Class

org.apache.spark.sql

ForeachWriter

Related Doc: package sql

Permalink

abstract class ForeachWriter[T] extends Serializable

A class to consume data generated by a StreamingQuery. Typically this is used to send the generated data to external systems. Each partition will use a new deserialized instance, so you usually should do all the initialization (e.g. opening a connection or initiating a transaction) in the open method.

Scala example:

datasetOfString.writeStream.foreach(new ForeachWriter[String] {

  def open(partitionId: Long, version: Long): Boolean = {
    // open connection
  }

  def process(record: String) = {
    // write string to connection
  }

  def close(errorOrNull: Throwable): Unit = {
    // close the connection
  }
})

Java example:

datasetOfString.writeStream().foreach(new ForeachWriter<String>() {

  @Override
  public boolean open(long partitionId, long version) {
    // open connection
  }

  @Override
  public void process(String value) {
    // write string to connection
  }

  @Override
  public void close(Throwable errorOrNull) {
    // close the connection
  }
});
Annotations
@Evolving()
Source
ForeachWriter.scala
Since

2.0.0

Linear Supertypes
Serializable, Serializable, AnyRef, Any
Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. ForeachWriter
  2. Serializable
  3. Serializable
  4. AnyRef
  5. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. All

Instance Constructors

  1. new ForeachWriter()

    Permalink

Abstract Value Members

  1. abstract def close(errorOrNull: Throwable): Unit

    Permalink

    Called when stopping to process one partition of new data in the executor side.

    Called when stopping to process one partition of new data in the executor side. This is guaranteed to be called either open returns true or false. However, close won't be called in the following cases:

    • JVM crashes without throwing a Throwable
    • open throws a Throwable.
    errorOrNull

    the error thrown during processing data or null if there was no error.

  2. abstract def open(partitionId: Long, version: Long): Boolean

    Permalink

    Called when starting to process one partition of new data in the executor.

    Called when starting to process one partition of new data in the executor. The version is for data deduplication when there are failures. When recovering from a failure, some data may be generated multiple times but they will always have the same version.

    If this method finds using the partitionId and version that this partition has already been processed, it can return false to skip the further data processing. However, close still will be called for cleaning up resources.

    partitionId

    the partition id.

    version

    a unique id for data deduplication.

    returns

    true if the corresponding partition and version id should be processed. false indicates the partition should be skipped.

  3. abstract def process(value: T): Unit

    Permalink

    Called to process the data in the executor side.

    Called to process the data in the executor side. This method will be called only when open returns true.

Concrete Value Members

  1. final def !=(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  2. final def ##(): Int

    Permalink
    Definition Classes
    AnyRef → Any
  3. final def ==(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  4. final def asInstanceOf[T0]: T0

    Permalink
    Definition Classes
    Any
  5. def clone(): AnyRef

    Permalink
    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  6. final def eq(arg0: AnyRef): Boolean

    Permalink
    Definition Classes
    AnyRef
  7. def equals(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  8. def finalize(): Unit

    Permalink
    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( classOf[java.lang.Throwable] )
  9. final def getClass(): Class[_]

    Permalink
    Definition Classes
    AnyRef → Any
  10. def hashCode(): Int

    Permalink
    Definition Classes
    AnyRef → Any
  11. final def isInstanceOf[T0]: Boolean

    Permalink
    Definition Classes
    Any
  12. final def ne(arg0: AnyRef): Boolean

    Permalink
    Definition Classes
    AnyRef
  13. final def notify(): Unit

    Permalink
    Definition Classes
    AnyRef
  14. final def notifyAll(): Unit

    Permalink
    Definition Classes
    AnyRef
  15. final def synchronized[T0](arg0: ⇒ T0): T0

    Permalink
    Definition Classes
    AnyRef
  16. def toString(): String

    Permalink
    Definition Classes
    AnyRef → Any
  17. final def wait(): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  18. final def wait(arg0: Long, arg1: Int): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  19. final def wait(arg0: Long): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )

Inherited from Serializable

Inherited from Serializable

Inherited from AnyRef

Inherited from Any

Ungrouped