@InterfaceStability.Evolving
public interface InputPartition<T>
extends java.io.Serializable
DataSourceReader.planInputPartitions()
and is
responsible for creating the actual data reader of one RDD partition.
The relationship between InputPartition
and InputPartitionReader
is similar to the relationship between Iterable
and Iterator
.
Note that InputPartition
s will be serialized and sent to executors, then
InputPartitionReader
s will be created on executors to do the actual reading. So
InputPartition
must be serializable while InputPartitionReader
doesn't need to
be.Modifier and Type | Method and Description |
---|---|
InputPartitionReader<T> |
createPartitionReader()
Returns an input partition reader to do the actual reading work.
|
default String[] |
preferredLocations()
The preferred locations where the input partition reader returned by this partition can run
faster, but Spark does not guarantee to run the input partition reader on these locations.
|
default String[] preferredLocations()
InputPartitionReader<T> createPartitionReader()