A RDD that cogroups its parents.
Represents a coalesced RDD that has fewer partitions than its parent RDD This class uses the PartitionCoalescer class to find a good partitioning of the parent RDD so that each new partition has roughly the same number of parent partitions and that the preferred location of each new partition overlaps with as many preferred locations of its parent partitions
Class that captures a coalesced RDD by essentially keeping track of parent partitions
Extra functions available on RDDs of Doubles through an implicit conversion.
An RDD that is empty, i.
An RDD that reads a Hadoop dataset as specified by a JobConf (e.
An RDD that executes an SQL query on a JDBC connection and reads results.
Extra functions available on RDDs of (key, value) pairs where the key is sortable through an implicit conversion.
Extra functions available on RDDs of (key, value) pairs through an implicit conversion.
A RDD used to prune RDD partitions/partitions so we can avoid launching tasks on all partitions.
An RDD that pipes the contents of each parent partition through an external command (printing them one per line) and returns the output as a collection of strings.
Represents a dependency between the PartitionPruningRDD and its parent.
A Resilient Distributed Dataset (RDD), the basic abstraction in Spark.
Extra functions available on RDDs of (key, value) pairs to create a Hadoop SequenceFile, through an implicit conversion.
The resulting RDD from a shuffle (e.