Output the RDD as a Hadoop SequenceFile using the Writable types we infer from the RDD's key and value types.
Output the RDD as a Hadoop SequenceFile using the Writable types we infer from the RDD's key
and value types. If the key or value are Writable, then we use their classes directly;
otherwise we map primitive types such as Int and Double to IntWritable, DoubleWritable, etc,
byte arrays to BytesWritable, and Strings to Text. The path
can be on any Hadoop-supported
file system.
Extra functions available on RDDs of (key, value) pairs to create a Hadoop SequenceFile, through an implicit conversion. Note that this can't be part of PairRDDFunctions because we need more implicit parameters to convert our keys and values to Writable.
Import
org.apache.spark.SparkContext._
at the top of their program to use these functions.