- abs(Column) - Static method in class org.apache.spark.sql.functions
-
Computes the absolute value.
- AbsoluteError - Class in org.apache.spark.mllib.tree.loss
-
:: DeveloperApi ::
Class for absolute error loss calculation (for regression).
- AbsoluteError() - Constructor for class org.apache.spark.mllib.tree.loss.AbsoluteError
-
- accId() - Method in class org.apache.spark.CleanAccum
-
- Accumulable<R,T> - Class in org.apache.spark
-
A data type that can be accumulated, ie has an commutative and associative "add" operation,
but where the result type, R
, may be different from the element type being added, T
.
- Accumulable(R, AccumulableParam<R, T>, Option<String>) - Constructor for class org.apache.spark.Accumulable
-
- Accumulable(R, AccumulableParam<R, T>) - Constructor for class org.apache.spark.Accumulable
-
- accumulable(T, AccumulableParam<T, R>) - Method in class org.apache.spark.api.java.JavaSparkContext
-
Create an
Accumulable
shared variable of the given type, to which tasks
can "add" values with
add
.
- accumulable(T, String, AccumulableParam<T, R>) - Method in class org.apache.spark.api.java.JavaSparkContext
-
Create an
Accumulable
shared variable of the given type, to which tasks
can "add" values with
add
.
- accumulable(R, AccumulableParam<R, T>) - Method in class org.apache.spark.SparkContext
-
Create an
Accumulable
shared variable, to which tasks can add values
with
+=
.
- accumulable(R, String, AccumulableParam<R, T>) - Method in class org.apache.spark.SparkContext
-
Create an
Accumulable
shared variable, with a name for display in the
Spark UI.
- accumulableCollection(R, Function1<R, Growable<T>>, ClassTag<R>) - Method in class org.apache.spark.SparkContext
-
Create an accumulator from a "mutable collection" type.
- AccumulableInfo - Class in org.apache.spark.scheduler
-
:: DeveloperApi ::
Information about an
Accumulable
modified during a task or stage.
- AccumulableInfo(long, String, Option<String>, String) - Constructor for class org.apache.spark.scheduler.AccumulableInfo
-
- AccumulableInfo - Class in org.apache.spark.status.api.v1
-
- AccumulableParam<R,T> - Interface in org.apache.spark
-
Helper object defining how to accumulate values of a particular type.
- accumulables() - Method in class org.apache.spark.scheduler.StageInfo
-
Terminal values of accumulables updated during this stage.
- accumulables() - Method in class org.apache.spark.scheduler.TaskInfo
-
Intermediate updates to accumulables during this task.
- Accumulator<T> - Class in org.apache.spark
-
A simpler value of
Accumulable
where the result type being accumulated is the same
as the types of elements being merged, i.e.
- Accumulator(T, AccumulatorParam<T>, Option<String>) - Constructor for class org.apache.spark.Accumulator
-
- Accumulator(T, AccumulatorParam<T>) - Constructor for class org.apache.spark.Accumulator
-
- accumulator(int) - Method in class org.apache.spark.api.java.JavaSparkContext
-
Create an
Accumulator
integer variable, which tasks can "add" values
to using the
add
method.
- accumulator(int, String) - Method in class org.apache.spark.api.java.JavaSparkContext
-
Create an
Accumulator
integer variable, which tasks can "add" values
to using the
add
method.
- accumulator(double) - Method in class org.apache.spark.api.java.JavaSparkContext
-
Create an
Accumulator
double variable, which tasks can "add" values
to using the
add
method.
- accumulator(double, String) - Method in class org.apache.spark.api.java.JavaSparkContext
-
Create an
Accumulator
double variable, which tasks can "add" values
to using the
add
method.
- accumulator(T, AccumulatorParam<T>) - Method in class org.apache.spark.api.java.JavaSparkContext
-
Create an
Accumulator
variable of a given type, which tasks can "add"
values to using the
add
method.
- accumulator(T, String, AccumulatorParam<T>) - Method in class org.apache.spark.api.java.JavaSparkContext
-
Create an
Accumulator
variable of a given type, which tasks can "add"
values to using the
add
method.
- accumulator(T, AccumulatorParam<T>) - Method in class org.apache.spark.SparkContext
-
Create an
Accumulator
variable of a given type, which tasks can "add"
values to using the
+=
method.
- accumulator(T, String, AccumulatorParam<T>) - Method in class org.apache.spark.SparkContext
-
Create an
Accumulator
variable of a given type, with a name for display
in the Spark UI.
- AccumulatorParam<T> - Interface in org.apache.spark
-
A simpler version of
AccumulableParam
where the only data type you can add
in is the same type as the accumulated value.
- AccumulatorParam.DoubleAccumulatorParam$ - Class in org.apache.spark
-
- AccumulatorParam.DoubleAccumulatorParam$() - Constructor for class org.apache.spark.AccumulatorParam.DoubleAccumulatorParam$
-
- AccumulatorParam.FloatAccumulatorParam$ - Class in org.apache.spark
-
- AccumulatorParam.FloatAccumulatorParam$() - Constructor for class org.apache.spark.AccumulatorParam.FloatAccumulatorParam$
-
- AccumulatorParam.IntAccumulatorParam$ - Class in org.apache.spark
-
- AccumulatorParam.IntAccumulatorParam$() - Constructor for class org.apache.spark.AccumulatorParam.IntAccumulatorParam$
-
- AccumulatorParam.LongAccumulatorParam$ - Class in org.apache.spark
-
- AccumulatorParam.LongAccumulatorParam$() - Constructor for class org.apache.spark.AccumulatorParam.LongAccumulatorParam$
-
- accumulatorUpdates() - Method in class org.apache.spark.status.api.v1.StageData
-
- accumulatorUpdates() - Method in class org.apache.spark.status.api.v1.TaskData
-
- accuracy() - Method in class org.apache.spark.mllib.evaluation.MultilabelMetrics
-
Returns accuracy
- acos(Column) - Static method in class org.apache.spark.sql.functions
-
Computes the cosine inverse of the given value; the returned angle is in the range
0.0 through pi.
- acos(String) - Static method in class org.apache.spark.sql.functions
-
Computes the cosine inverse of the given column; the returned angle is in the range
0.0 through pi.
- active() - Method in class org.apache.spark.streaming.scheduler.ReceiverInfo
-
- activeJobs() - Method in class org.apache.spark.ui.jobs.JobProgressListener
-
- activeStages() - Method in class org.apache.spark.ui.jobs.JobProgressListener
-
- activeTasks() - Method in class org.apache.spark.status.api.v1.ExecutorSummary
-
- ActorHelper - Interface in org.apache.spark.streaming.receiver
-
:: DeveloperApi ::
A receiver trait to be mixed in with your Actor to gain access to
the API for pushing received data into Spark Streaming for being processed.
- actorStream(Props, String, StorageLevel, SupervisorStrategy) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
-
Create an input stream with any arbitrary user implemented actor receiver.
- actorStream(Props, String, StorageLevel) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
-
Create an input stream with any arbitrary user implemented actor receiver.
- actorStream(Props, String) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
-
Create an input stream with any arbitrary user implemented actor receiver.
- actorStream(Props, String, StorageLevel, SupervisorStrategy, ClassTag<T>) - Method in class org.apache.spark.streaming.StreamingContext
-
Create an input stream with any arbitrary user implemented actor receiver.
- ActorSupervisorStrategy - Class in org.apache.spark.streaming.receiver
-
:: DeveloperApi ::
A helper with set of defaults for supervisor strategy
- ActorSupervisorStrategy() - Constructor for class org.apache.spark.streaming.receiver.ActorSupervisorStrategy
-
- actorSystem() - Method in class org.apache.spark.SparkEnv
-
- add(T) - Method in class org.apache.spark.Accumulable
-
Add more data to this accumulator / accumulable
- add(double, Vector) - Method in class org.apache.spark.ml.classification.LogisticAggregator
-
Add a new training data to this LogisticAggregator, and update the loss and gradient
of the objective function.
- add(double, Vector) - Method in class org.apache.spark.ml.regression.LeastSquaresAggregator
-
Add a new training data to this LeastSquaresAggregator, and update the loss and gradient
of the objective function.
- add(double[], MultivariateGaussian[], ExpectationSum, Vector<Object>) - Static method in class org.apache.spark.mllib.clustering.ExpectationSum
-
- add(Vector) - Method in class org.apache.spark.mllib.feature.IDF.DocumentFrequencyAggregator
-
Adds a new document.
- add(BlockMatrix) - Method in class org.apache.spark.mllib.linalg.distributed.BlockMatrix
-
Adds two block matrices together.
- add(Vector) - Method in class org.apache.spark.mllib.stat.MultivariateOnlineSummarizer
-
Add a new sample to this summarizer, and update the statistical summary.
- add(Vector) - Method in class org.apache.spark.util.Vector
-
- addAccumulator(R, T) - Method in interface org.apache.spark.AccumulableParam
-
Add additional data to the accumulator value.
- addAccumulator(T, T) - Method in interface org.apache.spark.AccumulatorParam
-
- addAppArgs(String...) - Method in class org.apache.spark.launcher.SparkLauncher
-
Adds command line arguments for the application.
- addedFiles() - Method in class org.apache.spark.SparkContext
-
- addedJars() - Method in class org.apache.spark.SparkContext
-
- addFile(String) - Method in class org.apache.spark.api.java.JavaSparkContext
-
Add a file to be downloaded with this Spark job on every node.
- addFile(String) - Method in class org.apache.spark.launcher.SparkLauncher
-
Adds a file to be submitted with the application.
- addFile(String) - Method in class org.apache.spark.SparkContext
-
Add a file to be downloaded with this Spark job on every node.
- addFile(String, boolean) - Method in class org.apache.spark.SparkContext
-
Add a file to be downloaded with this Spark job on every node.
- addGrid(Param<T>, Iterable<T>) - Method in class org.apache.spark.ml.tuning.ParamGridBuilder
-
Adds a param with multiple values (overwrites if the input param exists).
- addGrid(DoubleParam, double[]) - Method in class org.apache.spark.ml.tuning.ParamGridBuilder
-
Adds a double param with multiple values.
- addGrid(IntParam, int[]) - Method in class org.apache.spark.ml.tuning.ParamGridBuilder
-
Adds a int param with multiple values.
- addGrid(FloatParam, float[]) - Method in class org.apache.spark.ml.tuning.ParamGridBuilder
-
Adds a float param with multiple values.
- addGrid(LongParam, long[]) - Method in class org.apache.spark.ml.tuning.ParamGridBuilder
-
Adds a long param with multiple values.
- addGrid(BooleanParam) - Method in class org.apache.spark.ml.tuning.ParamGridBuilder
-
Adds a boolean param with true and false.
- addInPlace(R, R) - Method in interface org.apache.spark.AccumulableParam
-
Merge two accumulated values together.
- addInPlace(double, double) - Method in class org.apache.spark.AccumulatorParam.DoubleAccumulatorParam$
-
- addInPlace(float, float) - Method in class org.apache.spark.AccumulatorParam.FloatAccumulatorParam$
-
- addInPlace(int, int) - Method in class org.apache.spark.AccumulatorParam.IntAccumulatorParam$
-
- addInPlace(long, long) - Method in class org.apache.spark.AccumulatorParam.LongAccumulatorParam$
-
- addInPlace(double, double) - Method in class org.apache.spark.SparkContext.DoubleAccumulatorParam$
-
- addInPlace(float, float) - Method in class org.apache.spark.SparkContext.FloatAccumulatorParam$
-
- addInPlace(int, int) - Method in class org.apache.spark.SparkContext.IntAccumulatorParam$
-
- addInPlace(long, long) - Method in class org.apache.spark.SparkContext.LongAccumulatorParam$
-
- addInPlace(Vector) - Method in class org.apache.spark.util.Vector
-
- addInPlace(Vector, Vector) - Method in class org.apache.spark.util.Vector.VectorAccumParam$
-
- addJar(String) - Method in class org.apache.spark.api.java.JavaSparkContext
-
Adds a JAR dependency for all tasks to be executed on this SparkContext in the future.
- addJar(String) - Method in class org.apache.spark.launcher.SparkLauncher
-
Adds a jar file to be submitted with the application.
- addJar(String) - Method in class org.apache.spark.SparkContext
-
Adds a JAR dependency for all tasks to be executed on this SparkContext in the future.
- addLocalConfiguration(String, int, int, int, JobConf) - Static method in class org.apache.spark.rdd.HadoopRDD
-
Add Hadoop configuration specific to a single partition and attempt.
- addOnCompleteCallback(Function0<BoxedUnit>) - Method in class org.apache.spark.TaskContext
-
Adds a callback function to be executed on task completion.
- addPartToPGroup(Partition, PartitionGroup) - Method in class org.apache.spark.rdd.PartitionCoalescer
-
- addPyFile(String) - Method in class org.apache.spark.launcher.SparkLauncher
-
Adds a python file / zip / egg to be submitted with the application.
- address() - Method in class org.apache.spark.status.api.v1.RDDDataDistribution
-
- addSparkListener(SparkListener) - Method in class org.apache.spark.SparkContext
-
:: DeveloperApi ::
Register a listener to receive up-calls from events that happen during execution.
- addStreamingListener(StreamingListener) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
-
- addStreamingListener(StreamingListener) - Method in class org.apache.spark.streaming.StreamingContext
-
- addTaskCompletionListener(TaskCompletionListener) - Method in class org.apache.spark.TaskContext
-
Adds a (Java friendly) listener to be executed on task completion.
- addTaskCompletionListener(Function1<TaskContext, BoxedUnit>) - Method in class org.apache.spark.TaskContext
-
Adds a listener in the form of a Scala closure to be executed on task completion.
- addVector(Vector) - Method in class org.apache.spark.ml.feature.VectorIndexer.CategoryStats
-
Add a new vector to this index, updating sets of unique feature values
- agg(Column, Column...) - Method in class org.apache.spark.sql.DataFrame
-
Aggregates on the entire
DataFrame
without groups.
- agg(Tuple2<String, String>, Seq<Tuple2<String, String>>) - Method in class org.apache.spark.sql.DataFrame
-
(Scala-specific) Aggregates on the entire
DataFrame
without groups.
- agg(Map<String, String>) - Method in class org.apache.spark.sql.DataFrame
-
(Scala-specific) Aggregates on the entire
DataFrame
without groups.
- agg(Map<String, String>) - Method in class org.apache.spark.sql.DataFrame
-
(Java-specific) Aggregates on the entire
DataFrame
without groups.
- agg(Column, Seq<Column>) - Method in class org.apache.spark.sql.DataFrame
-
Aggregates on the entire
DataFrame
without groups.
- agg(Column, Column...) - Method in class org.apache.spark.sql.GroupedData
-
Compute aggregates by specifying a series of aggregate columns.
- agg(Tuple2<String, String>, Seq<Tuple2<String, String>>) - Method in class org.apache.spark.sql.GroupedData
-
(Scala-specific) Compute aggregates by specifying a map from column name to
aggregate methods.
- agg(Map<String, String>) - Method in class org.apache.spark.sql.GroupedData
-
(Scala-specific) Compute aggregates by specifying a map from column name to
aggregate methods.
- agg(Map<String, String>) - Method in class org.apache.spark.sql.GroupedData
-
(Java-specific) Compute aggregates by specifying a map from column name to
aggregate methods.
- agg(Column, Seq<Column>) - Method in class org.apache.spark.sql.GroupedData
-
Compute aggregates by specifying a series of aggregate columns.
- aggregate(U, Function2<U, T, U>, Function2<U, U, U>) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Aggregate the elements of each partition, and then the results for all the partitions, using
given combine functions and a neutral "zero value".
- aggregate(U, Function2<U, T, U>, Function2<U, U, U>, ClassTag<U>) - Method in class org.apache.spark.rdd.RDD
-
Aggregate the elements of each partition, and then the results for all the partitions, using
given combine functions and a neutral "zero value".
- aggregateByKey(U, Partitioner, Function2<U, V, U>, Function2<U, U, U>) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Aggregate the values of each key, using given combine functions and a neutral "zero value".
- aggregateByKey(U, int, Function2<U, V, U>, Function2<U, U, U>) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Aggregate the values of each key, using given combine functions and a neutral "zero value".
- aggregateByKey(U, Function2<U, V, U>, Function2<U, U, U>) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Aggregate the values of each key, using given combine functions and a neutral "zero value".
- aggregateByKey(U, Partitioner, Function2<U, V, U>, Function2<U, U, U>, ClassTag<U>) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Aggregate the values of each key, using given combine functions and a neutral "zero value".
- aggregateByKey(U, int, Function2<U, V, U>, Function2<U, U, U>, ClassTag<U>) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Aggregate the values of each key, using given combine functions and a neutral "zero value".
- aggregateByKey(U, Function2<U, V, U>, Function2<U, U, U>, ClassTag<U>) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Aggregate the values of each key, using given combine functions and a neutral "zero value".
- AggregatedDialect - Class in org.apache.spark.sql.jdbc
-
:: DeveloperApi ::
AggregatedDialect can unify multiple dialects into one virtual Dialect.
- AggregatedDialect(List<JdbcDialect>) - Constructor for class org.apache.spark.sql.jdbc.AggregatedDialect
-
- aggregateMessages(Function1<EdgeContext<VD, ED, A>, BoxedUnit>, Function2<A, A, A>, TripletFields, ClassTag<A>) - Method in class org.apache.spark.graphx.Graph
-
Aggregates values from the neighboring edges and vertices of each vertex.
- aggregateMessagesWithActiveSet(Function1<EdgeContext<VD, ED, A>, BoxedUnit>, Function2<A, A, A>, TripletFields, Option<Tuple2<VertexRDD<?>, EdgeDirection>>, ClassTag<A>) - Method in class org.apache.spark.graphx.impl.GraphImpl
-
- aggregateUsingIndex(RDD<Tuple2<Object, VD2>>, Function2<VD2, VD2, VD2>, ClassTag<VD2>) - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
-
- aggregateUsingIndex(RDD<Tuple2<Object, VD2>>, Function2<VD2, VD2, VD2>, ClassTag<VD2>) - Method in class org.apache.spark.graphx.VertexRDD
-
Aggregates vertices in messages
that have the same ids using reduceFunc
, returning a
VertexRDD co-indexed with this
.
- AggregatingEdgeContext<VD,ED,A> - Class in org.apache.spark.graphx.impl
-
- AggregatingEdgeContext(Function2<A, A, A>, Object, BitSet) - Constructor for class org.apache.spark.graphx.impl.AggregatingEdgeContext
-
- Aggregator<K,V,C> - Class in org.apache.spark
-
:: DeveloperApi ::
A set of functions used to aggregate data.
- Aggregator(Function1<V, C>, Function2<C, V, C>, Function2<C, C, C>) - Constructor for class org.apache.spark.Aggregator
-
- aggregator() - Method in class org.apache.spark.ShuffleDependency
-
- Algo - Class in org.apache.spark.mllib.tree.configuration
-
:: Experimental ::
Enum to select the algorithm for the decision tree
- Algo() - Constructor for class org.apache.spark.mllib.tree.configuration.Algo
-
- algo() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
-
- algo() - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel
-
- algo() - Method in class org.apache.spark.mllib.tree.model.GradientBoostedTreesModel
-
- algo() - Method in class org.apache.spark.mllib.tree.model.RandomForestModel
-
- algorithm() - Method in class org.apache.spark.mllib.regression.StreamingLinearRegressionWithSGD
-
- alias(String) - Method in class org.apache.spark.sql.Column
-
Gives the column an alias.
- All - Static variable in class org.apache.spark.graphx.TripletFields
-
Expose all the fields (source, edge, and destination).
- AlphaComponent - Annotation Type in org.apache.spark.annotation
-
A new component of Spark which may have unstable API's.
- ALS - Class in org.apache.spark.ml.recommendation
-
:: Experimental ::
Alternating Least Squares (ALS) matrix factorization.
- ALS(String) - Constructor for class org.apache.spark.ml.recommendation.ALS
-
- ALS() - Constructor for class org.apache.spark.ml.recommendation.ALS
-
- ALS - Class in org.apache.spark.mllib.recommendation
-
- ALS() - Constructor for class org.apache.spark.mllib.recommendation.ALS
-
- ALS.Rating<ID> - Class in org.apache.spark.ml.recommendation
-
:: DeveloperApi ::
Rating class for better code readability.
- ALS.Rating(ID, ID, float) - Constructor for class org.apache.spark.ml.recommendation.ALS.Rating
-
- ALS.Rating$ - Class in org.apache.spark.ml.recommendation
-
- ALS.Rating$() - Constructor for class org.apache.spark.ml.recommendation.ALS.Rating$
-
- ALSModel - Class in org.apache.spark.ml.recommendation
-
:: Experimental ::
Model fitted by ALS.
- AnalysisException - Exception in org.apache.spark.sql
-
:: DeveloperApi ::
Thrown when a query fails to analyze, usually because the query itself is invalid.
- analyze(String) - Method in class org.apache.spark.sql.hive.HiveContext
-
Analyzes the given table in the current database to generate statistics, which will be
used in query optimizations.
- and(Column) - Method in class org.apache.spark.sql.Column
-
Boolean AND.
- And - Class in org.apache.spark.sql.sources
-
A filter that evaluates to true
iff both left
or right
evaluate to true
.
- And(Filter, Filter) - Constructor for class org.apache.spark.sql.sources.And
-
- ANY() - Static method in class org.apache.spark.scheduler.TaskLocality
-
- anyNull() - Method in interface org.apache.spark.sql.Row
-
Returns true if there are any NULL values in this row.
- appAttemptId() - Method in class org.apache.spark.scheduler.SparkListenerApplicationStart
-
- appendBias(Vector) - Static method in class org.apache.spark.mllib.util.MLUtils
-
Returns a new vector with 1.0
(bias) appended to the input vector.
- appId() - Method in class org.apache.spark.scheduler.SparkListenerApplicationStart
-
- applicationAttemptId() - Method in class org.apache.spark.SparkContext
-
- ApplicationAttemptInfo - Class in org.apache.spark.status.api.v1
-
- applicationId() - Method in class org.apache.spark.SparkContext
-
- ApplicationInfo - Class in org.apache.spark.status.api.v1
-
- ApplicationStatus - Enum in org.apache.spark.status.api.v1
-
- apply(RDD<Tuple2<Object, VD>>, RDD<Edge<ED>>, VD, StorageLevel, StorageLevel, ClassTag<VD>, ClassTag<ED>) - Static method in class org.apache.spark.graphx.Graph
-
Construct a graph from a collection of vertices and
edges with attributes.
- apply(RDD<Edge<ED>>, VD, StorageLevel, StorageLevel, ClassTag<VD>, ClassTag<ED>) - Static method in class org.apache.spark.graphx.impl.GraphImpl
-
Create a graph from edges, setting referenced vertices to `defaultVertexAttr`.
- apply(RDD<Tuple2<Object, VD>>, RDD<Edge<ED>>, VD, StorageLevel, StorageLevel, ClassTag<VD>, ClassTag<ED>) - Static method in class org.apache.spark.graphx.impl.GraphImpl
-
Create a graph from vertices and edges, setting missing vertices to `defaultVertexAttr`.
- apply(VertexRDD<VD>, EdgeRDD<ED>, ClassTag<VD>, ClassTag<ED>) - Static method in class org.apache.spark.graphx.impl.GraphImpl
-
Create a graph from a VertexRDD and an EdgeRDD with arbitrary replicated vertices.
- apply(Graph<VD, ED>, A, int, EdgeDirection, Function3<Object, VD, A, VD>, Function1<EdgeTriplet<VD, ED>, Iterator<Tuple2<Object, A>>>, Function2<A, A, A>, ClassTag<VD>, ClassTag<ED>, ClassTag<A>) - Static method in class org.apache.spark.graphx.Pregel
-
Execute a Pregel-like iterative vertex-parallel abstraction.
- apply(RDD<Tuple2<Object, VD>>, ClassTag<VD>) - Static method in class org.apache.spark.graphx.VertexRDD
-
Constructs a standalone
VertexRDD
(one that is not set up for efficient joins with an
EdgeRDD
) from an RDD of vertex-attribute pairs.
- apply(RDD<Tuple2<Object, VD>>, EdgeRDD<?>, VD, ClassTag<VD>) - Static method in class org.apache.spark.graphx.VertexRDD
-
Constructs a VertexRDD
from an RDD of vertex-attribute pairs.
- apply(RDD<Tuple2<Object, VD>>, EdgeRDD<?>, VD, Function2<VD, VD, VD>, ClassTag<VD>) - Static method in class org.apache.spark.graphx.VertexRDD
-
Constructs a VertexRDD
from an RDD of vertex-attribute pairs.
- apply(String) - Method in class org.apache.spark.ml.attribute.AttributeGroup
-
Gets an attribute by its name.
- apply(int) - Method in class org.apache.spark.ml.attribute.AttributeGroup
-
Gets an attribute by its index.
- apply(Param<T>) - Method in class org.apache.spark.ml.param.ParamMap
-
Gets the value of the input param or its default value if it does not exist.
- apply(int, int) - Method in class org.apache.spark.mllib.linalg.DenseMatrix
-
- apply(int) - Method in class org.apache.spark.mllib.linalg.DenseVector
-
- apply(int, int) - Method in interface org.apache.spark.mllib.linalg.Matrix
-
Gets the (i, j)-th element.
- apply(int, int) - Method in class org.apache.spark.mllib.linalg.SparseMatrix
-
- apply(int) - Method in interface org.apache.spark.mllib.linalg.Vector
-
Gets the value of the ith element.
- apply(int, Predict, double, boolean) - Static method in class org.apache.spark.mllib.tree.model.Node
-
Construct a node with nodeIndex, predict, impurity and isLeaf parameters.
- apply(String) - Static method in class org.apache.spark.rdd.PartitionGroup
-
- apply(long, String, Option<String>, String) - Static method in class org.apache.spark.scheduler.AccumulableInfo
-
- apply(long, String, String) - Static method in class org.apache.spark.scheduler.AccumulableInfo
-
- apply(long, TaskMetrics) - Static method in class org.apache.spark.scheduler.RuntimePercentage
-
- apply(Object) - Method in class org.apache.spark.sql.Column
-
Extracts a value or values from a complex type.
- apply(String) - Method in class org.apache.spark.sql.DataFrame
-
Selects column based on the column name and return it as a
Column
.
- apply(DataFrame, Seq<Expression>, GroupedData.GroupType) - Static method in class org.apache.spark.sql.GroupedData
-
- apply(int) - Method in interface org.apache.spark.sql.Row
-
Returns the value at position i.
- apply(DataType) - Static method in class org.apache.spark.sql.types.ArrayType
-
Construct a
ArrayType
object with the given element type.
- apply(double) - Static method in class org.apache.spark.sql.types.Decimal
-
- apply(long) - Static method in class org.apache.spark.sql.types.Decimal
-
- apply(int) - Static method in class org.apache.spark.sql.types.Decimal
-
- apply(BigDecimal) - Static method in class org.apache.spark.sql.types.Decimal
-
- apply(BigDecimal) - Static method in class org.apache.spark.sql.types.Decimal
-
- apply(BigDecimal, int, int) - Static method in class org.apache.spark.sql.types.Decimal
-
- apply(long, int, int) - Static method in class org.apache.spark.sql.types.Decimal
-
- apply(String) - Static method in class org.apache.spark.sql.types.Decimal
-
- apply() - Static method in class org.apache.spark.sql.types.DecimalType
-
- apply(int, int) - Static method in class org.apache.spark.sql.types.DecimalType
-
- apply(DataType, DataType) - Static method in class org.apache.spark.sql.types.MapType
-
Construct a
MapType
object with the given key type and value type.
- apply(String) - Method in class org.apache.spark.sql.types.StructType
-
- apply(Set<String>) - Method in class org.apache.spark.sql.types.StructType
-
Returns a
StructType
containing
StructField
s of the given names, preserving the
original order of fields.
- apply(int) - Method in class org.apache.spark.sql.types.StructType
-
- apply(String) - Static method in class org.apache.spark.sql.types.UTF8String
-
Create a UTF-8 String from String
- apply(byte[]) - Static method in class org.apache.spark.sql.types.UTF8String
-
Create a UTF-8 String from Array[Byte], which should be encoded in UTF-8
- apply(Seq<Column>) - Method in class org.apache.spark.sql.UserDefinedFunction
-
- apply(String) - Static method in class org.apache.spark.storage.BlockId
-
Converts a BlockId "name" String back into a BlockId.
- apply(String, String, int) - Static method in class org.apache.spark.storage.BlockManagerId
-
- apply(ObjectInput) - Static method in class org.apache.spark.storage.BlockManagerId
-
- apply(boolean, boolean, boolean, boolean, int) - Static method in class org.apache.spark.storage.StorageLevel
-
:: DeveloperApi ::
Create a new StorageLevel object without setting useOffHeap.
- apply(boolean, boolean, boolean, int) - Static method in class org.apache.spark.storage.StorageLevel
-
:: DeveloperApi ::
Create a new StorageLevel object.
- apply(int, int) - Static method in class org.apache.spark.storage.StorageLevel
-
:: DeveloperApi ::
Create a new StorageLevel object from its integer representation.
- apply(ObjectInput) - Static method in class org.apache.spark.storage.StorageLevel
-
:: DeveloperApi ::
Read StorageLevel object from ObjectInput stream.
- apply(String, int) - Static method in class org.apache.spark.streaming.kafka.Broker
-
- apply(String, int, long, long) - Static method in class org.apache.spark.streaming.kafka.OffsetRange
-
- apply(TopicAndPartition, long, long) - Static method in class org.apache.spark.streaming.kafka.OffsetRange
-
- apply(long) - Static method in class org.apache.spark.streaming.Milliseconds
-
- apply(long) - Static method in class org.apache.spark.streaming.Minutes
-
- apply(long) - Static method in class org.apache.spark.streaming.Seconds
-
- apply(TraversableOnce<Object>) - Static method in class org.apache.spark.util.StatCounter
-
Build a StatCounter from a list of values.
- apply(Seq<Object>) - Static method in class org.apache.spark.util.StatCounter
-
Build a StatCounter from a list of values passed as variable-length arguments.
- apply(int) - Method in class org.apache.spark.util.Vector
-
- applySchema(RDD<Row>, StructType) - Method in class org.apache.spark.sql.SQLContext
-
- applySchema(JavaRDD<Row>, StructType) - Method in class org.apache.spark.sql.SQLContext
-
- applySchema(RDD<?>, Class<?>) - Method in class org.apache.spark.sql.SQLContext
-
- applySchema(JavaRDD<?>, Class<?>) - Method in class org.apache.spark.sql.SQLContext
-
- appName() - Method in class org.apache.spark.api.java.JavaSparkContext
-
- appName() - Method in class org.apache.spark.scheduler.SparkListenerApplicationStart
-
- appName() - Method in class org.apache.spark.SparkContext
-
- approxCountDistinct(Column) - Static method in class org.apache.spark.sql.functions
-
Aggregate function: returns the approximate number of distinct items in a group.
- approxCountDistinct(String) - Static method in class org.apache.spark.sql.functions
-
Aggregate function: returns the approximate number of distinct items in a group.
- approxCountDistinct(Column, double) - Static method in class org.apache.spark.sql.functions
-
Aggregate function: returns the approximate number of distinct items in a group.
- approxCountDistinct(String, double) - Static method in class org.apache.spark.sql.functions
-
Aggregate function: returns the approximate number of distinct items in a group.
- ApproxHist() - Static method in class org.apache.spark.mllib.tree.configuration.QuantileStrategy
-
- areaUnderPR() - Method in class org.apache.spark.mllib.evaluation.BinaryClassificationMetrics
-
Computes the area under the precision-recall curve.
- areaUnderROC() - Method in class org.apache.spark.mllib.evaluation.BinaryClassificationMetrics
-
Computes the area under the receiver operating characteristic (ROC) curve.
- arr() - Method in class org.apache.spark.rdd.PartitionGroup
-
- array(DataType) - Method in class org.apache.spark.sql.ColumnName
-
Creates a new StructField
of type array.
- array(Column...) - Static method in class org.apache.spark.sql.functions
-
Creates a new array column.
- array(Seq<Column>) - Static method in class org.apache.spark.sql.functions
-
Creates a new array column.
- array(String, Seq<String>) - Static method in class org.apache.spark.sql.functions
-
Creates a new array column.
- ArrayType - Class in org.apache.spark.sql.types
-
- ArrayType(DataType, boolean) - Constructor for class org.apache.spark.sql.types.ArrayType
-
- as(String) - Method in class org.apache.spark.sql.Column
-
Gives the column an alias.
- as(Seq<String>) - Method in class org.apache.spark.sql.Column
-
(Scala-specific) Assigns the given aliases to the results of a table generating function.
- as(String[]) - Method in class org.apache.spark.sql.Column
-
Assigns the given aliases to the results of a table generating function.
- as(Symbol) - Method in class org.apache.spark.sql.Column
-
Gives the column an alias.
- as(String, Metadata) - Method in class org.apache.spark.sql.Column
-
Gives the column an alias with metadata.
- as(String) - Method in class org.apache.spark.sql.DataFrame
-
- as(Symbol) - Method in class org.apache.spark.sql.DataFrame
-
(Scala-specific) Returns a new
DataFrame
with an alias set.
- asc() - Method in class org.apache.spark.sql.Column
-
Returns an ordering used in sorting.
- asc(String) - Static method in class org.apache.spark.sql.functions
-
Returns a sort expression based on ascending order of the column.
- asin(Column) - Static method in class org.apache.spark.sql.functions
-
Computes the sine inverse of the given value; the returned angle is in the range
-pi/2 through pi/2.
- asin(String) - Static method in class org.apache.spark.sql.functions
-
Computes the sine inverse of the given column; the returned angle is in the range
-pi/2 through pi/2.
- asIntegral() - Method in class org.apache.spark.sql.types.DecimalType
-
- asIntegral() - Method in class org.apache.spark.sql.types.DoubleType
-
- asIntegral() - Method in class org.apache.spark.sql.types.FloatType
-
- asIterator() - Method in class org.apache.spark.serializer.DeserializationStream
-
Read the elements of this stream through an iterator.
- asJavaPairRDD() - Method in class org.apache.spark.api.r.PairwiseRRDD
-
- asJavaRDD() - Method in class org.apache.spark.api.r.RRDD
-
- asJavaRDD() - Method in class org.apache.spark.api.r.StringRRDD
-
- asKeyValueIterator() - Method in class org.apache.spark.serializer.DeserializationStream
-
Read the elements of this stream through an iterator over key-value pairs.
- AskPermissionToCommitOutput - Class in org.apache.spark.scheduler
-
- AskPermissionToCommitOutput(int, long, long) - Constructor for class org.apache.spark.scheduler.AskPermissionToCommitOutput
-
- askTimeout(SparkConf) - Static method in class org.apache.spark.util.RpcUtils
-
Returns the default Spark timeout to use for RPC ask operations.
- asRDDId() - Method in class org.apache.spark.storage.BlockId
-
- assignments() - Method in class org.apache.spark.mllib.clustering.PowerIterationClusteringModel
-
- AsyncRDDActions<T> - Class in org.apache.spark.rdd
-
A set of asynchronous RDD actions available through an implicit conversion.
- AsyncRDDActions(RDD<T>, ClassTag<T>) - Constructor for class org.apache.spark.rdd.AsyncRDDActions
-
- atan(Column) - Static method in class org.apache.spark.sql.functions
-
Computes the tangent inverse of the given value.
- atan(String) - Static method in class org.apache.spark.sql.functions
-
Computes the tangent inverse of the given column.
- atan2(Column, Column) - Static method in class org.apache.spark.sql.functions
-
Returns the angle theta from the conversion of rectangular coordinates (x, y) to
polar coordinates (r, theta).
- atan2(Column, String) - Static method in class org.apache.spark.sql.functions
-
Returns the angle theta from the conversion of rectangular coordinates (x, y) to
polar coordinates (r, theta).
- atan2(String, Column) - Static method in class org.apache.spark.sql.functions
-
Returns the angle theta from the conversion of rectangular coordinates (x, y) to
polar coordinates (r, theta).
- atan2(String, String) - Static method in class org.apache.spark.sql.functions
-
Returns the angle theta from the conversion of rectangular coordinates (x, y) to
polar coordinates (r, theta).
- atan2(Column, double) - Static method in class org.apache.spark.sql.functions
-
Returns the angle theta from the conversion of rectangular coordinates (x, y) to
polar coordinates (r, theta).
- atan2(String, double) - Static method in class org.apache.spark.sql.functions
-
Returns the angle theta from the conversion of rectangular coordinates (x, y) to
polar coordinates (r, theta).
- atan2(double, Column) - Static method in class org.apache.spark.sql.functions
-
Returns the angle theta from the conversion of rectangular coordinates (x, y) to
polar coordinates (r, theta).
- atan2(double, String) - Static method in class org.apache.spark.sql.functions
-
Returns the angle theta from the conversion of rectangular coordinates (x, y) to
polar coordinates (r, theta).
- attempt() - Method in class org.apache.spark.scheduler.TaskInfo
-
- attempt() - Method in class org.apache.spark.status.api.v1.TaskData
-
- attemptId() - Method in class org.apache.spark.scheduler.StageInfo
-
- attemptId() - Method in class org.apache.spark.status.api.v1.ApplicationAttemptInfo
-
- attemptId() - Method in class org.apache.spark.status.api.v1.StageData
-
- attemptID() - Method in class org.apache.spark.TaskCommitDenied
-
- attemptId() - Method in class org.apache.spark.TaskContext
-
- attemptNumber() - Method in class org.apache.spark.TaskContext
-
How many times this task has been attempted.
- attempts() - Method in class org.apache.spark.status.api.v1.ApplicationInfo
-
- attr() - Method in class org.apache.spark.graphx.Edge
-
- attr() - Method in class org.apache.spark.graphx.EdgeContext
-
The attribute associated with the edge.
- attr() - Method in class org.apache.spark.graphx.impl.AggregatingEdgeContext
-
- Attribute - Class in org.apache.spark.ml.attribute
-
:: DeveloperApi ::
Abstract class for ML attributes.
- Attribute() - Constructor for class org.apache.spark.ml.attribute.Attribute
-
- attribute() - Method in class org.apache.spark.sql.sources.EqualTo
-
- attribute() - Method in class org.apache.spark.sql.sources.GreaterThan
-
- attribute() - Method in class org.apache.spark.sql.sources.GreaterThanOrEqual
-
- attribute() - Method in class org.apache.spark.sql.sources.In
-
- attribute() - Method in class org.apache.spark.sql.sources.IsNotNull
-
- attribute() - Method in class org.apache.spark.sql.sources.IsNull
-
- attribute() - Method in class org.apache.spark.sql.sources.LessThan
-
- attribute() - Method in class org.apache.spark.sql.sources.LessThanOrEqual
-
- attribute() - Method in class org.apache.spark.sql.sources.StringContains
-
- attribute() - Method in class org.apache.spark.sql.sources.StringEndsWith
-
- attribute() - Method in class org.apache.spark.sql.sources.StringStartsWith
-
- AttributeGroup - Class in org.apache.spark.ml.attribute
-
:: DeveloperApi ::
Attributes that describe a vector ML column.
- AttributeGroup(String) - Constructor for class org.apache.spark.ml.attribute.AttributeGroup
-
Creates an attribute group without attribute info.
- AttributeGroup(String, int) - Constructor for class org.apache.spark.ml.attribute.AttributeGroup
-
Creates an attribute group knowing only the number of attributes.
- AttributeGroup(String, Attribute[]) - Constructor for class org.apache.spark.ml.attribute.AttributeGroup
-
Creates an attribute group with attributes.
- attributes() - Method in class org.apache.spark.ml.attribute.AttributeGroup
-
Optional array of attributes.
- AttributeType - Class in org.apache.spark.ml.attribute
-
:: DeveloperApi ::
An enum-like type for attribute types: AttributeType$.Numeric
, AttributeType$.Nominal
,
and AttributeType$.Binary
.
- AttributeType(String) - Constructor for class org.apache.spark.ml.attribute.AttributeType
-
- attrType() - Method in class org.apache.spark.ml.attribute.Attribute
-
Attribute type.
- attrType() - Method in class org.apache.spark.ml.attribute.BinaryAttribute
-
- attrType() - Method in class org.apache.spark.ml.attribute.NominalAttribute
-
- attrType() - Method in class org.apache.spark.ml.attribute.NumericAttribute
-
- attrType() - Static method in class org.apache.spark.ml.attribute.UnresolvedAttribute
-
- avg(Column) - Static method in class org.apache.spark.sql.functions
-
Aggregate function: returns the average of the values in a group.
- avg(String) - Static method in class org.apache.spark.sql.functions
-
Aggregate function: returns the average of the values in a group.
- avg(String...) - Method in class org.apache.spark.sql.GroupedData
-
Compute the mean value for each numeric columns for each group.
- avg(Seq<String>) - Method in class org.apache.spark.sql.GroupedData
-
Compute the mean value for each numeric columns for each group.
- awaitTermination() - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
-
Wait for the execution to stop.
- awaitTermination(long) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
-
Deprecated.
As of 1.3.0, replaced by awaitTerminationOrTimeout(Long)
.
- awaitTermination() - Method in class org.apache.spark.streaming.StreamingContext
-
Wait for the execution to stop.
- awaitTermination(long) - Method in class org.apache.spark.streaming.StreamingContext
-
Deprecated.
As of 1.3.0, replaced by awaitTerminationOrTimeout(Long)
.
- awaitTerminationOrTimeout(long) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
-
Wait for the execution to stop.
- awaitTerminationOrTimeout(long) - Method in class org.apache.spark.streaming.StreamingContext
-
Wait for the execution to stop.
- cache() - Method in class org.apache.spark.api.java.JavaDoubleRDD
-
Persist this RDD with the default storage level (`MEMORY_ONLY`).
- cache() - Method in class org.apache.spark.api.java.JavaPairRDD
-
Persist this RDD with the default storage level (`MEMORY_ONLY`).
- cache() - Method in class org.apache.spark.api.java.JavaRDD
-
Persist this RDD with the default storage level (`MEMORY_ONLY`).
- cache() - Method in class org.apache.spark.graphx.Graph
-
Caches the vertices and edges associated with this graph at the previously-specified target
storage levels, which default to MEMORY_ONLY
.
- cache() - Method in class org.apache.spark.graphx.impl.EdgeRDDImpl
-
Persists the edge partitions using `targetStorageLevel`, which defaults to MEMORY_ONLY.
- cache() - Method in class org.apache.spark.graphx.impl.GraphImpl
-
- cache() - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
-
Persists the vertex partitions at `targetStorageLevel`, which defaults to MEMORY_ONLY.
- cache() - Method in class org.apache.spark.mllib.linalg.distributed.BlockMatrix
-
Caches the underlying RDD.
- cache() - Method in class org.apache.spark.rdd.RDD
-
Persist this RDD with the default storage level (`MEMORY_ONLY`).
- cache() - Method in class org.apache.spark.sql.DataFrame
-
- cache() - Method in class org.apache.spark.streaming.api.java.JavaDStream
-
Persist RDDs of this DStream with the default storage level (MEMORY_ONLY_SER)
- cache() - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Persist RDDs of this DStream with the default storage level (MEMORY_ONLY_SER)
- cache() - Method in class org.apache.spark.streaming.dstream.DStream
-
Persist RDDs of this DStream with the default storage level (MEMORY_ONLY_SER)
- cacheManager() - Method in class org.apache.spark.SparkEnv
-
- cacheTable(String) - Method in class org.apache.spark.sql.SQLContext
-
Caches the specified table in-memory.
- calculate(DenseVector<Object>) - Method in class org.apache.spark.ml.classification.LogisticCostFun
-
- calculate(DenseVector<Object>) - Method in class org.apache.spark.ml.regression.LeastSquaresCostFun
-
- calculate(double[], double) - Static method in class org.apache.spark.mllib.tree.impurity.Entropy
-
:: DeveloperApi ::
information calculation for multiclass classification
- calculate(double, double, double) - Static method in class org.apache.spark.mllib.tree.impurity.Entropy
-
:: DeveloperApi ::
variance calculation
- calculate(double[], double) - Static method in class org.apache.spark.mllib.tree.impurity.Gini
-
:: DeveloperApi ::
information calculation for multiclass classification
- calculate(double, double, double) - Static method in class org.apache.spark.mllib.tree.impurity.Gini
-
:: DeveloperApi ::
variance calculation
- calculate(double[], double) - Method in interface org.apache.spark.mllib.tree.impurity.Impurity
-
:: DeveloperApi ::
information calculation for multiclass classification
- calculate(double, double, double) - Method in interface org.apache.spark.mllib.tree.impurity.Impurity
-
:: DeveloperApi ::
information calculation for regression
- calculate(double[], double) - Static method in class org.apache.spark.mllib.tree.impurity.Variance
-
:: DeveloperApi ::
information calculation for multiclass classification
- calculate(double, double, double) - Static method in class org.apache.spark.mllib.tree.impurity.Variance
-
:: DeveloperApi ::
variance calculation
- call(T) - Method in interface org.apache.spark.api.java.function.DoubleFlatMapFunction
-
- call(T) - Method in interface org.apache.spark.api.java.function.DoubleFunction
-
- call(T) - Method in interface org.apache.spark.api.java.function.FlatMapFunction
-
- call(T1, T2) - Method in interface org.apache.spark.api.java.function.FlatMapFunction2
-
- call(T1) - Method in interface org.apache.spark.api.java.function.Function
-
- call() - Method in interface org.apache.spark.api.java.function.Function0
-
- call(T1, T2) - Method in interface org.apache.spark.api.java.function.Function2
-
- call(T1, T2, T3) - Method in interface org.apache.spark.api.java.function.Function3
-
- call(T) - Method in interface org.apache.spark.api.java.function.PairFlatMapFunction
-
- call(T) - Method in interface org.apache.spark.api.java.function.PairFunction
-
- call(T) - Method in interface org.apache.spark.api.java.function.VoidFunction
-
- call(T1) - Method in interface org.apache.spark.sql.api.java.UDF1
-
- call(T1, T2, T3, T4, T5, T6, T7, T8, T9, T10) - Method in interface org.apache.spark.sql.api.java.UDF10
-
- call(T1, T2, T3, T4, T5, T6, T7, T8, T9, T10, T11) - Method in interface org.apache.spark.sql.api.java.UDF11
-
- call(T1, T2, T3, T4, T5, T6, T7, T8, T9, T10, T11, T12) - Method in interface org.apache.spark.sql.api.java.UDF12
-
- call(T1, T2, T3, T4, T5, T6, T7, T8, T9, T10, T11, T12, T13) - Method in interface org.apache.spark.sql.api.java.UDF13
-
- call(T1, T2, T3, T4, T5, T6, T7, T8, T9, T10, T11, T12, T13, T14) - Method in interface org.apache.spark.sql.api.java.UDF14
-
- call(T1, T2, T3, T4, T5, T6, T7, T8, T9, T10, T11, T12, T13, T14, T15) - Method in interface org.apache.spark.sql.api.java.UDF15
-
- call(T1, T2, T3, T4, T5, T6, T7, T8, T9, T10, T11, T12, T13, T14, T15, T16) - Method in interface org.apache.spark.sql.api.java.UDF16
-
- call(T1, T2, T3, T4, T5, T6, T7, T8, T9, T10, T11, T12, T13, T14, T15, T16, T17) - Method in interface org.apache.spark.sql.api.java.UDF17
-
- call(T1, T2, T3, T4, T5, T6, T7, T8, T9, T10, T11, T12, T13, T14, T15, T16, T17, T18) - Method in interface org.apache.spark.sql.api.java.UDF18
-
- call(T1, T2, T3, T4, T5, T6, T7, T8, T9, T10, T11, T12, T13, T14, T15, T16, T17, T18, T19) - Method in interface org.apache.spark.sql.api.java.UDF19
-
- call(T1, T2) - Method in interface org.apache.spark.sql.api.java.UDF2
-
- call(T1, T2, T3, T4, T5, T6, T7, T8, T9, T10, T11, T12, T13, T14, T15, T16, T17, T18, T19, T20) - Method in interface org.apache.spark.sql.api.java.UDF20
-
- call(T1, T2, T3, T4, T5, T6, T7, T8, T9, T10, T11, T12, T13, T14, T15, T16, T17, T18, T19, T20, T21) - Method in interface org.apache.spark.sql.api.java.UDF21
-
- call(T1, T2, T3, T4, T5, T6, T7, T8, T9, T10, T11, T12, T13, T14, T15, T16, T17, T18, T19, T20, T21, T22) - Method in interface org.apache.spark.sql.api.java.UDF22
-
- call(T1, T2, T3) - Method in interface org.apache.spark.sql.api.java.UDF3
-
- call(T1, T2, T3, T4) - Method in interface org.apache.spark.sql.api.java.UDF4
-
- call(T1, T2, T3, T4, T5) - Method in interface org.apache.spark.sql.api.java.UDF5
-
- call(T1, T2, T3, T4, T5, T6) - Method in interface org.apache.spark.sql.api.java.UDF6
-
- call(T1, T2, T3, T4, T5, T6, T7) - Method in interface org.apache.spark.sql.api.java.UDF7
-
- call(T1, T2, T3, T4, T5, T6, T7, T8) - Method in interface org.apache.spark.sql.api.java.UDF8
-
- call(T1, T2, T3, T4, T5, T6, T7, T8, T9) - Method in interface org.apache.spark.sql.api.java.UDF9
-
- callUDF(Function0<?>, DataType) - Static method in class org.apache.spark.sql.functions
-
Call a Scala function of 0 arguments as user-defined function (UDF).
- callUDF(Function1<?, ?>, DataType, Column) - Static method in class org.apache.spark.sql.functions
-
Call a Scala function of 1 arguments as user-defined function (UDF).
- callUDF(Function2<?, ?, ?>, DataType, Column, Column) - Static method in class org.apache.spark.sql.functions
-
Call a Scala function of 2 arguments as user-defined function (UDF).
- callUDF(Function3<?, ?, ?, ?>, DataType, Column, Column, Column) - Static method in class org.apache.spark.sql.functions
-
Call a Scala function of 3 arguments as user-defined function (UDF).
- callUDF(Function4<?, ?, ?, ?, ?>, DataType, Column, Column, Column, Column) - Static method in class org.apache.spark.sql.functions
-
Call a Scala function of 4 arguments as user-defined function (UDF).
- callUDF(Function5<?, ?, ?, ?, ?, ?>, DataType, Column, Column, Column, Column, Column) - Static method in class org.apache.spark.sql.functions
-
Call a Scala function of 5 arguments as user-defined function (UDF).
- callUDF(Function6<?, ?, ?, ?, ?, ?, ?>, DataType, Column, Column, Column, Column, Column, Column) - Static method in class org.apache.spark.sql.functions
-
Call a Scala function of 6 arguments as user-defined function (UDF).
- callUDF(Function7<?, ?, ?, ?, ?, ?, ?, ?>, DataType, Column, Column, Column, Column, Column, Column, Column) - Static method in class org.apache.spark.sql.functions
-
Call a Scala function of 7 arguments as user-defined function (UDF).
- callUDF(Function8<?, ?, ?, ?, ?, ?, ?, ?, ?>, DataType, Column, Column, Column, Column, Column, Column, Column, Column) - Static method in class org.apache.spark.sql.functions
-
Call a Scala function of 8 arguments as user-defined function (UDF).
- callUDF(Function9<?, ?, ?, ?, ?, ?, ?, ?, ?, ?>, DataType, Column, Column, Column, Column, Column, Column, Column, Column, Column) - Static method in class org.apache.spark.sql.functions
-
Call a Scala function of 9 arguments as user-defined function (UDF).
- callUDF(Function10<?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?>, DataType, Column, Column, Column, Column, Column, Column, Column, Column, Column, Column) - Static method in class org.apache.spark.sql.functions
-
Call a Scala function of 10 arguments as user-defined function (UDF).
- callUdf(String, Seq<Column>) - Static method in class org.apache.spark.sql.functions
-
Call an user-defined function.
- cancel() - Method in class org.apache.spark.ComplexFutureAction
-
- cancel() - Method in interface org.apache.spark.FutureAction
-
Cancels the execution of this action.
- cancel() - Method in class org.apache.spark.SimpleFutureAction
-
- cancelAllJobs() - Method in class org.apache.spark.api.java.JavaSparkContext
-
Cancel all jobs that have been scheduled or are running.
- cancelAllJobs() - Method in class org.apache.spark.SparkContext
-
Cancel all jobs that have been scheduled or are running.
- cancelJobGroup(String) - Method in class org.apache.spark.api.java.JavaSparkContext
-
Cancel active jobs for the specified group.
- cancelJobGroup(String) - Method in class org.apache.spark.SparkContext
-
Cancel active jobs for the specified group.
- canEqual(Object) - Method in class org.apache.spark.scheduler.cluster.ExecutorInfo
-
- canEqual(Object) - Method in class org.apache.spark.util.MutablePair
-
- canHandle(String) - Method in class org.apache.spark.sql.jdbc.AggregatedDialect
-
- canHandle(String) - Method in class org.apache.spark.sql.jdbc.JdbcDialect
-
Check if this dialect instance can handle a certain jdbc url.
- canHandle(String) - Static method in class org.apache.spark.sql.jdbc.MySQLDialect
-
- canHandle(String) - Static method in class org.apache.spark.sql.jdbc.NoopDialect
-
- canHandle(String) - Static method in class org.apache.spark.sql.jdbc.PostgresDialect
-
- cartesian(JavaRDDLike<U, ?>) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Return the Cartesian product of this RDD and another one, that is, the RDD of all pairs of
elements (a, b) where a is in this
and b is in other
.
- cartesian(RDD<U>, ClassTag<U>) - Method in class org.apache.spark.rdd.RDD
-
Return the Cartesian product of this RDD and another one, that is, the RDD of all pairs of
elements (a, b) where a is in this
and b is in other
.
- cast(DataType) - Method in class org.apache.spark.sql.Column
-
Casts the column to a different data type.
- cast(String) - Method in class org.apache.spark.sql.Column
-
Casts the column to a different data type, using the canonical string representation
of the type.
- CatalystScan - Interface in org.apache.spark.sql.sources
-
::Experimental::
An interface for experimenting with a more direct connection to the query planner.
- Categorical() - Static method in class org.apache.spark.mllib.tree.configuration.FeatureType
-
- categoricalFeaturesInfo() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
-
- CategoricalSplit - Class in org.apache.spark.ml.tree
-
:: DeveloperApi ::
Split which tests a categorical feature.
- categories() - Method in class org.apache.spark.mllib.tree.model.Split
-
- categoryMaps() - Method in class org.apache.spark.ml.feature.VectorIndexerModel
-
- cbrt(Column) - Static method in class org.apache.spark.sql.functions
-
Computes the cube-root of the given value.
- cbrt(String) - Static method in class org.apache.spark.sql.functions
-
Computes the cube-root of the given column.
- ceil(Column) - Static method in class org.apache.spark.sql.functions
-
Computes the ceiling of the given value.
- ceil(String) - Static method in class org.apache.spark.sql.functions
-
Computes the ceiling of the given column.
- changePrecision(int, int) - Method in class org.apache.spark.sql.types.Decimal
-
Update precision and scale while keeping our value the same, and return true if successful.
- checkpoint() - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Mark this RDD for checkpointing.
- checkpoint() - Method in class org.apache.spark.graphx.Graph
-
Mark this Graph for checkpointing.
- checkpoint() - Method in class org.apache.spark.graphx.impl.EdgeRDDImpl
-
- checkpoint() - Method in class org.apache.spark.graphx.impl.GraphImpl
-
- checkpoint() - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
-
- checkpoint() - Method in class org.apache.spark.rdd.HadoopRDD
-
- checkpoint() - Method in class org.apache.spark.rdd.RDD
-
Mark this RDD for checkpointing.
- checkpoint(Duration) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
-
Enable periodic checkpointing of RDDs of this DStream.
- checkpoint(String) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
-
Sets the context to periodically checkpoint the DStream operations for master
fault-tolerance.
- checkpoint(Duration) - Method in class org.apache.spark.streaming.dstream.DStream
-
Enable periodic checkpointing of RDDs of this DStream
- checkpoint(String) - Method in class org.apache.spark.streaming.StreamingContext
-
Set the context to periodically checkpoint the DStream operations for driver
fault-tolerance.
- checkpointData() - Method in class org.apache.spark.rdd.RDD
-
- checkpointData() - Method in class org.apache.spark.streaming.dstream.DStream
-
- checkpointDir() - Method in class org.apache.spark.SparkContext
-
- checkpointDir() - Method in class org.apache.spark.streaming.StreamingContext
-
- checkpointDuration() - Method in class org.apache.spark.streaming.dstream.DStream
-
- checkpointDuration() - Method in class org.apache.spark.streaming.StreamingContext
-
- checkpointInterval() - Method in class org.apache.spark.mllib.clustering.EMLDAOptimizer
-
- checkpointInterval() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
-
- checkSplits(double[]) - Static method in class org.apache.spark.ml.feature.Bucketizer
-
We require splits to be of length >= 3 and to be in strictly increasing order.
- child() - Method in class org.apache.spark.sql.sources.Not
-
- ChiSqSelector - Class in org.apache.spark.mllib.feature
-
:: Experimental ::
Creates a ChiSquared feature selector.
- ChiSqSelector(int) - Constructor for class org.apache.spark.mllib.feature.ChiSqSelector
-
- ChiSqSelectorModel - Class in org.apache.spark.mllib.feature
-
:: Experimental ::
Chi Squared selector model.
- ChiSqSelectorModel(int[]) - Constructor for class org.apache.spark.mllib.feature.ChiSqSelectorModel
-
- chiSqTest(Vector, Vector) - Static method in class org.apache.spark.mllib.stat.Statistics
-
Conduct Pearson's chi-squared goodness of fit test of the observed data against the
expected distribution.
- chiSqTest(Vector) - Static method in class org.apache.spark.mllib.stat.Statistics
-
Conduct Pearson's chi-squared goodness of fit test of the observed data against the uniform
distribution, with each category having an expected frequency of 1 / observed.size
.
- chiSqTest(Matrix) - Static method in class org.apache.spark.mllib.stat.Statistics
-
Conduct Pearson's independence test on the input contingency matrix, which cannot contain
negative entries or columns or rows that sum up to 0.
- chiSqTest(RDD<LabeledPoint>) - Static method in class org.apache.spark.mllib.stat.Statistics
-
Conduct Pearson's independence test for every feature against the label across the input RDD.
- ChiSqTestResult - Class in org.apache.spark.mllib.stat.test
-
:: Experimental ::
Object containing the test results for the chi-squared hypothesis test.
- Classification() - Static method in class org.apache.spark.mllib.tree.configuration.Algo
-
- ClassificationModel<FeaturesType,M extends ClassificationModel<FeaturesType,M>> - Class in org.apache.spark.ml.classification
-
:: DeveloperApi ::
- ClassificationModel() - Constructor for class org.apache.spark.ml.classification.ClassificationModel
-
- ClassificationModel - Interface in org.apache.spark.mllib.classification
-
:: Experimental ::
Represents a classification model that predicts to which of a set of categories an example
belongs.
- Classifier<FeaturesType,E extends Classifier<FeaturesType,E,M>,M extends ClassificationModel<FeaturesType,M>> - Class in org.apache.spark.ml.classification
-
:: DeveloperApi ::
- Classifier() - Constructor for class org.apache.spark.ml.classification.Classifier
-
- className() - Method in class org.apache.spark.ExceptionFailure
-
- classpathEntries() - Method in class org.apache.spark.ui.env.EnvironmentListener
-
- classTag() - Method in class org.apache.spark.api.java.JavaDoubleRDD
-
- classTag() - Method in class org.apache.spark.api.java.JavaPairRDD
-
- classTag() - Method in class org.apache.spark.api.java.JavaRDD
-
- classTag() - Method in interface org.apache.spark.api.java.JavaRDDLike
-
- classTag() - Method in class org.apache.spark.streaming.api.java.JavaDStream
-
- classTag() - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
-
- classTag() - Method in class org.apache.spark.streaming.api.java.JavaInputDStream
-
- classTag() - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
- classTag() - Method in class org.apache.spark.streaming.api.java.JavaReceiverInputDStream
-
- clean(long, boolean) - Method in class org.apache.spark.streaming.util.WriteAheadLog
-
Clean all the records that are older than the threshold time.
- CleanAccum - Class in org.apache.spark
-
- CleanAccum(long) - Constructor for class org.apache.spark.CleanAccum
-
- CleanBroadcast - Class in org.apache.spark
-
- CleanBroadcast(long) - Constructor for class org.apache.spark.CleanBroadcast
-
- CleanCheckpoint - Class in org.apache.spark
-
- CleanCheckpoint(int) - Constructor for class org.apache.spark.CleanCheckpoint
-
- CleanRDD - Class in org.apache.spark
-
- CleanRDD(int) - Constructor for class org.apache.spark.CleanRDD
-
- CleanShuffle - Class in org.apache.spark
-
- CleanShuffle(int) - Constructor for class org.apache.spark.CleanShuffle
-
- CleanupTask - Interface in org.apache.spark
-
Classes that represent cleaning tasks.
- CleanupTaskWeakReference - Class in org.apache.spark
-
A WeakReference associated with a CleanupTask.
- CleanupTaskWeakReference(CleanupTask, Object, ReferenceQueue<Object>) - Constructor for class org.apache.spark.CleanupTaskWeakReference
-
- clear(Param<?>) - Method in interface org.apache.spark.ml.param.Params
-
Clears the user-supplied value for the input param.
- clearCache() - Method in class org.apache.spark.sql.SQLContext
-
Removes all cached tables from the in-memory cache.
- clearCallSite() - Method in class org.apache.spark.api.java.JavaSparkContext
-
Pass-through to SparkContext.setCallSite.
- clearCallSite() - Method in class org.apache.spark.SparkContext
-
Clear the thread-local property for overriding the call sites
of actions and RDDs.
- clearDependencies() - Method in class org.apache.spark.rdd.CoGroupedRDD
-
- clearDependencies() - Method in class org.apache.spark.rdd.ShuffledRDD
-
- clearDependencies() - Method in class org.apache.spark.rdd.UnionRDD
-
- clearFiles() - Method in class org.apache.spark.api.java.JavaSparkContext
-
Clear the job's list of files added by addFile
so that they do not get downloaded to
any new nodes.
- clearFiles() - Method in class org.apache.spark.SparkContext
-
Clear the job's list of files added by addFile
so that they do not get downloaded to
any new nodes.
- clearJars() - Method in class org.apache.spark.api.java.JavaSparkContext
-
Clear the job's list of JARs added by addJar
so that they do not get downloaded to
any new nodes.
- clearJars() - Method in class org.apache.spark.SparkContext
-
Clear the job's list of JARs added by addJar
so that they do not get downloaded to
any new nodes.
- clearJobGroup() - Method in class org.apache.spark.api.java.JavaSparkContext
-
Clear the current thread's job group ID and its description.
- clearJobGroup() - Method in class org.apache.spark.SparkContext
-
Clear the current thread's job group ID and its description.
- clearThreshold() - Method in class org.apache.spark.mllib.classification.LogisticRegressionModel
-
:: Experimental ::
Clears the threshold so that predict
will output raw prediction scores.
- clearThreshold() - Method in class org.apache.spark.mllib.classification.SVMModel
-
:: Experimental ::
Clears the threshold so that predict
will output raw prediction scores.
- clone() - Method in class org.apache.spark.SparkConf
-
Copy this object
- clone() - Method in class org.apache.spark.sql.types.Decimal
-
- clone() - Method in class org.apache.spark.sql.types.UTF8String
-
- clone() - Method in class org.apache.spark.storage.StorageLevel
-
- clone() - Method in class org.apache.spark.util.random.BernoulliCellSampler
-
- clone() - Method in class org.apache.spark.util.random.BernoulliSampler
-
- clone() - Method in class org.apache.spark.util.random.PoissonSampler
-
- clone() - Method in interface org.apache.spark.util.random.RandomSampler
-
return a copy of the RandomSampler object
- cloneComplement() - Method in class org.apache.spark.util.random.BernoulliCellSampler
-
Return a sampler that is the complement of the range specified of the current sampler.
- close() - Method in class org.apache.spark.api.java.JavaSparkContext
-
- close() - Method in class org.apache.spark.input.PortableDataStream
-
Close the file (if it is currently open)
- close() - Method in class org.apache.spark.io.SnappyOutputStreamWrapper
-
- close() - Method in class org.apache.spark.serializer.DeserializationStream
-
- close() - Method in class org.apache.spark.serializer.SerializationStream
-
- close() - Method in class org.apache.spark.sql.sources.OutputWriter
-
- close() - Method in class org.apache.spark.storage.TimeTrackingOutputStream
-
- close() - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
-
- close() - Method in class org.apache.spark.streaming.util.WriteAheadLog
-
Close this log and release any resources.
- closureSerializer() - Method in class org.apache.spark.SparkEnv
-
- cls() - Method in class org.apache.spark.util.MethodIdentifier
-
- cluster() - Method in class org.apache.spark.mllib.clustering.PowerIterationClustering.Assignment
-
- clusterCenters() - Method in class org.apache.spark.mllib.clustering.KMeansModel
-
- clusterCenters() - Method in class org.apache.spark.mllib.clustering.StreamingKMeansModel
-
- clusterWeights() - Method in class org.apache.spark.mllib.clustering.StreamingKMeansModel
-
- cn() - Method in class org.apache.spark.mllib.feature.VocabWord
-
- coalesce(int) - Method in class org.apache.spark.api.java.JavaDoubleRDD
-
Return a new RDD that is reduced into numPartitions
partitions.
- coalesce(int, boolean) - Method in class org.apache.spark.api.java.JavaDoubleRDD
-
Return a new RDD that is reduced into numPartitions
partitions.
- coalesce(int) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Return a new RDD that is reduced into numPartitions
partitions.
- coalesce(int, boolean) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Return a new RDD that is reduced into numPartitions
partitions.
- coalesce(int) - Method in class org.apache.spark.api.java.JavaRDD
-
Return a new RDD that is reduced into numPartitions
partitions.
- coalesce(int, boolean) - Method in class org.apache.spark.api.java.JavaRDD
-
Return a new RDD that is reduced into numPartitions
partitions.
- coalesce(int, boolean, Ordering<T>) - Method in class org.apache.spark.rdd.RDD
-
Return a new RDD that is reduced into numPartitions
partitions.
- coalesce(int) - Method in class org.apache.spark.sql.DataFrame
-
Returns a new
DataFrame
that has exactly
numPartitions
partitions.
- coalesce(Column...) - Static method in class org.apache.spark.sql.functions
-
Returns the first column that is not null.
- coalesce(Seq<Column>) - Static method in class org.apache.spark.sql.functions
-
Returns the first column that is not null.
- code() - Method in class org.apache.spark.mllib.feature.VocabWord
-
- codeLen() - Method in class org.apache.spark.mllib.feature.VocabWord
-
- cogroup(JavaPairRDD<K, W>, Partitioner) - Method in class org.apache.spark.api.java.JavaPairRDD
-
For each key k in this
or other
, return a resulting RDD that contains a tuple with the
list of values for that key in this
as well as other
.
- cogroup(JavaPairRDD<K, W1>, JavaPairRDD<K, W2>, Partitioner) - Method in class org.apache.spark.api.java.JavaPairRDD
-
For each key k in this
or other1
or other2
, return a resulting RDD that contains a
tuple with the list of values for that key in this
, other1
and other2
.
- cogroup(JavaPairRDD<K, W1>, JavaPairRDD<K, W2>, JavaPairRDD<K, W3>, Partitioner) - Method in class org.apache.spark.api.java.JavaPairRDD
-
For each key k in this
or other1
or other2
or other3
,
return a resulting RDD that contains a tuple with the list of values
for that key in this
, other1
, other2
and other3
.
- cogroup(JavaPairRDD<K, W>) - Method in class org.apache.spark.api.java.JavaPairRDD
-
For each key k in this
or other
, return a resulting RDD that contains a tuple with the
list of values for that key in this
as well as other
.
- cogroup(JavaPairRDD<K, W1>, JavaPairRDD<K, W2>) - Method in class org.apache.spark.api.java.JavaPairRDD
-
For each key k in this
or other1
or other2
, return a resulting RDD that contains a
tuple with the list of values for that key in this
, other1
and other2
.
- cogroup(JavaPairRDD<K, W1>, JavaPairRDD<K, W2>, JavaPairRDD<K, W3>) - Method in class org.apache.spark.api.java.JavaPairRDD
-
For each key k in this
or other1
or other2
or other3
,
return a resulting RDD that contains a tuple with the list of values
for that key in this
, other1
, other2
and other3
.
- cogroup(JavaPairRDD<K, W>, int) - Method in class org.apache.spark.api.java.JavaPairRDD
-
For each key k in this
or other
, return a resulting RDD that contains a tuple with the
list of values for that key in this
as well as other
.
- cogroup(JavaPairRDD<K, W1>, JavaPairRDD<K, W2>, int) - Method in class org.apache.spark.api.java.JavaPairRDD
-
For each key k in this
or other1
or other2
, return a resulting RDD that contains a
tuple with the list of values for that key in this
, other1
and other2
.
- cogroup(JavaPairRDD<K, W1>, JavaPairRDD<K, W2>, JavaPairRDD<K, W3>, int) - Method in class org.apache.spark.api.java.JavaPairRDD
-
For each key k in this
or other1
or other2
or other3
,
return a resulting RDD that contains a tuple with the list of values
for that key in this
, other1
, other2
and other3
.
- cogroup(RDD<Tuple2<K, W1>>, RDD<Tuple2<K, W2>>, RDD<Tuple2<K, W3>>, Partitioner) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
For each key k in this
or other1
or other2
or other3
,
return a resulting RDD that contains a tuple with the list of values
for that key in this
, other1
, other2
and other3
.
- cogroup(RDD<Tuple2<K, W>>, Partitioner) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
- cogroup(RDD<Tuple2<K, W1>>, RDD<Tuple2<K, W2>>, Partitioner) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
- cogroup(RDD<Tuple2<K, W1>>, RDD<Tuple2<K, W2>>, RDD<Tuple2<K, W3>>) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
- cogroup(RDD<Tuple2<K, W>>) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
For each key k in this
or other
, return a resulting RDD that contains a tuple with the
list of values for that key in this
as well as other
.
- cogroup(RDD<Tuple2<K, W1>>, RDD<Tuple2<K, W2>>) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
For each key k in this
or other1
or other2
, return a resulting RDD that contains a
tuple with the list of values for that key in this
, other1
and other2
.
- cogroup(RDD<Tuple2<K, W>>, int) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
For each key k in this
or other
, return a resulting RDD that contains a tuple with the
list of values for that key in this
as well as other
.
- cogroup(RDD<Tuple2<K, W1>>, RDD<Tuple2<K, W2>>, int) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
For each key k in this
or other1
or other2
, return a resulting RDD that contains a
tuple with the list of values for that key in this
, other1
and other2
.
- cogroup(RDD<Tuple2<K, W1>>, RDD<Tuple2<K, W2>>, RDD<Tuple2<K, W3>>, int) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
For each key k in this
or other1
or other2
or other3
,
return a resulting RDD that contains a tuple with the list of values
for that key in this
, other1
, other2
and other3
.
- cogroup(JavaPairDStream<K, W>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Return a new DStream by applying 'cogroup' between RDDs of this
DStream and other
DStream.
- cogroup(JavaPairDStream<K, W>, int) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Return a new DStream by applying 'cogroup' between RDDs of this
DStream and other
DStream.
- cogroup(JavaPairDStream<K, W>, Partitioner) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Return a new DStream by applying 'cogroup' between RDDs of this
DStream and other
DStream.
- cogroup(DStream<Tuple2<K, W>>, ClassTag<W>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
-
Return a new DStream by applying 'cogroup' between RDDs of this
DStream and other
DStream.
- cogroup(DStream<Tuple2<K, W>>, int, ClassTag<W>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
-
Return a new DStream by applying 'cogroup' between RDDs of this
DStream and other
DStream.
- cogroup(DStream<Tuple2<K, W>>, Partitioner, ClassTag<W>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
-
Return a new DStream by applying 'cogroup' between RDDs of this
DStream and other
DStream.
- CoGroupedRDD<K> - Class in org.apache.spark.rdd
-
:: DeveloperApi ::
A RDD that cogroups its parents.
- CoGroupedRDD(Seq<RDD<? extends Product2<K, ?>>>, Partitioner) - Constructor for class org.apache.spark.rdd.CoGroupedRDD
-
- col(String) - Method in class org.apache.spark.sql.DataFrame
-
Selects column based on the column name and return it as a
Column
.
- col(String) - Static method in class org.apache.spark.sql.functions
-
Returns a
Column
based on the given column name.
- collect() - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Return an array that contains all of the elements in this RDD.
- collect() - Method in class org.apache.spark.graphx.impl.EdgeRDDImpl
-
- collect() - Method in class org.apache.spark.rdd.RDD
-
Return an array that contains all of the elements in this RDD.
- collect(PartialFunction<T, U>, ClassTag<U>) - Method in class org.apache.spark.rdd.RDD
-
Return an RDD that contains all matching values by applying f
.
- collect() - Method in class org.apache.spark.sql.DataFrame
-
Returns an array that contains all of
Row
s in this
DataFrame
.
- collectAsList() - Method in class org.apache.spark.sql.DataFrame
-
Returns a Java list that contains all of
Row
s in this
DataFrame
.
- collectAsMap() - Method in class org.apache.spark.api.java.JavaPairRDD
-
Return the key-value pairs in this RDD to the master as a Map.
- collectAsMap() - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Return the key-value pairs in this RDD to the master as a Map.
- collectAsync() - Method in interface org.apache.spark.api.java.JavaRDDLike
-
The asynchronous version of collect
, which returns a future for
retrieving an array containing all of the elements in this RDD.
- collectAsync() - Method in class org.apache.spark.rdd.AsyncRDDActions
-
Returns a future for retrieving all elements of this RDD.
- collectEdges(EdgeDirection) - Method in class org.apache.spark.graphx.GraphOps
-
Returns an RDD that contains for each vertex v its local edges,
i.e., the edges that are incident on v, in the user-specified direction.
- collectNeighborIds(EdgeDirection) - Method in class org.apache.spark.graphx.GraphOps
-
Collect the neighbor vertex ids for each vertex.
- collectNeighbors(EdgeDirection) - Method in class org.apache.spark.graphx.GraphOps
-
Collect the neighbor vertex attributes for each vertex.
- collectPartitions(int[]) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Return an array that contains all of the elements in a specific partition of this RDD.
- colPtrs() - Method in class org.apache.spark.mllib.linalg.SparseMatrix
-
- colsPerBlock() - Method in class org.apache.spark.mllib.linalg.distributed.BlockMatrix
-
- colStats(RDD<Vector>) - Static method in class org.apache.spark.mllib.stat.Statistics
-
Computes column-wise summary statistics for the input RDD[Vector].
- Column - Class in org.apache.spark.sql
-
- Column(Expression) - Constructor for class org.apache.spark.sql.Column
-
- Column(String) - Constructor for class org.apache.spark.sql.Column
-
- column(String) - Static method in class org.apache.spark.sql.functions
-
Returns a
Column
based on the given column name.
- ColumnName - Class in org.apache.spark.sql
-
:: Experimental ::
A convenient class used for constructing schema.
- ColumnName(String) - Constructor for class org.apache.spark.sql.ColumnName
-
- columns() - Method in class org.apache.spark.sql.DataFrame
-
Returns all column names as an array.
- columnSimilarities() - Method in class org.apache.spark.mllib.linalg.distributed.RowMatrix
-
Compute all cosine similarities between columns of this matrix using the brute-force
approach of computing normalized dot products.
- columnSimilarities(double) - Method in class org.apache.spark.mllib.linalg.distributed.RowMatrix
-
Compute similarities between columns of this matrix using a sampling approach.
- combineByKey(Function<V, C>, Function2<C, V, C>, Function2<C, C, C>, Partitioner, boolean, Serializer) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Generic function to combine the elements for each key using a custom set of aggregation
functions.
- combineByKey(Function<V, C>, Function2<C, V, C>, Function2<C, C, C>, Partitioner) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Generic function to combine the elements for each key using a custom set of aggregation
functions.
- combineByKey(Function<V, C>, Function2<C, V, C>, Function2<C, C, C>, int) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Simplified version of combineByKey that hash-partitions the output RDD and uses map-side
aggregation.
- combineByKey(Function<V, C>, Function2<C, V, C>, Function2<C, C, C>) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Simplified version of combineByKey that hash-partitions the resulting RDD using the existing
partitioner/parallelism level and using map-side aggregation.
- combineByKey(Function1<V, C>, Function2<C, V, C>, Function2<C, C, C>, Partitioner, boolean, Serializer) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Generic function to combine the elements for each key using a custom set of aggregation
functions.
- combineByKey(Function1<V, C>, Function2<C, V, C>, Function2<C, C, C>, int) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Simplified version of combineByKey that hash-partitions the output RDD.
- combineByKey(Function1<V, C>, Function2<C, V, C>, Function2<C, C, C>) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
- combineByKey(Function<V, C>, Function2<C, V, C>, Function2<C, C, C>, Partitioner) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Combine elements of each key in DStream's RDDs using custom function.
- combineByKey(Function<V, C>, Function2<C, V, C>, Function2<C, C, C>, Partitioner, boolean) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Combine elements of each key in DStream's RDDs using custom function.
- combineByKey(Function1<V, C>, Function2<C, V, C>, Function2<C, C, C>, Partitioner, boolean, ClassTag<C>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
-
Combine elements of each key in DStream's RDDs using custom functions.
- combineCombinersByKey(Iterator<Product2<K, C>>) - Method in class org.apache.spark.Aggregator
-
- combineCombinersByKey(Iterator<Product2<K, C>>, TaskContext) - Method in class org.apache.spark.Aggregator
-
- combineValuesByKey(Iterator<Product2<K, V>>) - Method in class org.apache.spark.Aggregator
-
- combineValuesByKey(Iterator<Product2<K, V>>, TaskContext) - Method in class org.apache.spark.Aggregator
-
- compare(PartitionGroup, PartitionGroup) - Method in class org.apache.spark.rdd.PartitionCoalescer
-
- compare(Option<PartitionGroup>, Option<PartitionGroup>) - Method in class org.apache.spark.rdd.PartitionCoalescer
-
- compare(Decimal) - Method in class org.apache.spark.sql.types.Decimal
-
- compare(UTF8String) - Method in class org.apache.spark.sql.types.UTF8String
-
- compare(RDDInfo) - Method in class org.apache.spark.storage.RDDInfo
-
- compareTo(UTF8String) - Method in class org.apache.spark.sql.types.UTF8String
-
- compareTo(SparkShutdownHook) - Method in class org.apache.spark.util.SparkShutdownHook
-
- completed() - Method in class org.apache.spark.status.api.v1.ApplicationAttemptInfo
-
- completedJobs() - Method in class org.apache.spark.ui.jobs.JobProgressListener
-
- completedStages() - Method in class org.apache.spark.ui.jobs.JobProgressListener
-
- completedTasks() - Method in class org.apache.spark.status.api.v1.ExecutorSummary
-
- completionTime() - Method in class org.apache.spark.scheduler.StageInfo
-
Time when all tasks in the stage completed or when the stage was cancelled.
- completionTime() - Method in class org.apache.spark.status.api.v1.JobData
-
- ComplexFutureAction<T> - Class in org.apache.spark
-
A
FutureAction
for actions that could trigger multiple Spark jobs.
- ComplexFutureAction() - Constructor for class org.apache.spark.ComplexFutureAction
-
- compressed() - Method in interface org.apache.spark.mllib.linalg.Vector
-
Returns a vector in either dense or sparse format, whichever uses less storage.
- compressedInputStream(InputStream) - Method in interface org.apache.spark.io.CompressionCodec
-
- compressedInputStream(InputStream) - Method in class org.apache.spark.io.LZ4CompressionCodec
-
- compressedInputStream(InputStream) - Method in class org.apache.spark.io.LZFCompressionCodec
-
- compressedInputStream(InputStream) - Method in class org.apache.spark.io.SnappyCompressionCodec
-
- compressedOutputStream(OutputStream) - Method in interface org.apache.spark.io.CompressionCodec
-
- compressedOutputStream(OutputStream) - Method in class org.apache.spark.io.LZ4CompressionCodec
-
- compressedOutputStream(OutputStream) - Method in class org.apache.spark.io.LZFCompressionCodec
-
- compressedOutputStream(OutputStream) - Method in class org.apache.spark.io.SnappyCompressionCodec
-
- CompressionCodec - Interface in org.apache.spark.io
-
:: DeveloperApi ::
CompressionCodec allows the customization of choosing different compression implementations
to be used in block storage.
- compute(Partition, TaskContext) - Method in class org.apache.spark.api.r.BaseRRDD
-
- compute(Partition, TaskContext) - Method in class org.apache.spark.graphx.EdgeRDD
-
- compute(Partition, TaskContext) - Method in class org.apache.spark.graphx.VertexRDD
-
Provides the RDD[(VertexId, VD)]
equivalent output.
- compute(Vector, double, Vector) - Method in class org.apache.spark.mllib.optimization.Gradient
-
Compute the gradient and loss given the features of a single data point.
- compute(Vector, double, Vector, Vector) - Method in class org.apache.spark.mllib.optimization.Gradient
-
Compute the gradient and loss given the features of a single data point,
add the gradient to a provided vector to avoid creating new objects, and return loss.
- compute(Vector, double, Vector) - Method in class org.apache.spark.mllib.optimization.HingeGradient
-
- compute(Vector, double, Vector, Vector) - Method in class org.apache.spark.mllib.optimization.HingeGradient
-
- compute(Vector, Vector, double, int, double) - Method in class org.apache.spark.mllib.optimization.L1Updater
-
- compute(Vector, double, Vector) - Method in class org.apache.spark.mllib.optimization.LeastSquaresGradient
-
- compute(Vector, double, Vector, Vector) - Method in class org.apache.spark.mllib.optimization.LeastSquaresGradient
-
- compute(Vector, double, Vector) - Method in class org.apache.spark.mllib.optimization.LogisticGradient
-
- compute(Vector, double, Vector, Vector) - Method in class org.apache.spark.mllib.optimization.LogisticGradient
-
- compute(Vector, Vector, double, int, double) - Method in class org.apache.spark.mllib.optimization.SimpleUpdater
-
- compute(Vector, Vector, double, int, double) - Method in class org.apache.spark.mllib.optimization.SquaredL2Updater
-
- compute(Vector, Vector, double, int, double) - Method in class org.apache.spark.mllib.optimization.Updater
-
Compute an updated value for weights given the gradient, stepSize, iteration number and
regularization parameter.
- compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.CoGroupedRDD
-
- compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.HadoopRDD
-
- compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.JdbcRDD
-
- compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.NewHadoopRDD
-
- compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.PartitionPruningRDD
-
- compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.RDD
-
:: DeveloperApi ::
Implemented by subclasses to compute a given partition.
- compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.ShuffledRDD
-
- compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.UnionRDD
-
- compute(Time) - Method in class org.apache.spark.streaming.api.java.JavaDStream
-
Generate an RDD for the given duration
- compute(Time) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Method that generates a RDD for the given Duration
- compute(Time) - Method in class org.apache.spark.streaming.dstream.ConstantInputDStream
-
- compute(Time) - Method in class org.apache.spark.streaming.dstream.DStream
-
Method that generates a RDD for the given time
- compute(Time) - Method in class org.apache.spark.streaming.dstream.ReceiverInputDStream
-
- computeColumnSummaryStatistics() - Method in class org.apache.spark.mllib.linalg.distributed.RowMatrix
-
Computes column-wise summary statistics.
- computeCost(RDD<Vector>) - Method in class org.apache.spark.mllib.clustering.KMeansModel
-
Return the K-means cost (sum of squared distances of points to their nearest center) for this
model on the given data.
- computeCovariance() - Method in class org.apache.spark.mllib.linalg.distributed.RowMatrix
-
Computes the covariance matrix, treating each row as an observation.
- computeError(org.apache.spark.mllib.tree.model.TreeEnsembleModel, RDD<LabeledPoint>) - Method in interface org.apache.spark.mllib.tree.loss.Loss
-
Method to calculate error of the base learner for the gradient boosting calculation.
- computeError(double, double) - Method in interface org.apache.spark.mllib.tree.loss.Loss
-
Method to calculate loss when the predictions are already known.
- computeGramianMatrix() - Method in class org.apache.spark.mllib.linalg.distributed.IndexedRowMatrix
-
- computeGramianMatrix() - Method in class org.apache.spark.mllib.linalg.distributed.RowMatrix
-
Computes the Gramian matrix A^T A
.
- computeInitialPredictionAndError(RDD<LabeledPoint>, double, DecisionTreeModel, Loss) - Static method in class org.apache.spark.mllib.tree.model.GradientBoostedTreesModel
-
Compute the initial predictions and errors for a dataset for the first
iteration of gradient boosting.
- computePreferredLocations(Seq<InputFormatInfo>) - Static method in class org.apache.spark.scheduler.InputFormatInfo
-
Computes the preferred locations based on input(s) and returned a location to block map.
- computePrincipalComponents(int) - Method in class org.apache.spark.mllib.linalg.distributed.RowMatrix
-
Computes the top k principal components.
- computeSVD(int, boolean, double) - Method in class org.apache.spark.mllib.linalg.distributed.IndexedRowMatrix
-
- computeSVD(int, boolean, double) - Method in class org.apache.spark.mllib.linalg.distributed.RowMatrix
-
Computes singular value decomposition of this matrix.
- conf() - Method in class org.apache.spark.SparkEnv
-
- conf() - Method in class org.apache.spark.streaming.StreamingContext
-
- confidence() - Method in class org.apache.spark.partial.BoundedDouble
-
- configuration() - Method in class org.apache.spark.scheduler.InputFormatInfo
-
- CONFIGURATION_INSTANTIATION_LOCK() - Static method in class org.apache.spark.rdd.HadoopRDD
-
Configuration's constructor is not threadsafe (see SPARK-1097 and HADOOP-10456).
- confusionMatrix() - Method in class org.apache.spark.mllib.evaluation.MulticlassMetrics
-
Returns confusion matrix:
predicted classes are in columns,
they are ordered by class label ascending,
as in "labels"
- connectedComponents() - Method in class org.apache.spark.graphx.GraphOps
-
Compute the connected component membership of each vertex and return a graph with the vertex
value containing the lowest vertex id in the connected component containing that vertex.
- ConnectedComponents - Class in org.apache.spark.graphx.lib
-
Connected components algorithm.
- ConnectedComponents() - Constructor for class org.apache.spark.graphx.lib.ConnectedComponents
-
- ConstantInputDStream<T> - Class in org.apache.spark.streaming.dstream
-
An input stream that always returns the same RDD on each timestep.
- ConstantInputDStream(StreamingContext, RDD<T>, ClassTag<T>) - Constructor for class org.apache.spark.streaming.dstream.ConstantInputDStream
-
- contains(Param<?>) - Method in class org.apache.spark.ml.param.ParamMap
-
Checks whether a parameter is explicitly specified.
- contains(String) - Method in class org.apache.spark.SparkConf
-
Does the configuration contain a given parameter?
- contains(Object) - Method in class org.apache.spark.sql.Column
-
Contains the other element.
- contains(String) - Method in class org.apache.spark.sql.types.Metadata
-
Tests whether this Metadata contains a binding for a key.
- contains(UTF8String) - Method in class org.apache.spark.sql.types.UTF8String
-
- containsBlock(BlockId) - Method in class org.apache.spark.storage.StorageStatus
-
Return whether the given block is stored in this block manager in O(1) time.
- containsCachedMetadata(String) - Static method in class org.apache.spark.rdd.HadoopRDD
-
- containsNull() - Method in class org.apache.spark.sql.types.ArrayType
-
- context() - Method in interface org.apache.spark.api.java.JavaRDDLike
-
- context() - Method in class org.apache.spark.InterruptibleIterator
-
- context() - Method in class org.apache.spark.rdd.RDD
-
- context() - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
-
- context() - Method in class org.apache.spark.streaming.dstream.DStream
-
Return the StreamingContext associated with this DStream
- Continuous() - Static method in class org.apache.spark.mllib.tree.configuration.FeatureType
-
- ContinuousSplit - Class in org.apache.spark.ml.tree
-
:: DeveloperApi ::
Split which tests a continuous feature.
- convertToCanonicalEdges(Function2<ED, ED, ED>) - Method in class org.apache.spark.graphx.GraphOps
-
Convert bi-directional edges into uni-directional ones.
- CoordinateMatrix - Class in org.apache.spark.mllib.linalg.distributed
-
- CoordinateMatrix(RDD<MatrixEntry>, long, long) - Constructor for class org.apache.spark.mllib.linalg.distributed.CoordinateMatrix
-
- CoordinateMatrix(RDD<MatrixEntry>) - Constructor for class org.apache.spark.mllib.linalg.distributed.CoordinateMatrix
-
- copy(ParamMap) - Method in class org.apache.spark.ml.classification.DecisionTreeClassificationModel
-
- copy(ParamMap) - Method in class org.apache.spark.ml.classification.GBTClassificationModel
-
- copy(ParamMap) - Method in class org.apache.spark.ml.classification.LogisticRegressionModel
-
- copy(ParamMap) - Method in class org.apache.spark.ml.classification.RandomForestClassificationModel
-
- copy(ParamMap) - Method in class org.apache.spark.ml.Estimator
-
- copy(ParamMap) - Method in class org.apache.spark.ml.evaluation.Evaluator
-
- copy(ParamMap) - Method in class org.apache.spark.ml.Model
-
- copy() - Method in class org.apache.spark.ml.param.ParamMap
-
Creates a copy of this param map.
- copy(ParamMap) - Method in interface org.apache.spark.ml.param.Params
-
Creates a copy of this instance with the same UID and some extra params.
- copy(ParamMap) - Method in class org.apache.spark.ml.Pipeline
-
- copy(ParamMap) - Method in class org.apache.spark.ml.PipelineModel
-
- copy(ParamMap) - Method in class org.apache.spark.ml.PipelineStage
-
- copy(ParamMap) - Method in class org.apache.spark.ml.Predictor
-
- copy(ParamMap) - Method in class org.apache.spark.ml.regression.DecisionTreeRegressionModel
-
- copy(ParamMap) - Method in class org.apache.spark.ml.regression.GBTRegressionModel
-
- copy(ParamMap) - Method in class org.apache.spark.ml.regression.LinearRegressionModel
-
- copy(ParamMap) - Method in class org.apache.spark.ml.regression.RandomForestRegressionModel
-
- copy(ParamMap) - Method in class org.apache.spark.ml.Transformer
-
- copy(ParamMap) - Method in class org.apache.spark.ml.tuning.CrossValidatorModel
-
- copy() - Method in class org.apache.spark.mllib.linalg.DenseMatrix
-
- copy() - Method in class org.apache.spark.mllib.linalg.DenseVector
-
- copy() - Method in interface org.apache.spark.mllib.linalg.Matrix
-
Get a deep copy of the matrix.
- copy() - Method in class org.apache.spark.mllib.linalg.SparseMatrix
-
- copy() - Method in class org.apache.spark.mllib.linalg.SparseVector
-
- copy() - Method in interface org.apache.spark.mllib.linalg.Vector
-
Makes a deep copy of this vector.
- copy() - Method in class org.apache.spark.mllib.random.ExponentialGenerator
-
- copy() - Method in class org.apache.spark.mllib.random.GammaGenerator
-
- copy() - Method in class org.apache.spark.mllib.random.LogNormalGenerator
-
- copy() - Method in class org.apache.spark.mllib.random.PoissonGenerator
-
- copy() - Method in interface org.apache.spark.mllib.random.RandomDataGenerator
-
Returns a copy of the RandomDataGenerator with a new instance of the rng object used in the
class when applicable for non-locking concurrent usage.
- copy() - Method in class org.apache.spark.mllib.random.StandardNormalGenerator
-
- copy() - Method in class org.apache.spark.mllib.random.UniformGenerator
-
- copy() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
-
Returns a shallow copy of this instance.
- copy() - Method in interface org.apache.spark.sql.Row
-
Make a copy of the current
Row
object.
- copy() - Method in class org.apache.spark.util.StatCounter
-
Clone this StatCounter
- copyValues(T, ParamMap) - Method in interface org.apache.spark.ml.param.Params
-
Copies param values from this instance to another instance for params shared by them.
- corr(RDD<Vector>) - Static method in class org.apache.spark.mllib.stat.Statistics
-
Compute the Pearson correlation matrix for the input RDD of Vectors.
- corr(RDD<Vector>, String) - Static method in class org.apache.spark.mllib.stat.Statistics
-
Compute the correlation matrix for the input RDD of Vectors using the specified method.
- corr(RDD<Object>, RDD<Object>) - Static method in class org.apache.spark.mllib.stat.Statistics
-
Compute the Pearson correlation for the input RDDs.
- corr(JavaRDD<Double>, JavaRDD<Double>) - Static method in class org.apache.spark.mllib.stat.Statistics
-
Java-friendly version of corr()
- corr(RDD<Object>, RDD<Object>, String) - Static method in class org.apache.spark.mllib.stat.Statistics
-
Compute the correlation for the input RDDs using the specified method.
- corr(JavaRDD<Double>, JavaRDD<Double>, String) - Static method in class org.apache.spark.mllib.stat.Statistics
-
Java-friendly version of corr()
- corr(String, String, String) - Method in class org.apache.spark.sql.DataFrameStatFunctions
-
Calculates the correlation of two columns of a DataFrame.
- corr(String, String) - Method in class org.apache.spark.sql.DataFrameStatFunctions
-
Calculates the Pearson Correlation Coefficient of two columns of a DataFrame.
- cos(Column) - Static method in class org.apache.spark.sql.functions
-
Computes the cosine of the given value.
- cos(String) - Static method in class org.apache.spark.sql.functions
-
Computes the cosine of the given column.
- cosh(Column) - Static method in class org.apache.spark.sql.functions
-
Computes the hyperbolic cosine of the given value.
- cosh(String) - Static method in class org.apache.spark.sql.functions
-
Computes the hyperbolic cosine of the given column.
- count() - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Return the number of elements in the RDD.
- count() - Method in class org.apache.spark.graphx.impl.EdgeRDDImpl
-
The number of edges in the RDD.
- count() - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
-
The number of vertices in the RDD.
- count() - Method in class org.apache.spark.ml.classification.LogisticAggregator
-
- count() - Method in class org.apache.spark.ml.regression.LeastSquaresAggregator
-
- count() - Method in class org.apache.spark.mllib.stat.MultivariateOnlineSummarizer
-
- count() - Method in interface org.apache.spark.mllib.stat.MultivariateStatisticalSummary
-
Sample size.
- count() - Method in class org.apache.spark.rdd.RDD
-
Return the number of elements in the RDD.
- count() - Method in class org.apache.spark.sql.DataFrame
-
- count(Column) - Static method in class org.apache.spark.sql.functions
-
Aggregate function: returns the number of items in a group.
- count(String) - Static method in class org.apache.spark.sql.functions
-
Aggregate function: returns the number of items in a group.
- count() - Method in class org.apache.spark.sql.GroupedData
-
Count the number of rows for each group.
- count() - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
-
Return a new DStream in which each RDD has a single element generated by counting each RDD
of this DStream.
- count() - Method in class org.apache.spark.streaming.dstream.DStream
-
Return a new DStream in which each RDD has a single element generated by counting each RDD
of this DStream.
- count() - Method in class org.apache.spark.util.StatCounter
-
- countApprox(long, double) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
:: Experimental ::
Approximate version of count() that returns a potentially incomplete result
within a timeout, even if not all tasks have finished.
- countApprox(long) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
:: Experimental ::
Approximate version of count() that returns a potentially incomplete result
within a timeout, even if not all tasks have finished.
- countApprox(long, double) - Method in class org.apache.spark.rdd.RDD
-
:: Experimental ::
Approximate version of count() that returns a potentially incomplete result
within a timeout, even if not all tasks have finished.
- countApproxDistinct(double) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Return approximate number of distinct elements in the RDD.
- countApproxDistinct(int, int) - Method in class org.apache.spark.rdd.RDD
-
:: Experimental ::
Return approximate number of distinct elements in the RDD.
- countApproxDistinct(double) - Method in class org.apache.spark.rdd.RDD
-
Return approximate number of distinct elements in the RDD.
- countApproxDistinctByKey(double, Partitioner) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Return approximate number of distinct values for each key in this RDD.
- countApproxDistinctByKey(double, int) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Return approximate number of distinct values for each key in this RDD.
- countApproxDistinctByKey(double) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Return approximate number of distinct values for each key in this RDD.
- countApproxDistinctByKey(int, int, Partitioner) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
:: Experimental ::
- countApproxDistinctByKey(double, Partitioner) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Return approximate number of distinct values for each key in this RDD.
- countApproxDistinctByKey(double, int) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Return approximate number of distinct values for each key in this RDD.
- countApproxDistinctByKey(double) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Return approximate number of distinct values for each key in this RDD.
- countAsync() - Method in interface org.apache.spark.api.java.JavaRDDLike
-
The asynchronous version of count
, which returns a
future for counting the number of elements in this RDD.
- countAsync() - Method in class org.apache.spark.rdd.AsyncRDDActions
-
Returns a future for counting the number of elements in the RDD.
- countByKey() - Method in class org.apache.spark.api.java.JavaPairRDD
-
Count the number of elements for each key, and return the result to the master as a Map.
- countByKey() - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Count the number of elements for each key, collecting the results to a local Map.
- countByKeyApprox(long) - Method in class org.apache.spark.api.java.JavaPairRDD
-
:: Experimental ::
Approximate version of countByKey that can return a partial result if it does
not finish within a timeout.
- countByKeyApprox(long, double) - Method in class org.apache.spark.api.java.JavaPairRDD
-
:: Experimental ::
Approximate version of countByKey that can return a partial result if it does
not finish within a timeout.
- countByKeyApprox(long, double) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
:: Experimental ::
Approximate version of countByKey that can return a partial result if it does
not finish within a timeout.
- countByValue() - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Return the count of each unique value in this RDD as a map of (value, count) pairs.
- countByValue(Ordering<T>) - Method in class org.apache.spark.rdd.RDD
-
Return the count of each unique value in this RDD as a local map of (value, count) pairs.
- countByValue() - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
-
Return a new DStream in which each RDD contains the counts of each distinct value in
each RDD of this DStream.
- countByValue(int) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
-
Return a new DStream in which each RDD contains the counts of each distinct value in
each RDD of this DStream.
- countByValue(int, Ordering<T>) - Method in class org.apache.spark.streaming.dstream.DStream
-
Return a new DStream in which each RDD contains the counts of each distinct value in
each RDD of this DStream.
- countByValueAndWindow(Duration, Duration) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
-
Return a new DStream in which each RDD contains the count of distinct elements in
RDDs in a sliding window over this DStream.
- countByValueAndWindow(Duration, Duration, int) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
-
Return a new DStream in which each RDD contains the count of distinct elements in
RDDs in a sliding window over this DStream.
- countByValueAndWindow(Duration, Duration, int, Ordering<T>) - Method in class org.apache.spark.streaming.dstream.DStream
-
Return a new DStream in which each RDD contains the count of distinct elements in
RDDs in a sliding window over this DStream.
- countByValueApprox(long, double) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
(Experimental) Approximate version of countByValue().
- countByValueApprox(long) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
(Experimental) Approximate version of countByValue().
- countByValueApprox(long, double, Ordering<T>) - Method in class org.apache.spark.rdd.RDD
-
:: Experimental ::
Approximate version of countByValue().
- countByWindow(Duration, Duration) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
-
Return a new DStream in which each RDD has a single element generated by counting the number
of elements in a window over this DStream.
- countByWindow(Duration, Duration) - Method in class org.apache.spark.streaming.dstream.DStream
-
Return a new DStream in which each RDD has a single element generated by counting the number
of elements in a sliding window over this DStream.
- countDistinct(Column, Column...) - Static method in class org.apache.spark.sql.functions
-
Aggregate function: returns the number of distinct items in a group.
- countDistinct(String, String...) - Static method in class org.apache.spark.sql.functions
-
Aggregate function: returns the number of distinct items in a group.
- countDistinct(Column, Seq<Column>) - Static method in class org.apache.spark.sql.functions
-
Aggregate function: returns the number of distinct items in a group.
- countDistinct(String, Seq<String>) - Static method in class org.apache.spark.sql.functions
-
Aggregate function: returns the number of distinct items in a group.
- cov(String, String) - Method in class org.apache.spark.sql.DataFrameStatFunctions
-
Calculate the sample covariance of two numerical columns of a DataFrame.
- CreatableRelationProvider - Interface in org.apache.spark.sql.sources
-
- create(boolean, boolean, boolean, int) - Static method in class org.apache.spark.api.java.StorageLevels
-
Deprecated.
- create(boolean, boolean, boolean, boolean, int) - Static method in class org.apache.spark.api.java.StorageLevels
-
Create a new StorageLevel object.
- create(JavaSparkContext, JdbcRDD.ConnectionFactory, String, long, long, int, Function<ResultSet, T>) - Static method in class org.apache.spark.rdd.JdbcRDD
-
Create an RDD that executes an SQL query on a JDBC connection and reads results.
- create(JavaSparkContext, JdbcRDD.ConnectionFactory, String, long, long, int) - Static method in class org.apache.spark.rdd.JdbcRDD
-
Create an RDD that executes an SQL query on a JDBC connection and reads results.
- create(RDD<T>, Function1<Object, Object>) - Static method in class org.apache.spark.rdd.PartitionPruningRDD
-
Create a PartitionPruningRDD.
- create(Object...) - Static method in class org.apache.spark.sql.RowFactory
-
Create a
Row
from the given arguments.
- create() - Method in interface org.apache.spark.streaming.api.java.JavaStreamingContextFactory
-
- create(String, int) - Static method in class org.apache.spark.streaming.kafka.Broker
-
- create(String, int, long, long) - Static method in class org.apache.spark.streaming.kafka.OffsetRange
-
- create(TopicAndPartition, long, long) - Static method in class org.apache.spark.streaming.kafka.OffsetRange
-
- createArrayType(DataType) - Static method in class org.apache.spark.sql.types.DataTypes
-
Creates an ArrayType by specifying the data type of elements (elementType
).
- createArrayType(DataType, boolean) - Static method in class org.apache.spark.sql.types.DataTypes
-
Creates an ArrayType by specifying the data type of elements (elementType
) and
whether the array contains null values (containsNull
).
- createBroker(String, Integer) - Method in class org.apache.spark.streaming.kafka.KafkaUtilsPythonHelper
-
- createCombiner() - Method in class org.apache.spark.Aggregator
-
- createDataFrame(RDD<A>, TypeTags.TypeTag<A>) - Method in class org.apache.spark.sql.SQLContext
-
- createDataFrame(Seq<A>, TypeTags.TypeTag<A>) - Method in class org.apache.spark.sql.SQLContext
-
- createDataFrame(RDD<Row>, StructType) - Method in class org.apache.spark.sql.SQLContext
-
- createDataFrame(JavaRDD<Row>, StructType) - Method in class org.apache.spark.sql.SQLContext
-
- createDataFrame(RDD<?>, Class<?>) - Method in class org.apache.spark.sql.SQLContext
-
- createDataFrame(JavaRDD<?>, Class<?>) - Method in class org.apache.spark.sql.SQLContext
-
- createDecimalType(int, int) - Static method in class org.apache.spark.sql.types.DataTypes
-
- createDecimalType() - Static method in class org.apache.spark.sql.types.DataTypes
-
- createDirectStream(StreamingContext, Map<String, String>, Map<TopicAndPartition, Object>, Function1<MessageAndMetadata<K, V>, R>, ClassTag<K>, ClassTag<V>, ClassTag<KD>, ClassTag<VD>, ClassTag<R>) - Static method in class org.apache.spark.streaming.kafka.KafkaUtils
-
:: Experimental ::
Create an input stream that directly pulls messages from Kafka Brokers
without using any receiver.
- createDirectStream(StreamingContext, Map<String, String>, Set<String>, ClassTag<K>, ClassTag<V>, ClassTag<KD>, ClassTag<VD>) - Static method in class org.apache.spark.streaming.kafka.KafkaUtils
-
:: Experimental ::
Create an input stream that directly pulls messages from Kafka Brokers
without using any receiver.
- createDirectStream(JavaStreamingContext, Class<K>, Class<V>, Class<KD>, Class<VD>, Class<R>, Map<String, String>, Map<TopicAndPartition, Long>, Function<MessageAndMetadata<K, V>, R>) - Static method in class org.apache.spark.streaming.kafka.KafkaUtils
-
:: Experimental ::
Create an input stream that directly pulls messages from Kafka Brokers
without using any receiver.
- createDirectStream(JavaStreamingContext, Class<K>, Class<V>, Class<KD>, Class<VD>, Map<String, String>, Set<String>) - Static method in class org.apache.spark.streaming.kafka.KafkaUtils
-
:: Experimental ::
Create an input stream that directly pulls messages from Kafka Brokers
without using any receiver.
- createDirectStream(JavaStreamingContext, Map<String, String>, Set<String>, Map<TopicAndPartition, Long>) - Method in class org.apache.spark.streaming.kafka.KafkaUtilsPythonHelper
-
- createExternalTable(String, String) - Method in class org.apache.spark.sql.SQLContext
-
- createExternalTable(String, String, String) - Method in class org.apache.spark.sql.SQLContext
-
- createExternalTable(String, String, Map<String, String>) - Method in class org.apache.spark.sql.SQLContext
-
- createExternalTable(String, String, Map<String, String>) - Method in class org.apache.spark.sql.SQLContext
-
- createExternalTable(String, String, StructType, Map<String, String>) - Method in class org.apache.spark.sql.SQLContext
-
- createExternalTable(String, String, StructType, Map<String, String>) - Method in class org.apache.spark.sql.SQLContext
-
- createJDBCTable(String, String, boolean) - Method in class org.apache.spark.sql.DataFrame
-
Deprecated.
As of 1.340, replaced by write().jdbc()
.
- createMapType(DataType, DataType) - Static method in class org.apache.spark.sql.types.DataTypes
-
Creates a MapType by specifying the data type of keys (keyType
) and values
(keyType
).
- createMapType(DataType, DataType, boolean) - Static method in class org.apache.spark.sql.types.DataTypes
-
Creates a MapType by specifying the data type of keys (keyType
), the data type of
values (keyType
), and whether values contain any null value
(valueContainsNull
).
- createOffsetRange(String, Integer, Long, Long) - Method in class org.apache.spark.streaming.kafka.KafkaUtilsPythonHelper
-
- createPollingStream(StreamingContext, String, int, StorageLevel) - Static method in class org.apache.spark.streaming.flume.FlumeUtils
-
Creates an input stream that is to be used with the Spark Sink deployed on a Flume agent.
- createPollingStream(StreamingContext, Seq<InetSocketAddress>, StorageLevel) - Static method in class org.apache.spark.streaming.flume.FlumeUtils
-
Creates an input stream that is to be used with the Spark Sink deployed on a Flume agent.
- createPollingStream(StreamingContext, Seq<InetSocketAddress>, StorageLevel, int, int) - Static method in class org.apache.spark.streaming.flume.FlumeUtils
-
Creates an input stream that is to be used with the Spark Sink deployed on a Flume agent.
- createPollingStream(JavaStreamingContext, String, int) - Static method in class org.apache.spark.streaming.flume.FlumeUtils
-
Creates an input stream that is to be used with the Spark Sink deployed on a Flume agent.
- createPollingStream(JavaStreamingContext, String, int, StorageLevel) - Static method in class org.apache.spark.streaming.flume.FlumeUtils
-
Creates an input stream that is to be used with the Spark Sink deployed on a Flume agent.
- createPollingStream(JavaStreamingContext, InetSocketAddress[], StorageLevel) - Static method in class org.apache.spark.streaming.flume.FlumeUtils
-
Creates an input stream that is to be used with the Spark Sink deployed on a Flume agent.
- createPollingStream(JavaStreamingContext, InetSocketAddress[], StorageLevel, int, int) - Static method in class org.apache.spark.streaming.flume.FlumeUtils
-
Creates an input stream that is to be used with the Spark Sink deployed on a Flume agent.
- createRDD(SparkContext, Map<String, String>, OffsetRange[], ClassTag<K>, ClassTag<V>, ClassTag<KD>, ClassTag<VD>) - Static method in class org.apache.spark.streaming.kafka.KafkaUtils
-
Create a RDD from Kafka using offset ranges for each topic and partition.
- createRDD(SparkContext, Map<String, String>, OffsetRange[], Map<TopicAndPartition, Broker>, Function1<MessageAndMetadata<K, V>, R>, ClassTag<K>, ClassTag<V>, ClassTag<KD>, ClassTag<VD>, ClassTag<R>) - Static method in class org.apache.spark.streaming.kafka.KafkaUtils
-
:: Experimental ::
Create a RDD from Kafka using offset ranges for each topic and partition.
- createRDD(JavaSparkContext, Class<K>, Class<V>, Class<KD>, Class<VD>, Map<String, String>, OffsetRange[]) - Static method in class org.apache.spark.streaming.kafka.KafkaUtils
-
Create a RDD from Kafka using offset ranges for each topic and partition.
- createRDD(JavaSparkContext, Class<K>, Class<V>, Class<KD>, Class<VD>, Class<R>, Map<String, String>, OffsetRange[], Map<TopicAndPartition, Broker>, Function<MessageAndMetadata<K, V>, R>) - Static method in class org.apache.spark.streaming.kafka.KafkaUtils
-
:: Experimental ::
Create a RDD from Kafka using offset ranges for each topic and partition.
- createRDD(JavaSparkContext, Map<String, String>, List<OffsetRange>, Map<TopicAndPartition, Broker>) - Method in class org.apache.spark.streaming.kafka.KafkaUtilsPythonHelper
-
- createRDDFromArray(JavaSparkContext, byte[][]) - Static method in class org.apache.spark.api.r.RRDD
-
Create an RRDD given a sequence of byte arrays.
- createRelation(SQLContext, SaveMode, Map<String, String>, DataFrame) - Method in interface org.apache.spark.sql.sources.CreatableRelationProvider
-
Creates a relation with the given parameters based on the contents of the given
DataFrame.
- createRelation(SQLContext, String[], Option<StructType>, Option<StructType>, Map<String, String>) - Method in interface org.apache.spark.sql.sources.HadoopFsRelationProvider
-
Returns a new base relation with the given parameters, a user defined schema, and a list of
partition columns.
- createRelation(SQLContext, Map<String, String>) - Method in interface org.apache.spark.sql.sources.RelationProvider
-
Returns a new base relation with the given parameters.
- createRelation(SQLContext, Map<String, String>, StructType) - Method in interface org.apache.spark.sql.sources.SchemaRelationProvider
-
Returns a new base relation with the given parameters and user defined schema.
- createRWorker(String, int) - Static method in class org.apache.spark.api.r.RRDD
-
ProcessBuilder used to launch worker R processes.
- createSparkContext(String, String, String, String[], Map<Object, Object>, Map<Object, Object>) - Static method in class org.apache.spark.api.r.RRDD
-
- createStream(StreamingContext, String, int, StorageLevel) - Static method in class org.apache.spark.streaming.flume.FlumeUtils
-
Create a input stream from a Flume source.
- createStream(StreamingContext, String, int, StorageLevel, boolean) - Static method in class org.apache.spark.streaming.flume.FlumeUtils
-
Create a input stream from a Flume source.
- createStream(JavaStreamingContext, String, int) - Static method in class org.apache.spark.streaming.flume.FlumeUtils
-
Creates a input stream from a Flume source.
- createStream(JavaStreamingContext, String, int, StorageLevel) - Static method in class org.apache.spark.streaming.flume.FlumeUtils
-
Creates a input stream from a Flume source.
- createStream(JavaStreamingContext, String, int, StorageLevel, boolean) - Static method in class org.apache.spark.streaming.flume.FlumeUtils
-
Creates a input stream from a Flume source.
- createStream(StreamingContext, String, String, Map<String, Object>, StorageLevel) - Static method in class org.apache.spark.streaming.kafka.KafkaUtils
-
Create an input stream that pulls messages from Kafka Brokers.
- createStream(StreamingContext, Map<String, String>, Map<String, Object>, StorageLevel, ClassTag<K>, ClassTag<V>, ClassTag<U>, ClassTag<T>) - Static method in class org.apache.spark.streaming.kafka.KafkaUtils
-
Create an input stream that pulls messages from Kafka Brokers.
- createStream(JavaStreamingContext, String, String, Map<String, Integer>) - Static method in class org.apache.spark.streaming.kafka.KafkaUtils
-
Create an input stream that pulls messages from Kafka Brokers.
- createStream(JavaStreamingContext, String, String, Map<String, Integer>, StorageLevel) - Static method in class org.apache.spark.streaming.kafka.KafkaUtils
-
Create an input stream that pulls messages from Kafka Brokers.
- createStream(JavaStreamingContext, Class<K>, Class<V>, Class<U>, Class<T>, Map<String, String>, Map<String, Integer>, StorageLevel) - Static method in class org.apache.spark.streaming.kafka.KafkaUtils
-
Create an input stream that pulls messages from Kafka Brokers.
- createStream(JavaStreamingContext, Map<String, String>, Map<String, Integer>, StorageLevel) - Method in class org.apache.spark.streaming.kafka.KafkaUtilsPythonHelper
-
- createStream(StreamingContext, String, String, String, String, InitialPositionInStream, Duration, StorageLevel) - Static method in class org.apache.spark.streaming.kinesis.KinesisUtils
-
Create an input stream that pulls messages from a Kinesis stream.
- createStream(StreamingContext, String, String, String, String, InitialPositionInStream, Duration, StorageLevel, String, String) - Static method in class org.apache.spark.streaming.kinesis.KinesisUtils
-
Create an input stream that pulls messages from a Kinesis stream.
- createStream(StreamingContext, String, String, Duration, InitialPositionInStream, StorageLevel) - Static method in class org.apache.spark.streaming.kinesis.KinesisUtils
-
Create an input stream that pulls messages from a Kinesis stream.
- createStream(JavaStreamingContext, String, String, String, String, InitialPositionInStream, Duration, StorageLevel) - Static method in class org.apache.spark.streaming.kinesis.KinesisUtils
-
Create an input stream that pulls messages from a Kinesis stream.
- createStream(JavaStreamingContext, String, String, String, String, InitialPositionInStream, Duration, StorageLevel, String, String) - Static method in class org.apache.spark.streaming.kinesis.KinesisUtils
-
Create an input stream that pulls messages from a Kinesis stream.
- createStream(JavaStreamingContext, String, String, Duration, InitialPositionInStream, StorageLevel) - Static method in class org.apache.spark.streaming.kinesis.KinesisUtils
-
Create an input stream that pulls messages from a Kinesis stream.
- createStream(StreamingContext, String, String, StorageLevel) - Static method in class org.apache.spark.streaming.mqtt.MQTTUtils
-
Create an input stream that receives messages pushed by a MQTT publisher.
- createStream(JavaStreamingContext, String, String) - Static method in class org.apache.spark.streaming.mqtt.MQTTUtils
-
Create an input stream that receives messages pushed by a MQTT publisher.
- createStream(JavaStreamingContext, String, String, StorageLevel) - Static method in class org.apache.spark.streaming.mqtt.MQTTUtils
-
Create an input stream that receives messages pushed by a MQTT publisher.
- createStream(StreamingContext, Option<Authorization>, Seq<String>, StorageLevel) - Static method in class org.apache.spark.streaming.twitter.TwitterUtils
-
Create a input stream that returns tweets received from Twitter.
- createStream(JavaStreamingContext) - Static method in class org.apache.spark.streaming.twitter.TwitterUtils
-
Create a input stream that returns tweets received from Twitter using Twitter4J's default
OAuth authentication; this requires the system properties twitter4j.oauth.consumerKey,
twitter4j.oauth.consumerSecret, twitter4j.oauth.accessToken and
twitter4j.oauth.accessTokenSecret.
- createStream(JavaStreamingContext, String[]) - Static method in class org.apache.spark.streaming.twitter.TwitterUtils
-
Create a input stream that returns tweets received from Twitter using Twitter4J's default
OAuth authentication; this requires the system properties twitter4j.oauth.consumerKey,
twitter4j.oauth.consumerSecret, twitter4j.oauth.accessToken and
twitter4j.oauth.accessTokenSecret.
- createStream(JavaStreamingContext, String[], StorageLevel) - Static method in class org.apache.spark.streaming.twitter.TwitterUtils
-
Create a input stream that returns tweets received from Twitter using Twitter4J's default
OAuth authentication; this requires the system properties twitter4j.oauth.consumerKey,
twitter4j.oauth.consumerSecret, twitter4j.oauth.accessToken and
twitter4j.oauth.accessTokenSecret.
- createStream(JavaStreamingContext, Authorization) - Static method in class org.apache.spark.streaming.twitter.TwitterUtils
-
Create a input stream that returns tweets received from Twitter.
- createStream(JavaStreamingContext, Authorization, String[]) - Static method in class org.apache.spark.streaming.twitter.TwitterUtils
-
Create a input stream that returns tweets received from Twitter.
- createStream(JavaStreamingContext, Authorization, String[], StorageLevel) - Static method in class org.apache.spark.streaming.twitter.TwitterUtils
-
Create a input stream that returns tweets received from Twitter.
- createStream(StreamingContext, String, Subscribe, Function1<Seq<ByteString>, Iterator<T>>, StorageLevel, SupervisorStrategy, ClassTag<T>) - Static method in class org.apache.spark.streaming.zeromq.ZeroMQUtils
-
Create an input stream that receives messages pushed by a zeromq publisher.
- createStream(JavaStreamingContext, String, Subscribe, Function<byte[][], Iterable<T>>, StorageLevel, SupervisorStrategy) - Static method in class org.apache.spark.streaming.zeromq.ZeroMQUtils
-
Create an input stream that receives messages pushed by a zeromq publisher.
- createStream(JavaStreamingContext, String, Subscribe, Function<byte[][], Iterable<T>>, StorageLevel) - Static method in class org.apache.spark.streaming.zeromq.ZeroMQUtils
-
Create an input stream that receives messages pushed by a zeromq publisher.
- createStream(JavaStreamingContext, String, Subscribe, Function<byte[][], Iterable<T>>) - Static method in class org.apache.spark.streaming.zeromq.ZeroMQUtils
-
Create an input stream that receives messages pushed by a zeromq publisher.
- createStructField(String, DataType, boolean, Metadata) - Static method in class org.apache.spark.sql.types.DataTypes
-
Creates a StructField by specifying the name (name
), data type (dataType
) and
whether values of this field can be null values (nullable
).
- createStructField(String, DataType, boolean) - Static method in class org.apache.spark.sql.types.DataTypes
-
Creates a StructField with empty metadata.
- createStructType(List<StructField>) - Static method in class org.apache.spark.sql.types.DataTypes
-
Creates a StructType with the given list of StructFields (fields
).
- createStructType(StructField[]) - Static method in class org.apache.spark.sql.types.DataTypes
-
Creates a StructType with the given StructField array (fields
).
- createTopic(String) - Method in class org.apache.spark.streaming.kafka.KafkaTestUtils
-
Create a Kafka topic and wait until it propagated to the whole cluster
- createTopicAndPartition(String, Integer) - Method in class org.apache.spark.streaming.kafka.KafkaUtilsPythonHelper
-
- creationSite() - Method in class org.apache.spark.rdd.RDD
-
User code that created this RDD (e.g.
- creationSite() - Method in class org.apache.spark.streaming.dstream.DStream
-
- crosstab(String, String) - Method in class org.apache.spark.sql.DataFrameStatFunctions
-
Computes a pair-wise frequency table of the given columns.
- CrossValidator - Class in org.apache.spark.ml.tuning
-
:: Experimental ::
K-fold cross validation.
- CrossValidator(String) - Constructor for class org.apache.spark.ml.tuning.CrossValidator
-
- CrossValidator() - Constructor for class org.apache.spark.ml.tuning.CrossValidator
-
- CrossValidatorModel - Class in org.apache.spark.ml.tuning
-
:: Experimental ::
Model from k-fold cross validation.
- cube(Column...) - Method in class org.apache.spark.sql.DataFrame
-
Create a multi-dimensional cube for the current
DataFrame
using the specified columns,
so we can run aggregation on them.
- cube(String, String...) - Method in class org.apache.spark.sql.DataFrame
-
Create a multi-dimensional cube for the current
DataFrame
using the specified columns,
so we can run aggregation on them.
- cube(Seq<Column>) - Method in class org.apache.spark.sql.DataFrame
-
Create a multi-dimensional cube for the current
DataFrame
using the specified columns,
so we can run aggregation on them.
- cube(String, Seq<String>) - Method in class org.apache.spark.sql.DataFrame
-
Create a multi-dimensional cube for the current
DataFrame
using the specified columns,
so we can run aggregation on them.
- cumeDist() - Static method in class org.apache.spark.sql.functions
-
Window function: returns the cumulative distribution of values within a window partition,
i.e.
- currentAttemptId() - Method in interface org.apache.spark.SparkStageInfo
-
- currentAttemptId() - Method in class org.apache.spark.SparkStageInfoImpl
-
- currPrefLocs(Partition) - Method in class org.apache.spark.rdd.PartitionCoalescer
-
- databaseTypeDefinition() - Method in class org.apache.spark.sql.jdbc.JdbcType
-
- dataDistribution() - Method in class org.apache.spark.status.api.v1.RDDStorageInfo
-
- DataFrame - Class in org.apache.spark.sql
-
:: Experimental ::
A distributed collection of data organized into named columns.
- DataFrame(SQLContext, LogicalPlan) - Constructor for class org.apache.spark.sql.DataFrame
-
A constructor that automatically analyzes the logical plan.
- DataFrameNaFunctions - Class in org.apache.spark.sql
-
:: Experimental ::
Functionality for working with missing data in
DataFrame
s.
- DataFrameReader - Class in org.apache.spark.sql
-
:: Experimental ::
Interface used to load a
DataFrame
from external storage systems (e.g.
- DataFrameStatFunctions - Class in org.apache.spark.sql
-
:: Experimental ::
Statistic functions for
DataFrame
s.
- DataFrameWriter - Class in org.apache.spark.sql
-
:: Experimental ::
Interface used to write a
DataFrame
to external storage systems (e.g.
- dataSchema() - Method in class org.apache.spark.sql.sources.HadoopFsRelation
-
Specifies schema of actual data files.
- DataType - Class in org.apache.spark.sql.types
-
:: DeveloperApi ::
The base type of all Spark SQL data types.
- DataType() - Constructor for class org.apache.spark.sql.types.DataType
-
- dataType() - Method in class org.apache.spark.sql.types.StructField
-
- dataType() - Method in class org.apache.spark.sql.UserDefinedFunction
-
- DataTypes - Class in org.apache.spark.sql.types
-
To get/create specific data type, users should use singleton objects and factory methods
provided by this class.
- DataTypes() - Constructor for class org.apache.spark.sql.types.DataTypes
-
- DataValidators - Class in org.apache.spark.mllib.util
-
:: DeveloperApi ::
A collection of methods used to validate data before applying ML algorithms.
- DataValidators() - Constructor for class org.apache.spark.mllib.util.DataValidators
-
- date() - Method in class org.apache.spark.sql.ColumnName
-
Creates a new StructField
of type date.
- DateType - Static variable in class org.apache.spark.sql.types.DataTypes
-
Gets the DateType object.
- DateType - Class in org.apache.spark.sql.types
-
:: DeveloperApi ::
The data type representing java.sql.Date
values.
- decayFactor() - Method in class org.apache.spark.mllib.clustering.StreamingKMeans
-
- decimal() - Method in class org.apache.spark.sql.ColumnName
-
Creates a new StructField
of type decimal.
- decimal(int, int) - Method in class org.apache.spark.sql.ColumnName
-
Creates a new StructField
of type decimal.
- Decimal - Class in org.apache.spark.sql.types
-
A mutable implementation of BigDecimal that can hold a Long if values are small enough.
- Decimal() - Constructor for class org.apache.spark.sql.types.Decimal
-
- DecimalType - Class in org.apache.spark.sql.types
-
- DecimalType(Option<PrecisionInfo>) - Constructor for class org.apache.spark.sql.types.DecimalType
-
- DecisionTree - Class in org.apache.spark.mllib.tree
-
:: Experimental ::
A class which implements a decision tree learning algorithm for classification and regression.
- DecisionTree(Strategy) - Constructor for class org.apache.spark.mllib.tree.DecisionTree
-
- DecisionTreeClassificationModel - Class in org.apache.spark.ml.classification
-
:: Experimental ::
Decision tree
model for classification.
- DecisionTreeClassifier - Class in org.apache.spark.ml.classification
-
:: Experimental ::
Decision tree
learning algorithm
for classification.
- DecisionTreeClassifier(String) - Constructor for class org.apache.spark.ml.classification.DecisionTreeClassifier
-
- DecisionTreeClassifier() - Constructor for class org.apache.spark.ml.classification.DecisionTreeClassifier
-
- DecisionTreeModel - Class in org.apache.spark.mllib.tree.model
-
:: Experimental ::
Decision tree model for classification or regression.
- DecisionTreeModel(Node, Enumeration.Value) - Constructor for class org.apache.spark.mllib.tree.model.DecisionTreeModel
-
- DecisionTreeRegressionModel - Class in org.apache.spark.ml.regression
-
:: Experimental ::
Decision tree
model for regression.
- DecisionTreeRegressor - Class in org.apache.spark.ml.regression
-
:: Experimental ::
Decision tree
learning algorithm
for regression.
- DecisionTreeRegressor(String) - Constructor for class org.apache.spark.ml.regression.DecisionTreeRegressor
-
- DecisionTreeRegressor() - Constructor for class org.apache.spark.ml.regression.DecisionTreeRegressor
-
- defaultAttr() - Static method in class org.apache.spark.ml.attribute.BinaryAttribute
-
The default binary attribute.
- defaultAttr() - Static method in class org.apache.spark.ml.attribute.NominalAttribute
-
The default nominal attribute.
- defaultAttr() - Static method in class org.apache.spark.ml.attribute.NumericAttribute
-
The default numeric attribute.
- defaultMinPartitions() - Method in class org.apache.spark.api.java.JavaSparkContext
-
Default min number of partitions for Hadoop RDDs when not given by user
- defaultMinPartitions() - Method in class org.apache.spark.SparkContext
-
Default min number of partitions for Hadoop RDDs when not given by user
Notice that we use math.min so the "defaultMinPartitions" cannot be higher than 2.
- defaultMinSplits() - Method in class org.apache.spark.api.java.JavaSparkContext
-
- defaultMinSplits() - Method in class org.apache.spark.SparkContext
-
Default min number of partitions for Hadoop RDDs when not given by user
- defaultParallelism() - Method in class org.apache.spark.api.java.JavaSparkContext
-
Default level of parallelism to use when not given by user (e.g.
- defaultParallelism() - Method in class org.apache.spark.SparkContext
-
Default level of parallelism to use when not given by user (e.g.
- defaultParamMap() - Method in interface org.apache.spark.ml.param.Params
-
Internal param map for default values.
- defaultParams(String) - Static method in class org.apache.spark.mllib.tree.configuration.BoostingStrategy
-
- defaultParams(Enumeration.Value) - Static method in class org.apache.spark.mllib.tree.configuration.BoostingStrategy
-
- defaultPartitioner(RDD<?>, Seq<RDD<?>>) - Static method in class org.apache.spark.Partitioner
-
Choose a partitioner to use for a cogroup-like operation between a number of RDDs.
- defaultSize() - Method in class org.apache.spark.sql.types.ArrayType
-
The default size of a value of the ArrayType is 100 * the default size of the element type.
- defaultSize() - Method in class org.apache.spark.sql.types.BinaryType
-
The default size of a value of the BinaryType is 4096 bytes.
- defaultSize() - Method in class org.apache.spark.sql.types.BooleanType
-
The default size of a value of the BooleanType is 1 byte.
- defaultSize() - Method in class org.apache.spark.sql.types.ByteType
-
The default size of a value of the ByteType is 1 byte.
- defaultSize() - Method in class org.apache.spark.sql.types.DataType
-
The default size of a value of this data type, used internally for size estimation.
- defaultSize() - Method in class org.apache.spark.sql.types.DateType
-
The default size of a value of the DateType is 4 bytes.
- defaultSize() - Method in class org.apache.spark.sql.types.DecimalType
-
The default size of a value of the DecimalType is 4096 bytes.
- defaultSize() - Method in class org.apache.spark.sql.types.DoubleType
-
The default size of a value of the DoubleType is 8 bytes.
- defaultSize() - Method in class org.apache.spark.sql.types.FloatType
-
The default size of a value of the FloatType is 4 bytes.
- defaultSize() - Method in class org.apache.spark.sql.types.IntegerType
-
The default size of a value of the IntegerType is 4 bytes.
- defaultSize() - Method in class org.apache.spark.sql.types.LongType
-
The default size of a value of the LongType is 8 bytes.
- defaultSize() - Method in class org.apache.spark.sql.types.MapType
-
The default size of a value of the MapType is
100 * (the default size of the key type + the default size of the value type).
- defaultSize() - Method in class org.apache.spark.sql.types.NullType
-
- defaultSize() - Method in class org.apache.spark.sql.types.ShortType
-
The default size of a value of the ShortType is 2 bytes.
- defaultSize() - Method in class org.apache.spark.sql.types.StringType
-
The default size of a value of the StringType is 4096 bytes.
- defaultSize() - Method in class org.apache.spark.sql.types.StructType
-
The default size of a value of the StructType is the total default sizes of all field types.
- defaultSize() - Method in class org.apache.spark.sql.types.TimestampType
-
The default size of a value of the TimestampType is 12 bytes.
- defaultSize() - Method in class org.apache.spark.sql.types.UserDefinedType
-
The default size of a value of the UserDefinedType is 4096 bytes.
- defaultStategy(Enumeration.Value) - Static method in class org.apache.spark.mllib.tree.configuration.Strategy
-
- defaultStrategy(String) - Static method in class org.apache.spark.mllib.tree.configuration.Strategy
-
- defaultStrategy() - Static method in class org.apache.spark.streaming.receiver.ActorSupervisorStrategy
-
- degree() - Method in class org.apache.spark.ml.feature.PolynomialExpansion
-
The polynomial degree to expand, which should be >= 1.
- degrees() - Method in class org.apache.spark.graphx.GraphOps
-
The degree of each vertex in the graph.
- degreesOfFreedom() - Method in class org.apache.spark.mllib.stat.test.ChiSqTestResult
-
- degreesOfFreedom() - Method in interface org.apache.spark.mllib.stat.test.TestResult
-
Returns the degree(s) of freedom of the hypothesis test.
- delegate() - Method in class org.apache.spark.InterruptibleIterator
-
- dense(int, int, double[]) - Static method in class org.apache.spark.mllib.linalg.Matrices
-
Creates a column-major dense matrix.
- dense(double, double...) - Static method in class org.apache.spark.mllib.linalg.Vectors
-
Creates a dense vector from its values.
- dense(double, Seq<Object>) - Static method in class org.apache.spark.mllib.linalg.Vectors
-
Creates a dense vector from its values.
- dense(double[]) - Static method in class org.apache.spark.mllib.linalg.Vectors
-
Creates a dense vector from a double array.
- DenseMatrix - Class in org.apache.spark.mllib.linalg
-
Column-major dense matrix.
- DenseMatrix(int, int, double[], boolean) - Constructor for class org.apache.spark.mllib.linalg.DenseMatrix
-
- DenseMatrix(int, int, double[]) - Constructor for class org.apache.spark.mllib.linalg.DenseMatrix
-
Column-major dense matrix.
- denseRank() - Static method in class org.apache.spark.sql.functions
-
Window function: returns the rank of rows within a window partition, without any gaps.
- DenseVector - Class in org.apache.spark.mllib.linalg
-
A dense vector represented by a value array.
- DenseVector(double[]) - Constructor for class org.apache.spark.mllib.linalg.DenseVector
-
- dependencies() - Method in class org.apache.spark.rdd.RDD
-
Get the list of dependencies of this RDD, taking into account whether the
RDD is checkpointed or not.
- dependencies() - Method in class org.apache.spark.streaming.dstream.DStream
-
List of parent DStreams on which this DStream depends on
- dependencies() - Method in class org.apache.spark.streaming.dstream.InputDStream
-
- Dependency<T> - Class in org.apache.spark
-
:: DeveloperApi ::
Base class for dependencies.
- Dependency() - Constructor for class org.apache.spark.Dependency
-
- depth() - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel
-
Get depth of tree.
- desc() - Method in class org.apache.spark.sql.Column
-
Returns an ordering used in sorting.
- desc(String) - Static method in class org.apache.spark.sql.functions
-
Returns a sort expression based on the descending order of the column.
- desc() - Method in class org.apache.spark.util.MethodIdentifier
-
- describe(String...) - Method in class org.apache.spark.sql.DataFrame
-
Computes statistics for numeric columns, including count, mean, stddev, min, and max.
- describe(Seq<String>) - Method in class org.apache.spark.sql.DataFrame
-
Computes statistics for numeric columns, including count, mean, stddev, min, and max.
- describeTopics(int) - Method in class org.apache.spark.mllib.clustering.DistributedLDAModel
-
- describeTopics(int) - Method in class org.apache.spark.mllib.clustering.LDAModel
-
Return the topics described by weighted terms.
- describeTopics() - Method in class org.apache.spark.mllib.clustering.LDAModel
-
Return the topics described by weighted terms.
- describeTopics(int) - Method in class org.apache.spark.mllib.clustering.LocalLDAModel
-
- description() - Method in class org.apache.spark.ExceptionFailure
-
- description() - Method in class org.apache.spark.status.api.v1.JobData
-
- description() - Method in class org.apache.spark.storage.StorageLevel
-
- DeserializationStream - Class in org.apache.spark.serializer
-
:: DeveloperApi ::
A stream for reading serialized objects.
- DeserializationStream() - Constructor for class org.apache.spark.serializer.DeserializationStream
-
- deserialize(ByteBuffer, ClassTag<T>) - Method in class org.apache.spark.serializer.SerializerInstance
-
- deserialize(ByteBuffer, ClassLoader, ClassTag<T>) - Method in class org.apache.spark.serializer.SerializerInstance
-
- deserialize(Object) - Method in class org.apache.spark.sql.types.UserDefinedType
-
Convert a SQL datum to the user type
- deserialized() - Method in class org.apache.spark.storage.MemoryEntry
-
- deserialized() - Method in class org.apache.spark.storage.StorageLevel
-
- deserializeStream(InputStream) - Method in class org.apache.spark.serializer.SerializerInstance
-
- destroy() - Method in class org.apache.spark.broadcast.Broadcast
-
Destroy all data and metadata related to this broadcast variable.
- details() - Method in class org.apache.spark.scheduler.StageInfo
-
- details() - Method in class org.apache.spark.status.api.v1.StageData
-
- determineBounds(ArrayBuffer<Tuple2<K, Object>>, int, Ordering<K>, ClassTag<K>) - Static method in class org.apache.spark.RangePartitioner
-
Determines the bounds for range partitioning from candidates with weights indicating how many
items each represents.
- DeveloperApi - Annotation Type in org.apache.spark.annotation
-
A lower-level, unstable API intended for developers.
- diag(Vector) - Static method in class org.apache.spark.mllib.linalg.DenseMatrix
-
Generate a diagonal matrix in DenseMatrix
format from the supplied values.
- diag(Vector) - Static method in class org.apache.spark.mllib.linalg.Matrices
-
Generate a diagonal matrix in Matrix
format from the supplied values.
- diff(RDD<Tuple2<Object, VD>>) - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
-
- diff(VertexRDD<VD>) - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
-
- diff(RDD<Tuple2<Object, VD>>) - Method in class org.apache.spark.graphx.VertexRDD
-
For each vertex present in both this
and other
, diff
returns only those vertices with
differing values; for values that are different, keeps the values from other
.
- diff(VertexRDD<VD>) - Method in class org.apache.spark.graphx.VertexRDD
-
For each vertex present in both this
and other
, diff
returns only those vertices with
differing values; for values that are different, keeps the values from other
.
- disableOutputSpecValidation() - Static method in class org.apache.spark.rdd.PairRDDFunctions
-
- DISK_ONLY - Static variable in class org.apache.spark.api.java.StorageLevels
-
- DISK_ONLY() - Static method in class org.apache.spark.storage.StorageLevel
-
- DISK_ONLY_2 - Static variable in class org.apache.spark.api.java.StorageLevels
-
- DISK_ONLY_2() - Static method in class org.apache.spark.storage.StorageLevel
-
- diskBytesSpilled() - Method in class org.apache.spark.status.api.v1.ExecutorStageSummary
-
- diskBytesSpilled() - Method in class org.apache.spark.status.api.v1.StageData
-
- diskBytesSpilled() - Method in class org.apache.spark.status.api.v1.TaskMetricDistributions
-
- diskBytesSpilled() - Method in class org.apache.spark.status.api.v1.TaskMetrics
-
- diskSize() - Method in class org.apache.spark.storage.BlockStatus
-
- diskSize() - Method in class org.apache.spark.storage.RDDInfo
-
- diskUsed() - Method in class org.apache.spark.status.api.v1.ExecutorSummary
-
- diskUsed() - Method in class org.apache.spark.status.api.v1.RDDDataDistribution
-
- diskUsed() - Method in class org.apache.spark.status.api.v1.RDDPartitionInfo
-
- diskUsed() - Method in class org.apache.spark.status.api.v1.RDDStorageInfo
-
- diskUsed() - Method in class org.apache.spark.storage.StorageStatus
-
Return the disk space used by this block manager.
- diskUsedByRdd(int) - Method in class org.apache.spark.storage.StorageStatus
-
Return the disk space used by the given RDD in this block manager in O(1) time.
- dist(Vector) - Method in class org.apache.spark.util.Vector
-
- distinct() - Method in class org.apache.spark.api.java.JavaDoubleRDD
-
Return a new RDD containing the distinct elements in this RDD.
- distinct(int) - Method in class org.apache.spark.api.java.JavaDoubleRDD
-
Return a new RDD containing the distinct elements in this RDD.
- distinct() - Method in class org.apache.spark.api.java.JavaPairRDD
-
Return a new RDD containing the distinct elements in this RDD.
- distinct(int) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Return a new RDD containing the distinct elements in this RDD.
- distinct() - Method in class org.apache.spark.api.java.JavaRDD
-
Return a new RDD containing the distinct elements in this RDD.
- distinct(int) - Method in class org.apache.spark.api.java.JavaRDD
-
Return a new RDD containing the distinct elements in this RDD.
- distinct(int, Ordering<T>) - Method in class org.apache.spark.rdd.RDD
-
Return a new RDD containing the distinct elements in this RDD.
- distinct() - Method in class org.apache.spark.rdd.RDD
-
Return a new RDD containing the distinct elements in this RDD.
- distinct() - Method in class org.apache.spark.sql.DataFrame
-
- DistributedLDAModel - Class in org.apache.spark.mllib.clustering
-
:: Experimental ::
- DistributedMatrix - Interface in org.apache.spark.mllib.linalg.distributed
-
Represents a distributively stored matrix backed by one or more RDDs.
- div(Duration) - Method in class org.apache.spark.streaming.Duration
-
- divide(Object) - Method in class org.apache.spark.sql.Column
-
Division this expression by another expression.
- divide(double) - Method in class org.apache.spark.util.Vector
-
- doc() - Method in class org.apache.spark.ml.param.Param
-
- docConcentration() - Method in class org.apache.spark.mllib.clustering.EMLDAOptimizer
-
- dot(Vector) - Method in class org.apache.spark.util.Vector
-
- doubleAccumulator(double) - Method in class org.apache.spark.api.java.JavaSparkContext
-
Create an
Accumulator
double variable, which tasks can "add" values
to using the
add
method.
- doubleAccumulator(double, String) - Method in class org.apache.spark.api.java.JavaSparkContext
-
Create an
Accumulator
double variable, which tasks can "add" values
to using the
add
method.
- DoubleArrayParam - Class in org.apache.spark.ml.param
-
:: DeveloperApi ::
Specialized version of Param[Array[Double
} for Java.
- DoubleArrayParam(Params, String, String, Function1<double[], Object>) - Constructor for class org.apache.spark.ml.param.DoubleArrayParam
-
- DoubleArrayParam(Params, String, String) - Constructor for class org.apache.spark.ml.param.DoubleArrayParam
-
- DoubleFlatMapFunction<T> - Interface in org.apache.spark.api.java.function
-
A function that returns zero or more records of type Double from each input record.
- DoubleFunction<T> - Interface in org.apache.spark.api.java.function
-
A function that returns Doubles, and can be used to construct DoubleRDDs.
- DoubleParam - Class in org.apache.spark.ml.param
-
:: DeveloperApi ::
Specialized version of Param[Double
] for Java.
- DoubleParam(String, String, String, Function1<Object, Object>) - Constructor for class org.apache.spark.ml.param.DoubleParam
-
- DoubleParam(String, String, String) - Constructor for class org.apache.spark.ml.param.DoubleParam
-
- DoubleParam(org.apache.spark.ml.util.Identifiable, String, String, Function1<Object, Object>) - Constructor for class org.apache.spark.ml.param.DoubleParam
-
- DoubleParam(org.apache.spark.ml.util.Identifiable, String, String) - Constructor for class org.apache.spark.ml.param.DoubleParam
-
- DoubleRDDFunctions - Class in org.apache.spark.rdd
-
Extra functions available on RDDs of Doubles through an implicit conversion.
- DoubleRDDFunctions(RDD<Object>) - Constructor for class org.apache.spark.rdd.DoubleRDDFunctions
-
- doubleRDDToDoubleRDDFunctions(RDD<Object>) - Static method in class org.apache.spark.rdd.RDD
-
- doubleRDDToDoubleRDDFunctions(RDD<Object>) - Static method in class org.apache.spark.SparkContext
-
- doubleToDoubleWritable(double) - Static method in class org.apache.spark.SparkContext
-
- doubleToMultiplier(double) - Static method in class org.apache.spark.util.Vector
-
- DoubleType - Static variable in class org.apache.spark.sql.types.DataTypes
-
Gets the DoubleType object.
- DoubleType - Class in org.apache.spark.sql.types
-
:: DeveloperApi ::
The data type representing Double
values.
- doubleWritableConverter() - Static method in class org.apache.spark.SparkContext
-
- DRIVER_EXTRA_CLASSPATH - Static variable in class org.apache.spark.launcher.SparkLauncher
-
Configuration key for the driver class path.
- DRIVER_EXTRA_JAVA_OPTIONS - Static variable in class org.apache.spark.launcher.SparkLauncher
-
Configuration key for the driver VM options.
- DRIVER_EXTRA_LIBRARY_PATH - Static variable in class org.apache.spark.launcher.SparkLauncher
-
Configuration key for the driver native library path.
- DRIVER_IDENTIFIER() - Static method in class org.apache.spark.SparkContext
-
Executor id for the driver.
- DRIVER_MEMORY - Static variable in class org.apache.spark.launcher.SparkLauncher
-
Configuration key for the driver memory.
- driverActorSystemName() - Static method in class org.apache.spark.SparkEnv
-
- drop(String) - Method in class org.apache.spark.sql.DataFrame
-
Returns a new
DataFrame
with a column dropped.
- drop(Column) - Method in class org.apache.spark.sql.DataFrame
-
Returns a new
DataFrame
with a column dropped.
- drop() - Method in class org.apache.spark.sql.DataFrameNaFunctions
-
Returns a new
DataFrame
that drops rows containing any null values.
- drop(String) - Method in class org.apache.spark.sql.DataFrameNaFunctions
-
Returns a new
DataFrame
that drops rows containing null values.
- drop(String[]) - Method in class org.apache.spark.sql.DataFrameNaFunctions
-
Returns a new
DataFrame
that drops rows containing any null values
in the specified columns.
- drop(Seq<String>) - Method in class org.apache.spark.sql.DataFrameNaFunctions
-
(Scala-specific) Returns a new
DataFrame
that drops rows containing any null values
in the specified columns.
- drop(String, String[]) - Method in class org.apache.spark.sql.DataFrameNaFunctions
-
Returns a new
DataFrame
that drops rows containing null values
in the specified columns.
- drop(String, Seq<String>) - Method in class org.apache.spark.sql.DataFrameNaFunctions
-
(Scala-specific) Returns a new
DataFrame
that drops rows containing null values
in the specified columns.
- drop(int) - Method in class org.apache.spark.sql.DataFrameNaFunctions
-
Returns a new
DataFrame
that drops rows containing less than
minNonNulls
non-null values.
- drop(int, String[]) - Method in class org.apache.spark.sql.DataFrameNaFunctions
-
Returns a new
DataFrame
that drops rows containing less than
minNonNulls
non-null
values in the specified columns.
- drop(int, Seq<String>) - Method in class org.apache.spark.sql.DataFrameNaFunctions
-
(Scala-specific) Returns a new
DataFrame
that drops rows containing less than
minNonNulls
non-null values in the specified columns.
- dropDuplicates() - Method in class org.apache.spark.sql.DataFrame
-
- dropDuplicates(Seq<String>) - Method in class org.apache.spark.sql.DataFrame
-
(Scala-specific) Returns a new
DataFrame
with duplicate rows removed, considering only
the subset of columns.
- dropDuplicates(String[]) - Method in class org.apache.spark.sql.DataFrame
-
Returns a new
DataFrame
with duplicate rows removed, considering only
the subset of columns.
- dropLast() - Method in class org.apache.spark.ml.feature.OneHotEncoder
-
Whether to drop the last category in the encoded vector (default: true)
- dropTempTable(String) - Method in class org.apache.spark.sql.SQLContext
-
- Dst - Static variable in class org.apache.spark.graphx.TripletFields
-
Expose the destination and edge fields but not the source field.
- dstAttr() - Method in class org.apache.spark.graphx.EdgeContext
-
The vertex attribute of the edge's destination vertex.
- dstAttr() - Method in class org.apache.spark.graphx.EdgeTriplet
-
The destination vertex attribute
- dstAttr() - Method in class org.apache.spark.graphx.impl.AggregatingEdgeContext
-
- dstId() - Method in class org.apache.spark.graphx.Edge
-
- dstId() - Method in class org.apache.spark.graphx.EdgeContext
-
The vertex id of the edge's destination vertex.
- dstId() - Method in class org.apache.spark.graphx.impl.AggregatingEdgeContext
-
- dstream() - Method in class org.apache.spark.streaming.api.java.JavaDStream
-
- dstream() - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
-
- dstream() - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
- DStream<T> - Class in org.apache.spark.streaming.dstream
-
A Discretized Stream (DStream), the basic abstraction in Spark Streaming, is a continuous
sequence of RDDs (of the same type) representing a continuous stream of data (see
org.apache.spark.rdd.RDD in the Spark core documentation for more details on RDDs).
- DStream(StreamingContext, ClassTag<T>) - Constructor for class org.apache.spark.streaming.dstream.DStream
-
- dtypes() - Method in class org.apache.spark.sql.DataFrame
-
Returns all column names and their data types as an array.
- duration() - Method in class org.apache.spark.scheduler.TaskInfo
-
- Duration - Class in org.apache.spark.streaming
-
- Duration(long) - Constructor for class org.apache.spark.streaming.Duration
-
- Durations - Class in org.apache.spark.streaming
-
- Durations() - Constructor for class org.apache.spark.streaming.Durations
-
- f() - Method in class org.apache.spark.sql.UserDefinedFunction
-
- f1Measure() - Method in class org.apache.spark.mllib.evaluation.MultilabelMetrics
-
Returns document-based f1-measure averaged by the number of documents
- f1Measure(double) - Method in class org.apache.spark.mllib.evaluation.MultilabelMetrics
-
Returns f1-measure for a given label (category)
- failed() - Method in class org.apache.spark.scheduler.TaskInfo
-
- failedJobs() - Method in class org.apache.spark.ui.jobs.JobProgressListener
-
- failedStages() - Method in class org.apache.spark.ui.jobs.JobProgressListener
-
- failedTasks() - Method in class org.apache.spark.status.api.v1.ExecutorStageSummary
-
- failedTasks() - Method in class org.apache.spark.status.api.v1.ExecutorSummary
-
- failureReason() - Method in class org.apache.spark.scheduler.StageInfo
-
If the stage failed, the reason why.
- FAIR() - Static method in class org.apache.spark.scheduler.SchedulingMode
-
- falsePositiveRate(double) - Method in class org.apache.spark.mllib.evaluation.MulticlassMetrics
-
Returns false positive rate for a given label (category)
- feature() - Method in class org.apache.spark.mllib.tree.model.Split
-
- featureIndex() - Method in class org.apache.spark.ml.tree.CategoricalSplit
-
- featureIndex() - Method in class org.apache.spark.ml.tree.ContinuousSplit
-
- featureIndex() - Method in interface org.apache.spark.ml.tree.Split
-
Index of feature which this split tests
- features() - Method in class org.apache.spark.mllib.regression.LabeledPoint
-
- FeatureType - Class in org.apache.spark.mllib.tree.configuration
-
:: Experimental ::
Enum to describe whether a feature is "continuous" or "categorical"
- FeatureType() - Constructor for class org.apache.spark.mllib.tree.configuration.FeatureType
-
- featureType() - Method in class org.apache.spark.mllib.tree.model.Split
-
- FetchFailed - Class in org.apache.spark
-
:: DeveloperApi ::
Task failed to fetch shuffle data from a remote node.
- FetchFailed(BlockManagerId, int, int, int, String) - Constructor for class org.apache.spark.FetchFailed
-
- fetchPct() - Method in class org.apache.spark.scheduler.RuntimePercentage
-
- fetchWaitTime() - Method in class org.apache.spark.status.api.v1.ShuffleReadMetricDistributions
-
- fetchWaitTime() - Method in class org.apache.spark.status.api.v1.ShuffleReadMetrics
-
- field() - Method in class org.apache.spark.storage.BroadcastBlockId
-
- fieldIndex(String) - Method in interface org.apache.spark.sql.Row
-
Returns the index of a given field name.
- fieldIndex(String) - Method in class org.apache.spark.sql.types.StructType
-
Returns index of a given field
- fieldNames() - Method in class org.apache.spark.sql.types.StructType
-
Returns all field names in an array.
- fields() - Method in class org.apache.spark.sql.types.StructType
-
- FIFO() - Static method in class org.apache.spark.scheduler.SchedulingMode
-
- files() - Method in class org.apache.spark.SparkContext
-
- fileStream(String, Class<K>, Class<V>, Class<F>) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
-
Create an input stream that monitors a Hadoop-compatible filesystem
for new files and reads them using the given key-value types and input format.
- fileStream(String, Class<K>, Class<V>, Class<F>, Function<Path, Boolean>, boolean) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
-
Create an input stream that monitors a Hadoop-compatible filesystem
for new files and reads them using the given key-value types and input format.
- fileStream(String, Class<K>, Class<V>, Class<F>, Function<Path, Boolean>, boolean, Configuration) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
-
Create an input stream that monitors a Hadoop-compatible filesystem
for new files and reads them using the given key-value types and input format.
- fileStream(String, ClassTag<K>, ClassTag<V>, ClassTag<F>) - Method in class org.apache.spark.streaming.StreamingContext
-
Create a input stream that monitors a Hadoop-compatible filesystem
for new files and reads them using the given key-value types and input format.
- fileStream(String, Function1<Path, Object>, boolean, ClassTag<K>, ClassTag<V>, ClassTag<F>) - Method in class org.apache.spark.streaming.StreamingContext
-
Create a input stream that monitors a Hadoop-compatible filesystem
for new files and reads them using the given key-value types and input format.
- fileStream(String, Function1<Path, Object>, boolean, Configuration, ClassTag<K>, ClassTag<V>, ClassTag<F>) - Method in class org.apache.spark.streaming.StreamingContext
-
Create a input stream that monitors a Hadoop-compatible filesystem
for new files and reads them using the given key-value types and input format.
- fill(double) - Method in class org.apache.spark.sql.DataFrameNaFunctions
-
Returns a new
DataFrame
that replaces null values in numeric columns with
value
.
- fill(String) - Method in class org.apache.spark.sql.DataFrameNaFunctions
-
Returns a new
DataFrame
that replaces null values in string columns with
value
.
- fill(double, String[]) - Method in class org.apache.spark.sql.DataFrameNaFunctions
-
Returns a new
DataFrame
that replaces null values in specified numeric columns.
- fill(double, Seq<String>) - Method in class org.apache.spark.sql.DataFrameNaFunctions
-
(Scala-specific) Returns a new
DataFrame
that replaces null values in specified
numeric columns.
- fill(String, String[]) - Method in class org.apache.spark.sql.DataFrameNaFunctions
-
Returns a new
DataFrame
that replaces null values in specified string columns.
- fill(String, Seq<String>) - Method in class org.apache.spark.sql.DataFrameNaFunctions
-
(Scala-specific) Returns a new
DataFrame
that replaces null values in
specified string columns.
- fill(Map<String, Object>) - Method in class org.apache.spark.sql.DataFrameNaFunctions
-
Returns a new
DataFrame
that replaces null values.
- fill(Map<String, Object>) - Method in class org.apache.spark.sql.DataFrameNaFunctions
-
(Scala-specific) Returns a new
DataFrame
that replaces null values.
- filter(Function<Double, Boolean>) - Method in class org.apache.spark.api.java.JavaDoubleRDD
-
Return a new RDD containing only the elements that satisfy a predicate.
- filter(Function<Tuple2<K, V>, Boolean>) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Return a new RDD containing only the elements that satisfy a predicate.
- filter(Function<T, Boolean>) - Method in class org.apache.spark.api.java.JavaRDD
-
Return a new RDD containing only the elements that satisfy a predicate.
- filter(Function1<Graph<VD, ED>, Graph<VD2, ED2>>, Function1<EdgeTriplet<VD2, ED2>, Object>, Function2<Object, VD2, Object>, ClassTag<VD2>, ClassTag<ED2>) - Method in class org.apache.spark.graphx.GraphOps
-
Filter the graph by computing some values to filter on, and applying the predicates.
- filter(Function1<EdgeTriplet<VD, ED>, Object>, Function2<Object, VD, Object>) - Method in class org.apache.spark.graphx.impl.EdgeRDDImpl
-
- filter(Function1<Tuple2<Object, VD>, Object>) - Method in class org.apache.spark.graphx.VertexRDD
-
Restricts the vertex set to the set of vertices satisfying the given predicate.
- filter(Params) - Method in class org.apache.spark.ml.param.ParamMap
-
Filters this param map for the given parent.
- filter(Function1<T, Object>) - Method in class org.apache.spark.rdd.RDD
-
Return a new RDD containing only the elements that satisfy a predicate.
- filter(Column) - Method in class org.apache.spark.sql.DataFrame
-
Filters rows using the given condition.
- filter(String) - Method in class org.apache.spark.sql.DataFrame
-
Filters rows using the given SQL expression.
- Filter - Class in org.apache.spark.sql.sources
-
A filter predicate for data sources.
- Filter() - Constructor for class org.apache.spark.sql.sources.Filter
-
- filter(Function<T, Boolean>) - Method in class org.apache.spark.streaming.api.java.JavaDStream
-
Return a new DStream containing only the elements that satisfy a predicate.
- filter(Function<Tuple2<K, V>, Boolean>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Return a new DStream containing only the elements that satisfy a predicate.
- filter(Function1<T, Object>) - Method in class org.apache.spark.streaming.dstream.DStream
-
Return a new DStream containing only the elements that satisfy a predicate.
- filterByRange(K, K) - Method in class org.apache.spark.rdd.OrderedRDDFunctions
-
Returns an RDD containing only the elements in the the inclusive range lower
to upper
.
- filterWith(Function1<Object, A>, Function2<T, A, Object>) - Method in class org.apache.spark.rdd.RDD
-
Filters this RDD with p, where p takes an additional parameter of type A.
- findSynonyms(String, int) - Method in class org.apache.spark.mllib.feature.Word2VecModel
-
Find synonyms of a word
- findSynonyms(Vector, int) - Method in class org.apache.spark.mllib.feature.Word2VecModel
-
Find synonyms of the vector representation of a word
- finished() - Method in class org.apache.spark.scheduler.TaskInfo
-
- finishTime() - Method in class org.apache.spark.scheduler.TaskInfo
-
The time when the task has completed successfully (including the time to remotely fetch
results, if necessary).
- first() - Method in class org.apache.spark.api.java.JavaDoubleRDD
-
- first() - Method in class org.apache.spark.api.java.JavaPairRDD
-
- first() - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Return the first element in this RDD.
- first() - Method in class org.apache.spark.rdd.RDD
-
Return the first element in this RDD.
- first() - Method in class org.apache.spark.sql.DataFrame
-
Returns the first row.
- first(Column) - Static method in class org.apache.spark.sql.functions
-
Aggregate function: returns the first value in a group.
- first(String) - Static method in class org.apache.spark.sql.functions
-
Aggregate function: returns the first value of a column in a group.
- fit(DataFrame) - Method in class org.apache.spark.ml.classification.OneVsRest
-
- fit(DataFrame, ParamPair<?>, ParamPair<?>...) - Method in class org.apache.spark.ml.Estimator
-
Fits a single model to the input data with optional parameters.
- fit(DataFrame, ParamPair<?>, Seq<ParamPair<?>>) - Method in class org.apache.spark.ml.Estimator
-
Fits a single model to the input data with optional parameters.
- fit(DataFrame, ParamMap) - Method in class org.apache.spark.ml.Estimator
-
Fits a single model to the input data with provided parameter map.
- fit(DataFrame) - Method in class org.apache.spark.ml.Estimator
-
Fits a model to the input data.
- fit(DataFrame, ParamMap[]) - Method in class org.apache.spark.ml.Estimator
-
Fits multiple models to the input data with multiple sets of parameters.
- fit(DataFrame) - Method in class org.apache.spark.ml.feature.IDF
-
- fit(DataFrame) - Method in class org.apache.spark.ml.feature.StandardScaler
-
- fit(DataFrame) - Method in class org.apache.spark.ml.feature.StringIndexer
-
- fit(DataFrame) - Method in class org.apache.spark.ml.feature.VectorIndexer
-
- fit(DataFrame) - Method in class org.apache.spark.ml.feature.Word2Vec
-
- fit(DataFrame) - Method in class org.apache.spark.ml.Pipeline
-
Fits the pipeline to the input dataset with additional parameters.
- fit(DataFrame) - Method in class org.apache.spark.ml.Predictor
-
- fit(DataFrame) - Method in class org.apache.spark.ml.recommendation.ALS
-
- fit(DataFrame) - Method in class org.apache.spark.ml.tuning.CrossValidator
-
- fit(RDD<LabeledPoint>) - Method in class org.apache.spark.mllib.feature.ChiSqSelector
-
Returns a ChiSquared feature selector.
- fit(RDD<Vector>) - Method in class org.apache.spark.mllib.feature.IDF
-
Computes the inverse document frequency.
- fit(JavaRDD<Vector>) - Method in class org.apache.spark.mllib.feature.IDF
-
Computes the inverse document frequency.
- fit(RDD<Vector>) - Method in class org.apache.spark.mllib.feature.PCA
-
Computes a
PCAModel
that contains the principal components of the input vectors.
- fit(JavaRDD<Vector>) - Method in class org.apache.spark.mllib.feature.PCA
-
Java-friendly version of fit()
- fit(RDD<Vector>) - Method in class org.apache.spark.mllib.feature.StandardScaler
-
Computes the mean and variance and stores as a model to be used for later scaling.
- fit(RDD<S>) - Method in class org.apache.spark.mllib.feature.Word2Vec
-
- fit(JavaRDD<S>) - Method in class org.apache.spark.mllib.feature.Word2Vec
-
Computes the vector representation of each word in vocabulary (Java version).
- flatMap(FlatMapFunction<T, U>) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Return a new RDD by first applying a function to all elements of this
RDD, and then flattening the results.
- flatMap(Function1<T, TraversableOnce<U>>, ClassTag<U>) - Method in class org.apache.spark.rdd.RDD
-
Return a new RDD by first applying a function to all elements of this
RDD, and then flattening the results.
- flatMap(Function1<Row, TraversableOnce<R>>, ClassTag<R>) - Method in class org.apache.spark.sql.DataFrame
-
Returns a new RDD by first applying a function to all rows of this
DataFrame
,
and then flattening the results.
- flatMap(FlatMapFunction<T, U>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
-
Return a new DStream by applying a function to all elements of this DStream,
and then flattening the results
- flatMap(Function1<T, Traversable<U>>, ClassTag<U>) - Method in class org.apache.spark.streaming.dstream.DStream
-
Return a new DStream by applying a function to all elements of this DStream,
and then flattening the results
- FlatMapFunction<T,R> - Interface in org.apache.spark.api.java.function
-
A function that returns zero or more output records from each input record.
- FlatMapFunction2<T1,T2,R> - Interface in org.apache.spark.api.java.function
-
A function that takes two inputs and returns zero or more output records.
- flatMapToDouble(DoubleFlatMapFunction<T>) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Return a new RDD by first applying a function to all elements of this
RDD, and then flattening the results.
- flatMapToPair(PairFlatMapFunction<T, K2, V2>) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Return a new RDD by first applying a function to all elements of this
RDD, and then flattening the results.
- flatMapToPair(PairFlatMapFunction<T, K2, V2>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
-
Return a new DStream by applying a function to all elements of this DStream,
and then flattening the results
- flatMapValues(Function<V, Iterable<U>>) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Pass each value in the key-value pair RDD through a flatMap function without changing the
keys; this also retains the original RDD's partitioning.
- flatMapValues(Function1<V, TraversableOnce<U>>) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Pass each value in the key-value pair RDD through a flatMap function without changing the
keys; this also retains the original RDD's partitioning.
- flatMapValues(Function<V, Iterable<U>>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Return a new DStream by applying a flatmap function to the value of each key-value pairs in
'this' DStream without changing the key.
- flatMapValues(Function1<V, TraversableOnce<U>>, ClassTag<U>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
-
Return a new DStream by applying a flatmap function to the value of each key-value pairs in
'this' DStream without changing the key.
- flatMapWith(Function1<Object, A>, boolean, Function2<T, A, Seq<U>>, ClassTag<U>) - Method in class org.apache.spark.rdd.RDD
-
FlatMaps f over this RDD, where f takes an additional parameter of type A.
- FloatParam - Class in org.apache.spark.ml.param
-
:: DeveloperApi ::
Specialized version of Param[Float
] for Java.
- FloatParam(String, String, String, Function1<Object, Object>) - Constructor for class org.apache.spark.ml.param.FloatParam
-
- FloatParam(String, String, String) - Constructor for class org.apache.spark.ml.param.FloatParam
-
- FloatParam(org.apache.spark.ml.util.Identifiable, String, String, Function1<Object, Object>) - Constructor for class org.apache.spark.ml.param.FloatParam
-
- FloatParam(org.apache.spark.ml.util.Identifiable, String, String) - Constructor for class org.apache.spark.ml.param.FloatParam
-
- floatToFloatWritable(float) - Static method in class org.apache.spark.SparkContext
-
- FloatType - Static variable in class org.apache.spark.sql.types.DataTypes
-
Gets the FloatType object.
- FloatType - Class in org.apache.spark.sql.types
-
:: DeveloperApi ::
The data type representing Float
values.
- floatWritableConverter() - Static method in class org.apache.spark.SparkContext
-
- floor(Column) - Static method in class org.apache.spark.sql.functions
-
Computes the floor of the given value.
- floor(String) - Static method in class org.apache.spark.sql.functions
-
Computes the floor of the given column.
- floor(Duration) - Method in class org.apache.spark.streaming.Time
-
- floor(Duration, Time) - Method in class org.apache.spark.streaming.Time
-
- FlumeUtils - Class in org.apache.spark.streaming.flume
-
- FlumeUtils() - Constructor for class org.apache.spark.streaming.flume.FlumeUtils
-
- flush() - Method in class org.apache.spark.io.SnappyOutputStreamWrapper
-
- flush() - Method in class org.apache.spark.serializer.SerializationStream
-
- flush() - Method in class org.apache.spark.storage.TimeTrackingOutputStream
-
- fMeasure(double, double) - Method in class org.apache.spark.mllib.evaluation.MulticlassMetrics
-
Returns f-measure for a given label (category)
- fMeasure(double) - Method in class org.apache.spark.mllib.evaluation.MulticlassMetrics
-
Returns f1-measure for a given label (category)
- fMeasure() - Method in class org.apache.spark.mllib.evaluation.MulticlassMetrics
-
Returns f-measure
(equals to precision and recall because precision equals recall)
- fMeasureByThreshold(double) - Method in class org.apache.spark.mllib.evaluation.BinaryClassificationMetrics
-
Returns the (threshold, F-Measure) curve.
- fMeasureByThreshold() - Method in class org.apache.spark.mllib.evaluation.BinaryClassificationMetrics
-
Returns the (threshold, F-Measure) curve with beta = 1.0.
- fold(T, Function2<T, T, T>) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Aggregate the elements of each partition, and then the results for all the partitions, using a
given associative and commutative function and a neutral "zero value".
- fold(T, Function2<T, T, T>) - Method in class org.apache.spark.rdd.RDD
-
Aggregate the elements of each partition, and then the results for all the partitions, using a
given associative and commutative function and a neutral "zero value".
- foldByKey(V, Partitioner, Function2<V, V, V>) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Merge the values for each key using an associative function and a neutral "zero value" which
may be added to the result an arbitrary number of times, and must not change the result
(e.g ., Nil for list concatenation, 0 for addition, or 1 for multiplication.).
- foldByKey(V, int, Function2<V, V, V>) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Merge the values for each key using an associative function and a neutral "zero value" which
may be added to the result an arbitrary number of times, and must not change the result
(e.g ., Nil for list concatenation, 0 for addition, or 1 for multiplication.).
- foldByKey(V, Function2<V, V, V>) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Merge the values for each key using an associative function and a neutral "zero value"
which may be added to the result an arbitrary number of times, and must not change the result
(e.g., Nil for list concatenation, 0 for addition, or 1 for multiplication.).
- foldByKey(V, Partitioner, Function2<V, V, V>) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Merge the values for each key using an associative function and a neutral "zero value" which
may be added to the result an arbitrary number of times, and must not change the result
(e.g., Nil for list concatenation, 0 for addition, or 1 for multiplication.).
- foldByKey(V, int, Function2<V, V, V>) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Merge the values for each key using an associative function and a neutral "zero value" which
may be added to the result an arbitrary number of times, and must not change the result
(e.g., Nil for list concatenation, 0 for addition, or 1 for multiplication.).
- foldByKey(V, Function2<V, V, V>) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Merge the values for each key using an associative function and a neutral "zero value" which
may be added to the result an arbitrary number of times, and must not change the result
(e.g., Nil for list concatenation, 0 for addition, or 1 for multiplication.).
- foreach(VoidFunction<T>) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Applies a function f to all elements of this RDD.
- foreach(Function1<T, BoxedUnit>) - Method in class org.apache.spark.rdd.RDD
-
Applies a function f to all elements of this RDD.
- foreach(Function1<Row, BoxedUnit>) - Method in class org.apache.spark.sql.DataFrame
-
Applies a function f
to all rows.
- foreach(Function<R, Void>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
-
Deprecated.
As of release 0.9.0, replaced by foreachRDD
- foreach(Function2<R, Time, Void>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
-
Deprecated.
As of release 0.9.0, replaced by foreachRDD
- foreach(Function1<RDD<T>, BoxedUnit>) - Method in class org.apache.spark.streaming.dstream.DStream
-
Deprecated.
As of 0.9.0, replaced by foreachRDD
.
- foreach(Function2<RDD<T>, Time, BoxedUnit>) - Method in class org.apache.spark.streaming.dstream.DStream
-
Deprecated.
As of 0.9.0, replaced by foreachRDD
.
- foreachActive(Function3<Object, Object, Object, BoxedUnit>) - Method in interface org.apache.spark.mllib.linalg.Matrix
-
Applies a function f
to all the active elements of dense and sparse matrix.
- foreachActive(Function2<Object, Object, BoxedUnit>) - Method in interface org.apache.spark.mllib.linalg.Vector
-
Applies a function f
to all the active elements of dense and sparse vector.
- foreachAsync(VoidFunction<T>) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
The asynchronous version of the foreach
action, which
applies a function f to all the elements of this RDD.
- foreachAsync(Function1<T, BoxedUnit>) - Method in class org.apache.spark.rdd.AsyncRDDActions
-
Applies a function f to all elements of this RDD.
- foreachPartition(VoidFunction<Iterator<T>>) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Applies a function f to each partition of this RDD.
- foreachPartition(Function1<Iterator<T>, BoxedUnit>) - Method in class org.apache.spark.rdd.RDD
-
Applies a function f to each partition of this RDD.
- foreachPartition(Function1<Iterator<Row>, BoxedUnit>) - Method in class org.apache.spark.sql.DataFrame
-
Applies a function f to each partition of this
DataFrame
.
- foreachPartitionAsync(VoidFunction<Iterator<T>>) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
The asynchronous version of the foreachPartition
action, which
applies a function f to each partition of this RDD.
- foreachPartitionAsync(Function1<Iterator<T>, BoxedUnit>) - Method in class org.apache.spark.rdd.AsyncRDDActions
-
Applies a function f to each partition of this RDD.
- foreachRDD(Function<R, Void>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
-
Apply a function to each RDD in this DStream.
- foreachRDD(Function2<R, Time, Void>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
-
Apply a function to each RDD in this DStream.
- foreachRDD(Function1<RDD<T>, BoxedUnit>) - Method in class org.apache.spark.streaming.dstream.DStream
-
Apply a function to each RDD in this DStream.
- foreachRDD(Function2<RDD<T>, Time, BoxedUnit>) - Method in class org.apache.spark.streaming.dstream.DStream
-
Apply a function to each RDD in this DStream.
- foreachWith(Function1<Object, A>, Function2<T, A, BoxedUnit>) - Method in class org.apache.spark.rdd.RDD
-
Applies f to each element of this RDD, where f takes an additional parameter of type A.
- format(String) - Method in class org.apache.spark.sql.DataFrameReader
-
Specifies the input data source format.
- format(String) - Method in class org.apache.spark.sql.DataFrameWriter
-
Specifies the underlying output data source.
- formatVersion() - Method in interface org.apache.spark.mllib.util.Saveable
-
Current version of model save/load format.
- FPGrowth - Class in org.apache.spark.mllib.fpm
-
:: Experimental ::
- FPGrowth() - Constructor for class org.apache.spark.mllib.fpm.FPGrowth
-
Constructs a default instance with default parameters {minSupport: 0.3
, numPartitions: same
as the input data}.
- FPGrowth.FreqItemset<Item> - Class in org.apache.spark.mllib.fpm
-
Frequent itemset.
- FPGrowth.FreqItemset(Object, long) - Constructor for class org.apache.spark.mllib.fpm.FPGrowth.FreqItemset
-
- FPGrowthModel<Item> - Class in org.apache.spark.mllib.fpm
-
:: Experimental ::
- FPGrowthModel(RDD<FPGrowth.FreqItemset<Item>>, ClassTag<Item>) - Constructor for class org.apache.spark.mllib.fpm.FPGrowthModel
-
- fractional() - Method in class org.apache.spark.sql.types.DecimalType
-
- fractional() - Method in class org.apache.spark.sql.types.DoubleType
-
- fractional() - Method in class org.apache.spark.sql.types.FloatType
-
- freq() - Method in class org.apache.spark.mllib.fpm.FPGrowth.FreqItemset
-
- freqItems(String[], double) - Method in class org.apache.spark.sql.DataFrameStatFunctions
-
Finding frequent items for columns, possibly with false positives.
- freqItems(String[]) - Method in class org.apache.spark.sql.DataFrameStatFunctions
-
Finding frequent items for columns, possibly with false positives.
- freqItems(Seq<String>, double) - Method in class org.apache.spark.sql.DataFrameStatFunctions
-
(Scala-specific) Finding frequent items for columns, possibly with false positives.
- freqItems(Seq<String>) - Method in class org.apache.spark.sql.DataFrameStatFunctions
-
(Scala-specific) Finding frequent items for columns, possibly with false positives.
- freqItemsets() - Method in class org.apache.spark.mllib.fpm.FPGrowthModel
-
- fromAvroFlumeEvent(AvroFlumeEvent) - Static method in class org.apache.spark.streaming.flume.SparkFlumeEvent
-
- fromCaseClassString(String) - Static method in class org.apache.spark.sql.types.DataType
-
Deprecated.
As of 1.2.0, replaced by DataType.fromJson()
- fromCOO(int, int, Iterable<Tuple3<Object, Object, Object>>) - Static method in class org.apache.spark.mllib.linalg.SparseMatrix
-
Generate a SparseMatrix
from Coordinate List (COO) format.
- fromDStream(DStream<T>, ClassTag<T>) - Static method in class org.apache.spark.streaming.api.java.JavaDStream
-
- fromEdgePartitions(RDD<Tuple2<Object, EdgePartition<ED, VD>>>, VD, StorageLevel, StorageLevel, ClassTag<VD>, ClassTag<ED>) - Static method in class org.apache.spark.graphx.impl.GraphImpl
-
Create a graph from EdgePartitions, setting referenced vertices to `defaultVertexAttr`.
- fromEdges(RDD<Edge<ED>>, ClassTag<ED>, ClassTag<VD>) - Static method in class org.apache.spark.graphx.EdgeRDD
-
Creates an EdgeRDD from a set of edges.
- fromEdges(RDD<Edge<ED>>, VD, StorageLevel, StorageLevel, ClassTag<VD>, ClassTag<ED>) - Static method in class org.apache.spark.graphx.Graph
-
Construct a graph from a collection of edges.
- fromEdges(EdgeRDD<?>, int, VD, ClassTag<VD>) - Static method in class org.apache.spark.graphx.VertexRDD
-
Constructs a VertexRDD
containing all vertices referred to in edges
.
- fromEdgeTuples(RDD<Tuple2<Object, Object>>, VD, Option<PartitionStrategy>, StorageLevel, StorageLevel, ClassTag<VD>) - Static method in class org.apache.spark.graphx.Graph
-
Construct a graph from a collection of edges encoded as vertex id pairs.
- fromExistingRDDs(VertexRDD<VD>, EdgeRDD<ED>, ClassTag<VD>, ClassTag<ED>) - Static method in class org.apache.spark.graphx.impl.GraphImpl
-
Create a graph from a VertexRDD and an EdgeRDD with the same replicated vertex type as the
vertices.
- fromInputDStream(InputDStream<T>, ClassTag<T>) - Static method in class org.apache.spark.streaming.api.java.JavaInputDStream
-
- fromInputDStream(InputDStream<Tuple2<K, V>>, ClassTag<K>, ClassTag<V>) - Static method in class org.apache.spark.streaming.api.java.JavaPairInputDStream
-
- fromJavaDStream(JavaDStream<Tuple2<K, V>>) - Static method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
- fromJavaRDD(JavaRDD<Tuple2<K, V>>) - Static method in class org.apache.spark.api.java.JavaPairRDD
-
Convert a JavaRDD of key-value pairs to JavaPairRDD.
- fromJson(String) - Static method in class org.apache.spark.sql.types.DataType
-
- fromJson(String) - Static method in class org.apache.spark.sql.types.Metadata
-
Creates a Metadata instance from JSON.
- fromName(String) - Static method in class org.apache.spark.ml.attribute.AttributeType
-
- fromOffset() - Method in class org.apache.spark.streaming.kafka.OffsetRange
-
inclusive starting offset
- fromOld(DecisionTreeModel, DecisionTreeClassifier, Map<Object, Object>) - Static method in class org.apache.spark.ml.classification.DecisionTreeClassificationModel
-
(private[ml]) Convert a model from the old API
- fromOld(GradientBoostedTreesModel, GBTClassifier, Map<Object, Object>) - Static method in class org.apache.spark.ml.classification.GBTClassificationModel
-
(private[ml]) Convert a model from the old API
- fromOld(RandomForestModel, RandomForestClassifier, Map<Object, Object>) - Static method in class org.apache.spark.ml.classification.RandomForestClassificationModel
-
(private[ml]) Convert a model from the old API
- fromOld(DecisionTreeModel, DecisionTreeRegressor, Map<Object, Object>) - Static method in class org.apache.spark.ml.regression.DecisionTreeRegressionModel
-
(private[ml]) Convert a model from the old API
- fromOld(GradientBoostedTreesModel, GBTRegressor, Map<Object, Object>) - Static method in class org.apache.spark.ml.regression.GBTRegressionModel
-
(private[ml]) Convert a model from the old API
- fromOld(RandomForestModel, RandomForestRegressor, Map<Object, Object>) - Static method in class org.apache.spark.ml.regression.RandomForestRegressionModel
-
(private[ml]) Convert a model from the old API
- fromOld(Node, Map<Object, Object>) - Static method in class org.apache.spark.ml.tree.Node
-
Create a new Node from the old Node format, recursively creating child nodes as needed.
- fromPairDStream(DStream<Tuple2<K, V>>, ClassTag<K>, ClassTag<V>) - Static method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
- fromPairRDD(RDD<Tuple2<K, V>>, ClassTag<K>, ClassTag<V>) - Static method in class org.apache.spark.mllib.rdd.MLPairRDDFunctions
-
Implicit conversion from a pair RDD to MLPairRDDFunctions.
- fromRDD(RDD<Object>) - Static method in class org.apache.spark.api.java.JavaDoubleRDD
-
- fromRDD(RDD<Tuple2<K, V>>, ClassTag<K>, ClassTag<V>) - Static method in class org.apache.spark.api.java.JavaPairRDD
-
- fromRDD(RDD<T>, ClassTag<T>) - Static method in class org.apache.spark.api.java.JavaRDD
-
- fromRDD(RDD<T>, ClassTag<T>) - Static method in class org.apache.spark.mllib.rdd.RDDFunctions
-
Implicit conversion from an RDD to RDDFunctions.
- fromRdd(RDD<?>) - Static method in class org.apache.spark.storage.RDDInfo
-
- fromReceiverInputDStream(ReceiverInputDStream<Tuple2<K, V>>, ClassTag<K>, ClassTag<V>) - Static method in class org.apache.spark.streaming.api.java.JavaPairReceiverInputDStream
-
- fromReceiverInputDStream(ReceiverInputDStream<T>, ClassTag<T>) - Static method in class org.apache.spark.streaming.api.java.JavaReceiverInputDStream
-
- fromSparkContext(SparkContext) - Static method in class org.apache.spark.api.java.JavaSparkContext
-
- fromStage(Stage, Option<Object>) - Static method in class org.apache.spark.scheduler.StageInfo
-
Construct a StageInfo from a Stage.
- fromString(String) - Static method in enum org.apache.spark.JobExecutionStatus
-
- fromString(String) - Static method in class org.apache.spark.mllib.tree.loss.Losses
-
- fromString(String) - Static method in enum org.apache.spark.status.api.v1.ApplicationStatus
-
- fromString(String) - Static method in enum org.apache.spark.status.api.v1.StageStatus
-
- fromString(String) - Static method in enum org.apache.spark.status.api.v1.TaskSorting
-
- fromString(String) - Static method in class org.apache.spark.storage.StorageLevel
-
:: DeveloperApi ::
Return the StorageLevel object with the specified name.
- fromStructField(StructField) - Static method in class org.apache.spark.ml.attribute.AttributeGroup
-
Creates an attribute group from a StructField
instance.
- fullOuterJoin(JavaPairRDD<K, W>, Partitioner) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Perform a full outer join of this
and other
.
- fullOuterJoin(JavaPairRDD<K, W>) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Perform a full outer join of this
and other
.
- fullOuterJoin(JavaPairRDD<K, W>, int) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Perform a full outer join of this
and other
.
- fullOuterJoin(RDD<Tuple2<K, W>>, Partitioner) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Perform a full outer join of this
and other
.
- fullOuterJoin(RDD<Tuple2<K, W>>) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Perform a full outer join of this
and other
.
- fullOuterJoin(RDD<Tuple2<K, W>>, int) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Perform a full outer join of this
and other
.
- fullOuterJoin(JavaPairDStream<K, W>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Return a new DStream by applying 'full outer join' between RDDs of this
DStream and
other
DStream.
- fullOuterJoin(JavaPairDStream<K, W>, int) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Return a new DStream by applying 'full outer join' between RDDs of this
DStream and
other
DStream.
- fullOuterJoin(JavaPairDStream<K, W>, Partitioner) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Return a new DStream by applying 'full outer join' between RDDs of this
DStream and
other
DStream.
- fullOuterJoin(DStream<Tuple2<K, W>>, ClassTag<W>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
-
Return a new DStream by applying 'full outer join' between RDDs of this
DStream and
other
DStream.
- fullOuterJoin(DStream<Tuple2<K, W>>, int, ClassTag<W>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
-
Return a new DStream by applying 'full outer join' between RDDs of this
DStream and
other
DStream.
- fullOuterJoin(DStream<Tuple2<K, W>>, Partitioner, ClassTag<W>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
-
Return a new DStream by applying 'full outer join' between RDDs of this
DStream and
other
DStream.
- fullStackTrace() - Method in class org.apache.spark.ExceptionFailure
-
- Function<T1,R> - Interface in org.apache.spark.api.java.function
-
Base interface for functions whose return types do not create special RDDs.
- Function0<R> - Interface in org.apache.spark.api.java.function
-
A zero-argument function that returns an R.
- Function2<T1,T2,R> - Interface in org.apache.spark.api.java.function
-
A two-argument function that takes arguments of type T1 and T2 and returns an R.
- Function3<T1,T2,T3,R> - Interface in org.apache.spark.api.java.function
-
A three-argument function that takes arguments of type T1, T2 and T3 and returns an R.
- functions - Class in org.apache.spark.sql
-
- functions() - Constructor for class org.apache.spark.sql.functions
-
- FutureAction<T> - Interface in org.apache.spark
-
A future for the result of an action to support cancellation.
- futureExecutionContext() - Static method in class org.apache.spark.rdd.AsyncRDDActions
-
- gain() - Method in class org.apache.spark.ml.tree.InternalNode
-
- gain() - Method in class org.apache.spark.mllib.tree.model.InformationGainStats
-
- gamma1() - Method in class org.apache.spark.graphx.lib.SVDPlusPlus.Conf
-
- gamma2() - Method in class org.apache.spark.graphx.lib.SVDPlusPlus.Conf
-
- gamma6() - Method in class org.apache.spark.graphx.lib.SVDPlusPlus.Conf
-
- gamma7() - Method in class org.apache.spark.graphx.lib.SVDPlusPlus.Conf
-
- GammaGenerator - Class in org.apache.spark.mllib.random
-
:: DeveloperApi ::
Generates i.i.d.
- GammaGenerator(double, double) - Constructor for class org.apache.spark.mllib.random.GammaGenerator
-
- gammaJavaRDD(JavaSparkContext, double, double, long, int, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs
-
- gammaJavaRDD(JavaSparkContext, double, double, long, int) - Static method in class org.apache.spark.mllib.random.RandomRDDs
-
- gammaJavaRDD(JavaSparkContext, double, double, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs
-
- gammaJavaVectorRDD(JavaSparkContext, double, double, long, int, int, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs
-
- gammaJavaVectorRDD(JavaSparkContext, double, double, long, int, int) - Static method in class org.apache.spark.mllib.random.RandomRDDs
-
- gammaJavaVectorRDD(JavaSparkContext, double, double, long, int) - Static method in class org.apache.spark.mllib.random.RandomRDDs
-
- gammaRDD(SparkContext, double, double, long, int, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs
-
Generates an RDD comprised of i.i.d.
samples from the gamma distribution with the input
shape and scale.
- gammaVectorRDD(SparkContext, double, double, long, int, int, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs
-
Generates an RDD[Vector] with vectors containing i.i.d.
samples drawn from the
gamma distribution with the input shape and scale.
- gaps() - Method in class org.apache.spark.ml.feature.RegexTokenizer
-
Indicates whether regex splits on gaps (true) or matches tokens (false).
- GaussianMixture - Class in org.apache.spark.mllib.clustering
-
:: Experimental ::
- GaussianMixture() - Constructor for class org.apache.spark.mllib.clustering.GaussianMixture
-
Constructs a default instance.
- GaussianMixtureModel - Class in org.apache.spark.mllib.clustering
-
:: Experimental ::
- GaussianMixtureModel(double[], MultivariateGaussian[]) - Constructor for class org.apache.spark.mllib.clustering.GaussianMixtureModel
-
- gaussians() - Method in class org.apache.spark.mllib.clustering.GaussianMixtureModel
-
- GBTClassificationModel - Class in org.apache.spark.ml.classification
-
:: Experimental ::
Gradient-Boosted Trees (GBTs)
model for classification.
- GBTClassificationModel(String, DecisionTreeRegressionModel[], double[]) - Constructor for class org.apache.spark.ml.classification.GBTClassificationModel
-
- GBTClassifier - Class in org.apache.spark.ml.classification
-
:: Experimental ::
Gradient-Boosted Trees (GBTs)
learning algorithm for classification.
- GBTClassifier(String) - Constructor for class org.apache.spark.ml.classification.GBTClassifier
-
- GBTClassifier() - Constructor for class org.apache.spark.ml.classification.GBTClassifier
-
- GBTRegressionModel - Class in org.apache.spark.ml.regression
-
:: Experimental ::
- GBTRegressionModel(String, DecisionTreeRegressionModel[], double[]) - Constructor for class org.apache.spark.ml.regression.GBTRegressionModel
-
- GBTRegressor - Class in org.apache.spark.ml.regression
-
:: Experimental ::
Gradient-Boosted Trees (GBTs)
learning algorithm for regression.
- GBTRegressor(String) - Constructor for class org.apache.spark.ml.regression.GBTRegressor
-
- GBTRegressor() - Constructor for class org.apache.spark.ml.regression.GBTRegressor
-
- GeneralizedLinearAlgorithm<M extends GeneralizedLinearModel> - Class in org.apache.spark.mllib.regression
-
:: DeveloperApi ::
GeneralizedLinearAlgorithm implements methods to train a Generalized Linear Model (GLM).
- GeneralizedLinearAlgorithm() - Constructor for class org.apache.spark.mllib.regression.GeneralizedLinearAlgorithm
-
- GeneralizedLinearModel - Class in org.apache.spark.mllib.regression
-
:: DeveloperApi ::
GeneralizedLinearModel (GLM) represents a model trained using
GeneralizedLinearAlgorithm.
- GeneralizedLinearModel(Vector, double) - Constructor for class org.apache.spark.mllib.regression.GeneralizedLinearModel
-
- generate(String, String, int, int) - Static method in class org.apache.spark.examples.streaming.KinesisWordProducerASL
-
- generatedRDDs() - Method in class org.apache.spark.streaming.dstream.DStream
-
- generateKMeansRDD(SparkContext, int, int, int, double, int) - Static method in class org.apache.spark.mllib.util.KMeansDataGenerator
-
Generate an RDD containing test data for KMeans.
- generateLinearInput(double, double[], int, int, double) - Static method in class org.apache.spark.mllib.util.LinearDataGenerator
-
For compatibility, the generated data without specifying the mean and variance
will have zero mean and variance of (1.0/3.0) since the original output range is
[-1, 1] with uniform distribution, and the variance of uniform distribution
is (b - a)^2^ / 12 which will be (1.0/3.0)
- generateLinearInput(double, double[], double[], double[], int, int, double) - Static method in class org.apache.spark.mllib.util.LinearDataGenerator
-
- generateLinearInputAsList(double, double[], int, int, double) - Static method in class org.apache.spark.mllib.util.LinearDataGenerator
-
Return a Java List of synthetic data randomly generated according to a multi
collinear model.
- generateLinearRDD(SparkContext, int, int, double, int, double) - Static method in class org.apache.spark.mllib.util.LinearDataGenerator
-
Generate an RDD containing sample data for Linear Regression models - including Ridge, Lasso,
and uregularized variants.
- generateLogisticRDD(SparkContext, int, int, double, int, double) - Static method in class org.apache.spark.mllib.util.LogisticRegressionDataGenerator
-
Generate an RDD containing test data for LogisticRegression.
- generateRandomEdges(int, int, int, long) - Static method in class org.apache.spark.graphx.util.GraphGenerators
-
- geq(Object) - Method in class org.apache.spark.sql.Column
-
Greater than or equal to an expression.
- get() - Method in interface org.apache.spark.FutureAction
-
Blocks and returns the result of this job.
- get(Param<T>) - Method in class org.apache.spark.ml.param.ParamMap
-
Optionally returns the value associated with a param.
- get(Param<T>) - Method in interface org.apache.spark.ml.param.Params
-
Optionally returns the user-supplied value of a param.
- get(String) - Method in class org.apache.spark.SparkConf
-
Get a parameter; throws a NoSuchElementException if it's not set
- get(String, String) - Method in class org.apache.spark.SparkConf
-
Get a parameter, falling back to a default if not set
- get() - Static method in class org.apache.spark.SparkEnv
-
Returns the SparkEnv.
- get(String) - Static method in class org.apache.spark.SparkFiles
-
Get the absolute path of a file added through SparkContext.addFile()
.
- get(int) - Method in interface org.apache.spark.sql.Row
-
Returns the value at position i.
- get() - Static method in class org.apache.spark.TaskContext
-
Return the currently active TaskContext.
- getActive() - Static method in class org.apache.spark.streaming.StreamingContext
-
:: Experimental ::
- getActiveJobIds() - Method in class org.apache.spark.api.java.JavaSparkStatusTracker
-
Returns an array containing the ids of all active jobs.
- getActiveJobIds() - Method in class org.apache.spark.SparkStatusTracker
-
Returns an array containing the ids of all active jobs.
- getActiveOrCreate(Function0<StreamingContext>) - Static method in class org.apache.spark.streaming.StreamingContext
-
:: Experimental ::
- getActiveOrCreate(String, Function0<StreamingContext>, Configuration, boolean) - Static method in class org.apache.spark.streaming.StreamingContext
-
:: Experimental ::
- getActiveStageIds() - Method in class org.apache.spark.api.java.JavaSparkStatusTracker
-
Returns an array containing the ids of all active stages.
- getActiveStageIds() - Method in class org.apache.spark.SparkStatusTracker
-
Returns an array containing the ids of all active stages.
- getAkkaConf() - Method in class org.apache.spark.SparkConf
-
Get all akka conf variables set on this SparkConf
- getAlgo() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
-
- getAll() - Method in class org.apache.spark.SparkConf
-
Get all parameters as a list of pairs
- getAllConfs() - Method in class org.apache.spark.sql.SQLContext
-
Return all the configuration properties that have been set (i.e.
- getAllPools() - Method in class org.apache.spark.SparkContext
-
:: DeveloperApi ::
Return pools for fair scheduler
- getAlpha() - Method in class org.apache.spark.mllib.clustering.LDA
-
Alias for getDocConcentration
- getAppId() - Method in class org.apache.spark.SparkConf
-
Returns the Spark application id, valid in the Driver after TaskScheduler registration and
from the start in the Executor.
- getAs(int) - Method in interface org.apache.spark.sql.Row
-
Returns the value at position i.
- getAs(String) - Method in interface org.apache.spark.sql.Row
-
Returns the value of a given fieldName.
- getAttr(String) - Method in class org.apache.spark.ml.attribute.AttributeGroup
-
Gets an attribute by its name.
- getAttr(int) - Method in class org.apache.spark.ml.attribute.AttributeGroup
-
Gets an attribute by its index.
- getBeta() - Method in class org.apache.spark.mllib.clustering.LDA
-
Alias for getTopicConcentration
- getBlock(BlockId) - Method in class org.apache.spark.storage.StorageStatus
-
Return the given block stored in this block manager in O(1) time.
- getBoolean(String, boolean) - Method in class org.apache.spark.SparkConf
-
Get a parameter as a boolean, falling back to a default if not set
- getBoolean(int) - Method in interface org.apache.spark.sql.Row
-
Returns the value at position i as a primitive boolean.
- getBoolean(String) - Method in class org.apache.spark.sql.types.Metadata
-
Gets a Boolean.
- getBooleanArray(String) - Method in class org.apache.spark.sql.types.Metadata
-
Gets a Boolean array.
- getByte(int) - Method in interface org.apache.spark.sql.Row
-
Returns the value at position i as a primitive byte.
- getBytes() - Method in class org.apache.spark.sql.types.UTF8String
-
- getCachedBlockManagerId(BlockManagerId) - Static method in class org.apache.spark.storage.BlockManagerId
-
- getCachedMetadata(String) - Static method in class org.apache.spark.rdd.HadoopRDD
-
The three methods below are helpers for accessing the local map, a property of the SparkEnv of
the local process.
- getCatalystType(int, String, int, MetadataBuilder) - Method in class org.apache.spark.sql.jdbc.AggregatedDialect
-
- getCatalystType(int, String, int, MetadataBuilder) - Method in class org.apache.spark.sql.jdbc.JdbcDialect
-
Get the custom datatype mapping for the given jdbc meta information.
- getCatalystType(int, String, int, MetadataBuilder) - Static method in class org.apache.spark.sql.jdbc.MySQLDialect
-
- getCatalystType(int, String, int, MetadataBuilder) - Static method in class org.apache.spark.sql.jdbc.PostgresDialect
-
- getCategoricalFeaturesInfo() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
-
- getCategoryMaps() - Method in class org.apache.spark.ml.feature.VectorIndexer.CategoryStats
-
Based on stats collected, decide which features are categorical,
and choose indices for categories.
- getCheckpointDir() - Method in class org.apache.spark.api.java.JavaSparkContext
-
- getCheckpointDir() - Method in class org.apache.spark.SparkContext
-
- getCheckpointFile() - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Gets the name of the file to which this RDD was checkpointed
- getCheckpointFile() - Method in class org.apache.spark.graphx.impl.EdgeRDDImpl
-
- getCheckpointFile() - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
-
- getCheckpointFile() - Method in class org.apache.spark.rdd.RDD
-
Gets the name of the file to which this RDD was checkpointed
- getCheckpointFiles() - Method in class org.apache.spark.graphx.Graph
-
Gets the name of the files to which this Graph was checkpointed.
- getCheckpointFiles() - Method in class org.apache.spark.graphx.impl.GraphImpl
-
- getCheckpointInterval() - Method in class org.apache.spark.mllib.clustering.LDA
-
Period (in iterations) between checkpoints.
- getCheckpointInterval() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
-
- getConf() - Method in class org.apache.spark.api.java.JavaSparkContext
-
Return a copy of this JavaSparkContext's configuration.
- getConf() - Method in class org.apache.spark.rdd.HadoopRDD
-
- getConf() - Method in class org.apache.spark.rdd.NewHadoopRDD
-
- getConf() - Method in class org.apache.spark.SparkContext
-
Return a copy of this SparkContext's configuration.
- getConf(String) - Method in class org.apache.spark.sql.SQLContext
-
Return the value of Spark SQL configuration property for the given key.
- getConf(String, String) - Method in class org.apache.spark.sql.SQLContext
-
Return the value of Spark SQL configuration property for the given key.
- getConnection() - Method in interface org.apache.spark.rdd.JdbcRDD.ConnectionFactory
-
- getConvergenceTol() - Method in class org.apache.spark.mllib.clustering.GaussianMixture
-
Return the largest change in log-likelihood at which convergence is
considered to have occurred.
- getDate(int) - Method in interface org.apache.spark.sql.Row
-
Returns the value at position i of date type as java.sql.Date.
- getDecimal(int) - Method in interface org.apache.spark.sql.Row
-
Returns the value at position i of decimal type as java.math.BigDecimal.
- getDefault(Param<T>) - Method in interface org.apache.spark.ml.param.Params
-
Gets the default value of a parameter.
- getDegree() - Method in class org.apache.spark.ml.feature.PolynomialExpansion
-
- getDependencies() - Method in class org.apache.spark.rdd.CoGroupedRDD
-
- getDependencies() - Method in class org.apache.spark.rdd.ShuffledRDD
-
- getDependencies() - Method in class org.apache.spark.rdd.UnionRDD
-
- getDeprecatedConfig(String, SparkConf) - Static method in class org.apache.spark.SparkConf
-
Looks for available deprecated keys for the given config option, and return the first
value available.
- getDocConcentration() - Method in class org.apache.spark.mllib.clustering.LDA
-
Concentration parameter (commonly named "alpha") for the prior placed on documents'
distributions over topics ("theta").
- getDouble(String, double) - Method in class org.apache.spark.SparkConf
-
Get a parameter as a double, falling back to a default if not set
- getDouble(int) - Method in interface org.apache.spark.sql.Row
-
Returns the value at position i as a primitive double.
- getDouble(String) - Method in class org.apache.spark.sql.types.Metadata
-
Gets a Double.
- getDoubleArray(String) - Method in class org.apache.spark.sql.types.Metadata
-
Gets a Double array.
- getEpsilon() - Method in class org.apache.spark.mllib.clustering.KMeans
-
The distance threshold within which we've consider centers to have converged.
- getExecutorEnv() - Method in class org.apache.spark.SparkConf
-
Get all executor environment variables set on this SparkConf
- getExecutorMemoryStatus() - Method in class org.apache.spark.SparkContext
-
Return a map from the slave to the max memory available for caching and the remaining
memory available for caching.
- getExecutorStorageStatus() - Method in class org.apache.spark.SparkContext
-
:: DeveloperApi ::
Return information about blocks stored in all of the slaves
- getField(String) - Method in class org.apache.spark.sql.Column
-
An expression that gets a field by name in a StructType
.
- getFinalValue() - Method in class org.apache.spark.partial.PartialResult
-
Blocking method to wait for and return the final value.
- getFloat(int) - Method in interface org.apache.spark.sql.Row
-
Returns the value at position i as a primitive float.
- getGaps() - Method in class org.apache.spark.ml.feature.RegexTokenizer
-
- getImpurity() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
-
- getInitializationMode() - Method in class org.apache.spark.mllib.clustering.KMeans
-
The initialization algorithm.
- getInitializationSteps() - Method in class org.apache.spark.mllib.clustering.KMeans
-
Number of steps for the k-means|| initialization mode
- getInitialModel() - Method in class org.apache.spark.mllib.clustering.GaussianMixture
-
Return the user supplied initial GMM, if supplied
- getInt(String, int) - Method in class org.apache.spark.SparkConf
-
Get a parameter as an integer, falling back to a default if not set
- getInt(int) - Method in interface org.apache.spark.sql.Row
-
Returns the value at position i as a primitive int.
- getItem(Object) - Method in class org.apache.spark.sql.Column
-
An expression that gets an item at position ordinal
out of an array,
or gets a value by key key
in a MapType
.
- getJavaMap(int) - Method in interface org.apache.spark.sql.Row
-
Returns the value at position i of array type as a Map
.
- getJDBCType(DataType) - Method in class org.apache.spark.sql.jdbc.AggregatedDialect
-
- getJDBCType(DataType) - Method in class org.apache.spark.sql.jdbc.JdbcDialect
-
Retrieve the jdbc / sql type for a given datatype.
- getJDBCType(DataType) - Static method in class org.apache.spark.sql.jdbc.PostgresDialect
-
- getJobIdsForGroup(String) - Method in class org.apache.spark.api.java.JavaSparkStatusTracker
-
Return a list of all known jobs in a particular job group.
- getJobIdsForGroup(String) - Method in class org.apache.spark.SparkStatusTracker
-
Return a list of all known jobs in a particular job group.
- getJobInfo(int) - Method in class org.apache.spark.api.java.JavaSparkStatusTracker
-
Returns job information, or null
if the job info could not be found or was garbage collected.
- getJobInfo(int) - Method in class org.apache.spark.SparkStatusTracker
-
Returns job information, or None
if the job info could not be found or was garbage collected.
- getK() - Method in class org.apache.spark.mllib.clustering.GaussianMixture
-
Return the number of Gaussians in the mixture model
- getK() - Method in class org.apache.spark.mllib.clustering.KMeans
-
Number of clusters to create (k).
- getK() - Method in class org.apache.spark.mllib.clustering.LDA
-
Number of topics to infer.
- getKappa() - Method in class org.apache.spark.mllib.clustering.OnlineLDAOptimizer
-
Learning rate: exponential decay rate
- getLambda() - Method in class org.apache.spark.mllib.classification.NaiveBayes
-
Get the smoothing parameter.
- getLDAModel(double[]) - Method in interface org.apache.spark.mllib.clustering.LDAOptimizer
-
- getLearningRate() - Method in class org.apache.spark.mllib.tree.configuration.BoostingStrategy
-
- getLeastGroupHash(String) - Method in class org.apache.spark.rdd.PartitionCoalescer
-
Sorts and gets the least element of the list associated with key in groupHash
The returned PartitionGroup is the least loaded of all groups that represent the machine "key"
- getList(int) - Method in interface org.apache.spark.sql.Row
-
Returns the value at position i of array type as List
.
- getLocalProperty(String) - Method in class org.apache.spark.api.java.JavaSparkContext
-
Get a local property set in this thread, or null if it is missing.
- getLocalProperty(String) - Method in class org.apache.spark.SparkContext
-
Get a local property set in this thread, or null if it is missing.
- getLong(String, long) - Method in class org.apache.spark.SparkConf
-
Get a parameter as a long, falling back to a default if not set
- getLong(int) - Method in interface org.apache.spark.sql.Row
-
Returns the value at position i as a primitive long.
- getLong(String) - Method in class org.apache.spark.sql.types.Metadata
-
Gets a Long.
- getLongArray(String) - Method in class org.apache.spark.sql.types.Metadata
-
Gets a Long array.
- getLoss() - Method in class org.apache.spark.mllib.tree.configuration.BoostingStrategy
-
- getLossType() - Method in class org.apache.spark.ml.classification.GBTClassifier
-
- getLossType() - Method in class org.apache.spark.ml.regression.GBTRegressor
-
- getMap(int) - Method in interface org.apache.spark.sql.Row
-
Returns the value at position i of map type as a Scala Map.
- getMaxBins() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
-
- getMaxDepth() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
-
- getMaxIterations() - Method in class org.apache.spark.mllib.clustering.GaussianMixture
-
Return the maximum number of iterations to run
- getMaxIterations() - Method in class org.apache.spark.mllib.clustering.KMeans
-
Maximum number of iterations to run.
- getMaxIterations() - Method in class org.apache.spark.mllib.clustering.LDA
-
Maximum number of iterations for learning.
- getMaxMemoryInMB() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
-
- getMessage() - Method in exception org.apache.spark.sql.AnalysisException
-
- getMetadata(String) - Method in class org.apache.spark.sql.types.Metadata
-
Gets a Metadata.
- getMetadataArray(String) - Method in class org.apache.spark.sql.types.Metadata
-
Gets a Metadata array.
- getMetricName() - Method in class org.apache.spark.ml.evaluation.BinaryClassificationEvaluator
-
- getMetricName() - Method in class org.apache.spark.ml.evaluation.RegressionEvaluator
-
- getMiniBatchFraction() - Method in class org.apache.spark.mllib.clustering.OnlineLDAOptimizer
-
Mini-batch fraction, which sets the fraction of document sampled and used in each iteration
- getMinInfoGain() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
-
- getMinInstancesPerNode() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
-
- getMinTokenLength() - Method in class org.apache.spark.ml.feature.RegexTokenizer
-
- getModelType() - Method in class org.apache.spark.mllib.classification.NaiveBayes
-
Get the model type.
- getNode(int, Node) - Static method in class org.apache.spark.mllib.tree.model.Node
-
Traces down from a root node to get the node with the given node index.
- getNumClasses() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
-
- getNumFeatures() - Method in class org.apache.spark.ml.feature.HashingTF
-
- getNumFeatures() - Method in class org.apache.spark.mllib.regression.GeneralizedLinearAlgorithm
-
The dimension of training features.
- getNumIterations() - Method in class org.apache.spark.mllib.tree.configuration.BoostingStrategy
-
- getNumValues() - Method in class org.apache.spark.ml.attribute.NominalAttribute
-
Get the number of values, either from numValues
or from values
.
- getOptimizer() - Method in class org.apache.spark.mllib.clustering.LDA
-
:: DeveloperApi ::
- getOption(String) - Method in class org.apache.spark.SparkConf
-
Get a parameter as an Option
- getOrCreate(SparkConf) - Static method in class org.apache.spark.SparkContext
-
This function may be used to get or instantiate a SparkContext and register it as a
singleton object.
- getOrCreate() - Static method in class org.apache.spark.SparkContext
-
This function may be used to get or instantiate a SparkContext and register it as a
singleton object.
- getOrCreate(SparkContext) - Static method in class org.apache.spark.sql.SQLContext
-
Get the singleton SQLContext if it exists or create a new one using the given SparkContext.
- getOrCreate(String, JavaStreamingContextFactory) - Static method in class org.apache.spark.streaming.api.java.JavaStreamingContext
-
Deprecated.
As of 1.4.0, replaced by getOrCreate
without JavaStreamingContextFactor.
- getOrCreate(String, Configuration, JavaStreamingContextFactory) - Static method in class org.apache.spark.streaming.api.java.JavaStreamingContext
-
Deprecated.
As of 1.4.0, replaced by getOrCreate
without JavaStreamingContextFactor.
- getOrCreate(String, Configuration, JavaStreamingContextFactory, boolean) - Static method in class org.apache.spark.streaming.api.java.JavaStreamingContext
-
Deprecated.
As of 1.4.0, replaced by getOrCreate
without JavaStreamingContextFactor.
- getOrCreate(String, Function0<JavaStreamingContext>) - Static method in class org.apache.spark.streaming.api.java.JavaStreamingContext
-
Either recreate a StreamingContext from checkpoint data or create a new StreamingContext.
- getOrCreate(String, Function0<JavaStreamingContext>, Configuration) - Static method in class org.apache.spark.streaming.api.java.JavaStreamingContext
-
Either recreate a StreamingContext from checkpoint data or create a new StreamingContext.
- getOrCreate(String, Function0<JavaStreamingContext>, Configuration, boolean) - Static method in class org.apache.spark.streaming.api.java.JavaStreamingContext
-
Either recreate a StreamingContext from checkpoint data or create a new StreamingContext.
- getOrCreate(String, Function0<StreamingContext>, Configuration, boolean) - Static method in class org.apache.spark.streaming.StreamingContext
-
Either recreate a StreamingContext from checkpoint data or create a new StreamingContext.
- getOrDefault(Param<T>) - Method in interface org.apache.spark.ml.param.Params
-
Gets the value of a param in the embedded param map or its default value.
- getOrElse(Param<T>, T) - Method in class org.apache.spark.ml.param.ParamMap
-
Returns the value associated with a param or a default value.
- getP() - Method in class org.apache.spark.ml.feature.Normalizer
-
- getParam(String) - Method in interface org.apache.spark.ml.param.Params
-
Gets a param by its name.
- getParents(int) - Method in class org.apache.spark.NarrowDependency
-
Get the parent partitions for a child partition.
- getParents(int) - Method in class org.apache.spark.OneToOneDependency
-
- getParents(int) - Method in class org.apache.spark.RangeDependency
-
- getPartition(long, long, int) - Method in class org.apache.spark.graphx.PartitionStrategy.CanonicalRandomVertexCut$
-
- getPartition(long, long, int) - Method in class org.apache.spark.graphx.PartitionStrategy.EdgePartition1D$
-
- getPartition(long, long, int) - Method in class org.apache.spark.graphx.PartitionStrategy.EdgePartition2D$
-
- getPartition(long, long, int) - Method in interface org.apache.spark.graphx.PartitionStrategy
-
Returns the partition number for a given edge.
- getPartition(long, long, int) - Method in class org.apache.spark.graphx.PartitionStrategy.RandomVertexCut$
-
- getPartition(Object) - Method in class org.apache.spark.HashPartitioner
-
- getPartition(Object) - Method in class org.apache.spark.Partitioner
-
- getPartition(Object) - Method in class org.apache.spark.RangePartitioner
-
- getPartitions() - Method in class org.apache.spark.api.r.BaseRRDD
-
- getPartitions() - Method in class org.apache.spark.rdd.CoGroupedRDD
-
- getPartitions() - Method in class org.apache.spark.rdd.HadoopRDD
-
- getPartitions() - Method in class org.apache.spark.rdd.JdbcRDD
-
- getPartitions() - Method in class org.apache.spark.rdd.NewHadoopRDD
-
- getPartitions() - Method in class org.apache.spark.rdd.PartitionCoalescer
-
- getPartitions() - Method in class org.apache.spark.rdd.ShuffledRDD
-
- getPartitions() - Method in class org.apache.spark.rdd.UnionRDD
-
- getPath() - Method in class org.apache.spark.input.PortableDataStream
-
- getPattern() - Method in class org.apache.spark.ml.feature.RegexTokenizer
-
- getPersistentRDDs() - Method in class org.apache.spark.SparkContext
-
Returns an immutable map of RDDs that have marked themselves as persistent via cache() call.
- getPoolForName(String) - Method in class org.apache.spark.SparkContext
-
:: DeveloperApi ::
Return the pool associated with the given name, if one exists
- getPreferredLocations(Partition) - Method in class org.apache.spark.rdd.HadoopRDD
-
- getPreferredLocations(Partition) - Method in class org.apache.spark.rdd.NewHadoopRDD
-
- getPreferredLocations(Partition) - Method in class org.apache.spark.rdd.UnionRDD
-
- getQuantileCalculationStrategy() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
-
- getRDDStorageInfo() - Method in class org.apache.spark.SparkContext
-
:: DeveloperApi ::
Return information about what RDDs are cached, if they are in mem or on disk, how much space
they take, etc.
- getReceiver() - Method in class org.apache.spark.streaming.dstream.ReceiverInputDStream
-
Gets the receiver object that will be sent to the worker nodes
to receive data.
- getRootDirectory() - Static method in class org.apache.spark.SparkFiles
-
Get the root directory that contains files added through SparkContext.addFile()
.
- getRuns() - Method in class org.apache.spark.mllib.clustering.KMeans
-
:: Experimental ::
Number of runs of the algorithm to execute in parallel.
- getScalingVec() - Method in class org.apache.spark.ml.feature.ElementwiseProduct
-
- getSchedulingMode() - Method in class org.apache.spark.SparkContext
-
Return current scheduling mode
- getSeed() - Method in class org.apache.spark.mllib.clustering.GaussianMixture
-
Return the random seed
- getSeed() - Method in class org.apache.spark.mllib.clustering.KMeans
-
The random seed for cluster initialization.
- getSeed() - Method in class org.apache.spark.mllib.clustering.LDA
-
Random seed
- getSeq(int) - Method in interface org.apache.spark.sql.Row
-
Returns the value at position i of array type as a Scala Seq.
- getSerializer(Serializer) - Static method in class org.apache.spark.serializer.Serializer
-
- getSerializer(Option<Serializer>) - Static method in class org.apache.spark.serializer.Serializer
-
- getShort(int) - Method in interface org.apache.spark.sql.Row
-
Returns the value at position i as a primitive short.
- getSizeAsBytes(String) - Method in class org.apache.spark.SparkConf
-
Get a size parameter as bytes; throws a NoSuchElementException if it's not set.
- getSizeAsBytes(String, String) - Method in class org.apache.spark.SparkConf
-
Get a size parameter as bytes, falling back to a default if not set.
- getSizeAsGb(String) - Method in class org.apache.spark.SparkConf
-
Get a size parameter as Gibibytes; throws a NoSuchElementException if it's not set.
- getSizeAsGb(String, String) - Method in class org.apache.spark.SparkConf
-
Get a size parameter as Gibibytes, falling back to a default if not set.
- getSizeAsKb(String) - Method in class org.apache.spark.SparkConf
-
Get a size parameter as Kibibytes; throws a NoSuchElementException if it's not set.
- getSizeAsKb(String, String) - Method in class org.apache.spark.SparkConf
-
Get a size parameter as Kibibytes, falling back to a default if not set.
- getSizeAsMb(String) - Method in class org.apache.spark.SparkConf
-
Get a size parameter as Mebibytes; throws a NoSuchElementException if it's not set.
- getSizeAsMb(String, String) - Method in class org.apache.spark.SparkConf
-
Get a size parameter as Mebibytes, falling back to a default if not set.
- getSparkHome() - Method in class org.apache.spark.api.java.JavaSparkContext
-
Get Spark's home location from either a value set through the constructor,
or the spark.home Java property, or the SPARK_HOME environment variable
(in that order of preference).
- getSplits() - Method in class org.apache.spark.ml.feature.Bucketizer
-
- getStageInfo(int) - Method in class org.apache.spark.api.java.JavaSparkStatusTracker
-
Returns stage information, or null
if the stage info could not be found or was
garbage collected.
- getStageInfo(int) - Method in class org.apache.spark.SparkStatusTracker
-
Returns stage information, or None
if the stage info could not be found or was
garbage collected.
- getStages() - Method in class org.apache.spark.ml.Pipeline
-
- getState() - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
-
:: DeveloperApi ::
- getState() - Method in class org.apache.spark.streaming.StreamingContext
-
:: DeveloperApi ::
- getStorageLevel() - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Get the RDD's current storage level, or StorageLevel.NONE if none is set.
- getStorageLevel() - Method in class org.apache.spark.graphx.impl.EdgeRDDImpl
-
- getStorageLevel() - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
-
- getStorageLevel() - Method in class org.apache.spark.rdd.RDD
-
Get the RDD's current storage level, or StorageLevel.NONE if none is set.
- getString(int) - Method in interface org.apache.spark.sql.Row
-
Returns the value at position i as a String object.
- getString(String) - Method in class org.apache.spark.sql.types.Metadata
-
Gets a String.
- getStringArray(String) - Method in class org.apache.spark.sql.types.Metadata
-
Gets a String array.
- getStruct(int) - Method in interface org.apache.spark.sql.Row
-
Returns the value at position i of struct type as an
Row
object.
- getSubsamplingRate() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
-
- getTau0() - Method in class org.apache.spark.mllib.clustering.OnlineLDAOptimizer
-
A (positive) learning parameter that downweights early iterations.
- getThreadLocal() - Static method in class org.apache.spark.SparkEnv
-
Returns the ThreadLocal SparkEnv.
- getThreshold() - Method in class org.apache.spark.ml.feature.Binarizer
-
- getThreshold() - Method in class org.apache.spark.mllib.classification.LogisticRegressionModel
-
:: Experimental ::
Returns the threshold (if any) used for converting raw prediction scores into 0/1 predictions.
- getThreshold() - Method in class org.apache.spark.mllib.classification.SVMModel
-
:: Experimental ::
Returns the threshold (if any) used for converting raw prediction scores into 0/1 predictions.
- getTimeAsMs(String) - Method in class org.apache.spark.SparkConf
-
Get a time parameter as milliseconds; throws a NoSuchElementException if it's not set.
- getTimeAsMs(String, String) - Method in class org.apache.spark.SparkConf
-
Get a time parameter as milliseconds, falling back to a default if not set.
- getTimeAsSeconds(String) - Method in class org.apache.spark.SparkConf
-
Get a time parameter as seconds; throws a NoSuchElementException if it's not set.
- getTimeAsSeconds(String, String) - Method in class org.apache.spark.SparkConf
-
Get a time parameter as seconds, falling back to a default if not set.
- gettingResult() - Method in class org.apache.spark.scheduler.TaskInfo
-
- gettingResultTime() - Method in class org.apache.spark.scheduler.TaskInfo
-
The time when the task started remotely getting the result.
- getTopicConcentration() - Method in class org.apache.spark.mllib.clustering.LDA
-
Concentration parameter (commonly named "beta" or "eta") for the prior placed on topics'
distributions over terms.
- getTreeStrategy() - Method in class org.apache.spark.mllib.tree.configuration.BoostingStrategy
-
- getUseNodeIdCache() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
-
- getValidationTol() - Method in class org.apache.spark.mllib.tree.configuration.BoostingStrategy
-
- getValue(int) - Method in class org.apache.spark.ml.attribute.NominalAttribute
-
Gets a value given its index.
- getValuesMap(Seq<String>) - Method in interface org.apache.spark.sql.Row
-
Returns a Map(name -> value) for the requested fieldNames
- getVectors() - Method in class org.apache.spark.mllib.feature.Word2VecModel
-
Returns a map of words to their vector representations.
- Gini - Class in org.apache.spark.mllib.tree.impurity
-
:: Experimental ::
Class for calculating the
Gini impurity
during binary classification.
- Gini() - Constructor for class org.apache.spark.mllib.tree.impurity.Gini
-
- globalTopicTotals() - Method in class org.apache.spark.mllib.clustering.EMLDAOptimizer
-
Aggregate distributions over topics from all term vertices.
- glom() - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Return an RDD created by coalescing all elements within each partition into an array.
- glom() - Method in class org.apache.spark.rdd.RDD
-
Return an RDD created by coalescing all elements within each partition into an array.
- glom() - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
-
Return a new DStream in which each RDD is generated by applying glom() to each RDD of
this DStream.
- glom() - Method in class org.apache.spark.streaming.dstream.DStream
-
Return a new DStream in which each RDD is generated by applying glom() to each RDD of
this DStream.
- gradient() - Method in class org.apache.spark.ml.classification.LogisticAggregator
-
- gradient() - Method in class org.apache.spark.ml.regression.LeastSquaresAggregator
-
- Gradient - Class in org.apache.spark.mllib.optimization
-
:: DeveloperApi ::
Class used to compute the gradient for a loss function, given a single data point.
- Gradient() - Constructor for class org.apache.spark.mllib.optimization.Gradient
-
- gradient(double, double) - Static method in class org.apache.spark.mllib.tree.loss.AbsoluteError
-
Method to calculate the gradients for the gradient boosting calculation for least
absolute error calculation.
- gradient(double, double) - Static method in class org.apache.spark.mllib.tree.loss.LogLoss
-
Method to calculate the loss gradients for the gradient boosting calculation for binary
classification
The gradient with respect to F(x) is: - 4 y / (1 + exp(2 y F(x)))
- gradient(double, double) - Method in interface org.apache.spark.mllib.tree.loss.Loss
-
Method to calculate the gradients for the gradient boosting calculation.
- gradient(double, double) - Static method in class org.apache.spark.mllib.tree.loss.SquaredError
-
Method to calculate the gradients for the gradient boosting calculation for least
squares error calculation.
- GradientBoostedTrees - Class in org.apache.spark.mllib.tree
-
:: Experimental ::
A class that implements
Stochastic Gradient Boosting
for regression and binary classification.
- GradientBoostedTrees(BoostingStrategy) - Constructor for class org.apache.spark.mllib.tree.GradientBoostedTrees
-
- GradientBoostedTreesModel - Class in org.apache.spark.mllib.tree.model
-
:: Experimental ::
Represents a gradient boosted trees model.
- GradientBoostedTreesModel(Enumeration.Value, DecisionTreeModel[], double[]) - Constructor for class org.apache.spark.mllib.tree.model.GradientBoostedTreesModel
-
- GradientDescent - Class in org.apache.spark.mllib.optimization
-
Class used to solve an optimization problem using Gradient Descent.
- Graph<VD,ED> - Class in org.apache.spark.graphx
-
The Graph abstractly represents a graph with arbitrary objects
associated with vertices and edges.
- graph() - Method in class org.apache.spark.mllib.clustering.EMLDAOptimizer
-
The following fields will only be initialized through the initialize() method
- graph() - Method in class org.apache.spark.streaming.dstream.DStream
-
- graph() - Method in class org.apache.spark.streaming.StreamingContext
-
- GraphGenerators - Class in org.apache.spark.graphx.util
-
A collection of graph generating functions.
- GraphGenerators() - Constructor for class org.apache.spark.graphx.util.GraphGenerators
-
- GraphImpl<VD,ED> - Class in org.apache.spark.graphx.impl
-
An implementation of
Graph
to support computation on graphs.
- GraphKryoRegistrator - Class in org.apache.spark.graphx
-
Registers GraphX classes with Kryo for improved performance.
- GraphKryoRegistrator() - Constructor for class org.apache.spark.graphx.GraphKryoRegistrator
-
- GraphLoader - Class in org.apache.spark.graphx
-
Provides utilities for loading
Graph
s from files.
- GraphLoader() - Constructor for class org.apache.spark.graphx.GraphLoader
-
- GraphOps<VD,ED> - Class in org.apache.spark.graphx
-
Contains additional functionality for
Graph
.
- GraphOps(Graph<VD, ED>, ClassTag<VD>, ClassTag<ED>) - Constructor for class org.apache.spark.graphx.GraphOps
-
- graphToGraphOps(Graph<VD, ED>, ClassTag<VD>, ClassTag<ED>) - Static method in class org.apache.spark.graphx.Graph
-
Implicitly extracts the
GraphOps
member from a graph.
- GraphXUtils - Class in org.apache.spark.graphx
-
- GraphXUtils() - Constructor for class org.apache.spark.graphx.GraphXUtils
-
- greater(Duration) - Method in class org.apache.spark.streaming.Duration
-
- greater(Time) - Method in class org.apache.spark.streaming.Time
-
- greaterEq(Duration) - Method in class org.apache.spark.streaming.Duration
-
- greaterEq(Time) - Method in class org.apache.spark.streaming.Time
-
- GreaterThan - Class in org.apache.spark.sql.sources
-
A filter that evaluates to true
iff the attribute evaluates to a value
greater than value
.
- GreaterThan(String, Object) - Constructor for class org.apache.spark.sql.sources.GreaterThan
-
- GreaterThanOrEqual - Class in org.apache.spark.sql.sources
-
A filter that evaluates to true
iff the attribute evaluates to a value
greater than or equal to value
.
- GreaterThanOrEqual(String, Object) - Constructor for class org.apache.spark.sql.sources.GreaterThanOrEqual
-
- gridGraph(SparkContext, int, int) - Static method in class org.apache.spark.graphx.util.GraphGenerators
-
Create rows
by cols
grid graph with each vertex connected to its
row+1 and col+1 neighbors.
- groupArr() - Method in class org.apache.spark.rdd.PartitionCoalescer
-
- groupBy(Function<T, U>) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Return an RDD of grouped elements.
- groupBy(Function<T, U>, int) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Return an RDD of grouped elements.
- groupBy(Function1<T, K>, ClassTag<K>) - Method in class org.apache.spark.rdd.RDD
-
Return an RDD of grouped items.
- groupBy(Function1<T, K>, int, ClassTag<K>) - Method in class org.apache.spark.rdd.RDD
-
Return an RDD of grouped elements.
- groupBy(Function1<T, K>, Partitioner, ClassTag<K>, Ordering<K>) - Method in class org.apache.spark.rdd.RDD
-
Return an RDD of grouped items.
- groupBy(Column...) - Method in class org.apache.spark.sql.DataFrame
-
Groups the
DataFrame
using the specified columns, so we can run aggregation on them.
- groupBy(String, String...) - Method in class org.apache.spark.sql.DataFrame
-
Groups the
DataFrame
using the specified columns, so we can run aggregation on them.
- groupBy(Seq<Column>) - Method in class org.apache.spark.sql.DataFrame
-
Groups the
DataFrame
using the specified columns, so we can run aggregation on them.
- groupBy(String, Seq<String>) - Method in class org.apache.spark.sql.DataFrame
-
Groups the
DataFrame
using the specified columns, so we can run aggregation on them.
- groupByKey(Partitioner) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Group the values for each key in the RDD into a single sequence.
- groupByKey(int) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Group the values for each key in the RDD into a single sequence.
- groupByKey() - Method in class org.apache.spark.api.java.JavaPairRDD
-
Group the values for each key in the RDD into a single sequence.
- groupByKey(Partitioner) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Group the values for each key in the RDD into a single sequence.
- groupByKey(int) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Group the values for each key in the RDD into a single sequence.
- groupByKey() - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Group the values for each key in the RDD into a single sequence.
- groupByKey() - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Return a new DStream by applying groupByKey
to each RDD.
- groupByKey(int) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Return a new DStream by applying groupByKey
to each RDD.
- groupByKey(Partitioner) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Return a new DStream by applying groupByKey
on each RDD of this
DStream.
- groupByKey() - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
-
Return a new DStream by applying groupByKey
to each RDD.
- groupByKey(int) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
-
Return a new DStream by applying groupByKey
to each RDD.
- groupByKey(Partitioner) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
-
Return a new DStream by applying groupByKey
on each RDD.
- groupByKeyAndWindow(Duration) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Return a new DStream by applying groupByKey
over a sliding window.
- groupByKeyAndWindow(Duration, Duration) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Return a new DStream by applying groupByKey
over a sliding window.
- groupByKeyAndWindow(Duration, Duration, int) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Return a new DStream by applying groupByKey
over a sliding window on this
DStream.
- groupByKeyAndWindow(Duration, Duration, Partitioner) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Return a new DStream by applying groupByKey
over a sliding window on this
DStream.
- groupByKeyAndWindow(Duration) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
-
Return a new DStream by applying groupByKey
over a sliding window.
- groupByKeyAndWindow(Duration, Duration) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
-
Return a new DStream by applying groupByKey
over a sliding window.
- groupByKeyAndWindow(Duration, Duration, int) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
-
Return a new DStream by applying groupByKey
over a sliding window on this
DStream.
- groupByKeyAndWindow(Duration, Duration, Partitioner) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
-
Create a new DStream by applying groupByKey
over a sliding window on this
DStream.
- GroupedData - Class in org.apache.spark.sql
-
:: Experimental ::
A set of methods for aggregations on a
DataFrame
, created by
DataFrame.groupBy
.
- groupEdges(Function2<ED, ED, ED>) - Method in class org.apache.spark.graphx.Graph
-
Merges multiple edges between two vertices into a single edge.
- groupEdges(Function2<ED, ED, ED>) - Method in class org.apache.spark.graphx.impl.GraphImpl
-
- groupHash() - Method in class org.apache.spark.rdd.PartitionCoalescer
-
- groupWith(JavaPairRDD<K, W>) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Alias for cogroup.
- groupWith(JavaPairRDD<K, W1>, JavaPairRDD<K, W2>) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Alias for cogroup.
- groupWith(JavaPairRDD<K, W1>, JavaPairRDD<K, W2>, JavaPairRDD<K, W3>) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Alias for cogroup.
- groupWith(RDD<Tuple2<K, W>>) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Alias for cogroup.
- groupWith(RDD<Tuple2<K, W1>>, RDD<Tuple2<K, W2>>) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Alias for cogroup.
- groupWith(RDD<Tuple2<K, W1>>, RDD<Tuple2<K, W2>>, RDD<Tuple2<K, W3>>) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Alias for cogroup.
- gt(double) - Static method in class org.apache.spark.ml.param.ParamValidators
-
Check if value > lowerBound
- gt(Object) - Method in class org.apache.spark.sql.Column
-
Greater than.
- gtEq(double) - Static method in class org.apache.spark.ml.param.ParamValidators
-
Check if value >= lowerBound
- L1Updater - Class in org.apache.spark.mllib.optimization
-
:: DeveloperApi ::
Updater for L1 regularized problems.
- L1Updater() - Constructor for class org.apache.spark.mllib.optimization.L1Updater
-
- label() - Method in class org.apache.spark.mllib.regression.LabeledPoint
-
- LabeledPoint - Class in org.apache.spark.mllib.regression
-
Class that represents the features and labels of a data point.
- LabeledPoint(double, Vector) - Constructor for class org.apache.spark.mllib.regression.LabeledPoint
-
- LabelPropagation - Class in org.apache.spark.graphx.lib
-
Label Propagation algorithm.
- LabelPropagation() - Constructor for class org.apache.spark.graphx.lib.LabelPropagation
-
- labels() - Method in class org.apache.spark.mllib.classification.NaiveBayesModel
-
- labels() - Method in class org.apache.spark.mllib.evaluation.MulticlassMetrics
-
Returns the sequence of labels in ascending order
- labels() - Method in class org.apache.spark.mllib.evaluation.MultilabelMetrics
-
Returns the sequence of labels in ascending order
- lag(Column, int) - Static method in class org.apache.spark.sql.functions
-
Window function: returns the value that is offset
rows before the current row, and
null
if there is less than offset
rows before the current row.
- lag(String, int) - Static method in class org.apache.spark.sql.functions
-
Window function: returns the value that is offset
rows before the current row, and
null
if there is less than offset
rows before the current row.
- lag(String, int, Object) - Static method in class org.apache.spark.sql.functions
-
Window function: returns the value that is offset
rows before the current row, and
defaultValue
if there is less than offset
rows before the current row.
- lag(Column, int, Object) - Static method in class org.apache.spark.sql.functions
-
Window function: returns the value that is offset
rows before the current row, and
defaultValue
if there is less than offset
rows before the current row.
- LassoModel - Class in org.apache.spark.mllib.regression
-
Regression model trained using Lasso.
- LassoModel(Vector, double) - Constructor for class org.apache.spark.mllib.regression.LassoModel
-
- LassoWithSGD - Class in org.apache.spark.mllib.regression
-
Train a regression model with L1-regularization using Stochastic Gradient Descent.
- LassoWithSGD() - Constructor for class org.apache.spark.mllib.regression.LassoWithSGD
-
Construct a Lasso object with default parameters: {stepSize: 1.0, numIterations: 100,
regParam: 0.01, miniBatchFraction: 1.0}.
- last(Column) - Static method in class org.apache.spark.sql.functions
-
Aggregate function: returns the last value in a group.
- last(String) - Static method in class org.apache.spark.sql.functions
-
Aggregate function: returns the last value of the column in a group.
- lastError() - Method in class org.apache.spark.streaming.scheduler.ReceiverInfo
-
- lastErrorMessage() - Method in class org.apache.spark.streaming.scheduler.ReceiverInfo
-
- lastErrorTime() - Method in class org.apache.spark.streaming.scheduler.ReceiverInfo
-
- lastValidTime() - Method in class org.apache.spark.streaming.dstream.InputDStream
-
- latestModel() - Method in class org.apache.spark.mllib.clustering.StreamingKMeans
-
Return the latest model.
- latestModel() - Method in class org.apache.spark.mllib.regression.StreamingLinearAlgorithm
-
Return the latest model.
- launch() - Method in class org.apache.spark.launcher.SparkLauncher
-
Launches a sub-process that will start the configured Spark application.
- launchTime() - Method in class org.apache.spark.scheduler.TaskInfo
-
- launchTime() - Method in class org.apache.spark.status.api.v1.TaskData
-
- LBFGS - Class in org.apache.spark.mllib.optimization
-
:: DeveloperApi ::
Class used to solve an optimization problem using Limited-memory BFGS.
- LBFGS(Gradient, Updater) - Constructor for class org.apache.spark.mllib.optimization.LBFGS
-
- LDA - Class in org.apache.spark.mllib.clustering
-
:: Experimental ::
- LDA() - Constructor for class org.apache.spark.mllib.clustering.LDA
-
- LDAModel - Class in org.apache.spark.mllib.clustering
-
:: Experimental ::
- LDAOptimizer - Interface in org.apache.spark.mllib.clustering
-
:: DeveloperApi ::
- lead(String, int) - Static method in class org.apache.spark.sql.functions
-
Window function: returns the value that is offset
rows after the current row, and
null
if there is less than offset
rows after the current row.
- lead(Column, int) - Static method in class org.apache.spark.sql.functions
-
Window function: returns the value that is offset
rows after the current row, and
null
if there is less than offset
rows after the current row.
- lead(String, int, Object) - Static method in class org.apache.spark.sql.functions
-
Window function: returns the value that is offset
rows after the current row, and
defaultValue
if there is less than offset
rows after the current row.
- lead(Column, int, Object) - Static method in class org.apache.spark.sql.functions
-
Window function: returns the value that is offset
rows after the current row, and
defaultValue
if there is less than offset
rows after the current row.
- LeafNode - Class in org.apache.spark.ml.tree
-
:: DeveloperApi ::
Decision tree leaf node.
- learningRate() - Method in class org.apache.spark.mllib.tree.configuration.BoostingStrategy
-
- LeastSquaresAggregator - Class in org.apache.spark.ml.regression
-
LeastSquaresAggregator computes the gradient and loss for a Least-squared loss function,
as used in linear regression for samples in sparse or dense vector in a online fashion.
- LeastSquaresAggregator(Vector, double, double, double[], double[]) - Constructor for class org.apache.spark.ml.regression.LeastSquaresAggregator
-
- LeastSquaresCostFun - Class in org.apache.spark.ml.regression
-
LeastSquaresCostFun implements Breeze's DiffFunction[T] for Least Squares cost.
- LeastSquaresCostFun(RDD<Tuple2<Object, Vector>>, double, double, double[], double[], double) - Constructor for class org.apache.spark.ml.regression.LeastSquaresCostFun
-
- LeastSquaresGradient - Class in org.apache.spark.mllib.optimization
-
:: DeveloperApi ::
Compute gradient and loss for a Least-squared loss function, as used in linear regression.
- LeastSquaresGradient() - Constructor for class org.apache.spark.mllib.optimization.LeastSquaresGradient
-
- left() - Method in class org.apache.spark.sql.sources.And
-
- left() - Method in class org.apache.spark.sql.sources.Or
-
- leftCategories() - Method in class org.apache.spark.ml.tree.CategoricalSplit
-
Get sorted categories which split to the left
- leftChild() - Method in class org.apache.spark.ml.tree.InternalNode
-
- leftChildIndex(int) - Static method in class org.apache.spark.mllib.tree.model.Node
-
Return the index of the left child of this node.
- leftImpurity() - Method in class org.apache.spark.mllib.tree.model.InformationGainStats
-
- leftJoin(RDD<Tuple2<Object, VD2>>, Function3<Object, VD, Option<VD2>, VD3>, ClassTag<VD2>, ClassTag<VD3>) - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
-
- leftJoin(RDD<Tuple2<Object, VD2>>, Function3<Object, VD, Option<VD2>, VD3>, ClassTag<VD2>, ClassTag<VD3>) - Method in class org.apache.spark.graphx.VertexRDD
-
Left joins this VertexRDD with an RDD containing vertex attribute pairs.
- leftNode() - Method in class org.apache.spark.mllib.tree.model.Node
-
- leftOuterJoin(JavaPairRDD<K, W>, Partitioner) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Perform a left outer join of this
and other
.
- leftOuterJoin(JavaPairRDD<K, W>) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Perform a left outer join of this
and other
.
- leftOuterJoin(JavaPairRDD<K, W>, int) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Perform a left outer join of this
and other
.
- leftOuterJoin(RDD<Tuple2<K, W>>, Partitioner) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Perform a left outer join of this
and other
.
- leftOuterJoin(RDD<Tuple2<K, W>>) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Perform a left outer join of this
and other
.
- leftOuterJoin(RDD<Tuple2<K, W>>, int) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Perform a left outer join of this
and other
.
- leftOuterJoin(JavaPairDStream<K, W>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Return a new DStream by applying 'left outer join' between RDDs of this
DStream and
other
DStream.
- leftOuterJoin(JavaPairDStream<K, W>, int) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Return a new DStream by applying 'left outer join' between RDDs of this
DStream and
other
DStream.
- leftOuterJoin(JavaPairDStream<K, W>, Partitioner) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Return a new DStream by applying 'left outer join' between RDDs of this
DStream and
other
DStream.
- leftOuterJoin(DStream<Tuple2<K, W>>, ClassTag<W>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
-
Return a new DStream by applying 'left outer join' between RDDs of this
DStream and
other
DStream.
- leftOuterJoin(DStream<Tuple2<K, W>>, int, ClassTag<W>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
-
Return a new DStream by applying 'left outer join' between RDDs of this
DStream and
other
DStream.
- leftOuterJoin(DStream<Tuple2<K, W>>, Partitioner, ClassTag<W>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
-
Return a new DStream by applying 'left outer join' between RDDs of this
DStream and
other
DStream.
- leftPredict() - Method in class org.apache.spark.mllib.tree.model.InformationGainStats
-
- leftZipJoin(VertexRDD<VD2>, Function3<Object, VD, Option<VD2>, VD3>, ClassTag<VD2>, ClassTag<VD3>) - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
-
- leftZipJoin(VertexRDD<VD2>, Function3<Object, VD, Option<VD2>, VD3>, ClassTag<VD2>, ClassTag<VD3>) - Method in class org.apache.spark.graphx.VertexRDD
-
Left joins this RDD with another VertexRDD with the same index.
- LEGACY_DRIVER_IDENTIFIER() - Static method in class org.apache.spark.SparkContext
-
Legacy version of DRIVER_IDENTIFIER, retained for backwards-compatibility.
- length() - Method in class org.apache.spark.scheduler.SplitInfo
-
- length() - Method in interface org.apache.spark.sql.Row
-
Number of elements in the Row.
- length() - Method in class org.apache.spark.sql.types.StructType
-
- length() - Method in class org.apache.spark.sql.types.UTF8String
-
Return the number of code points in it.
- length() - Method in class org.apache.spark.util.Vector
-
- leq(Object) - Method in class org.apache.spark.sql.Column
-
Less than or equal to.
- less(Duration) - Method in class org.apache.spark.streaming.Duration
-
- less(Time) - Method in class org.apache.spark.streaming.Time
-
- lessEq(Duration) - Method in class org.apache.spark.streaming.Duration
-
- lessEq(Time) - Method in class org.apache.spark.streaming.Time
-
- LessThan - Class in org.apache.spark.sql.sources
-
A filter that evaluates to true
iff the attribute evaluates to a value
less than value
.
- LessThan(String, Object) - Constructor for class org.apache.spark.sql.sources.LessThan
-
- LessThanOrEqual - Class in org.apache.spark.sql.sources
-
A filter that evaluates to true
iff the attribute evaluates to a value
less than or equal to value
.
- LessThanOrEqual(String, Object) - Constructor for class org.apache.spark.sql.sources.LessThanOrEqual
-
- like(String) - Method in class org.apache.spark.sql.Column
-
SQL like expression.
- limit(int) - Method in class org.apache.spark.sql.DataFrame
-
Returns a new
DataFrame
by taking the first
n
rows.
- line() - Method in exception org.apache.spark.sql.AnalysisException
-
- LinearDataGenerator - Class in org.apache.spark.mllib.util
-
:: DeveloperApi ::
Generate sample data used for Linear Data.
- LinearDataGenerator() - Constructor for class org.apache.spark.mllib.util.LinearDataGenerator
-
- LinearRegression - Class in org.apache.spark.ml.regression
-
:: Experimental ::
Linear regression.
- LinearRegression(String) - Constructor for class org.apache.spark.ml.regression.LinearRegression
-
- LinearRegression() - Constructor for class org.apache.spark.ml.regression.LinearRegression
-
- LinearRegressionModel - Class in org.apache.spark.ml.regression
-
- LinearRegressionModel - Class in org.apache.spark.mllib.regression
-
Regression model trained using LinearRegression.
- LinearRegressionModel(Vector, double) - Constructor for class org.apache.spark.mllib.regression.LinearRegressionModel
-
- LinearRegressionWithSGD - Class in org.apache.spark.mllib.regression
-
Train a linear regression model with no regularization using Stochastic Gradient Descent.
- LinearRegressionWithSGD() - Constructor for class org.apache.spark.mllib.regression.LinearRegressionWithSGD
-
Construct a LinearRegression object with default parameters: {stepSize: 1.0,
numIterations: 100, miniBatchFraction: 1.0}.
- listenerBus() - Method in class org.apache.spark.SparkContext
-
- lit(Object) - Static method in class org.apache.spark.sql.functions
-
Creates a
Column
of literal value.
- load(SparkContext, String) - Static method in class org.apache.spark.mllib.classification.LogisticRegressionModel
-
- load(SparkContext, String) - Static method in class org.apache.spark.mllib.classification.NaiveBayesModel
-
- load(SparkContext, String) - Static method in class org.apache.spark.mllib.classification.SVMModel
-
- load(SparkContext, String) - Static method in class org.apache.spark.mllib.clustering.GaussianMixtureModel
-
- load(SparkContext, String) - Static method in class org.apache.spark.mllib.clustering.KMeansModel
-
- load(SparkContext, String) - Static method in class org.apache.spark.mllib.clustering.PowerIterationClusteringModel
-
- load(SparkContext, String) - Static method in class org.apache.spark.mllib.feature.Word2VecModel
-
- load(SparkContext, String) - Static method in class org.apache.spark.mllib.recommendation.MatrixFactorizationModel
-
- load(SparkContext, String) - Static method in class org.apache.spark.mllib.regression.IsotonicRegressionModel
-
- load(SparkContext, String) - Static method in class org.apache.spark.mllib.regression.LassoModel
-
- load(SparkContext, String) - Static method in class org.apache.spark.mllib.regression.LinearRegressionModel
-
- load(SparkContext, String) - Static method in class org.apache.spark.mllib.regression.RidgeRegressionModel
-
- load(SparkContext, String) - Static method in class org.apache.spark.mllib.tree.model.DecisionTreeModel
-
- load(SparkContext, String) - Static method in class org.apache.spark.mllib.tree.model.GradientBoostedTreesModel
-
- load(SparkContext, String) - Static method in class org.apache.spark.mllib.tree.model.RandomForestModel
-
- load(SparkContext, String) - Method in interface org.apache.spark.mllib.util.Loader
-
Load a model from the given path.
- load(String) - Method in class org.apache.spark.sql.DataFrameReader
-
Loads input in as a
DataFrame
, for data sources that require a path (e.g.
- load() - Method in class org.apache.spark.sql.DataFrameReader
-
Loads input in as a
DataFrame
, for data sources that don't require a path (e.g.
- load(String) - Method in class org.apache.spark.sql.SQLContext
-
Deprecated.
As of 1.4.0, replaced by read().load(path)
.
- load(String, String) - Method in class org.apache.spark.sql.SQLContext
-
Deprecated.
As of 1.4.0, replaced by read().format(source).load(path)
.
- load(String, Map<String, String>) - Method in class org.apache.spark.sql.SQLContext
-
Deprecated.
As of 1.4.0, replaced by read().format(source).options(options).load()
.
- load(String, Map<String, String>) - Method in class org.apache.spark.sql.SQLContext
-
Deprecated.
As of 1.4.0, replaced by read().format(source).options(options).load()
.
- load(String, StructType, Map<String, String>) - Method in class org.apache.spark.sql.SQLContext
-
Deprecated.
As of 1.4.0, replaced by
read().format(source).schema(schema).options(options).load()
.
- load(String, StructType, Map<String, String>) - Method in class org.apache.spark.sql.SQLContext
-
Deprecated.
As of 1.4.0, replaced by
read().format(source).schema(schema).options(options).load()
.
- Loader<M extends Saveable> - Interface in org.apache.spark.mllib.util
-
:: DeveloperApi ::
- loadLabeledData(SparkContext, String) - Static method in class org.apache.spark.mllib.util.MLUtils
-
- loadLabeledPoints(SparkContext, String, int) - Static method in class org.apache.spark.mllib.util.MLUtils
-
Loads labeled points saved using RDD[LabeledPoint].saveAsTextFile
.
- loadLabeledPoints(SparkContext, String) - Static method in class org.apache.spark.mllib.util.MLUtils
-
Loads labeled points saved using RDD[LabeledPoint].saveAsTextFile
with the default number of
partitions.
- loadLibSVMFile(SparkContext, String, int, int) - Static method in class org.apache.spark.mllib.util.MLUtils
-
Loads labeled data in the LIBSVM format into an RDD[LabeledPoint].
- loadLibSVMFile(SparkContext, String, boolean, int, int) - Static method in class org.apache.spark.mllib.util.MLUtils
-
- loadLibSVMFile(SparkContext, String, int) - Static method in class org.apache.spark.mllib.util.MLUtils
-
Loads labeled data in the LIBSVM format into an RDD[LabeledPoint], with the default number of
partitions.
- loadLibSVMFile(SparkContext, String, boolean, int) - Static method in class org.apache.spark.mllib.util.MLUtils
-
- loadLibSVMFile(SparkContext, String, boolean) - Static method in class org.apache.spark.mllib.util.MLUtils
-
- loadLibSVMFile(SparkContext, String) - Static method in class org.apache.spark.mllib.util.MLUtils
-
Loads binary labeled data in the LIBSVM format into an RDD[LabeledPoint], with number of
features determined automatically and the default number of partitions.
- loadVectors(SparkContext, String, int) - Static method in class org.apache.spark.mllib.util.MLUtils
-
Loads vectors saved using RDD[Vector].saveAsTextFile
.
- loadVectors(SparkContext, String) - Static method in class org.apache.spark.mllib.util.MLUtils
-
Loads vectors saved using RDD[Vector].saveAsTextFile
with the default number of partitions.
- localBlocksFetched() - Method in class org.apache.spark.status.api.v1.ShuffleReadMetricDistributions
-
- localBlocksFetched() - Method in class org.apache.spark.status.api.v1.ShuffleReadMetrics
-
- LocalLDAModel - Class in org.apache.spark.mllib.clustering
-
:: Experimental ::
- localSeqToDataFrameHolder(Seq<A>, TypeTags.TypeTag<A>) - Method in class org.apache.spark.sql.SQLContext.implicits$
-
- LocalSQLContext - Class in org.apache.spark.sql.test
-
A SQLContext that can be used for local testing.
- LocalSQLContext() - Constructor for class org.apache.spark.sql.test.LocalSQLContext
-
- localValue() - Method in class org.apache.spark.Accumulable
-
Get the current value of this accumulator from within a task.
- location() - Method in class org.apache.spark.streaming.scheduler.ReceiverInfo
-
- log() - Method in interface org.apache.spark.Logging
-
- log(Column) - Static method in class org.apache.spark.sql.functions
-
Computes the natural logarithm of the given value.
- log(String) - Static method in class org.apache.spark.sql.functions
-
Computes the natural logarithm of the given column.
- log10(Column) - Static method in class org.apache.spark.sql.functions
-
Computes the logarithm of the given value in Base 10.
- log10(String) - Static method in class org.apache.spark.sql.functions
-
Computes the logarithm of the given value in Base 10.
- log1p(Column) - Static method in class org.apache.spark.sql.functions
-
Computes the natural logarithm of the given value plus one.
- log1p(String) - Static method in class org.apache.spark.sql.functions
-
Computes the natural logarithm of the given column plus one.
- log_() - Method in interface org.apache.spark.Logging
-
- logDebug(Function0<String>) - Method in interface org.apache.spark.Logging
-
- logDebug(Function0<String>, Throwable) - Method in interface org.apache.spark.Logging
-
- logDeprecationWarning(String) - Static method in class org.apache.spark.SparkConf
-
Logs a warning message if the given config key is deprecated.
- logDirName() - Method in class org.apache.spark.scheduler.JobLogger
-
- logError(Function0<String>) - Method in interface org.apache.spark.Logging
-
- logError(Function0<String>, Throwable) - Method in interface org.apache.spark.Logging
-
- Logging - Interface in org.apache.spark
-
:: DeveloperApi ::
Utility trait for classes that want to log data.
- logInfo(Function0<String>) - Method in interface org.apache.spark.Logging
-
- logInfo(Function0<String>, Throwable) - Method in interface org.apache.spark.Logging
-
- LogisticAggregator - Class in org.apache.spark.ml.classification
-
LogisticAggregator computes the gradient and loss for binary logistic loss function, as used
in binary classification for samples in sparse or dense vector in a online fashion.
- LogisticAggregator(Vector, int, boolean, double[], double[]) - Constructor for class org.apache.spark.ml.classification.LogisticAggregator
-
- LogisticCostFun - Class in org.apache.spark.ml.classification
-
LogisticCostFun implements Breeze's DiffFunction[T] for a multinomial logistic loss function,
as used in multi-class classification (it is also used in binary logistic regression).
- LogisticCostFun(RDD<Tuple2<Object, Vector>>, int, boolean, double[], double[], double) - Constructor for class org.apache.spark.ml.classification.LogisticCostFun
-
- LogisticGradient - Class in org.apache.spark.mllib.optimization
-
:: DeveloperApi ::
Compute gradient and loss for a multinomial logistic loss function, as used
in multi-class classification (it is also used in binary logistic regression).
- LogisticGradient(int) - Constructor for class org.apache.spark.mllib.optimization.LogisticGradient
-
- LogisticGradient() - Constructor for class org.apache.spark.mllib.optimization.LogisticGradient
-
- LogisticRegression - Class in org.apache.spark.ml.classification
-
:: Experimental ::
Logistic regression.
- LogisticRegression(String) - Constructor for class org.apache.spark.ml.classification.LogisticRegression
-
- LogisticRegression() - Constructor for class org.apache.spark.ml.classification.LogisticRegression
-
- LogisticRegressionDataGenerator - Class in org.apache.spark.mllib.util
-
:: DeveloperApi ::
Generate test data for LogisticRegression.
- LogisticRegressionDataGenerator() - Constructor for class org.apache.spark.mllib.util.LogisticRegressionDataGenerator
-
- LogisticRegressionModel - Class in org.apache.spark.ml.classification
-
- LogisticRegressionModel - Class in org.apache.spark.mllib.classification
-
Classification model trained using Multinomial/Binary Logistic Regression.
- LogisticRegressionModel(Vector, double, int, int) - Constructor for class org.apache.spark.mllib.classification.LogisticRegressionModel
-
- LogisticRegressionModel(Vector, double) - Constructor for class org.apache.spark.mllib.classification.LogisticRegressionModel
-
- LogisticRegressionWithLBFGS - Class in org.apache.spark.mllib.classification
-
Train a classification model for Multinomial/Binary Logistic Regression using
Limited-memory BFGS.
- LogisticRegressionWithLBFGS() - Constructor for class org.apache.spark.mllib.classification.LogisticRegressionWithLBFGS
-
- LogisticRegressionWithSGD - Class in org.apache.spark.mllib.classification
-
Train a classification model for Binary Logistic Regression
using Stochastic Gradient Descent.
- LogisticRegressionWithSGD() - Constructor for class org.apache.spark.mllib.classification.LogisticRegressionWithSGD
-
Construct a LogisticRegression object with default parameters: {stepSize: 1.0,
numIterations: 100, regParm: 0.01, miniBatchFraction: 1.0}.
- logLikelihood() - Method in class org.apache.spark.mllib.clustering.DistributedLDAModel
-
Log likelihood of the observed tokens in the training set,
given the current parameter estimates:
log P(docs | topics, topic distributions for docs, alpha, eta)
- logLikelihood() - Method in class org.apache.spark.mllib.clustering.ExpectationSum
-
- LogLoss - Class in org.apache.spark.mllib.tree.loss
-
:: DeveloperApi ::
Class for log loss calculation (for classification).
- LogLoss() - Constructor for class org.apache.spark.mllib.tree.loss.LogLoss
-
- logName() - Method in interface org.apache.spark.Logging
-
- LogNormalGenerator - Class in org.apache.spark.mllib.random
-
:: DeveloperApi ::
Generates i.i.d.
- LogNormalGenerator(double, double) - Constructor for class org.apache.spark.mllib.random.LogNormalGenerator
-
- logNormalGraph(SparkContext, int, int, double, double, long) - Static method in class org.apache.spark.graphx.util.GraphGenerators
-
Generate a graph whose vertex out degree distribution is log normal.
- logNormalJavaRDD(JavaSparkContext, double, double, long, int, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs
-
- logNormalJavaRDD(JavaSparkContext, double, double, long, int) - Static method in class org.apache.spark.mllib.random.RandomRDDs
-
- logNormalJavaRDD(JavaSparkContext, double, double, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs
-
- logNormalJavaVectorRDD(JavaSparkContext, double, double, long, int, int, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs
-
- logNormalJavaVectorRDD(JavaSparkContext, double, double, long, int, int) - Static method in class org.apache.spark.mllib.random.RandomRDDs
-
- logNormalJavaVectorRDD(JavaSparkContext, double, double, long, int) - Static method in class org.apache.spark.mllib.random.RandomRDDs
-
- logNormalRDD(SparkContext, double, double, long, int, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs
-
Generates an RDD comprised of i.i.d.
samples from the log normal distribution with the input
mean and standard deviation
- logNormalVectorRDD(SparkContext, double, double, long, int, int, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs
-
Generates an RDD[Vector] with vectors containing i.i.d.
samples drawn from a
log normal distribution.
- logpdf(Vector) - Method in class org.apache.spark.mllib.stat.distribution.MultivariateGaussian
-
Returns the log-density of this multivariate Gaussian at given point, x
- logPrior() - Method in class org.apache.spark.mllib.clustering.DistributedLDAModel
-
Log probability of the current parameter estimate:
log P(topics, topic distributions for docs | alpha, eta)
- logTrace(Function0<String>) - Method in interface org.apache.spark.Logging
-
- logTrace(Function0<String>, Throwable) - Method in interface org.apache.spark.Logging
-
- logUrlMap() - Method in class org.apache.spark.scheduler.cluster.ExecutorInfo
-
- logWarning(Function0<String>) - Method in interface org.apache.spark.Logging
-
- logWarning(Function0<String>, Throwable) - Method in interface org.apache.spark.Logging
-
- LongParam - Class in org.apache.spark.ml.param
-
:: DeveloperApi ::
Specialized version of Param[Long
] for Java.
- LongParam(String, String, String, Function1<Object, Object>) - Constructor for class org.apache.spark.ml.param.LongParam
-
- LongParam(String, String, String) - Constructor for class org.apache.spark.ml.param.LongParam
-
- LongParam(org.apache.spark.ml.util.Identifiable, String, String, Function1<Object, Object>) - Constructor for class org.apache.spark.ml.param.LongParam
-
- LongParam(org.apache.spark.ml.util.Identifiable, String, String) - Constructor for class org.apache.spark.ml.param.LongParam
-
- longRddToDataFrameHolder(RDD<Object>) - Method in class org.apache.spark.sql.SQLContext.implicits$
-
- longToLongWritable(long) - Static method in class org.apache.spark.SparkContext
-
- LongType - Static variable in class org.apache.spark.sql.types.DataTypes
-
Gets the LongType object.
- LongType - Class in org.apache.spark.sql.types
-
:: DeveloperApi ::
The data type representing Long
values.
- longWritableConverter() - Static method in class org.apache.spark.SparkContext
-
- lookup(K) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Return the list of values in the RDD for key key
.
- lookup(K) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Return the list of values in the RDD for key key
.
- lookupTimeout(SparkConf) - Static method in class org.apache.spark.util.RpcUtils
-
Returns the default Spark timeout to use for RPC remote endpoint lookup.
- loss() - Method in class org.apache.spark.ml.classification.LogisticAggregator
-
- loss() - Method in class org.apache.spark.ml.regression.LeastSquaresAggregator
-
- loss() - Method in class org.apache.spark.mllib.tree.configuration.BoostingStrategy
-
- Loss - Interface in org.apache.spark.mllib.tree.loss
-
:: DeveloperApi ::
Trait for adding "pluggable" loss functions for the gradient boosting algorithm.
- Losses - Class in org.apache.spark.mllib.tree.loss
-
- Losses() - Constructor for class org.apache.spark.mllib.tree.loss.Losses
-
- lossType() - Method in class org.apache.spark.ml.classification.GBTClassifier
-
Loss function which GBT tries to minimize.
- lossType() - Method in class org.apache.spark.ml.regression.GBTRegressor
-
Loss function which GBT tries to minimize.
- low() - Method in class org.apache.spark.partial.BoundedDouble
-
- lower(Column) - Static method in class org.apache.spark.sql.functions
-
Converts a string exprsesion to lower case.
- lt(double) - Static method in class org.apache.spark.ml.param.ParamValidators
-
Check if value < upperBound
- lt(Object) - Method in class org.apache.spark.sql.Column
-
Less than.
- ltEq(double) - Static method in class org.apache.spark.ml.param.ParamValidators
-
Check if value <= upperBound
- LZ4CompressionCodec - Class in org.apache.spark.io
-
- LZ4CompressionCodec(SparkConf) - Constructor for class org.apache.spark.io.LZ4CompressionCodec
-
- LZFCompressionCodec - Class in org.apache.spark.io
-
- LZFCompressionCodec(SparkConf) - Constructor for class org.apache.spark.io.LZFCompressionCodec
-
- main(String[]) - Static method in class org.apache.spark.examples.streaming.JavaKinesisWordCountASL
-
- main(String[]) - Static method in class org.apache.spark.examples.streaming.KinesisWordCountASL
-
- main(String[]) - Static method in class org.apache.spark.examples.streaming.KinesisWordProducerASL
-
- main(String[]) - Static method in class org.apache.spark.mllib.util.KMeansDataGenerator
-
- main(String[]) - Static method in class org.apache.spark.mllib.util.LinearDataGenerator
-
- main(String[]) - Static method in class org.apache.spark.mllib.util.LogisticRegressionDataGenerator
-
- main(String[]) - Static method in class org.apache.spark.mllib.util.MFDataGenerator
-
- main(String[]) - Static method in class org.apache.spark.mllib.util.SVMDataGenerator
-
- makeDriverRef(String, SparkConf, org.apache.spark.rpc.RpcEnv) - Static method in class org.apache.spark.util.RpcUtils
-
Retrieve a RpcEndpointRef
which is located in the driver via its name.
- makeRDD(Seq<T>, int, ClassTag<T>) - Method in class org.apache.spark.SparkContext
-
Distribute a local Scala collection to form an RDD.
- makeRDD(Seq<Tuple2<T, Seq<String>>>, ClassTag<T>) - Method in class org.apache.spark.SparkContext
-
Distribute a local Scala collection to form an RDD, with one or more
location preferences (hostnames of Spark nodes) for each object.
- map(Function<T, R>) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Return a new RDD by applying a function to all elements of this RDD.
- map(Function1<Object, Object>) - Method in interface org.apache.spark.mllib.linalg.Matrix
-
Map the values of this matrix using a function.
- map(Function1<R, T>) - Method in class org.apache.spark.partial.PartialResult
-
Transform this PartialResult into a PartialResult of type T.
- map(Function1<T, U>, ClassTag<U>) - Method in class org.apache.spark.rdd.RDD
-
Return a new RDD by applying a function to all elements of this RDD.
- map(DataType, DataType) - Method in class org.apache.spark.sql.ColumnName
-
Creates a new StructField
of type map.
- map(MapType) - Method in class org.apache.spark.sql.ColumnName
-
- map(Function1<Row, R>, ClassTag<R>) - Method in class org.apache.spark.sql.DataFrame
-
Returns a new RDD by applying a function to all rows of this DataFrame.
- map() - Method in class org.apache.spark.sql.types.Metadata
-
- map(Function<T, R>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
-
Return a new DStream by applying a function to all elements of this DStream.
- map(Function1<T, U>, ClassTag<U>) - Method in class org.apache.spark.streaming.dstream.DStream
-
Return a new DStream by applying a function to all elements of this DStream.
- mapEdgePartitions(Function2<Object, EdgePartition<ED, VD>, EdgePartition<ED2, VD2>>, ClassTag<ED2>, ClassTag<VD2>) - Method in class org.apache.spark.graphx.impl.EdgeRDDImpl
-
- mapEdges(Function1<Edge<ED>, ED2>, ClassTag<ED2>) - Method in class org.apache.spark.graphx.Graph
-
Transforms each edge attribute in the graph using the map function.
- mapEdges(Function2<Object, Iterator<Edge<ED>>, Iterator<ED2>>, ClassTag<ED2>) - Method in class org.apache.spark.graphx.Graph
-
Transforms each edge attribute using the map function, passing it a whole partition at a
time.
- mapEdges(Function2<Object, Iterator<Edge<ED>>, Iterator<ED2>>, ClassTag<ED2>) - Method in class org.apache.spark.graphx.impl.GraphImpl
-
- mapId() - Method in class org.apache.spark.FetchFailed
-
- mapId() - Method in class org.apache.spark.storage.ShuffleBlockId
-
- mapId() - Method in class org.apache.spark.storage.ShuffleDataBlockId
-
- mapId() - Method in class org.apache.spark.storage.ShuffleIndexBlockId
-
- mapOutputTracker() - Method in class org.apache.spark.SparkEnv
-
- mapPartitions(FlatMapFunction<Iterator<T>, U>) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Return a new RDD by applying a function to each partition of this RDD.
- mapPartitions(FlatMapFunction<Iterator<T>, U>, boolean) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Return a new RDD by applying a function to each partition of this RDD.
- mapPartitions(Function1<Iterator<T>, Iterator<U>>, boolean, ClassTag<U>) - Method in class org.apache.spark.rdd.RDD
-
Return a new RDD by applying a function to each partition of this RDD.
- mapPartitions(Function1<Iterator<Row>, Iterator<R>>, ClassTag<R>) - Method in class org.apache.spark.sql.DataFrame
-
Returns a new RDD by applying a function to each partition of this DataFrame.
- mapPartitions(FlatMapFunction<Iterator<T>, U>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
-
Return a new DStream in which each RDD is generated by applying mapPartitions() to each RDDs
of this DStream.
- mapPartitions(Function1<Iterator<T>, Iterator<U>>, boolean, ClassTag<U>) - Method in class org.apache.spark.streaming.dstream.DStream
-
Return a new DStream in which each RDD is generated by applying mapPartitions() to each RDDs
of this DStream.
- mapPartitionsToDouble(DoubleFlatMapFunction<Iterator<T>>) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Return a new RDD by applying a function to each partition of this RDD.
- mapPartitionsToDouble(DoubleFlatMapFunction<Iterator<T>>, boolean) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Return a new RDD by applying a function to each partition of this RDD.
- mapPartitionsToPair(PairFlatMapFunction<Iterator<T>, K2, V2>) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Return a new RDD by applying a function to each partition of this RDD.
- mapPartitionsToPair(PairFlatMapFunction<Iterator<T>, K2, V2>, boolean) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Return a new RDD by applying a function to each partition of this RDD.
- mapPartitionsToPair(PairFlatMapFunction<Iterator<T>, K2, V2>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
-
Return a new DStream in which each RDD is generated by applying mapPartitions() to each RDDs
of this DStream.
- mapPartitionsWithContext(Function2<TaskContext, Iterator<T>, Iterator<U>>, boolean, ClassTag<U>) - Method in class org.apache.spark.rdd.RDD
-
:: DeveloperApi ::
Return a new RDD by applying a function to each partition of this RDD.
- mapPartitionsWithIndex(Function2<Integer, Iterator<T>, Iterator<R>>, boolean) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Return a new RDD by applying a function to each partition of this RDD, while tracking the index
of the original partition.
- mapPartitionsWithIndex(Function2<Object, Iterator<T>, Iterator<U>>, boolean, ClassTag<U>) - Method in class org.apache.spark.rdd.RDD
-
Return a new RDD by applying a function to each partition of this RDD, while tracking the index
of the original partition.
- mapPartitionsWithInputSplit(Function2<InputSplit, Iterator<Tuple2<K, V>>, Iterator<R>>, boolean) - Method in class org.apache.spark.api.java.JavaHadoopRDD
-
Maps over a partition, providing the InputSplit that was used as the base of the partition.
- mapPartitionsWithInputSplit(Function2<InputSplit, Iterator<Tuple2<K, V>>, Iterator<R>>, boolean) - Method in class org.apache.spark.api.java.JavaNewHadoopRDD
-
Maps over a partition, providing the InputSplit that was used as the base of the partition.
- mapPartitionsWithInputSplit(Function2<InputSplit, Iterator<Tuple2<K, V>>, Iterator<U>>, boolean, ClassTag<U>) - Method in class org.apache.spark.rdd.HadoopRDD
-
Maps over a partition, providing the InputSplit that was used as the base of the partition.
- mapPartitionsWithInputSplit(Function2<InputSplit, Iterator<Tuple2<K, V>>, Iterator<U>>, boolean, ClassTag<U>) - Method in class org.apache.spark.rdd.NewHadoopRDD
-
Maps over a partition, providing the InputSplit that was used as the base of the partition.
- mapPartitionsWithSplit(Function2<Object, Iterator<T>, Iterator<U>>, boolean, ClassTag<U>) - Method in class org.apache.spark.rdd.RDD
-
Return a new RDD by applying a function to each partition of this RDD, while tracking the index
of the original partition.
- mapredInputFormat() - Method in class org.apache.spark.scheduler.InputFormatInfo
-
- mapreduceInputFormat() - Method in class org.apache.spark.scheduler.InputFormatInfo
-
- mapReduceTriplets(Function1<EdgeTriplet<VD, ED>, Iterator<Tuple2<Object, A>>>, Function2<A, A, A>, Option<Tuple2<VertexRDD<?>, EdgeDirection>>, ClassTag<A>) - Method in class org.apache.spark.graphx.Graph
-
Aggregates values from the neighboring edges and vertices of each vertex.
- mapReduceTriplets(Function1<EdgeTriplet<VD, ED>, Iterator<Tuple2<Object, A>>>, Function2<A, A, A>, Option<Tuple2<VertexRDD<?>, EdgeDirection>>, ClassTag<A>) - Method in class org.apache.spark.graphx.impl.GraphImpl
-
- mapSideCombine() - Method in class org.apache.spark.ShuffleDependency
-
- mapToDouble(DoubleFunction<T>) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Return a new RDD by applying a function to all elements of this RDD.
- mapToPair(PairFunction<T, K2, V2>) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Return a new RDD by applying a function to all elements of this RDD.
- mapToPair(PairFunction<T, K2, V2>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
-
Return a new DStream by applying a function to all elements of this DStream.
- mapTriplets(Function1<EdgeTriplet<VD, ED>, ED2>, ClassTag<ED2>) - Method in class org.apache.spark.graphx.Graph
-
Transforms each edge attribute using the map function, passing it the adjacent vertex
attributes as well.
- mapTriplets(Function1<EdgeTriplet<VD, ED>, ED2>, TripletFields, ClassTag<ED2>) - Method in class org.apache.spark.graphx.Graph
-
Transforms each edge attribute using the map function, passing it the adjacent vertex
attributes as well.
- mapTriplets(Function2<Object, Iterator<EdgeTriplet<VD, ED>>, Iterator<ED2>>, TripletFields, ClassTag<ED2>) - Method in class org.apache.spark.graphx.Graph
-
Transforms each edge attribute a partition at a time using the map function, passing it the
adjacent vertex attributes as well.
- mapTriplets(Function2<Object, Iterator<EdgeTriplet<VD, ED>>, Iterator<ED2>>, TripletFields, ClassTag<ED2>) - Method in class org.apache.spark.graphx.impl.GraphImpl
-
- MapType - Class in org.apache.spark.sql.types
-
:: DeveloperApi ::
The data type for Maps.
- MapType(DataType, DataType, boolean) - Constructor for class org.apache.spark.sql.types.MapType
-
- MapType() - Constructor for class org.apache.spark.sql.types.MapType
-
No-arg constructor for kryo.
- mapValues(Function<V, U>) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Pass each value in the key-value pair RDD through a map function without changing the keys;
this also retains the original RDD's partitioning.
- mapValues(Function1<Edge<ED>, ED2>, ClassTag<ED2>) - Method in class org.apache.spark.graphx.EdgeRDD
-
Map the values in an edge partitioning preserving the structure but changing the values.
- mapValues(Function1<Edge<ED>, ED2>, ClassTag<ED2>) - Method in class org.apache.spark.graphx.impl.EdgeRDDImpl
-
- mapValues(Function1<VD, VD2>, ClassTag<VD2>) - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
-
- mapValues(Function2<Object, VD, VD2>, ClassTag<VD2>) - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
-
- mapValues(Function1<VD, VD2>, ClassTag<VD2>) - Method in class org.apache.spark.graphx.VertexRDD
-
Maps each vertex attribute, preserving the index.
- mapValues(Function2<Object, VD, VD2>, ClassTag<VD2>) - Method in class org.apache.spark.graphx.VertexRDD
-
Maps each vertex attribute, additionally supplying the vertex ID.
- mapValues(Function1<V, U>) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Pass each value in the key-value pair RDD through a map function without changing the keys;
this also retains the original RDD's partitioning.
- mapValues(Function<V, U>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Return a new DStream by applying a map function to the value of each key-value pairs in
'this' DStream without changing the key.
- mapValues(Function1<V, U>, ClassTag<U>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
-
Return a new DStream by applying a map function to the value of each key-value pairs in
'this' DStream without changing the key.
- mapVertices(Function2<Object, VD, VD2>, ClassTag<VD2>, Predef.$eq$colon$eq<VD, VD2>) - Method in class org.apache.spark.graphx.Graph
-
Transforms each vertex attribute in the graph using the map function.
- mapVertices(Function2<Object, VD, VD2>, ClassTag<VD2>, Predef.$eq$colon$eq<VD, VD2>) - Method in class org.apache.spark.graphx.impl.GraphImpl
-
- mapWith(Function1<Object, A>, boolean, Function2<T, A, U>, ClassTag<U>) - Method in class org.apache.spark.rdd.RDD
-
Maps f over this RDD, where f takes an additional parameter of type A.
- mask(Graph<VD2, ED2>, ClassTag<VD2>, ClassTag<ED2>) - Method in class org.apache.spark.graphx.Graph
-
Restricts the graph to only the vertices and edges that are also in other
, but keeps the
attributes from this graph.
- mask(Graph<VD2, ED2>, ClassTag<VD2>, ClassTag<ED2>) - Method in class org.apache.spark.graphx.impl.GraphImpl
-
- master() - Method in class org.apache.spark.api.java.JavaSparkContext
-
- master() - Method in class org.apache.spark.SparkContext
-
- Matrices - Class in org.apache.spark.mllib.linalg
-
- Matrices() - Constructor for class org.apache.spark.mllib.linalg.Matrices
-
- Matrix - Interface in org.apache.spark.mllib.linalg
-
Trait for a local matrix.
- MatrixEntry - Class in org.apache.spark.mllib.linalg.distributed
-
:: Experimental ::
Represents an entry in an distributed matrix.
- MatrixEntry(long, long, double) - Constructor for class org.apache.spark.mllib.linalg.distributed.MatrixEntry
-
- MatrixFactorizationModel - Class in org.apache.spark.mllib.recommendation
-
Model representing the result of matrix factorization.
- MatrixFactorizationModel(int, RDD<Tuple2<Object, double[]>>, RDD<Tuple2<Object, double[]>>) - Constructor for class org.apache.spark.mllib.recommendation.MatrixFactorizationModel
-
- max() - Method in class org.apache.spark.api.java.JavaDoubleRDD
-
Returns the maximum element from this RDD as defined by
the default comparator natural order.
- max(Comparator<T>) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Returns the maximum element from this RDD as defined by the specified
Comparator[T].
- max() - Method in class org.apache.spark.ml.attribute.NumericAttribute
-
- max() - Method in class org.apache.spark.mllib.stat.MultivariateOnlineSummarizer
-
- max() - Method in interface org.apache.spark.mllib.stat.MultivariateStatisticalSummary
-
Maximum value of each column.
- max(Ordering<T>) - Method in class org.apache.spark.rdd.RDD
-
Returns the max of this RDD as defined by the implicit Ordering[T].
- max(Column) - Static method in class org.apache.spark.sql.functions
-
Aggregate function: returns the maximum value of the expression in a group.
- max(String) - Static method in class org.apache.spark.sql.functions
-
Aggregate function: returns the maximum value of the column in a group.
- max(String...) - Method in class org.apache.spark.sql.GroupedData
-
Compute the max value for each numeric columns for each group.
- max(Seq<String>) - Method in class org.apache.spark.sql.GroupedData
-
Compute the max value for each numeric columns for each group.
- max(Duration) - Method in class org.apache.spark.streaming.Duration
-
- max(Time) - Method in class org.apache.spark.streaming.Time
-
- max() - Method in class org.apache.spark.util.StatCounter
-
- MAX_LONG_DIGITS() - Static method in class org.apache.spark.sql.types.Decimal
-
Maximum number of decimal digits a Long can represent
- maxBins() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
-
- maxBufferSizeMb() - Method in class org.apache.spark.serializer.KryoSerializer
-
- maxDepth() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
-
- maxIters() - Method in class org.apache.spark.graphx.lib.SVDPlusPlus.Conf
-
- maxMem() - Method in class org.apache.spark.scheduler.SparkListenerBlockManagerAdded
-
- maxMem() - Method in class org.apache.spark.storage.StorageStatus
-
- maxMemory() - Method in class org.apache.spark.status.api.v1.ExecutorSummary
-
- maxMemoryInMB() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
-
- maxNodesInLevel(int) - Static method in class org.apache.spark.mllib.tree.model.Node
-
Return the maximum number of nodes which can be in the given level of the tree.
- maxVal() - Method in class org.apache.spark.graphx.lib.SVDPlusPlus.Conf
-
- mean() - Method in class org.apache.spark.api.java.JavaDoubleRDD
-
Compute the mean of this RDD's elements.
- mean() - Method in class org.apache.spark.mllib.feature.StandardScalerModel
-
- mean() - Method in class org.apache.spark.mllib.random.ExponentialGenerator
-
- mean() - Method in class org.apache.spark.mllib.random.LogNormalGenerator
-
- mean() - Method in class org.apache.spark.mllib.random.PoissonGenerator
-
- mean() - Method in class org.apache.spark.mllib.stat.MultivariateOnlineSummarizer
-
- mean() - Method in interface org.apache.spark.mllib.stat.MultivariateStatisticalSummary
-
Sample mean vector.
- mean() - Method in class org.apache.spark.partial.BoundedDouble
-
- mean() - Method in class org.apache.spark.rdd.DoubleRDDFunctions
-
Compute the mean of this RDD's elements.
- mean(Column) - Static method in class org.apache.spark.sql.functions
-
Aggregate function: returns the average of the values in a group.
- mean(String) - Static method in class org.apache.spark.sql.functions
-
Aggregate function: returns the average of the values in a group.
- mean(String...) - Method in class org.apache.spark.sql.GroupedData
-
Compute the average value for each numeric columns for each group.
- mean(Seq<String>) - Method in class org.apache.spark.sql.GroupedData
-
Compute the average value for each numeric columns for each group.
- mean() - Method in class org.apache.spark.util.StatCounter
-
- meanAbsoluteError() - Method in class org.apache.spark.mllib.evaluation.RegressionMetrics
-
Returns the mean absolute error, which is a risk function corresponding to the
expected value of the absolute error loss or l1-norm loss.
- meanApprox(long, Double) - Method in class org.apache.spark.api.java.JavaDoubleRDD
-
Return the approximate mean of the elements in this RDD.
- meanApprox(long) - Method in class org.apache.spark.api.java.JavaDoubleRDD
-
:: Experimental ::
Approximate operation to return the mean within a timeout.
- meanApprox(long, double) - Method in class org.apache.spark.rdd.DoubleRDDFunctions
-
:: Experimental ::
Approximate operation to return the mean within a timeout.
- meanAveragePrecision() - Method in class org.apache.spark.mllib.evaluation.RankingMetrics
-
Returns the mean average precision (MAP) of all the queries.
- means() - Method in class org.apache.spark.mllib.clustering.ExpectationSum
-
- meanSquaredError() - Method in class org.apache.spark.mllib.evaluation.RegressionMetrics
-
Returns the mean squared error, which is a risk function corresponding to the
expected value of the squared error loss or quadratic loss.
- MEMORY_AND_DISK - Static variable in class org.apache.spark.api.java.StorageLevels
-
- MEMORY_AND_DISK() - Static method in class org.apache.spark.storage.StorageLevel
-
- MEMORY_AND_DISK_2 - Static variable in class org.apache.spark.api.java.StorageLevels
-
- MEMORY_AND_DISK_2() - Static method in class org.apache.spark.storage.StorageLevel
-
- MEMORY_AND_DISK_SER - Static variable in class org.apache.spark.api.java.StorageLevels
-
- MEMORY_AND_DISK_SER() - Static method in class org.apache.spark.storage.StorageLevel
-
- MEMORY_AND_DISK_SER_2 - Static variable in class org.apache.spark.api.java.StorageLevels
-
- MEMORY_AND_DISK_SER_2() - Static method in class org.apache.spark.storage.StorageLevel
-
- MEMORY_ONLY - Static variable in class org.apache.spark.api.java.StorageLevels
-
- MEMORY_ONLY() - Static method in class org.apache.spark.storage.StorageLevel
-
- MEMORY_ONLY_2 - Static variable in class org.apache.spark.api.java.StorageLevels
-
- MEMORY_ONLY_2() - Static method in class org.apache.spark.storage.StorageLevel
-
- MEMORY_ONLY_SER - Static variable in class org.apache.spark.api.java.StorageLevels
-
- MEMORY_ONLY_SER() - Static method in class org.apache.spark.storage.StorageLevel
-
- MEMORY_ONLY_SER_2 - Static variable in class org.apache.spark.api.java.StorageLevels
-
- MEMORY_ONLY_SER_2() - Static method in class org.apache.spark.storage.StorageLevel
-
- memoryBytesSpilled() - Method in class org.apache.spark.status.api.v1.ExecutorStageSummary
-
- memoryBytesSpilled() - Method in class org.apache.spark.status.api.v1.StageData
-
- memoryBytesSpilled() - Method in class org.apache.spark.status.api.v1.TaskMetricDistributions
-
- memoryBytesSpilled() - Method in class org.apache.spark.status.api.v1.TaskMetrics
-
- MemoryEntry - Class in org.apache.spark.storage
-
- MemoryEntry(Object, long, boolean) - Constructor for class org.apache.spark.storage.MemoryEntry
-
- memoryRemaining() - Method in class org.apache.spark.status.api.v1.RDDDataDistribution
-
- memoryUsed() - Method in class org.apache.spark.status.api.v1.ExecutorSummary
-
- memoryUsed() - Method in class org.apache.spark.status.api.v1.RDDDataDistribution
-
- memoryUsed() - Method in class org.apache.spark.status.api.v1.RDDPartitionInfo
-
- memoryUsed() - Method in class org.apache.spark.status.api.v1.RDDStorageInfo
-
- memRemaining() - Method in class org.apache.spark.storage.StorageStatus
-
Return the memory remaining in this block manager.
- memSize() - Method in class org.apache.spark.storage.BlockStatus
-
- memSize() - Method in class org.apache.spark.storage.RDDInfo
-
- memUsed() - Method in class org.apache.spark.storage.StorageStatus
-
Return the memory used by this block manager.
- memUsedByRdd(int) - Method in class org.apache.spark.storage.StorageStatus
-
Return the memory used by the given RDD in this block manager in O(1) time.
- merge(R) - Method in class org.apache.spark.Accumulable
-
Merge two accumulable objects together
- merge(LogisticAggregator) - Method in class org.apache.spark.ml.classification.LogisticAggregator
-
Merge another LogisticAggregator, and update the loss and gradient
of the objective function.
- merge(VectorIndexer.CategoryStats) - Method in class org.apache.spark.ml.feature.VectorIndexer.CategoryStats
-
Merge with another instance, modifying this instance.
- merge(LeastSquaresAggregator) - Method in class org.apache.spark.ml.regression.LeastSquaresAggregator
-
Merge another LeastSquaresAggregator, and update the loss and gradient
of the objective function.
- merge(IDF.DocumentFrequencyAggregator) - Method in class org.apache.spark.mllib.feature.IDF.DocumentFrequencyAggregator
-
Merges another.
- merge(MultivariateOnlineSummarizer) - Method in class org.apache.spark.mllib.stat.MultivariateOnlineSummarizer
-
Merge another MultivariateOnlineSummarizer, and update the statistical summary.
- merge(double) - Method in class org.apache.spark.util.StatCounter
-
Add a value into this StatCounter, updating the internal statistics.
- merge(TraversableOnce<Object>) - Method in class org.apache.spark.util.StatCounter
-
Add multiple values into this StatCounter, updating the internal statistics.
- merge(StatCounter) - Method in class org.apache.spark.util.StatCounter
-
Merge another StatCounter into this one, adding up the internal statistics.
- mergeCombiners() - Method in class org.apache.spark.Aggregator
-
- mergeValue() - Method in class org.apache.spark.Aggregator
-
- message() - Method in class org.apache.spark.FetchFailed
-
- message() - Method in exception org.apache.spark.sql.AnalysisException
-
- Metadata - Class in org.apache.spark.sql.types
-
:: DeveloperApi ::
- metadata() - Method in class org.apache.spark.sql.types.StructField
-
- MetadataBuilder - Class in org.apache.spark.sql.types
-
:: DeveloperApi ::
- MetadataBuilder() - Constructor for class org.apache.spark.sql.types.MetadataBuilder
-
- method() - Method in class org.apache.spark.mllib.stat.test.ChiSqTestResult
-
- MethodIdentifier<T> - Class in org.apache.spark.util
-
Helper class to identify a method.
- MethodIdentifier(Class<T>, String, String) - Constructor for class org.apache.spark.util.MethodIdentifier
-
- metricName() - Method in class org.apache.spark.ml.evaluation.BinaryClassificationEvaluator
-
param for metric name in evaluation
- metricName() - Method in class org.apache.spark.ml.evaluation.RegressionEvaluator
-
param for metric name in evaluation (supports "rmse"
(default), "mse"
, "r2"
, and "mae"
)
- metrics() - Method in class org.apache.spark.ExceptionFailure
-
- metricsSystem() - Method in class org.apache.spark.SparkContext
-
- metricsSystem() - Method in class org.apache.spark.SparkEnv
-
- MFDataGenerator - Class in org.apache.spark.mllib.util
-
:: DeveloperApi ::
Generate RDD(s) containing data for Matrix Factorization.
- MFDataGenerator() - Constructor for class org.apache.spark.mllib.util.MFDataGenerator
-
- microF1Measure() - Method in class org.apache.spark.mllib.evaluation.MultilabelMetrics
-
Returns micro-averaged label-based f1-measure
(equals to micro-averaged document-based f1-measure)
- microPrecision() - Method in class org.apache.spark.mllib.evaluation.MultilabelMetrics
-
Returns micro-averaged label-based precision
(equals to micro-averaged document-based precision)
- microRecall() - Method in class org.apache.spark.mllib.evaluation.MultilabelMetrics
-
Returns micro-averaged label-based recall
(equals to micro-averaged document-based recall)
- milliseconds() - Method in class org.apache.spark.streaming.Duration
-
- milliseconds(long) - Static method in class org.apache.spark.streaming.Durations
-
- Milliseconds - Class in org.apache.spark.streaming
-
Helper object that creates instance of
Duration
representing
a given number of milliseconds.
- Milliseconds() - Constructor for class org.apache.spark.streaming.Milliseconds
-
- milliseconds() - Method in class org.apache.spark.streaming.Time
-
- millisToString(long) - Static method in class org.apache.spark.scheduler.StatsReportListener
-
Reformat a time interval in milliseconds to a prettier format for output
- min() - Method in class org.apache.spark.api.java.JavaDoubleRDD
-
Returns the minimum element from this RDD as defined by
the default comparator natural order.
- min(Comparator<T>) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Returns the minimum element from this RDD as defined by the specified
Comparator[T].
- min() - Method in class org.apache.spark.ml.attribute.NumericAttribute
-
- min() - Method in class org.apache.spark.mllib.stat.MultivariateOnlineSummarizer
-
- min() - Method in interface org.apache.spark.mllib.stat.MultivariateStatisticalSummary
-
Minimum value of each column.
- min(Ordering<T>) - Method in class org.apache.spark.rdd.RDD
-
Returns the min of this RDD as defined by the implicit Ordering[T].
- min(Column) - Static method in class org.apache.spark.sql.functions
-
Aggregate function: returns the minimum value of the expression in a group.
- min(String) - Static method in class org.apache.spark.sql.functions
-
Aggregate function: returns the minimum value of the column in a group.
- min(String...) - Method in class org.apache.spark.sql.GroupedData
-
Compute the min value for each numeric column for each group.
- min(Seq<String>) - Method in class org.apache.spark.sql.GroupedData
-
Compute the min value for each numeric column for each group.
- min(Duration) - Method in class org.apache.spark.streaming.Duration
-
- min(Time) - Method in class org.apache.spark.streaming.Time
-
- min() - Method in class org.apache.spark.util.StatCounter
-
- minDocFreq() - Method in class org.apache.spark.mllib.feature.IDF.DocumentFrequencyAggregator
-
- minDocFreq() - Method in class org.apache.spark.mllib.feature.IDF
-
- minInfoGain() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
-
- minInstancesPerNode() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
-
- MinMax() - Static method in class org.apache.spark.mllib.tree.configuration.QuantileStrategy
-
- minTokenLength() - Method in class org.apache.spark.ml.feature.RegexTokenizer
-
Minimum token length, >= 0.
- minus(RDD<Tuple2<Object, VD>>) - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
-
- minus(VertexRDD<VD>) - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
-
- minus(RDD<Tuple2<Object, VD>>) - Method in class org.apache.spark.graphx.VertexRDD
-
For each VertexId present in both this
and other
, minus will act as a set difference
operation returning only those unique VertexId's present in this
.
- minus(VertexRDD<VD>) - Method in class org.apache.spark.graphx.VertexRDD
-
For each VertexId present in both this
and other
, minus will act as a set difference
operation returning only those unique VertexId's present in this
.
- minus(Object) - Method in class org.apache.spark.sql.Column
-
Subtraction.
- minus(Duration) - Method in class org.apache.spark.streaming.Duration
-
- minus(Time) - Method in class org.apache.spark.streaming.Time
-
- minus(Duration) - Method in class org.apache.spark.streaming.Time
-
- minutes() - Static method in class org.apache.spark.scheduler.StatsReportListener
-
- minutes(long) - Static method in class org.apache.spark.streaming.Durations
-
- Minutes - Class in org.apache.spark.streaming
-
Helper object that creates instance of
Duration
representing
a given number of minutes.
- Minutes() - Constructor for class org.apache.spark.streaming.Minutes
-
- minVal() - Method in class org.apache.spark.graphx.lib.SVDPlusPlus.Conf
-
- mkString() - Method in interface org.apache.spark.sql.Row
-
Displays all elements of this sequence in a string (without a separator).
- mkString(String) - Method in interface org.apache.spark.sql.Row
-
Displays all elements of this sequence in a string using a separator string.
- mkString(String, String, String) - Method in interface org.apache.spark.sql.Row
-
Displays all elements of this traversable or iterator in a string using
start, end, and separator strings.
- MLPairRDDFunctions<K,V> - Class in org.apache.spark.mllib.rdd
-
Machine learning specific Pair RDD functions.
- MLPairRDDFunctions(RDD<Tuple2<K, V>>, ClassTag<K>, ClassTag<V>) - Constructor for class org.apache.spark.mllib.rdd.MLPairRDDFunctions
-
- MLUtils - Class in org.apache.spark.mllib.util
-
Helper methods to load, save and pre-process data used in ML Lib.
- MLUtils() - Constructor for class org.apache.spark.mllib.util.MLUtils
-
- mod(Object) - Method in class org.apache.spark.sql.Column
-
Modulo (a.k.a.
- mode(SaveMode) - Method in class org.apache.spark.sql.DataFrameWriter
-
Specifies the behavior when data or table already exists.
- mode(String) - Method in class org.apache.spark.sql.DataFrameWriter
-
Specifies the behavior when data or table already exists.
- Model<M extends Model<M>> - Class in org.apache.spark.ml
-
- Model() - Constructor for class org.apache.spark.ml.Model
-
- models() - Method in class org.apache.spark.ml.classification.OneVsRestModel
-
- modelType() - Method in class org.apache.spark.mllib.classification.NaiveBayesModel
-
- MODULE$ - Static variable in class org.apache.spark.AccumulatorParam.DoubleAccumulatorParam$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.AccumulatorParam.FloatAccumulatorParam$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.AccumulatorParam.IntAccumulatorParam$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.AccumulatorParam.LongAccumulatorParam$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.graphx.PartitionStrategy.CanonicalRandomVertexCut$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.graphx.PartitionStrategy.EdgePartition1D$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.graphx.PartitionStrategy.EdgePartition2D$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.graphx.PartitionStrategy.RandomVertexCut$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.ml.recommendation.ALS.Rating$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.mllib.clustering.PowerIterationClustering.Assignment$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.SparkContext.DoubleAccumulatorParam$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.SparkContext.FloatAccumulatorParam$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.SparkContext.IntAccumulatorParam$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.SparkContext.LongAccumulatorParam$
-
Static reference to the singleton instance of this Scala object.
- MODULE$ - Static variable in class org.apache.spark.util.Vector.VectorAccumParam$
-
Static reference to the singleton instance of this Scala object.
- monotonicallyIncreasingId() - Static method in class org.apache.spark.sql.functions
-
A column expression that generates monotonically increasing 64-bit integers.
- MQTTUtils - Class in org.apache.spark.streaming.mqtt
-
- MQTTUtils() - Constructor for class org.apache.spark.streaming.mqtt.MQTTUtils
-
- mu() - Method in class org.apache.spark.mllib.stat.distribution.MultivariateGaussian
-
- MulticlassMetrics - Class in org.apache.spark.mllib.evaluation
-
::Experimental::
Evaluator for multiclass classification.
- MulticlassMetrics(RDD<Tuple2<Object, Object>>) - Constructor for class org.apache.spark.mllib.evaluation.MulticlassMetrics
-
- MultilabelMetrics - Class in org.apache.spark.mllib.evaluation
-
Evaluator for multilabel classification.
- MultilabelMetrics(RDD<Tuple2<double[], double[]>>) - Constructor for class org.apache.spark.mllib.evaluation.MultilabelMetrics
-
- multiLabelValidator(int) - Static method in class org.apache.spark.mllib.util.DataValidators
-
Function to check if labels used for k class multi-label classification are
in the range of {0, 1, ..., k - 1}.
- Multinomial() - Static method in class org.apache.spark.mllib.classification.NaiveBayes
-
String name for multinomial model type.
- multiply(BlockMatrix) - Method in class org.apache.spark.mllib.linalg.distributed.BlockMatrix
-
- multiply(Matrix) - Method in class org.apache.spark.mllib.linalg.distributed.IndexedRowMatrix
-
- multiply(Matrix) - Method in class org.apache.spark.mllib.linalg.distributed.RowMatrix
-
Multiply this matrix by a local matrix on the right.
- multiply(DenseMatrix) - Method in interface org.apache.spark.mllib.linalg.Matrix
-
Convenience method for `Matrix`-`DenseMatrix` multiplication.
- multiply(DenseVector) - Method in interface org.apache.spark.mllib.linalg.Matrix
-
Convenience method for `Matrix`-`DenseVector` multiplication.
- multiply(Vector) - Method in interface org.apache.spark.mllib.linalg.Matrix
-
Convenience method for `Matrix`-`Vector` multiplication.
- multiply(Object) - Method in class org.apache.spark.sql.Column
-
Multiplication of this expression and another expression.
- multiply(double) - Method in class org.apache.spark.util.Vector
-
- MultivariateGaussian - Class in org.apache.spark.mllib.stat.distribution
-
:: DeveloperApi ::
This class provides basic functionality for a Multivariate Gaussian (Normal) Distribution.
- MultivariateGaussian(Vector, Matrix) - Constructor for class org.apache.spark.mllib.stat.distribution.MultivariateGaussian
-
- MultivariateOnlineSummarizer - Class in org.apache.spark.mllib.stat
-
:: DeveloperApi ::
MultivariateOnlineSummarizer implements
MultivariateStatisticalSummary
to compute the mean,
variance, minimum, maximum, counts, and nonzero counts for samples in sparse or dense vector
format in a online fashion.
- MultivariateOnlineSummarizer() - Constructor for class org.apache.spark.mllib.stat.MultivariateOnlineSummarizer
-
- MultivariateStatisticalSummary - Interface in org.apache.spark.mllib.stat
-
Trait for multivariate statistical summary of a data matrix.
- mustCheckpoint() - Method in class org.apache.spark.streaming.dstream.DStream
-
- MutablePair<T1,T2> - Class in org.apache.spark.util
-
:: DeveloperApi ::
A tuple of 2 elements.
- MutablePair(T1, T2) - Constructor for class org.apache.spark.util.MutablePair
-
- MutablePair() - Constructor for class org.apache.spark.util.MutablePair
-
No-arg constructor for serialization
- myName() - Method in class org.apache.spark.util.InnerClosureFinder
-
- MySQLDialect - Class in org.apache.spark.sql.jdbc
-
:: DeveloperApi ::
Default mysql dialect to read bit/bitsets correctly.
- MySQLDialect() - Constructor for class org.apache.spark.sql.jdbc.MySQLDialect
-
- p() - Method in class org.apache.spark.ml.feature.Normalizer
-
Normalization in L^p^ space.
- pageRank(double, double) - Method in class org.apache.spark.graphx.GraphOps
-
Run a dynamic version of PageRank returning a graph with vertex attributes containing the
PageRank and edge attributes containing the normalized edge weight.
- PageRank - Class in org.apache.spark.graphx.lib
-
PageRank algorithm implementation.
- PageRank() - Constructor for class org.apache.spark.graphx.lib.PageRank
-
- PairDStreamFunctions<K,V> - Class in org.apache.spark.streaming.dstream
-
Extra functions available on DStream of (key, value) pairs through an implicit conversion.
- PairDStreamFunctions(DStream<Tuple2<K, V>>, ClassTag<K>, ClassTag<V>, Ordering<K>) - Constructor for class org.apache.spark.streaming.dstream.PairDStreamFunctions
-
- PairFlatMapFunction<T,K,V> - Interface in org.apache.spark.api.java.function
-
A function that returns zero or more key-value pair records from each input record.
- PairFunction<T,K,V> - Interface in org.apache.spark.api.java.function
-
A function that returns key-value pairs (Tuple2<K, V>), and can be used to
construct PairRDDs.
- PairRDDFunctions<K,V> - Class in org.apache.spark.rdd
-
Extra functions available on RDDs of (key, value) pairs through an implicit conversion.
- PairRDDFunctions(RDD<Tuple2<K, V>>, ClassTag<K>, ClassTag<V>, Ordering<K>) - Constructor for class org.apache.spark.rdd.PairRDDFunctions
-
- PairwiseRRDD<T> - Class in org.apache.spark.api.r
-
Form an RDD[(Int, Array[Byte])] from key-value pairs returned from R.
- PairwiseRRDD(RDD<T>, int, byte[], String, byte[], String, Object[], ClassTag<T>) - Constructor for class org.apache.spark.api.r.PairwiseRRDD
-
- parallelize(List<T>, int) - Method in class org.apache.spark.api.java.JavaSparkContext
-
Distribute a local Scala collection to form an RDD.
- parallelize(List<T>) - Method in class org.apache.spark.api.java.JavaSparkContext
-
Distribute a local Scala collection to form an RDD.
- parallelize(Seq<T>, int, ClassTag<T>) - Method in class org.apache.spark.SparkContext
-
Distribute a local Scala collection to form an RDD.
- parallelizeDoubles(List<Double>, int) - Method in class org.apache.spark.api.java.JavaSparkContext
-
Distribute a local Scala collection to form an RDD.
- parallelizeDoubles(List<Double>) - Method in class org.apache.spark.api.java.JavaSparkContext
-
Distribute a local Scala collection to form an RDD.
- parallelizePairs(List<Tuple2<K, V>>, int) - Method in class org.apache.spark.api.java.JavaSparkContext
-
Distribute a local Scala collection to form an RDD.
- parallelizePairs(List<Tuple2<K, V>>) - Method in class org.apache.spark.api.java.JavaSparkContext
-
Distribute a local Scala collection to form an RDD.
- Param<T> - Class in org.apache.spark.ml.param
-
:: DeveloperApi ::
A param with self-contained documentation and optionally default value.
- Param(String, String, String, Function1<T, Object>) - Constructor for class org.apache.spark.ml.param.Param
-
- Param(org.apache.spark.ml.util.Identifiable, String, String, Function1<T, Object>) - Constructor for class org.apache.spark.ml.param.Param
-
- Param(String, String, String) - Constructor for class org.apache.spark.ml.param.Param
-
- Param(org.apache.spark.ml.util.Identifiable, String, String) - Constructor for class org.apache.spark.ml.param.Param
-
- param() - Method in class org.apache.spark.ml.param.ParamPair
-
- ParamGridBuilder - Class in org.apache.spark.ml.tuning
-
:: Experimental ::
Builder for a param grid used in grid search-based model selection.
- ParamGridBuilder() - Constructor for class org.apache.spark.ml.tuning.ParamGridBuilder
-
- ParamMap - Class in org.apache.spark.ml.param
-
:: Experimental ::
A param to value map.
- ParamMap() - Constructor for class org.apache.spark.ml.param.ParamMap
-
Creates an empty param map.
- paramMap() - Method in interface org.apache.spark.ml.param.Params
-
Internal param map for user-supplied values.
- ParamPair<T> - Class in org.apache.spark.ml.param
-
:: Experimental ::
A param amd its value.
- ParamPair(Param<T>, T) - Constructor for class org.apache.spark.ml.param.ParamPair
-
- Params - Interface in org.apache.spark.ml.param
-
:: DeveloperApi ::
Trait for components that take parameters.
- params() - Method in interface org.apache.spark.ml.param.Params
-
Returns all params sorted by their names.
- ParamValidators - Class in org.apache.spark.ml.param
-
:: DeveloperApi ::
Factory methods for common validation functions for Param.isValid
.
- ParamValidators() - Constructor for class org.apache.spark.ml.param.ParamValidators
-
- parent() - Method in class org.apache.spark.ml.Model
-
The parent estimator that produced this model.
- parent() - Method in class org.apache.spark.ml.param.Param
-
- parentIds() - Method in class org.apache.spark.scheduler.StageInfo
-
- parentIds() - Method in class org.apache.spark.storage.RDDInfo
-
- parentIndex(int) - Static method in class org.apache.spark.mllib.tree.model.Node
-
Get the parent index of the given node, or 0 if it is the root.
- parquet(String...) - Method in class org.apache.spark.sql.DataFrameReader
-
Loads a Parquet file, returning the result as a
DataFrame
.
- parquet(Seq<String>) - Method in class org.apache.spark.sql.DataFrameReader
-
Loads a Parquet file, returning the result as a
DataFrame
.
- parquet(String) - Method in class org.apache.spark.sql.DataFrameWriter
-
Saves the content of the
DataFrame
in Parquet format at the specified path.
- parquetFile(String...) - Method in class org.apache.spark.sql.SQLContext
-
Deprecated.
As of 1.4.0, replaced by read().parquet()
.
- parquetFile(Seq<String>) - Method in class org.apache.spark.sql.SQLContext
-
- parse(String) - Static method in class org.apache.spark.mllib.linalg.Vectors
-
Parses a string resulted from
Vector.toString
into a
Vector
.
- parse(String) - Static method in class org.apache.spark.mllib.regression.LabeledPoint
-
- parseIgnoreCase(Class<E>, String) - Static method in class org.apache.spark.util.EnumUtil
-
- PartialResult<R> - Class in org.apache.spark.partial
-
- PartialResult(R, boolean) - Constructor for class org.apache.spark.partial.PartialResult
-
- Partition - Interface in org.apache.spark
-
An identifier for a partition in an RDD.
- partition() - Method in class org.apache.spark.streaming.kafka.OffsetRange
-
Kafka partition id
- partitionBy(Partitioner) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Return a copy of the RDD partitioned using the specified partitioner.
- partitionBy(PartitionStrategy) - Method in class org.apache.spark.graphx.Graph
-
Repartitions the edges in the graph according to partitionStrategy
.
- partitionBy(PartitionStrategy, int) - Method in class org.apache.spark.graphx.Graph
-
Repartitions the edges in the graph according to partitionStrategy
.
- partitionBy(PartitionStrategy) - Method in class org.apache.spark.graphx.impl.GraphImpl
-
- partitionBy(PartitionStrategy, int) - Method in class org.apache.spark.graphx.impl.GraphImpl
-
- partitionBy(Partitioner) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Return a copy of the RDD partitioned using the specified partitioner.
- partitionBy(String...) - Method in class org.apache.spark.sql.DataFrameWriter
-
Partitions the output by the given columns on the file system.
- partitionBy(Seq<String>) - Method in class org.apache.spark.sql.DataFrameWriter
-
Partitions the output by the given columns on the file system.
- partitionBy(String, String...) - Static method in class org.apache.spark.sql.expressions.Window
-
Creates a
WindowSpec
with the partitioning defined.
- partitionBy(Column...) - Static method in class org.apache.spark.sql.expressions.Window
-
Creates a
WindowSpec
with the partitioning defined.
- partitionBy(String, Seq<String>) - Static method in class org.apache.spark.sql.expressions.Window
-
Creates a
WindowSpec
with the partitioning defined.
- partitionBy(Seq<Column>) - Static method in class org.apache.spark.sql.expressions.Window
-
Creates a
WindowSpec
with the partitioning defined.
- partitionBy(String, String...) - Method in class org.apache.spark.sql.expressions.WindowSpec
-
- partitionBy(Column...) - Method in class org.apache.spark.sql.expressions.WindowSpec
-
- partitionBy(String, Seq<String>) - Method in class org.apache.spark.sql.expressions.WindowSpec
-
- partitionBy(Seq<Column>) - Method in class org.apache.spark.sql.expressions.WindowSpec
-
- PartitionCoalescer - Class in org.apache.spark.rdd
-
Coalesce the partitions of a parent RDD (prev
) into fewer partitions, so that each partition of
this RDD computes one or more of the parent ones.
- PartitionCoalescer(int, RDD<?>, double) - Constructor for class org.apache.spark.rdd.PartitionCoalescer
-
- PartitionCoalescer.LocationIterator - Class in org.apache.spark.rdd
-
- PartitionCoalescer.LocationIterator(RDD<?>) - Constructor for class org.apache.spark.rdd.PartitionCoalescer.LocationIterator
-
- partitionColumns() - Method in class org.apache.spark.sql.sources.HadoopFsRelation
-
Partition columns.
- partitioner() - Method in class org.apache.spark.graphx.impl.EdgeRDDImpl
-
If partitionsRDD
already has a partitioner, use it.
- partitioner() - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
-
- Partitioner - Class in org.apache.spark
-
An object that defines how the elements in a key-value pair RDD are partitioned by key.
- Partitioner() - Constructor for class org.apache.spark.Partitioner
-
- partitioner() - Method in class org.apache.spark.rdd.CoGroupedRDD
-
- partitioner() - Method in class org.apache.spark.rdd.RDD
-
Optionally overridden by subclasses to specify how they are partitioned.
- partitioner() - Method in class org.apache.spark.rdd.ShuffledRDD
-
- partitioner() - Method in class org.apache.spark.ShuffleDependency
-
- PartitionGroup - Class in org.apache.spark.rdd
-
- PartitionGroup(Option<String>) - Constructor for class org.apache.spark.rdd.PartitionGroup
-
- partitionID() - Method in class org.apache.spark.TaskCommitDenied
-
- partitionId() - Method in class org.apache.spark.TaskContext
-
The ID of the RDD partition that is computed by this task.
- PartitionPruningRDD<T> - Class in org.apache.spark.rdd
-
:: DeveloperApi ::
A RDD used to prune RDD partitions/partitions so we can avoid launching tasks on
all partitions.
- PartitionPruningRDD(RDD<T>, Function1<Object, Object>, ClassTag<T>) - Constructor for class org.apache.spark.rdd.PartitionPruningRDD
-
- partitions() - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Set of partitions in this RDD.
- partitions() - Method in class org.apache.spark.rdd.RDD
-
Get the array of partitions of this RDD, taking into account whether the
RDD is checkpointed or not.
- partitions() - Method in class org.apache.spark.status.api.v1.RDDStorageInfo
-
- partitionsRDD() - Method in class org.apache.spark.graphx.impl.EdgeRDDImpl
-
- partitionsRDD() - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
-
- PartitionStrategy - Interface in org.apache.spark.graphx
-
Represents the way edges are assigned to edge partitions based on their source and destination
vertex IDs.
- PartitionStrategy.CanonicalRandomVertexCut$ - Class in org.apache.spark.graphx
-
Assigns edges to partitions by hashing the source and destination vertex IDs in a canonical
direction, resulting in a random vertex cut that colocates all edges between two vertices,
regardless of direction.
- PartitionStrategy.CanonicalRandomVertexCut$() - Constructor for class org.apache.spark.graphx.PartitionStrategy.CanonicalRandomVertexCut$
-
- PartitionStrategy.EdgePartition1D$ - Class in org.apache.spark.graphx
-
Assigns edges to partitions using only the source vertex ID, colocating edges with the same
source.
- PartitionStrategy.EdgePartition1D$() - Constructor for class org.apache.spark.graphx.PartitionStrategy.EdgePartition1D$
-
- PartitionStrategy.EdgePartition2D$ - Class in org.apache.spark.graphx
-
Assigns edges to partitions using a 2D partitioning of the sparse edge adjacency matrix,
guaranteeing a 2 * sqrt(numParts) - 1
bound on vertex replication.
- PartitionStrategy.EdgePartition2D$() - Constructor for class org.apache.spark.graphx.PartitionStrategy.EdgePartition2D$
-
- PartitionStrategy.RandomVertexCut$ - Class in org.apache.spark.graphx
-
Assigns edges to partitions by hashing the source and destination vertex IDs, resulting in a
random vertex cut that colocates all same-direction edges between two vertices.
- PartitionStrategy.RandomVertexCut$() - Constructor for class org.apache.spark.graphx.PartitionStrategy.RandomVertexCut$
-
- path() - Method in class org.apache.spark.scheduler.InputFormatInfo
-
- path() - Method in class org.apache.spark.scheduler.SplitInfo
-
- paths() - Method in class org.apache.spark.sql.sources.HadoopFsRelation
-
Base paths of this relation.
- pattern() - Method in class org.apache.spark.ml.feature.RegexTokenizer
-
Regex pattern used to match delimiters if gaps
is true or tokens if gaps
is false.
- pc() - Method in class org.apache.spark.mllib.feature.PCAModel
-
- PCA - Class in org.apache.spark.mllib.feature
-
A feature transformer that projects vectors to a low-dimensional space using PCA.
- PCA(int) - Constructor for class org.apache.spark.mllib.feature.PCA
-
- PCAModel - Class in org.apache.spark.mllib.feature
-
Model fitted by
PCA
that can project vectors to a low-dimensional space using PCA.
- pdf(Vector) - Method in class org.apache.spark.mllib.stat.distribution.MultivariateGaussian
-
Returns density of this multivariate Gaussian at given point, x
- pendingStages() - Method in class org.apache.spark.ui.jobs.JobProgressListener
-
- percentiles() - Static method in class org.apache.spark.scheduler.StatsReportListener
-
- percentilesHeader() - Static method in class org.apache.spark.scheduler.StatsReportListener
-
- percentRank() - Static method in class org.apache.spark.sql.functions
-
Window function: returns the relative rank (i.e.
- persist(StorageLevel) - Method in class org.apache.spark.api.java.JavaDoubleRDD
-
Set this RDD's storage level to persist its values across operations after the first time
it is computed.
- persist(StorageLevel) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Set this RDD's storage level to persist its values across operations after the first time
it is computed.
- persist(StorageLevel) - Method in class org.apache.spark.api.java.JavaRDD
-
Set this RDD's storage level to persist its values across operations after the first time
it is computed.
- persist(StorageLevel) - Method in class org.apache.spark.graphx.Graph
-
Caches the vertices and edges associated with this graph at the specified storage level,
ignoring any target storage levels previously set.
- persist(StorageLevel) - Method in class org.apache.spark.graphx.impl.EdgeRDDImpl
-
Persists the edge partitions at the specified storage level, ignoring any existing target
storage level.
- persist(StorageLevel) - Method in class org.apache.spark.graphx.impl.GraphImpl
-
- persist(StorageLevel) - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
-
Persists the vertex partitions at the specified storage level, ignoring any existing target
storage level.
- persist(StorageLevel) - Method in class org.apache.spark.mllib.linalg.distributed.BlockMatrix
-
Persists the underlying RDD with the specified storage level.
- persist(StorageLevel) - Method in class org.apache.spark.rdd.HadoopRDD
-
- persist(StorageLevel) - Method in class org.apache.spark.rdd.NewHadoopRDD
-
- persist(StorageLevel) - Method in class org.apache.spark.rdd.RDD
-
Set this RDD's storage level to persist its values across operations after the first time
it is computed.
- persist() - Method in class org.apache.spark.rdd.RDD
-
Persist this RDD with the default storage level (`MEMORY_ONLY`).
- persist() - Method in class org.apache.spark.sql.DataFrame
-
- persist(StorageLevel) - Method in class org.apache.spark.sql.DataFrame
-
- persist() - Method in class org.apache.spark.streaming.api.java.JavaDStream
-
Persist RDDs of this DStream with the default storage level (MEMORY_ONLY_SER)
- persist(StorageLevel) - Method in class org.apache.spark.streaming.api.java.JavaDStream
-
Persist the RDDs of this DStream with the given storage level
- persist() - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Persist RDDs of this DStream with the default storage level (MEMORY_ONLY_SER)
- persist(StorageLevel) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Persist the RDDs of this DStream with the given storage level
- persist(StorageLevel) - Method in class org.apache.spark.streaming.dstream.DStream
-
Persist the RDDs of this DStream with the given storage level
- persist() - Method in class org.apache.spark.streaming.dstream.DStream
-
Persist RDDs of this DStream with the default storage level (MEMORY_ONLY_SER)
- persistentRdds() - Method in class org.apache.spark.SparkContext
-
- personalizedPageRank(long, double, double) - Method in class org.apache.spark.graphx.GraphOps
-
Run personalized PageRank for a given vertex, such that all random walks
are started relative to the source node.
- pi() - Method in class org.apache.spark.mllib.classification.NaiveBayesModel
-
- pickBin(Partition) - Method in class org.apache.spark.rdd.PartitionCoalescer
-
Takes a parent RDD partition and decides which of the partition groups to put it in
Takes locality into account, but also uses power of 2 choices to load balance
It strikes a balance between the two use the balanceSlack variable
- pickRandomVertex() - Method in class org.apache.spark.graphx.GraphOps
-
Picks a random vertex from the graph and returns its ID.
- pipe(String) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Return an RDD created by piping elements to a forked external process.
- pipe(List<String>) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Return an RDD created by piping elements to a forked external process.
- pipe(List<String>, Map<String, String>) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Return an RDD created by piping elements to a forked external process.
- pipe(String) - Method in class org.apache.spark.rdd.RDD
-
Return an RDD created by piping elements to a forked external process.
- pipe(String, Map<String, String>) - Method in class org.apache.spark.rdd.RDD
-
Return an RDD created by piping elements to a forked external process.
- pipe(Seq<String>, Map<String, String>, Function1<Function1<String, BoxedUnit>, BoxedUnit>, Function2<T, Function1<String, BoxedUnit>, BoxedUnit>, boolean) - Method in class org.apache.spark.rdd.RDD
-
Return an RDD created by piping elements to a forked external process.
- Pipeline - Class in org.apache.spark.ml
-
:: Experimental ::
A simple pipeline, which acts as an estimator.
- Pipeline(String) - Constructor for class org.apache.spark.ml.Pipeline
-
- Pipeline() - Constructor for class org.apache.spark.ml.Pipeline
-
- PipelineModel - Class in org.apache.spark.ml
-
:: Experimental ::
Represents a fitted pipeline.
- PipelineStage - Class in org.apache.spark.ml
-
- PipelineStage() - Constructor for class org.apache.spark.ml.PipelineStage
-
- plus(Object) - Method in class org.apache.spark.sql.Column
-
Sum of this expression and another expression.
- plus(Duration) - Method in class org.apache.spark.streaming.Duration
-
- plus(Duration) - Method in class org.apache.spark.streaming.Time
-
- plusDot(Vector, Vector) - Method in class org.apache.spark.util.Vector
-
return (this + plus) dot other, but without creating any intermediate storage
- PMMLExportable - Interface in org.apache.spark.mllib.pmml
-
:: DeveloperApi ::
Export model to the PMML format
Predictive Model Markup Language (PMML) is an XML-based file format
developed by the Data Mining Group (www.dmg.org).
- point() - Method in class org.apache.spark.mllib.feature.VocabWord
-
- POINTS() - Static method in class org.apache.spark.mllib.clustering.StreamingKMeans
-
- PoissonGenerator - Class in org.apache.spark.mllib.random
-
:: DeveloperApi ::
Generates i.i.d.
- PoissonGenerator(double) - Constructor for class org.apache.spark.mllib.random.PoissonGenerator
-
- poissonJavaRDD(JavaSparkContext, double, long, int, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs
-
- poissonJavaRDD(JavaSparkContext, double, long, int) - Static method in class org.apache.spark.mllib.random.RandomRDDs
-
- poissonJavaRDD(JavaSparkContext, double, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs
-
- poissonJavaVectorRDD(JavaSparkContext, double, long, int, int, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs
-
- poissonJavaVectorRDD(JavaSparkContext, double, long, int, int) - Static method in class org.apache.spark.mllib.random.RandomRDDs
-
- poissonJavaVectorRDD(JavaSparkContext, double, long, int) - Static method in class org.apache.spark.mllib.random.RandomRDDs
-
- poissonRDD(SparkContext, double, long, int, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs
-
Generates an RDD comprised of i.i.d.
samples from the Poisson distribution with the input
mean.
- PoissonSampler<T> - Class in org.apache.spark.util.random
-
:: DeveloperApi ::
A sampler for sampling with replacement, based on values drawn from Poisson distribution.
- PoissonSampler(double, ClassTag<T>) - Constructor for class org.apache.spark.util.random.PoissonSampler
-
- poissonVectorRDD(SparkContext, double, long, int, int, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs
-
Generates an RDD[Vector] with vectors containing i.i.d.
samples drawn from the
Poisson distribution with the input mean.
- PolynomialExpansion - Class in org.apache.spark.ml.feature
-
:: Experimental ::
Perform feature expansion in a polynomial space.
- PolynomialExpansion(String) - Constructor for class org.apache.spark.ml.feature.PolynomialExpansion
-
- PolynomialExpansion() - Constructor for class org.apache.spark.ml.feature.PolynomialExpansion
-
- poolToActiveStages() - Method in class org.apache.spark.ui.jobs.JobProgressListener
-
- port() - Method in class org.apache.spark.storage.BlockManagerId
-
- port() - Method in class org.apache.spark.streaming.kafka.Broker
-
Broker's port
- PortableDataStream - Class in org.apache.spark.input
-
A class that allows DataStreams to be serialized and moved around by not creating them
until they need to be read
- PortableDataStream(CombineFileSplit, TaskAttemptContext, Integer) - Constructor for class org.apache.spark.input.PortableDataStream
-
- PostgresDialect - Class in org.apache.spark.sql.jdbc
-
:: DeveloperApi ::
Default postgres dialect, mapping bit/cidr/inet on read and string/binary/boolean on write.
- PostgresDialect() - Constructor for class org.apache.spark.sql.jdbc.PostgresDialect
-
- pow(Column, Column) - Static method in class org.apache.spark.sql.functions
-
Returns the value of the first argument raised to the power of the second argument.
- pow(Column, String) - Static method in class org.apache.spark.sql.functions
-
Returns the value of the first argument raised to the power of the second argument.
- pow(String, Column) - Static method in class org.apache.spark.sql.functions
-
Returns the value of the first argument raised to the power of the second argument.
- pow(String, String) - Static method in class org.apache.spark.sql.functions
-
Returns the value of the first argument raised to the power of the second argument.
- pow(Column, double) - Static method in class org.apache.spark.sql.functions
-
Returns the value of the first argument raised to the power of the second argument.
- pow(String, double) - Static method in class org.apache.spark.sql.functions
-
Returns the value of the first argument raised to the power of the second argument.
- pow(double, Column) - Static method in class org.apache.spark.sql.functions
-
Returns the value of the first argument raised to the power of the second argument.
- pow(double, String) - Static method in class org.apache.spark.sql.functions
-
Returns the value of the first argument raised to the power of the second argument.
- PowerIterationClustering - Class in org.apache.spark.mllib.clustering
-
- PowerIterationClustering() - Constructor for class org.apache.spark.mllib.clustering.PowerIterationClustering
-
- PowerIterationClustering.Assignment - Class in org.apache.spark.mllib.clustering
-
:: Experimental ::
Cluster assignment.
- PowerIterationClustering.Assignment(long, int) - Constructor for class org.apache.spark.mllib.clustering.PowerIterationClustering.Assignment
-
- PowerIterationClustering.Assignment$ - Class in org.apache.spark.mllib.clustering
-
- PowerIterationClustering.Assignment$() - Constructor for class org.apache.spark.mllib.clustering.PowerIterationClustering.Assignment$
-
- PowerIterationClusteringModel - Class in org.apache.spark.mllib.clustering
-
:: Experimental ::
- PowerIterationClusteringModel(int, RDD<PowerIterationClustering.Assignment>) - Constructor for class org.apache.spark.mllib.clustering.PowerIterationClusteringModel
-
- pr() - Method in class org.apache.spark.mllib.evaluation.BinaryClassificationMetrics
-
Returns the precision-recall curve, which is an RDD of (recall, precision),
NOT (precision, recall), with (0.0, 1.0) prepended to it.
- precision(double) - Method in class org.apache.spark.mllib.evaluation.MulticlassMetrics
-
Returns precision for a given label (category)
- precision() - Method in class org.apache.spark.mllib.evaluation.MulticlassMetrics
-
Returns precision
- precision() - Method in class org.apache.spark.mllib.evaluation.MultilabelMetrics
-
Returns document-based precision averaged by the number of documents
- precision(double) - Method in class org.apache.spark.mllib.evaluation.MultilabelMetrics
-
Returns precision for a given label (category)
- precision() - Method in class org.apache.spark.sql.types.Decimal
-
- precision() - Method in class org.apache.spark.sql.types.DecimalType
-
- precision() - Method in class org.apache.spark.sql.types.PrecisionInfo
-
- precisionAt(int) - Method in class org.apache.spark.mllib.evaluation.RankingMetrics
-
Compute the average precision of all the queries, truncated at ranking position k.
- precisionByThreshold() - Method in class org.apache.spark.mllib.evaluation.BinaryClassificationMetrics
-
Returns the (threshold, precision) curve.
- precisionInfo() - Method in class org.apache.spark.sql.types.DecimalType
-
- PrecisionInfo - Class in org.apache.spark.sql.types
-
Precision parameters for a Decimal
- PrecisionInfo(int, int) - Constructor for class org.apache.spark.sql.types.PrecisionInfo
-
- predict(RDD<Vector>) - Method in interface org.apache.spark.mllib.classification.ClassificationModel
-
Predict values for the given data set using the model trained.
- predict(Vector) - Method in interface org.apache.spark.mllib.classification.ClassificationModel
-
Predict values for a single data point using the model trained.
- predict(JavaRDD<Vector>) - Method in interface org.apache.spark.mllib.classification.ClassificationModel
-
Predict values for examples stored in a JavaRDD.
- predict(RDD<Vector>) - Method in class org.apache.spark.mllib.classification.NaiveBayesModel
-
- predict(Vector) - Method in class org.apache.spark.mllib.classification.NaiveBayesModel
-
- predict(RDD<Vector>) - Method in class org.apache.spark.mllib.clustering.GaussianMixtureModel
-
Maps given points to their cluster indices.
- predict(JavaRDD<Vector>) - Method in class org.apache.spark.mllib.clustering.GaussianMixtureModel
-
Java-friendly version of predict()
- predict(Vector) - Method in class org.apache.spark.mllib.clustering.KMeansModel
-
Returns the cluster index that a given point belongs to.
- predict(RDD<Vector>) - Method in class org.apache.spark.mllib.clustering.KMeansModel
-
Maps given points to their cluster indices.
- predict(JavaRDD<Vector>) - Method in class org.apache.spark.mllib.clustering.KMeansModel
-
Maps given points to their cluster indices.
- predict(int, int) - Method in class org.apache.spark.mllib.recommendation.MatrixFactorizationModel
-
Predict the rating of one user for one product.
- predict(RDD<Tuple2<Object, Object>>) - Method in class org.apache.spark.mllib.recommendation.MatrixFactorizationModel
-
Predict the rating of many users for many products.
- predict(JavaPairRDD<Integer, Integer>) - Method in class org.apache.spark.mllib.recommendation.MatrixFactorizationModel
-
Java-friendly version of MatrixFactorizationModel.predict
.
- predict(RDD<Vector>) - Method in class org.apache.spark.mllib.regression.GeneralizedLinearModel
-
Predict values for the given data set using the model trained.
- predict(Vector) - Method in class org.apache.spark.mllib.regression.GeneralizedLinearModel
-
Predict values for a single data point using the model trained.
- predict(RDD<Object>) - Method in class org.apache.spark.mllib.regression.IsotonicRegressionModel
-
Predict labels for provided features.
- predict(JavaDoubleRDD) - Method in class org.apache.spark.mllib.regression.IsotonicRegressionModel
-
Predict labels for provided features.
- predict(double) - Method in class org.apache.spark.mllib.regression.IsotonicRegressionModel
-
Predict a single label.
- predict(RDD<Vector>) - Method in interface org.apache.spark.mllib.regression.RegressionModel
-
Predict values for the given data set using the model trained.
- predict(Vector) - Method in interface org.apache.spark.mllib.regression.RegressionModel
-
Predict values for a single data point using the model trained.
- predict(JavaRDD<Vector>) - Method in interface org.apache.spark.mllib.regression.RegressionModel
-
Predict values for examples stored in a JavaRDD.
- predict(Vector) - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel
-
Predict values for a single data point using the model trained.
- predict(RDD<Vector>) - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel
-
Predict values for the given data set using the model trained.
- predict(JavaRDD<Vector>) - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel
-
Predict values for the given data set using the model trained.
- predict() - Method in class org.apache.spark.mllib.tree.model.Node
-
- predict(Vector) - Method in class org.apache.spark.mllib.tree.model.Node
-
predict value if node is not leaf
- Predict - Class in org.apache.spark.mllib.tree.model
-
Predicted value for a node
param: predict predicted value
param: prob probability of the label (classification only)
- Predict(double, double) - Constructor for class org.apache.spark.mllib.tree.model.Predict
-
- predict() - Method in class org.apache.spark.mllib.tree.model.Predict
-
- prediction() - Method in class org.apache.spark.ml.tree.InternalNode
-
- prediction() - Method in class org.apache.spark.ml.tree.LeafNode
-
- prediction() - Method in class org.apache.spark.ml.tree.Node
-
Prediction a leaf node makes, or which an internal node would make if it were a leaf node
- PredictionModel<FeaturesType,M extends PredictionModel<FeaturesType,M>> - Class in org.apache.spark.ml
-
:: DeveloperApi ::
Abstraction for a model for prediction tasks (regression and classification).
- PredictionModel() - Constructor for class org.apache.spark.ml.PredictionModel
-
- predictions() - Method in class org.apache.spark.mllib.regression.IsotonicRegressionModel
-
- predictOn(DStream<Vector>) - Method in class org.apache.spark.mllib.clustering.StreamingKMeans
-
Use the clustering model to make predictions on batches of data from a DStream.
- predictOn(JavaDStream<Vector>) - Method in class org.apache.spark.mllib.clustering.StreamingKMeans
-
Java-friendly version of `predictOn`.
- predictOn(DStream<Vector>) - Method in class org.apache.spark.mllib.regression.StreamingLinearAlgorithm
-
Use the model to make predictions on batches of data from a DStream
- predictOn(JavaDStream<Vector>) - Method in class org.apache.spark.mllib.regression.StreamingLinearAlgorithm
-
Java-friendly version of `predictOn`.
- predictOnValues(DStream<Tuple2<K, Vector>>, ClassTag<K>) - Method in class org.apache.spark.mllib.clustering.StreamingKMeans
-
Use the model to make predictions on the values of a DStream and carry over its keys.
- predictOnValues(JavaPairDStream<K, Vector>) - Method in class org.apache.spark.mllib.clustering.StreamingKMeans
-
Java-friendly version of `predictOnValues`.
- predictOnValues(DStream<Tuple2<K, Vector>>, ClassTag<K>) - Method in class org.apache.spark.mllib.regression.StreamingLinearAlgorithm
-
Use the model to make predictions on the values of a DStream and carry over its keys.
- predictOnValues(JavaPairDStream<K, Vector>) - Method in class org.apache.spark.mllib.regression.StreamingLinearAlgorithm
-
Java-friendly version of `predictOnValues`.
- Predictor<FeaturesType,Learner extends Predictor<FeaturesType,Learner,M>,M extends PredictionModel<FeaturesType,M>> - Class in org.apache.spark.ml
-
:: DeveloperApi ::
Abstraction for prediction problems (regression and classification).
- Predictor() - Constructor for class org.apache.spark.ml.Predictor
-
- predictSoft(RDD<Vector>) - Method in class org.apache.spark.mllib.clustering.GaussianMixtureModel
-
Given the input vectors, return the membership value of each vector
to all mixture components.
- preferredLocation() - Method in class org.apache.spark.streaming.receiver.Receiver
-
Override this to specify a preferred location (hostname).
- preferredLocations(Partition) - Method in class org.apache.spark.rdd.RDD
-
Get the preferred locations of a partition, taking into account whether the
RDD is checkpointed.
- preferredNodeLocationData() - Method in class org.apache.spark.SparkContext
-
- prefLoc() - Method in class org.apache.spark.rdd.PartitionGroup
-
- pregel(A, int, EdgeDirection, Function3<Object, VD, A, VD>, Function1<EdgeTriplet<VD, ED>, Iterator<Tuple2<Object, A>>>, Function2<A, A, A>, ClassTag<A>) - Method in class org.apache.spark.graphx.GraphOps
-
Execute a Pregel-like iterative vertex-parallel abstraction.
- Pregel - Class in org.apache.spark.graphx
-
Implements a Pregel-like bulk-synchronous message-passing API.
- Pregel() - Constructor for class org.apache.spark.graphx.Pregel
-
- prepareJobForWrite(Job) - Method in class org.apache.spark.sql.sources.HadoopFsRelation
-
- prettyJson() - Method in class org.apache.spark.sql.types.DataType
-
The pretty (i.e.
- prettyPrint() - Method in class org.apache.spark.streaming.Duration
-
- prev() - Method in class org.apache.spark.rdd.ShuffledRDD
-
- prevHandler() - Method in class org.apache.spark.util.SignalLoggerHandler
-
- print() - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
-
Print the first ten elements of each RDD generated in this DStream.
- print(int) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
-
Print the first num elements of each RDD generated in this DStream.
- print() - Method in class org.apache.spark.streaming.dstream.DStream
-
Print the first ten elements of each RDD generated in this DStream.
- print(int) - Method in class org.apache.spark.streaming.dstream.DStream
-
Print the first num elements of each RDD generated in this DStream.
- printSchema() - Method in class org.apache.spark.sql.DataFrame
-
Prints the schema to the console in a nice tree format.
- printStats() - Method in class org.apache.spark.streaming.scheduler.StatsReportListener
-
- printTreeString() - Method in class org.apache.spark.sql.types.StructType
-
- Private - Annotation Type in org.apache.spark.annotation
-
A class that is considered private to the internals of Spark -- there is a high-likelihood
they will be changed in future versions of Spark.
- prob() - Method in class org.apache.spark.mllib.tree.model.Predict
-
- probabilities() - Static method in class org.apache.spark.scheduler.StatsReportListener
-
- PROCESS_LOCAL() - Static method in class org.apache.spark.scheduler.TaskLocality
-
- processingDelay() - Method in class org.apache.spark.streaming.scheduler.BatchInfo
-
Time taken for the all jobs of this batch to finish processing from the time they started
processing.
- processingEndTime() - Method in class org.apache.spark.streaming.scheduler.BatchInfo
-
- processingStartTime() - Method in class org.apache.spark.streaming.scheduler.BatchInfo
-
- product() - Method in class org.apache.spark.mllib.recommendation.Rating
-
- productFeatures() - Method in class org.apache.spark.mllib.recommendation.MatrixFactorizationModel
-
- progressListener() - Method in class org.apache.spark.streaming.StreamingContext
-
- properties() - Method in class org.apache.spark.scheduler.SparkListenerJobStart
-
- properties() - Method in class org.apache.spark.scheduler.SparkListenerStageSubmitted
-
- PrunedFilteredScan - Interface in org.apache.spark.sql.sources
-
::DeveloperApi::
A BaseRelation that can eliminate unneeded columns and filter using selected
predicates before producing an RDD containing all matching tuples as Row objects.
- PrunedScan - Interface in org.apache.spark.sql.sources
-
::DeveloperApi::
A BaseRelation that can eliminate unneeded columns before producing an RDD
containing all of its tuples as Row objects.
- Pseudorandom - Interface in org.apache.spark.util.random
-
:: DeveloperApi ::
A class with pseudorandom behavior.
- put(ParamPair<?>...) - Method in class org.apache.spark.ml.param.ParamMap
-
Puts a list of param pairs (overwrites if the input params exists).
- put(Param<T>, T) - Method in class org.apache.spark.ml.param.ParamMap
-
Puts a (param, value) pair (overwrites if the input param exists).
- put(Seq<ParamPair<?>>) - Method in class org.apache.spark.ml.param.ParamMap
-
Puts a list of param pairs (overwrites if the input params exists).
- putBoolean(String, boolean) - Method in class org.apache.spark.sql.types.MetadataBuilder
-
Puts a Boolean.
- putBooleanArray(String, boolean[]) - Method in class org.apache.spark.sql.types.MetadataBuilder
-
Puts a Boolean array.
- putDouble(String, double) - Method in class org.apache.spark.sql.types.MetadataBuilder
-
Puts a Double.
- putDoubleArray(String, double[]) - Method in class org.apache.spark.sql.types.MetadataBuilder
-
Puts a Double array.
- putLong(String, long) - Method in class org.apache.spark.sql.types.MetadataBuilder
-
Puts a Long.
- putLongArray(String, long[]) - Method in class org.apache.spark.sql.types.MetadataBuilder
-
Puts a Long array.
- putMetadata(String, Metadata) - Method in class org.apache.spark.sql.types.MetadataBuilder
-
- putMetadataArray(String, Metadata[]) - Method in class org.apache.spark.sql.types.MetadataBuilder
-
- putString(String, String) - Method in class org.apache.spark.sql.types.MetadataBuilder
-
Puts a String.
- putStringArray(String, String[]) - Method in class org.apache.spark.sql.types.MetadataBuilder
-
Puts a String array.
- pValue() - Method in class org.apache.spark.mllib.stat.test.ChiSqTestResult
-
- pValue() - Method in interface org.apache.spark.mllib.stat.test.TestResult
-
The probability of obtaining a test statistic result at least as extreme as the one that was
actually observed, assuming that the null hypothesis is true.
- pyUDT() - Method in class org.apache.spark.sql.types.UserDefinedType
-
Paired Python UDT class, if exists.
- r2() - Method in class org.apache.spark.mllib.evaluation.RegressionMetrics
-
Returns R^2^, the coefficient of determination.
- RACK_LOCAL() - Static method in class org.apache.spark.scheduler.TaskLocality
-
- rand(int, int, Random) - Static method in class org.apache.spark.mllib.linalg.DenseMatrix
-
Generate a DenseMatrix
consisting of i.i.d.
uniform random numbers.
- rand(int, int, Random) - Static method in class org.apache.spark.mllib.linalg.Matrices
-
Generate a DenseMatrix
consisting of i.i.d.
uniform random numbers.
- rand(long) - Static method in class org.apache.spark.sql.functions
-
Generate a random column with i.i.d.
- rand() - Static method in class org.apache.spark.sql.functions
-
Generate a random column with i.i.d.
- randn(int, int, Random) - Static method in class org.apache.spark.mllib.linalg.DenseMatrix
-
Generate a DenseMatrix
consisting of i.i.d.
gaussian random numbers.
- randn(int, int, Random) - Static method in class org.apache.spark.mllib.linalg.Matrices
-
Generate a DenseMatrix
consisting of i.i.d.
gaussian random numbers.
- randn(long) - Static method in class org.apache.spark.sql.functions
-
Generate a column with i.i.d.
- randn() - Static method in class org.apache.spark.sql.functions
-
Generate a column with i.i.d.
- RANDOM() - Static method in class org.apache.spark.mllib.clustering.KMeans
-
- random(int, Random) - Static method in class org.apache.spark.util.Vector
-
Creates this
Vector
of given length containing random numbers
between 0.0 and 1.0.
- RandomDataGenerator<T> - Interface in org.apache.spark.mllib.random
-
:: DeveloperApi ::
Trait for random data generators that generate i.i.d.
- RandomForest - Class in org.apache.spark.mllib.tree
-
:: Experimental ::
A class that implements a Random Forest
learning algorithm for classification and regression.
- RandomForest(Strategy, int, String, int) - Constructor for class org.apache.spark.mllib.tree.RandomForest
-
- RandomForestClassificationModel - Class in org.apache.spark.ml.classification
-
:: Experimental ::
Random Forest
model for classification.
- RandomForestClassifier - Class in org.apache.spark.ml.classification
-
:: Experimental ::
Random Forest
learning algorithm for
classification.
- RandomForestClassifier(String) - Constructor for class org.apache.spark.ml.classification.RandomForestClassifier
-
- RandomForestClassifier() - Constructor for class org.apache.spark.ml.classification.RandomForestClassifier
-
- RandomForestModel - Class in org.apache.spark.mllib.tree.model
-
:: Experimental ::
Represents a random forest model.
- RandomForestModel(Enumeration.Value, DecisionTreeModel[]) - Constructor for class org.apache.spark.mllib.tree.model.RandomForestModel
-
- RandomForestRegressionModel - Class in org.apache.spark.ml.regression
-
:: Experimental ::
Random Forest
model for regression.
- RandomForestRegressor - Class in org.apache.spark.ml.regression
-
:: Experimental ::
Random Forest
learning algorithm for regression.
- RandomForestRegressor(String) - Constructor for class org.apache.spark.ml.regression.RandomForestRegressor
-
- RandomForestRegressor() - Constructor for class org.apache.spark.ml.regression.RandomForestRegressor
-
- randomRDD(SparkContext, RandomDataGenerator<T>, long, int, long, ClassTag<T>) - Static method in class org.apache.spark.mllib.random.RandomRDDs
-
:: DeveloperApi ::
Generates an RDD comprised of i.i.d.
samples produced by the input RandomDataGenerator.
- RandomRDDs - Class in org.apache.spark.mllib.random
-
:: Experimental ::
Generator methods for creating RDDs comprised of i.i.d.
samples from some distribution.
- RandomRDDs() - Constructor for class org.apache.spark.mllib.random.RandomRDDs
-
- RandomSampler<T,U> - Interface in org.apache.spark.util.random
-
:: DeveloperApi ::
A pseudorandom sampler.
- randomSplit(double[]) - Method in class org.apache.spark.api.java.JavaRDD
-
Randomly splits this RDD with the provided weights.
- randomSplit(double[], long) - Method in class org.apache.spark.api.java.JavaRDD
-
Randomly splits this RDD with the provided weights.
- randomSplit(double[], long) - Method in class org.apache.spark.rdd.RDD
-
Randomly splits this RDD with the provided weights.
- randomSplit(double[], long) - Method in class org.apache.spark.sql.DataFrame
-
Randomly splits this
DataFrame
with the provided weights.
- randomSplit(double[]) - Method in class org.apache.spark.sql.DataFrame
-
Randomly splits this
DataFrame
with the provided weights.
- randomVectorRDD(SparkContext, RandomDataGenerator<Object>, long, int, int, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs
-
:: DeveloperApi ::
Generates an RDD[Vector] with vectors containing i.i.d.
samples produced by the
input RandomDataGenerator.
- range(long, long, long, int) - Method in class org.apache.spark.SparkContext
-
Creates a new RDD[Long] containing elements from start
to end
(exclusive), increased by
step
every element.
- range(long, long) - Method in class org.apache.spark.sql.SQLContext
-
- range(long) - Method in class org.apache.spark.sql.SQLContext
-
- range(long, long, long, int) - Method in class org.apache.spark.sql.SQLContext
-
- rangeBetween(long, long) - Method in class org.apache.spark.sql.expressions.WindowSpec
-
Defines the frame boundaries, from start
(inclusive) to end
(inclusive).
- RangeDependency<T> - Class in org.apache.spark
-
:: DeveloperApi ::
Represents a one-to-one dependency between ranges of partitions in the parent and child RDDs.
- RangeDependency(RDD<T>, int, int, int) - Constructor for class org.apache.spark.RangeDependency
-
- RangePartitioner<K,V> - Class in org.apache.spark
-
A
Partitioner
that partitions sortable records by range into roughly
equal ranges.
- RangePartitioner(int, RDD<? extends Product2<K, V>>, boolean, Ordering<K>, ClassTag<K>) - Constructor for class org.apache.spark.RangePartitioner
-
- rank() - Method in class org.apache.spark.graphx.lib.SVDPlusPlus.Conf
-
- rank() - Method in class org.apache.spark.ml.recommendation.ALSModel
-
- rank() - Method in class org.apache.spark.mllib.recommendation.MatrixFactorizationModel
-
- rank() - Static method in class org.apache.spark.sql.functions
-
Window function: returns the rank of rows within a window partition.
- RankingMetrics<T> - Class in org.apache.spark.mllib.evaluation
-
::Experimental::
Evaluator for ranking algorithms.
- RankingMetrics(RDD<Tuple2<Object, Object>>, ClassTag<T>) - Constructor for class org.apache.spark.mllib.evaluation.RankingMetrics
-
- rating() - Method in class org.apache.spark.ml.recommendation.ALS.Rating
-
- Rating - Class in org.apache.spark.mllib.recommendation
-
A more compact class to represent a rating than Tuple3[Int, Int, Double].
- Rating(int, int, double) - Constructor for class org.apache.spark.mllib.recommendation.Rating
-
- rating() - Method in class org.apache.spark.mllib.recommendation.Rating
-
- rawSocketStream(String, int, StorageLevel) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
-
Create an input stream from network source hostname:port, where data is received
as serialized blocks (serialized using the Spark's serializer) that can be directly
pushed into the block manager without deserializing them.
- rawSocketStream(String, int) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
-
Create an input stream from network source hostname:port, where data is received
as serialized blocks (serialized using the Spark's serializer) that can be directly
pushed into the block manager without deserializing them.
- rawSocketStream(String, int, StorageLevel, ClassTag<T>) - Method in class org.apache.spark.streaming.StreamingContext
-
Create a input stream from network source hostname:port, where data is received
as serialized blocks (serialized using the Spark's serializer) that can be directly
pushed into the block manager without deserializing them.
- rdd() - Method in class org.apache.spark.api.java.JavaDoubleRDD
-
- rdd() - Method in class org.apache.spark.api.java.JavaPairRDD
-
- rdd() - Method in class org.apache.spark.api.java.JavaRDD
-
- rdd() - Method in interface org.apache.spark.api.java.JavaRDDLike
-
- rdd() - Method in class org.apache.spark.Dependency
-
- rdd() - Method in class org.apache.spark.NarrowDependency
-
- RDD<T> - Class in org.apache.spark.rdd
-
A Resilient Distributed Dataset (RDD), the basic abstraction in Spark.
- RDD(SparkContext, Seq<Dependency<?>>, ClassTag<T>) - Constructor for class org.apache.spark.rdd.RDD
-
- RDD(RDD<?>, ClassTag<T>) - Constructor for class org.apache.spark.rdd.RDD
-
Construct an RDD with just a one-to-one dependency on one parent
- rdd() - Method in class org.apache.spark.ShuffleDependency
-
- rdd() - Method in class org.apache.spark.sql.DataFrame
-
- RDD() - Static method in class org.apache.spark.storage.BlockId
-
- RDD_SCOPE_KEY() - Static method in class org.apache.spark.SparkContext
-
- RDD_SCOPE_NO_OVERRIDE_KEY() - Static method in class org.apache.spark.SparkContext
-
- RDDBlockId - Class in org.apache.spark.storage
-
- RDDBlockId(int, int) - Constructor for class org.apache.spark.storage.RDDBlockId
-
- rddBlocks() - Method in class org.apache.spark.status.api.v1.ExecutorSummary
-
- rddBlocks() - Method in class org.apache.spark.storage.StorageStatus
-
Return the RDD blocks stored in this block manager.
- rddBlocksById(int) - Method in class org.apache.spark.storage.StorageStatus
-
Return the blocks that belong to the given RDD stored in this block manager.
- RDDDataDistribution - Class in org.apache.spark.status.api.v1
-
- RDDFunctions<T> - Class in org.apache.spark.mllib.rdd
-
Machine learning specific RDD functions.
- RDDFunctions(RDD<T>, ClassTag<T>) - Constructor for class org.apache.spark.mllib.rdd.RDDFunctions
-
- rddId() - Method in class org.apache.spark.CleanCheckpoint
-
- rddId() - Method in class org.apache.spark.CleanRDD
-
- rddId() - Method in class org.apache.spark.scheduler.SparkListenerUnpersistRDD
-
- rddId() - Method in class org.apache.spark.storage.RDDBlockId
-
- RDDInfo - Class in org.apache.spark.storage
-
- RDDInfo(int, String, int, StorageLevel, Seq<Object>, Option<org.apache.spark.rdd.RDDOperationScope>) - Constructor for class org.apache.spark.storage.RDDInfo
-
- rddInfoList() - Method in class org.apache.spark.ui.storage.StorageListener
-
Filter RDD info to include only those with cached partitions
- rddInfos() - Method in class org.apache.spark.scheduler.StageInfo
-
- RDDPartitionInfo - Class in org.apache.spark.status.api.v1
-
- rdds() - Method in class org.apache.spark.rdd.CoGroupedRDD
-
- rdds() - Method in class org.apache.spark.rdd.UnionRDD
-
- RDDStorageInfo - Class in org.apache.spark.status.api.v1
-
- rddStorageLevel(int) - Method in class org.apache.spark.storage.StorageStatus
-
Return the storage level, if any, used by the given RDD in this block manager.
- rddToAsyncRDDActions(RDD<T>, ClassTag<T>) - Static method in class org.apache.spark.rdd.RDD
-
- rddToAsyncRDDActions(RDD<T>, ClassTag<T>) - Static method in class org.apache.spark.SparkContext
-
- rddToDataFrameHolder(RDD<A>, TypeTags.TypeTag<A>) - Method in class org.apache.spark.sql.SQLContext.implicits$
-
- rddToOrderedRDDFunctions(RDD<Tuple2<K, V>>, Ordering<K>, ClassTag<K>, ClassTag<V>) - Static method in class org.apache.spark.rdd.RDD
-
- rddToOrderedRDDFunctions(RDD<Tuple2<K, V>>, Ordering<K>, ClassTag<K>, ClassTag<V>) - Static method in class org.apache.spark.SparkContext
-
- rddToPairRDDFunctions(RDD<Tuple2<K, V>>, ClassTag<K>, ClassTag<V>, Ordering<K>) - Static method in class org.apache.spark.rdd.RDD
-
- rddToPairRDDFunctions(RDD<Tuple2<K, V>>, ClassTag<K>, ClassTag<V>, Ordering<K>) - Static method in class org.apache.spark.SparkContext
-
- rddToSequenceFileRDDFunctions(RDD<Tuple2<K, V>>, ClassTag<K>, ClassTag<V>, <any>, <any>) - Static method in class org.apache.spark.rdd.RDD
-
- rddToSequenceFileRDDFunctions(RDD<Tuple2<K, V>>, Function1<K, Writable>, ClassTag<K>, Function1<V, Writable>, ClassTag<V>) - Static method in class org.apache.spark.SparkContext
-
- read(Kryo, Input, Class<Iterable<?>>) - Method in class org.apache.spark.serializer.JavaIterableWrapperSerializer
-
- read() - Method in class org.apache.spark.sql.SQLContext
-
- read(WriteAheadLogRecordHandle) - Method in class org.apache.spark.streaming.util.WriteAheadLog
-
Read a written record based on the given record handle.
- readAll() - Method in class org.apache.spark.streaming.util.WriteAheadLog
-
Read and return an iterator of all the records that have been written but not yet cleaned up.
- readBytes() - Method in class org.apache.spark.status.api.v1.ShuffleReadMetricDistributions
-
- readExternal(ObjectInput) - Method in class org.apache.spark.serializer.JavaSerializer
-
- readExternal(ObjectInput) - Method in class org.apache.spark.storage.BlockManagerId
-
- readExternal(ObjectInput) - Method in class org.apache.spark.storage.StorageLevel
-
- readExternal(ObjectInput) - Method in class org.apache.spark.streaming.flume.SparkFlumeEvent
-
- readKey(ClassTag<T>) - Method in class org.apache.spark.serializer.DeserializationStream
-
Reads the object representing the key of a key-value pair.
- readObject(ClassTag<T>) - Method in class org.apache.spark.serializer.DeserializationStream
-
The most general-purpose method to read an object.
- readRecords() - Method in class org.apache.spark.status.api.v1.ShuffleReadMetricDistributions
-
- readValue(ClassTag<T>) - Method in class org.apache.spark.serializer.DeserializationStream
-
Reads the object representing the value of a key-value pair.
- ready(Duration, CanAwait) - Method in class org.apache.spark.ComplexFutureAction
-
- ready(Duration, CanAwait) - Method in interface org.apache.spark.FutureAction
-
Blocks until this action completes.
- ready(Duration, CanAwait) - Method in class org.apache.spark.SimpleFutureAction
-
- reason() - Method in class org.apache.spark.scheduler.SparkListenerExecutorRemoved
-
- reason() - Method in class org.apache.spark.scheduler.SparkListenerTaskEnd
-
- recall(double) - Method in class org.apache.spark.mllib.evaluation.MulticlassMetrics
-
Returns recall for a given label (category)
- recall() - Method in class org.apache.spark.mllib.evaluation.MulticlassMetrics
-
Returns recall
(equals to precision for multiclass classifier
because sum of all false positives is equal to sum
of all false negatives)
- recall() - Method in class org.apache.spark.mllib.evaluation.MultilabelMetrics
-
Returns document-based recall averaged by the number of documents
- recall(double) - Method in class org.apache.spark.mllib.evaluation.MultilabelMetrics
-
Returns recall for a given label (category)
- recallByThreshold() - Method in class org.apache.spark.mllib.evaluation.BinaryClassificationMetrics
-
Returns the (threshold, recall) curve.
- Receiver<T> - Class in org.apache.spark.streaming.receiver
-
:: DeveloperApi ::
Abstract class of a receiver that can be run on worker nodes to receive external data.
- Receiver(StorageLevel) - Constructor for class org.apache.spark.streaming.receiver.Receiver
-
- ReceiverInfo - Class in org.apache.spark.streaming.scheduler
-
:: DeveloperApi ::
Class having information about a receiver
- ReceiverInfo(int, String, org.apache.spark.rpc.RpcEndpointRef, boolean, String, String, String, long) - Constructor for class org.apache.spark.streaming.scheduler.ReceiverInfo
-
- receiverInfo() - Method in class org.apache.spark.streaming.scheduler.StreamingListenerReceiverError
-
- receiverInfo() - Method in class org.apache.spark.streaming.scheduler.StreamingListenerReceiverStarted
-
- receiverInfo() - Method in class org.apache.spark.streaming.scheduler.StreamingListenerReceiverStopped
-
- receiverInputDStream() - Method in class org.apache.spark.streaming.api.java.JavaPairReceiverInputDStream
-
- receiverInputDStream() - Method in class org.apache.spark.streaming.api.java.JavaReceiverInputDStream
-
- ReceiverInputDStream<T> - Class in org.apache.spark.streaming.dstream
-
Abstract class for defining any
InputDStream
that has to start a receiver on worker nodes to receive external data.
- ReceiverInputDStream(StreamingContext, ClassTag<T>) - Constructor for class org.apache.spark.streaming.dstream.ReceiverInputDStream
-
- receiverStream(Receiver<T>) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
-
Create an input stream with any arbitrary user implemented receiver.
- receiverStream(Receiver<T>, ClassTag<T>) - Method in class org.apache.spark.streaming.StreamingContext
-
Create an input stream with any arbitrary user implemented receiver.
- recommendProducts(int, int) - Method in class org.apache.spark.mllib.recommendation.MatrixFactorizationModel
-
Recommends products to a user.
- recommendProductsForUsers(int) - Method in class org.apache.spark.mllib.recommendation.MatrixFactorizationModel
-
Recommends topK products for all users.
- recommendUsers(int, int) - Method in class org.apache.spark.mllib.recommendation.MatrixFactorizationModel
-
Recommends users to a product.
- recommendUsersForProducts(int) - Method in class org.apache.spark.mllib.recommendation.MatrixFactorizationModel
-
Recommends topK users for all products.
- RECORDS_BETWEEN_BYTES_READ_METRIC_UPDATES() - Static method in class org.apache.spark.rdd.HadoopRDD
-
Update the input bytes read metric each time this number of records has been read
- RECORDS_BETWEEN_BYTES_WRITTEN_METRIC_UPDATES() - Static method in class org.apache.spark.rdd.PairRDDFunctions
-
- recordsRead() - Method in class org.apache.spark.status.api.v1.InputMetricDistributions
-
- recordsRead() - Method in class org.apache.spark.status.api.v1.InputMetrics
-
- recordsRead() - Method in class org.apache.spark.status.api.v1.ShuffleReadMetrics
-
- recordsWritten() - Method in class org.apache.spark.status.api.v1.OutputMetricDistributions
-
- recordsWritten() - Method in class org.apache.spark.status.api.v1.OutputMetrics
-
- recordsWritten() - Method in class org.apache.spark.status.api.v1.ShuffleWriteMetrics
-
- reduce(Function2<T, T, T>) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Reduces the elements of this RDD using the specified commutative and associative binary
operator.
- reduce(Function2<T, T, T>) - Method in class org.apache.spark.rdd.RDD
-
Reduces the elements of this RDD using the specified commutative and
associative binary operator.
- reduce(Function2<T, T, T>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
-
Return a new DStream in which each RDD has a single element generated by reducing each RDD
of this DStream.
- reduce(Function2<T, T, T>) - Method in class org.apache.spark.streaming.dstream.DStream
-
Return a new DStream in which each RDD has a single element generated by reducing each RDD
of this DStream.
- reduceByKey(Partitioner, Function2<V, V, V>) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Merge the values for each key using an associative reduce function.
- reduceByKey(Function2<V, V, V>, int) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Merge the values for each key using an associative reduce function.
- reduceByKey(Function2<V, V, V>) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Merge the values for each key using an associative reduce function.
- reduceByKey(Partitioner, Function2<V, V, V>) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Merge the values for each key using an associative reduce function.
- reduceByKey(Function2<V, V, V>, int) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Merge the values for each key using an associative reduce function.
- reduceByKey(Function2<V, V, V>) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Merge the values for each key using an associative reduce function.
- reduceByKey(Function2<V, V, V>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Return a new DStream by applying reduceByKey
to each RDD.
- reduceByKey(Function2<V, V, V>, int) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Return a new DStream by applying reduceByKey
to each RDD.
- reduceByKey(Function2<V, V, V>, Partitioner) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Return a new DStream by applying reduceByKey
to each RDD.
- reduceByKey(Function2<V, V, V>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
-
Return a new DStream by applying reduceByKey
to each RDD.
- reduceByKey(Function2<V, V, V>, int) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
-
Return a new DStream by applying reduceByKey
to each RDD.
- reduceByKey(Function2<V, V, V>, Partitioner) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
-
Return a new DStream by applying reduceByKey
to each RDD.
- reduceByKeyAndWindow(Function2<V, V, V>, Duration) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Create a new DStream by applying reduceByKey
over a sliding window on this
DStream.
- reduceByKeyAndWindow(Function2<V, V, V>, Duration, Duration) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Return a new DStream by applying reduceByKey
over a sliding window.
- reduceByKeyAndWindow(Function2<V, V, V>, Duration, Duration, int) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Return a new DStream by applying reduceByKey
over a sliding window.
- reduceByKeyAndWindow(Function2<V, V, V>, Duration, Duration, Partitioner) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Return a new DStream by applying reduceByKey
over a sliding window.
- reduceByKeyAndWindow(Function2<V, V, V>, Function2<V, V, V>, Duration, Duration) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Return a new DStream by reducing over a using incremental computation.
- reduceByKeyAndWindow(Function2<V, V, V>, Function2<V, V, V>, Duration, Duration, int, Function<Tuple2<K, V>, Boolean>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Return a new DStream by applying incremental reduceByKey
over a sliding window.
- reduceByKeyAndWindow(Function2<V, V, V>, Function2<V, V, V>, Duration, Duration, Partitioner, Function<Tuple2<K, V>, Boolean>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Return a new DStream by applying incremental reduceByKey
over a sliding window.
- reduceByKeyAndWindow(Function2<V, V, V>, Duration) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
-
Return a new DStream by applying reduceByKey
over a sliding window on this
DStream.
- reduceByKeyAndWindow(Function2<V, V, V>, Duration, Duration) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
-
Return a new DStream by applying reduceByKey
over a sliding window.
- reduceByKeyAndWindow(Function2<V, V, V>, Duration, Duration, int) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
-
Return a new DStream by applying reduceByKey
over a sliding window.
- reduceByKeyAndWindow(Function2<V, V, V>, Duration, Duration, Partitioner) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
-
Return a new DStream by applying reduceByKey
over a sliding window.
- reduceByKeyAndWindow(Function2<V, V, V>, Function2<V, V, V>, Duration, Duration, int, Function1<Tuple2<K, V>, Object>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
-
Return a new DStream by applying incremental reduceByKey
over a sliding window.
- reduceByKeyAndWindow(Function2<V, V, V>, Function2<V, V, V>, Duration, Duration, Partitioner, Function1<Tuple2<K, V>, Object>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
-
Return a new DStream by applying incremental reduceByKey
over a sliding window.
- reduceByKeyLocally(Function2<V, V, V>) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Merge the values for each key using an associative reduce function, but return the results
immediately to the master as a Map.
- reduceByKeyLocally(Function2<V, V, V>) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Merge the values for each key using an associative reduce function, but return the results
immediately to the master as a Map.
- reduceByKeyToDriver(Function2<V, V, V>) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Alias for reduceByKeyLocally
- reduceByWindow(Function2<T, T, T>, Duration, Duration) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
-
Deprecated.
As this API is not Java compatible.
- reduceByWindow(Function2<T, T, T>, Duration, Duration) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
-
Return a new DStream in which each RDD has a single element generated by reducing all
elements in a sliding window over this DStream.
- reduceByWindow(Function2<T, T, T>, Function2<T, T, T>, Duration, Duration) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
-
Return a new DStream in which each RDD has a single element generated by reducing all
elements in a sliding window over this DStream.
- reduceByWindow(Function2<T, T, T>, Duration, Duration) - Method in class org.apache.spark.streaming.dstream.DStream
-
Return a new DStream in which each RDD has a single element generated by reducing all
elements in a sliding window over this DStream.
- reduceByWindow(Function2<T, T, T>, Function2<T, T, T>, Duration, Duration) - Method in class org.apache.spark.streaming.dstream.DStream
-
Return a new DStream in which each RDD has a single element generated by reducing all
elements in a sliding window over this DStream.
- reduceId() - Method in class org.apache.spark.FetchFailed
-
- reduceId() - Method in class org.apache.spark.storage.ShuffleBlockId
-
- reduceId() - Method in class org.apache.spark.storage.ShuffleDataBlockId
-
- reduceId() - Method in class org.apache.spark.storage.ShuffleIndexBlockId
-
- refreshTable(String) - Method in class org.apache.spark.sql.hive.HiveContext
-
Invalidate and refresh all the cached the metadata of the given table.
- RegexTokenizer - Class in org.apache.spark.ml.feature
-
:: Experimental ::
A regex based tokenizer that extracts tokens either by using the provided regex pattern to split
the text (default) or repeatedly matching the regex (if gaps
is true).
- RegexTokenizer(String) - Constructor for class org.apache.spark.ml.feature.RegexTokenizer
-
- RegexTokenizer() - Constructor for class org.apache.spark.ml.feature.RegexTokenizer
-
- register(String, Function0<RT>, TypeTags.TypeTag<RT>) - Method in class org.apache.spark.sql.UDFRegistration
-
Register a Scala closure of 0 arguments as user-defined function (UDF).
- register(String, Function1<A1, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>) - Method in class org.apache.spark.sql.UDFRegistration
-
Register a Scala closure of 1 arguments as user-defined function (UDF).
- register(String, Function2<A1, A2, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>, TypeTags.TypeTag<A2>) - Method in class org.apache.spark.sql.UDFRegistration
-
Register a Scala closure of 2 arguments as user-defined function (UDF).
- register(String, Function3<A1, A2, A3, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>, TypeTags.TypeTag<A2>, TypeTags.TypeTag<A3>) - Method in class org.apache.spark.sql.UDFRegistration
-
Register a Scala closure of 3 arguments as user-defined function (UDF).
- register(String, Function4<A1, A2, A3, A4, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>, TypeTags.TypeTag<A2>, TypeTags.TypeTag<A3>, TypeTags.TypeTag<A4>) - Method in class org.apache.spark.sql.UDFRegistration
-
Register a Scala closure of 4 arguments as user-defined function (UDF).
- register(String, Function5<A1, A2, A3, A4, A5, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>, TypeTags.TypeTag<A2>, TypeTags.TypeTag<A3>, TypeTags.TypeTag<A4>, TypeTags.TypeTag<A5>) - Method in class org.apache.spark.sql.UDFRegistration
-
Register a Scala closure of 5 arguments as user-defined function (UDF).
- register(String, Function6<A1, A2, A3, A4, A5, A6, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>, TypeTags.TypeTag<A2>, TypeTags.TypeTag<A3>, TypeTags.TypeTag<A4>, TypeTags.TypeTag<A5>, TypeTags.TypeTag<A6>) - Method in class org.apache.spark.sql.UDFRegistration
-
Register a Scala closure of 6 arguments as user-defined function (UDF).
- register(String, Function7<A1, A2, A3, A4, A5, A6, A7, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>, TypeTags.TypeTag<A2>, TypeTags.TypeTag<A3>, TypeTags.TypeTag<A4>, TypeTags.TypeTag<A5>, TypeTags.TypeTag<A6>, TypeTags.TypeTag<A7>) - Method in class org.apache.spark.sql.UDFRegistration
-
Register a Scala closure of 7 arguments as user-defined function (UDF).
- register(String, Function8<A1, A2, A3, A4, A5, A6, A7, A8, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>, TypeTags.TypeTag<A2>, TypeTags.TypeTag<A3>, TypeTags.TypeTag<A4>, TypeTags.TypeTag<A5>, TypeTags.TypeTag<A6>, TypeTags.TypeTag<A7>, TypeTags.TypeTag<A8>) - Method in class org.apache.spark.sql.UDFRegistration
-
Register a Scala closure of 8 arguments as user-defined function (UDF).
- register(String, Function9<A1, A2, A3, A4, A5, A6, A7, A8, A9, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>, TypeTags.TypeTag<A2>, TypeTags.TypeTag<A3>, TypeTags.TypeTag<A4>, TypeTags.TypeTag<A5>, TypeTags.TypeTag<A6>, TypeTags.TypeTag<A7>, TypeTags.TypeTag<A8>, TypeTags.TypeTag<A9>) - Method in class org.apache.spark.sql.UDFRegistration
-
Register a Scala closure of 9 arguments as user-defined function (UDF).
- register(String, Function10<A1, A2, A3, A4, A5, A6, A7, A8, A9, A10, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>, TypeTags.TypeTag<A2>, TypeTags.TypeTag<A3>, TypeTags.TypeTag<A4>, TypeTags.TypeTag<A5>, TypeTags.TypeTag<A6>, TypeTags.TypeTag<A7>, TypeTags.TypeTag<A8>, TypeTags.TypeTag<A9>, TypeTags.TypeTag<A10>) - Method in class org.apache.spark.sql.UDFRegistration
-
Register a Scala closure of 10 arguments as user-defined function (UDF).
- register(String, Function11<A1, A2, A3, A4, A5, A6, A7, A8, A9, A10, A11, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>, TypeTags.TypeTag<A2>, TypeTags.TypeTag<A3>, TypeTags.TypeTag<A4>, TypeTags.TypeTag<A5>, TypeTags.TypeTag<A6>, TypeTags.TypeTag<A7>, TypeTags.TypeTag<A8>, TypeTags.TypeTag<A9>, TypeTags.TypeTag<A10>, TypeTags.TypeTag<A11>) - Method in class org.apache.spark.sql.UDFRegistration
-
Register a Scala closure of 11 arguments as user-defined function (UDF).
- register(String, Function12<A1, A2, A3, A4, A5, A6, A7, A8, A9, A10, A11, A12, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>, TypeTags.TypeTag<A2>, TypeTags.TypeTag<A3>, TypeTags.TypeTag<A4>, TypeTags.TypeTag<A5>, TypeTags.TypeTag<A6>, TypeTags.TypeTag<A7>, TypeTags.TypeTag<A8>, TypeTags.TypeTag<A9>, TypeTags.TypeTag<A10>, TypeTags.TypeTag<A11>, TypeTags.TypeTag<A12>) - Method in class org.apache.spark.sql.UDFRegistration
-
Register a Scala closure of 12 arguments as user-defined function (UDF).
- register(String, Function13<A1, A2, A3, A4, A5, A6, A7, A8, A9, A10, A11, A12, A13, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>, TypeTags.TypeTag<A2>, TypeTags.TypeTag<A3>, TypeTags.TypeTag<A4>, TypeTags.TypeTag<A5>, TypeTags.TypeTag<A6>, TypeTags.TypeTag<A7>, TypeTags.TypeTag<A8>, TypeTags.TypeTag<A9>, TypeTags.TypeTag<A10>, TypeTags.TypeTag<A11>, TypeTags.TypeTag<A12>, TypeTags.TypeTag<A13>) - Method in class org.apache.spark.sql.UDFRegistration
-
Register a Scala closure of 13 arguments as user-defined function (UDF).
- register(String, Function14<A1, A2, A3, A4, A5, A6, A7, A8, A9, A10, A11, A12, A13, A14, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>, TypeTags.TypeTag<A2>, TypeTags.TypeTag<A3>, TypeTags.TypeTag<A4>, TypeTags.TypeTag<A5>, TypeTags.TypeTag<A6>, TypeTags.TypeTag<A7>, TypeTags.TypeTag<A8>, TypeTags.TypeTag<A9>, TypeTags.TypeTag<A10>, TypeTags.TypeTag<A11>, TypeTags.TypeTag<A12>, TypeTags.TypeTag<A13>, TypeTags.TypeTag<A14>) - Method in class org.apache.spark.sql.UDFRegistration
-
Register a Scala closure of 14 arguments as user-defined function (UDF).
- register(String, Function15<A1, A2, A3, A4, A5, A6, A7, A8, A9, A10, A11, A12, A13, A14, A15, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>, TypeTags.TypeTag<A2>, TypeTags.TypeTag<A3>, TypeTags.TypeTag<A4>, TypeTags.TypeTag<A5>, TypeTags.TypeTag<A6>, TypeTags.TypeTag<A7>, TypeTags.TypeTag<A8>, TypeTags.TypeTag<A9>, TypeTags.TypeTag<A10>, TypeTags.TypeTag<A11>, TypeTags.TypeTag<A12>, TypeTags.TypeTag<A13>, TypeTags.TypeTag<A14>, TypeTags.TypeTag<A15>) - Method in class org.apache.spark.sql.UDFRegistration
-
Register a Scala closure of 15 arguments as user-defined function (UDF).
- register(String, Function16<A1, A2, A3, A4, A5, A6, A7, A8, A9, A10, A11, A12, A13, A14, A15, A16, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>, TypeTags.TypeTag<A2>, TypeTags.TypeTag<A3>, TypeTags.TypeTag<A4>, TypeTags.TypeTag<A5>, TypeTags.TypeTag<A6>, TypeTags.TypeTag<A7>, TypeTags.TypeTag<A8>, TypeTags.TypeTag<A9>, TypeTags.TypeTag<A10>, TypeTags.TypeTag<A11>, TypeTags.TypeTag<A12>, TypeTags.TypeTag<A13>, TypeTags.TypeTag<A14>, TypeTags.TypeTag<A15>, TypeTags.TypeTag<A16>) - Method in class org.apache.spark.sql.UDFRegistration
-
Register a Scala closure of 16 arguments as user-defined function (UDF).
- register(String, Function17<A1, A2, A3, A4, A5, A6, A7, A8, A9, A10, A11, A12, A13, A14, A15, A16, A17, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>, TypeTags.TypeTag<A2>, TypeTags.TypeTag<A3>, TypeTags.TypeTag<A4>, TypeTags.TypeTag<A5>, TypeTags.TypeTag<A6>, TypeTags.TypeTag<A7>, TypeTags.TypeTag<A8>, TypeTags.TypeTag<A9>, TypeTags.TypeTag<A10>, TypeTags.TypeTag<A11>, TypeTags.TypeTag<A12>, TypeTags.TypeTag<A13>, TypeTags.TypeTag<A14>, TypeTags.TypeTag<A15>, TypeTags.TypeTag<A16>, TypeTags.TypeTag<A17>) - Method in class org.apache.spark.sql.UDFRegistration
-
Register a Scala closure of 17 arguments as user-defined function (UDF).
- register(String, Function18<A1, A2, A3, A4, A5, A6, A7, A8, A9, A10, A11, A12, A13, A14, A15, A16, A17, A18, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>, TypeTags.TypeTag<A2>, TypeTags.TypeTag<A3>, TypeTags.TypeTag<A4>, TypeTags.TypeTag<A5>, TypeTags.TypeTag<A6>, TypeTags.TypeTag<A7>, TypeTags.TypeTag<A8>, TypeTags.TypeTag<A9>, TypeTags.TypeTag<A10>, TypeTags.TypeTag<A11>, TypeTags.TypeTag<A12>, TypeTags.TypeTag<A13>, TypeTags.TypeTag<A14>, TypeTags.TypeTag<A15>, TypeTags.TypeTag<A16>, TypeTags.TypeTag<A17>, TypeTags.TypeTag<A18>) - Method in class org.apache.spark.sql.UDFRegistration
-
Register a Scala closure of 18 arguments as user-defined function (UDF).
- register(String, Function19<A1, A2, A3, A4, A5, A6, A7, A8, A9, A10, A11, A12, A13, A14, A15, A16, A17, A18, A19, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>, TypeTags.TypeTag<A2>, TypeTags.TypeTag<A3>, TypeTags.TypeTag<A4>, TypeTags.TypeTag<A5>, TypeTags.TypeTag<A6>, TypeTags.TypeTag<A7>, TypeTags.TypeTag<A8>, TypeTags.TypeTag<A9>, TypeTags.TypeTag<A10>, TypeTags.TypeTag<A11>, TypeTags.TypeTag<A12>, TypeTags.TypeTag<A13>, TypeTags.TypeTag<A14>, TypeTags.TypeTag<A15>, TypeTags.TypeTag<A16>, TypeTags.TypeTag<A17>, TypeTags.TypeTag<A18>, TypeTags.TypeTag<A19>) - Method in class org.apache.spark.sql.UDFRegistration
-
Register a Scala closure of 19 arguments as user-defined function (UDF).
- register(String, Function20<A1, A2, A3, A4, A5, A6, A7, A8, A9, A10, A11, A12, A13, A14, A15, A16, A17, A18, A19, A20, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>, TypeTags.TypeTag<A2>, TypeTags.TypeTag<A3>, TypeTags.TypeTag<A4>, TypeTags.TypeTag<A5>, TypeTags.TypeTag<A6>, TypeTags.TypeTag<A7>, TypeTags.TypeTag<A8>, TypeTags.TypeTag<A9>, TypeTags.TypeTag<A10>, TypeTags.TypeTag<A11>, TypeTags.TypeTag<A12>, TypeTags.TypeTag<A13>, TypeTags.TypeTag<A14>, TypeTags.TypeTag<A15>, TypeTags.TypeTag<A16>, TypeTags.TypeTag<A17>, TypeTags.TypeTag<A18>, TypeTags.TypeTag<A19>, TypeTags.TypeTag<A20>) - Method in class org.apache.spark.sql.UDFRegistration
-
Register a Scala closure of 20 arguments as user-defined function (UDF).
- register(String, Function21<A1, A2, A3, A4, A5, A6, A7, A8, A9, A10, A11, A12, A13, A14, A15, A16, A17, A18, A19, A20, A21, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>, TypeTags.TypeTag<A2>, TypeTags.TypeTag<A3>, TypeTags.TypeTag<A4>, TypeTags.TypeTag<A5>, TypeTags.TypeTag<A6>, TypeTags.TypeTag<A7>, TypeTags.TypeTag<A8>, TypeTags.TypeTag<A9>, TypeTags.TypeTag<A10>, TypeTags.TypeTag<A11>, TypeTags.TypeTag<A12>, TypeTags.TypeTag<A13>, TypeTags.TypeTag<A14>, TypeTags.TypeTag<A15>, TypeTags.TypeTag<A16>, TypeTags.TypeTag<A17>, TypeTags.TypeTag<A18>, TypeTags.TypeTag<A19>, TypeTags.TypeTag<A20>, TypeTags.TypeTag<A21>) - Method in class org.apache.spark.sql.UDFRegistration
-
Register a Scala closure of 21 arguments as user-defined function (UDF).
- register(String, Function22<A1, A2, A3, A4, A5, A6, A7, A8, A9, A10, A11, A12, A13, A14, A15, A16, A17, A18, A19, A20, A21, A22, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>, TypeTags.TypeTag<A2>, TypeTags.TypeTag<A3>, TypeTags.TypeTag<A4>, TypeTags.TypeTag<A5>, TypeTags.TypeTag<A6>, TypeTags.TypeTag<A7>, TypeTags.TypeTag<A8>, TypeTags.TypeTag<A9>, TypeTags.TypeTag<A10>, TypeTags.TypeTag<A11>, TypeTags.TypeTag<A12>, TypeTags.TypeTag<A13>, TypeTags.TypeTag<A14>, TypeTags.TypeTag<A15>, TypeTags.TypeTag<A16>, TypeTags.TypeTag<A17>, TypeTags.TypeTag<A18>, TypeTags.TypeTag<A19>, TypeTags.TypeTag<A20>, TypeTags.TypeTag<A21>, TypeTags.TypeTag<A22>) - Method in class org.apache.spark.sql.UDFRegistration
-
Register a Scala closure of 22 arguments as user-defined function (UDF).
- register(String, UDF1<?, ?>, DataType) - Method in class org.apache.spark.sql.UDFRegistration
-
Register a user-defined function with 1 arguments.
- register(String, UDF2<?, ?, ?>, DataType) - Method in class org.apache.spark.sql.UDFRegistration
-
Register a user-defined function with 2 arguments.
- register(String, UDF3<?, ?, ?, ?>, DataType) - Method in class org.apache.spark.sql.UDFRegistration
-
Register a user-defined function with 3 arguments.
- register(String, UDF4<?, ?, ?, ?, ?>, DataType) - Method in class org.apache.spark.sql.UDFRegistration
-
Register a user-defined function with 4 arguments.
- register(String, UDF5<?, ?, ?, ?, ?, ?>, DataType) - Method in class org.apache.spark.sql.UDFRegistration
-
Register a user-defined function with 5 arguments.
- register(String, UDF6<?, ?, ?, ?, ?, ?, ?>, DataType) - Method in class org.apache.spark.sql.UDFRegistration
-
Register a user-defined function with 6 arguments.
- register(String, UDF7<?, ?, ?, ?, ?, ?, ?, ?>, DataType) - Method in class org.apache.spark.sql.UDFRegistration
-
Register a user-defined function with 7 arguments.
- register(String, UDF8<?, ?, ?, ?, ?, ?, ?, ?, ?>, DataType) - Method in class org.apache.spark.sql.UDFRegistration
-
Register a user-defined function with 8 arguments.
- register(String, UDF9<?, ?, ?, ?, ?, ?, ?, ?, ?, ?>, DataType) - Method in class org.apache.spark.sql.UDFRegistration
-
Register a user-defined function with 9 arguments.
- register(String, UDF10<?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?>, DataType) - Method in class org.apache.spark.sql.UDFRegistration
-
Register a user-defined function with 10 arguments.
- register(String, UDF11<?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?>, DataType) - Method in class org.apache.spark.sql.UDFRegistration
-
Register a user-defined function with 11 arguments.
- register(String, UDF12<?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?>, DataType) - Method in class org.apache.spark.sql.UDFRegistration
-
Register a user-defined function with 12 arguments.
- register(String, UDF13<?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?>, DataType) - Method in class org.apache.spark.sql.UDFRegistration
-
Register a user-defined function with 13 arguments.
- register(String, UDF14<?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?>, DataType) - Method in class org.apache.spark.sql.UDFRegistration
-
Register a user-defined function with 14 arguments.
- register(String, UDF15<?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?>, DataType) - Method in class org.apache.spark.sql.UDFRegistration
-
Register a user-defined function with 15 arguments.
- register(String, UDF16<?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?>, DataType) - Method in class org.apache.spark.sql.UDFRegistration
-
Register a user-defined function with 16 arguments.
- register(String, UDF17<?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?>, DataType) - Method in class org.apache.spark.sql.UDFRegistration
-
Register a user-defined function with 17 arguments.
- register(String, UDF18<?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?>, DataType) - Method in class org.apache.spark.sql.UDFRegistration
-
Register a user-defined function with 18 arguments.
- register(String, UDF19<?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?>, DataType) - Method in class org.apache.spark.sql.UDFRegistration
-
Register a user-defined function with 19 arguments.
- register(String, UDF20<?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?>, DataType) - Method in class org.apache.spark.sql.UDFRegistration
-
Register a user-defined function with 20 arguments.
- register(String, UDF21<?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?>, DataType) - Method in class org.apache.spark.sql.UDFRegistration
-
Register a user-defined function with 21 arguments.
- register(String, UDF22<?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?>, DataType) - Method in class org.apache.spark.sql.UDFRegistration
-
Register a user-defined function with 22 arguments.
- registerClasses(Kryo) - Method in class org.apache.spark.graphx.GraphKryoRegistrator
-
- registerClasses(Kryo) - Method in interface org.apache.spark.serializer.KryoRegistrator
-
- registerDialect(JdbcDialect) - Static method in class org.apache.spark.sql.jdbc.JdbcDialects
-
Register a dialect for use on all new matching jdbc
DataFrame
.
- registerKryoClasses(SparkConf) - Static method in class org.apache.spark.graphx.GraphXUtils
-
Registers classes that GraphX uses with Kryo.
- registerKryoClasses(Class<?>[]) - Method in class org.apache.spark.SparkConf
-
Use Kryo serialization and register the given set of classes with Kryo.
- registerTempTable(String) - Method in class org.apache.spark.sql.DataFrame
-
Registers this
DataFrame
as a temporary table using the given name.
- Regression() - Static method in class org.apache.spark.mllib.tree.configuration.Algo
-
- RegressionEvaluator - Class in org.apache.spark.ml.evaluation
-
:: Experimental ::
Evaluator for regression, which expects two input columns: prediction and label.
- RegressionEvaluator(String) - Constructor for class org.apache.spark.ml.evaluation.RegressionEvaluator
-
- RegressionEvaluator() - Constructor for class org.apache.spark.ml.evaluation.RegressionEvaluator
-
- RegressionMetrics - Class in org.apache.spark.mllib.evaluation
-
:: Experimental ::
Evaluator for regression.
- RegressionMetrics(RDD<Tuple2<Object, Object>>) - Constructor for class org.apache.spark.mllib.evaluation.RegressionMetrics
-
- RegressionModel<FeaturesType,M extends RegressionModel<FeaturesType,M>> - Class in org.apache.spark.ml.regression
-
:: DeveloperApi ::
- RegressionModel() - Constructor for class org.apache.spark.ml.regression.RegressionModel
-
- RegressionModel - Interface in org.apache.spark.mllib.regression
-
- reindex() - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
-
- reindex() - Method in class org.apache.spark.graphx.VertexRDD
-
Construct a new VertexRDD that is indexed by only the visible vertices.
- RelationProvider - Interface in org.apache.spark.sql.sources
-
::DeveloperApi::
Implemented by objects that produce relations for a specific kind of data source.
- relativeDirection(long) - Method in class org.apache.spark.graphx.Edge
-
Return the relative direction of the edge to the corresponding
vertex.
- remainder(Decimal) - Method in class org.apache.spark.sql.types.Decimal
-
- remember(Duration) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
-
Sets each DStreams in this context to remember RDDs it generated in the last given duration.
- remember(Duration) - Method in class org.apache.spark.streaming.StreamingContext
-
Set each DStreams in this context to remember RDDs it generated in the last given duration.
- rememberDuration() - Method in class org.apache.spark.streaming.dstream.DStream
-
- remoteBlocksFetched() - Method in class org.apache.spark.status.api.v1.ShuffleReadMetricDistributions
-
- remoteBlocksFetched() - Method in class org.apache.spark.status.api.v1.ShuffleReadMetrics
-
- remoteBytesRead() - Method in class org.apache.spark.status.api.v1.ShuffleReadMetricDistributions
-
- remoteBytesRead() - Method in class org.apache.spark.status.api.v1.ShuffleReadMetrics
-
- remove(Param<T>) - Method in class org.apache.spark.ml.param.ParamMap
-
Removes a key from this map and returns its value associated previously as an option.
- remove(String) - Method in class org.apache.spark.SparkConf
-
Remove a parameter from the configuration
- repartition(int) - Method in class org.apache.spark.api.java.JavaDoubleRDD
-
Return a new RDD that has exactly numPartitions partitions.
- repartition(int) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Return a new RDD that has exactly numPartitions partitions.
- repartition(int) - Method in class org.apache.spark.api.java.JavaRDD
-
Return a new RDD that has exactly numPartitions partitions.
- repartition(int, Ordering<T>) - Method in class org.apache.spark.rdd.RDD
-
Return a new RDD that has exactly numPartitions partitions.
- repartition(int) - Method in class org.apache.spark.sql.DataFrame
-
Returns a new
DataFrame
that has exactly
numPartitions
partitions.
- repartition(int) - Method in class org.apache.spark.streaming.api.java.JavaDStream
-
Return a new DStream with an increased or decreased level of parallelism.
- repartition(int) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Return a new DStream with an increased or decreased level of parallelism.
- repartition(int) - Method in class org.apache.spark.streaming.dstream.DStream
-
Return a new DStream with an increased or decreased level of parallelism.
- repartitionAndSortWithinPartitions(Partitioner) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Repartition the RDD according to the given partitioner and, within each resulting partition,
sort records by their keys.
- repartitionAndSortWithinPartitions(Partitioner, Comparator<K>) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Repartition the RDD according to the given partitioner and, within each resulting partition,
sort records by their keys.
- repartitionAndSortWithinPartitions(Partitioner) - Method in class org.apache.spark.rdd.OrderedRDDFunctions
-
Repartition the RDD according to the given partitioner and, within each resulting partition,
sort records by their keys.
- replace(String, Map<T, T>) - Method in class org.apache.spark.sql.DataFrameNaFunctions
-
Replaces values matching keys in replacement
map with the corresponding values.
- replace(String[], Map<T, T>) - Method in class org.apache.spark.sql.DataFrameNaFunctions
-
Replaces values matching keys in replacement
map with the corresponding values.
- replace(String, Map<T, T>) - Method in class org.apache.spark.sql.DataFrameNaFunctions
-
(Scala-specific) Replaces values matching keys in replacement
map.
- replace(Seq<String>, Map<T, T>) - Method in class org.apache.spark.sql.DataFrameNaFunctions
-
(Scala-specific) Replaces values matching keys in replacement
map.
- replicatedVertexView() - Method in class org.apache.spark.graphx.impl.GraphImpl
-
- replication() - Method in class org.apache.spark.storage.StorageLevel
-
- reportError(String, Throwable) - Method in class org.apache.spark.streaming.receiver.Receiver
-
Report exceptions in receiving data.
- requestExecutors(int) - Method in class org.apache.spark.SparkContext
-
:: DeveloperApi ::
Request an additional number of executors from the cluster manager.
- resetIterator() - Method in class org.apache.spark.rdd.PartitionCoalescer.LocationIterator
-
- restart(String) - Method in class org.apache.spark.streaming.receiver.Receiver
-
Restart the receiver.
- restart(String, Throwable) - Method in class org.apache.spark.streaming.receiver.Receiver
-
Restart the receiver.
- restart(String, Throwable, int) - Method in class org.apache.spark.streaming.receiver.Receiver
-
Restart the receiver.
- Resubmitted - Class in org.apache.spark
-
:: DeveloperApi ::
A ShuffleMapTask
that completed successfully earlier, but we
lost the executor before the stage completed.
- Resubmitted() - Constructor for class org.apache.spark.Resubmitted
-
- result(Duration, CanAwait) - Method in class org.apache.spark.ComplexFutureAction
-
- result(Duration, CanAwait) - Method in interface org.apache.spark.FutureAction
-
Awaits and returns the result (of type T) of this action.
- result(Duration, CanAwait) - Method in class org.apache.spark.SimpleFutureAction
-
- resultSerializationTime() - Method in class org.apache.spark.status.api.v1.TaskMetricDistributions
-
- resultSerializationTime() - Method in class org.apache.spark.status.api.v1.TaskMetrics
-
- resultSetToObjectArray(ResultSet) - Static method in class org.apache.spark.rdd.JdbcRDD
-
- resultSize() - Method in class org.apache.spark.status.api.v1.TaskMetricDistributions
-
- resultSize() - Method in class org.apache.spark.status.api.v1.TaskMetrics
-
- retainedJobs() - Method in class org.apache.spark.ui.jobs.JobProgressListener
-
- retainedStages() - Method in class org.apache.spark.ui.jobs.JobProgressListener
-
- retryWaitMs(SparkConf) - Static method in class org.apache.spark.util.RpcUtils
-
Returns the configured number of milliseconds to wait on each retry
- ReturnStatementFinder - Class in org.apache.spark.util
-
- ReturnStatementFinder() - Constructor for class org.apache.spark.util.ReturnStatementFinder
-
- reverse() - Method in class org.apache.spark.graphx.EdgeDirection
-
Reverse the direction of an edge.
- reverse() - Method in class org.apache.spark.graphx.EdgeRDD
-
Reverse all the edges in this RDD.
- reverse() - Method in class org.apache.spark.graphx.Graph
-
Reverses all edges in the graph.
- reverse() - Method in class org.apache.spark.graphx.impl.EdgeRDDImpl
-
- reverse() - Method in class org.apache.spark.graphx.impl.GraphImpl
-
- reverseRoutingTables() - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
-
- reverseRoutingTables() - Method in class org.apache.spark.graphx.VertexRDD
-
Returns a new
VertexRDD
reflecting a reversal of all edge directions in the corresponding
EdgeRDD
.
- ReviveOffers - Class in org.apache.spark.scheduler.local
-
- ReviveOffers() - Constructor for class org.apache.spark.scheduler.local.ReviveOffers
-
- RidgeRegressionModel - Class in org.apache.spark.mllib.regression
-
Regression model trained using RidgeRegression.
- RidgeRegressionModel(Vector, double) - Constructor for class org.apache.spark.mllib.regression.RidgeRegressionModel
-
- RidgeRegressionWithSGD - Class in org.apache.spark.mllib.regression
-
Train a regression model with L2-regularization using Stochastic Gradient Descent.
- RidgeRegressionWithSGD() - Constructor for class org.apache.spark.mllib.regression.RidgeRegressionWithSGD
-
Construct a RidgeRegression object with default parameters: {stepSize: 1.0, numIterations: 100,
regParam: 0.01, miniBatchFraction: 1.0}.
- right() - Method in class org.apache.spark.sql.sources.And
-
- right() - Method in class org.apache.spark.sql.sources.Or
-
- rightCategories() - Method in class org.apache.spark.ml.tree.CategoricalSplit
-
Get sorted categories which split to the right
- rightChild() - Method in class org.apache.spark.ml.tree.InternalNode
-
- rightChildIndex(int) - Static method in class org.apache.spark.mllib.tree.model.Node
-
Return the index of the right child of this node.
- rightImpurity() - Method in class org.apache.spark.mllib.tree.model.InformationGainStats
-
- rightNode() - Method in class org.apache.spark.mllib.tree.model.Node
-
- rightOuterJoin(JavaPairRDD<K, W>, Partitioner) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Perform a right outer join of this
and other
.
- rightOuterJoin(JavaPairRDD<K, W>) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Perform a right outer join of this
and other
.
- rightOuterJoin(JavaPairRDD<K, W>, int) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Perform a right outer join of this
and other
.
- rightOuterJoin(RDD<Tuple2<K, W>>, Partitioner) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Perform a right outer join of this
and other
.
- rightOuterJoin(RDD<Tuple2<K, W>>) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Perform a right outer join of this
and other
.
- rightOuterJoin(RDD<Tuple2<K, W>>, int) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Perform a right outer join of this
and other
.
- rightOuterJoin(JavaPairDStream<K, W>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Return a new DStream by applying 'right outer join' between RDDs of this
DStream and
other
DStream.
- rightOuterJoin(JavaPairDStream<K, W>, int) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Return a new DStream by applying 'right outer join' between RDDs of this
DStream and
other
DStream.
- rightOuterJoin(JavaPairDStream<K, W>, Partitioner) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Return a new DStream by applying 'right outer join' between RDDs of this
DStream and
other
DStream.
- rightOuterJoin(DStream<Tuple2<K, W>>, ClassTag<W>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
-
Return a new DStream by applying 'right outer join' between RDDs of this
DStream and
other
DStream.
- rightOuterJoin(DStream<Tuple2<K, W>>, int, ClassTag<W>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
-
Return a new DStream by applying 'right outer join' between RDDs of this
DStream and
other
DStream.
- rightOuterJoin(DStream<Tuple2<K, W>>, Partitioner, ClassTag<W>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
-
Return a new DStream by applying 'right outer join' between RDDs of this
DStream and
other
DStream.
- rightPredict() - Method in class org.apache.spark.mllib.tree.model.InformationGainStats
-
- rint(Column) - Static method in class org.apache.spark.sql.functions
-
Returns the double value that is closest in value to the argument and
is equal to a mathematical integer.
- rint(String) - Static method in class org.apache.spark.sql.functions
-
Returns the double value that is closest in value to the argument and
is equal to a mathematical integer.
- rlike(String) - Method in class org.apache.spark.sql.Column
-
SQL RLIKE expression (LIKE with Regex).
- RMATa() - Static method in class org.apache.spark.graphx.util.GraphGenerators
-
- RMATb() - Static method in class org.apache.spark.graphx.util.GraphGenerators
-
- RMATc() - Static method in class org.apache.spark.graphx.util.GraphGenerators
-
- RMATd() - Static method in class org.apache.spark.graphx.util.GraphGenerators
-
- rmatGraph(SparkContext, int, int) - Static method in class org.apache.spark.graphx.util.GraphGenerators
-
A random graph generator using the R-MAT model, proposed in
"R-MAT: A Recursive Model for Graph Mining" by Chakrabarti et al.
- rnd() - Method in class org.apache.spark.rdd.PartitionCoalescer
-
- roc() - Method in class org.apache.spark.mllib.evaluation.BinaryClassificationMetrics
-
Returns the receiver operating characteristic (ROC) curve,
which is an RDD of (false positive rate, true positive rate)
with (0.0, 0.0) prepended and (1.0, 1.0) appended to it.
- rollup(Column...) - Method in class org.apache.spark.sql.DataFrame
-
Create a multi-dimensional rollup for the current
DataFrame
using the specified columns,
so we can run aggregation on them.
- rollup(String, String...) - Method in class org.apache.spark.sql.DataFrame
-
Create a multi-dimensional rollup for the current
DataFrame
using the specified columns,
so we can run aggregation on them.
- rollup(Seq<Column>) - Method in class org.apache.spark.sql.DataFrame
-
Create a multi-dimensional rollup for the current
DataFrame
using the specified columns,
so we can run aggregation on them.
- rollup(String, Seq<String>) - Method in class org.apache.spark.sql.DataFrame
-
Create a multi-dimensional rollup for the current
DataFrame
using the specified columns,
so we can run aggregation on them.
- rootMeanSquaredError() - Method in class org.apache.spark.mllib.evaluation.RegressionMetrics
-
Returns the root mean squared error, which is defined as the square root of
the mean squared error.
- rootNode() - Method in class org.apache.spark.ml.classification.DecisionTreeClassificationModel
-
- rootNode() - Method in class org.apache.spark.ml.regression.DecisionTreeRegressionModel
-
- Row - Interface in org.apache.spark.sql
-
Represents one row of output from a relational operator.
- RowFactory - Class in org.apache.spark.sql
-
A factory class used to construct
Row
objects.
- RowFactory() - Constructor for class org.apache.spark.sql.RowFactory
-
- rowIndices() - Method in class org.apache.spark.mllib.linalg.SparseMatrix
-
- RowMatrix - Class in org.apache.spark.mllib.linalg.distributed
-
:: Experimental ::
Represents a row-oriented distributed Matrix with no meaningful row indices.
- RowMatrix(RDD<Vector>, long, int) - Constructor for class org.apache.spark.mllib.linalg.distributed.RowMatrix
-
- RowMatrix(RDD<Vector>) - Constructor for class org.apache.spark.mllib.linalg.distributed.RowMatrix
-
Alternative constructor leaving matrix dimensions to be determined automatically.
- rowNumber() - Static method in class org.apache.spark.sql.functions
-
Window function: returns a sequential number starting at 1 within a window partition.
- rows() - Method in class org.apache.spark.mllib.linalg.distributed.IndexedRowMatrix
-
- rows() - Method in class org.apache.spark.mllib.linalg.distributed.RowMatrix
-
- rowsBetween(long, long) - Method in class org.apache.spark.sql.expressions.WindowSpec
-
Defines the frame boundaries, from start
(inclusive) to end
(inclusive).
- rowsPerBlock() - Method in class org.apache.spark.mllib.linalg.distributed.BlockMatrix
-
- rpcEnv() - Method in class org.apache.spark.SparkEnv
-
- RpcUtils - Class in org.apache.spark.util
-
- RpcUtils() - Constructor for class org.apache.spark.util.RpcUtils
-
- RRDD<T> - Class in org.apache.spark.api.r
-
An RDD that stores serialized R objects as Array[Byte].
- RRDD(RDD<T>, byte[], String, String, byte[], String, Object[], ClassTag<T>) - Constructor for class org.apache.spark.api.r.RRDD
-
- run(Function0<T>, ExecutionContext) - Method in class org.apache.spark.ComplexFutureAction
-
Executes some action enclosed in the closure.
- run(Graph<VD, ED>, ClassTag<VD>, ClassTag<ED>) - Static method in class org.apache.spark.graphx.lib.ConnectedComponents
-
Compute the connected component membership of each vertex and return a graph with the vertex
value containing the lowest vertex id in the connected component containing that vertex.
- run(Graph<VD, ED>, int, ClassTag<ED>) - Static method in class org.apache.spark.graphx.lib.LabelPropagation
-
Run static Label Propagation for detecting communities in networks.
- run(Graph<VD, ED>, int, double, ClassTag<VD>, ClassTag<ED>) - Static method in class org.apache.spark.graphx.lib.PageRank
-
Run PageRank for a fixed number of iterations returning a graph
with vertex attributes containing the PageRank and edge
attributes the normalized edge weight.
- run(Graph<VD, ED>, Seq<Object>, ClassTag<ED>) - Static method in class org.apache.spark.graphx.lib.ShortestPaths
-
Computes shortest paths to the given set of landmark vertices.
- run(Graph<VD, ED>, int, ClassTag<VD>, ClassTag<ED>) - Static method in class org.apache.spark.graphx.lib.StronglyConnectedComponents
-
Compute the strongly connected component (SCC) of each vertex and return a graph with the
vertex value containing the lowest vertex id in the SCC containing that vertex.
- run(RDD<Edge<Object>>, SVDPlusPlus.Conf) - Static method in class org.apache.spark.graphx.lib.SVDPlusPlus
-
Implement SVD++ based on "Factorization Meets the Neighborhood:
a Multifaceted Collaborative Filtering Model",
available at http://public.research.att.com/~volinsky/netflix/kdd08koren.pdf
.
- run(Graph<VD, ED>, ClassTag<VD>, ClassTag<ED>) - Static method in class org.apache.spark.graphx.lib.TriangleCount
-
- run(RDD<LabeledPoint>) - Method in class org.apache.spark.mllib.classification.NaiveBayes
-
Run the algorithm with the configured parameters on an input RDD of LabeledPoint entries.
- run(RDD<Vector>) - Method in class org.apache.spark.mllib.clustering.GaussianMixture
-
Perform expectation maximization
- run(JavaRDD<Vector>) - Method in class org.apache.spark.mllib.clustering.GaussianMixture
-
Java-friendly version of run()
- run(RDD<Vector>) - Method in class org.apache.spark.mllib.clustering.KMeans
-
Train a K-means model on the given set of points; data
should be cached for high
performance, because this is an iterative algorithm.
- run(RDD<Tuple2<Object, Vector>>) - Method in class org.apache.spark.mllib.clustering.LDA
-
Learn an LDA model using the given dataset.
- run(JavaPairRDD<Long, Vector>) - Method in class org.apache.spark.mllib.clustering.LDA
-
Java-friendly version of run()
- run(RDD<Tuple3<Object, Object, Object>>) - Method in class org.apache.spark.mllib.clustering.PowerIterationClustering
-
Run the PIC algorithm.
- run(JavaRDD<Tuple3<Long, Long, Double>>) - Method in class org.apache.spark.mllib.clustering.PowerIterationClustering
-
A Java-friendly version of PowerIterationClustering.run
.
- run(RDD<Object>, ClassTag<Item>) - Method in class org.apache.spark.mllib.fpm.FPGrowth
-
Computes an FP-Growth model that contains frequent itemsets.
- run(JavaRDD<Basket>) - Method in class org.apache.spark.mllib.fpm.FPGrowth
-
- run(RDD<Rating>) - Method in class org.apache.spark.mllib.recommendation.ALS
-
- run(JavaRDD<Rating>) - Method in class org.apache.spark.mllib.recommendation.ALS
-
- run(RDD<LabeledPoint>) - Method in class org.apache.spark.mllib.regression.GeneralizedLinearAlgorithm
-
Run the algorithm with the configured parameters on an input
RDD of LabeledPoint entries.
- run(RDD<LabeledPoint>, Vector) - Method in class org.apache.spark.mllib.regression.GeneralizedLinearAlgorithm
-
Run the algorithm with the configured parameters on an input RDD
of LabeledPoint entries starting from the initial weights provided.
- run(RDD<Tuple3<Object, Object, Object>>) - Method in class org.apache.spark.mllib.regression.IsotonicRegression
-
- run(JavaRDD<Tuple3<Double, Double, Double>>) - Method in class org.apache.spark.mllib.regression.IsotonicRegression
-
- run(RDD<LabeledPoint>) - Method in class org.apache.spark.mllib.tree.DecisionTree
-
Method to train a decision tree model over an RDD
- run(RDD<LabeledPoint>) - Method in class org.apache.spark.mllib.tree.GradientBoostedTrees
-
Method to train a gradient boosting model
- run(JavaRDD<LabeledPoint>) - Method in class org.apache.spark.mllib.tree.GradientBoostedTrees
-
Java-friendly API for org.apache.spark.mllib.tree.GradientBoostedTrees!#run
.
- run(RDD<LabeledPoint>) - Method in class org.apache.spark.mllib.tree.RandomForest
-
Method to train a decision tree model over an RDD
- run() - Method in class org.apache.spark.rdd.PartitionCoalescer
-
Runs the packing algorithm and returns an array of PartitionGroups that if possible are
load balanced and grouped by locality
- run() - Method in class org.apache.spark.util.SparkShutdownHook
-
- runApproximateJob(RDD<T>, Function2<TaskContext, Iterator<T>, U>, <any>, long) - Method in class org.apache.spark.SparkContext
-
:: DeveloperApi ::
Run a job that can return approximate results.
- runJob(RDD<T>, Function1<Iterator<T>, U>, Seq<Object>, Function2<Object, U, BoxedUnit>, Function0<R>) - Method in class org.apache.spark.ComplexFutureAction
-
Runs a Spark job.
- runJob(RDD<T>, Function2<TaskContext, Iterator<T>, U>, Seq<Object>, boolean, Function2<Object, U, BoxedUnit>, ClassTag<U>) - Method in class org.apache.spark.SparkContext
-
Run a function on a given set of partitions in an RDD and pass the results to the given
handler function.
- runJob(RDD<T>, Function2<TaskContext, Iterator<T>, U>, Seq<Object>, boolean, ClassTag<U>) - Method in class org.apache.spark.SparkContext
-
Run a function on a given set of partitions in an RDD and return the results as an array.
- runJob(RDD<T>, Function1<Iterator<T>, U>, Seq<Object>, boolean, ClassTag<U>) - Method in class org.apache.spark.SparkContext
-
Run a job on a given set of partitions of an RDD, but take a function of type
Iterator[T] => U
instead of (TaskContext, Iterator[T]) => U
.
- runJob(RDD<T>, Function2<TaskContext, Iterator<T>, U>, ClassTag<U>) - Method in class org.apache.spark.SparkContext
-
Run a job on all partitions in an RDD and return the results in an array.
- runJob(RDD<T>, Function1<Iterator<T>, U>, ClassTag<U>) - Method in class org.apache.spark.SparkContext
-
Run a job on all partitions in an RDD and return the results in an array.
- runJob(RDD<T>, Function2<TaskContext, Iterator<T>, U>, Function2<Object, U, BoxedUnit>, ClassTag<U>) - Method in class org.apache.spark.SparkContext
-
Run a job on all partitions in an RDD and pass the results to a handler function.
- runJob(RDD<T>, Function1<Iterator<T>, U>, Function2<Object, U, BoxedUnit>, ClassTag<U>) - Method in class org.apache.spark.SparkContext
-
Run a job on all partitions in an RDD and pass the results to a handler function.
- runLBFGS(RDD<Tuple2<Object, Vector>>, Gradient, Updater, int, double, int, double, Vector) - Static method in class org.apache.spark.mllib.optimization.LBFGS
-
Run Limited-memory BFGS (L-BFGS) in parallel.
- runMiniBatchSGD(RDD<Tuple2<Object, Vector>>, Gradient, Updater, double, int, double, double, Vector) - Static method in class org.apache.spark.mllib.optimization.GradientDescent
-
Run stochastic gradient descent (SGD) in parallel using mini batches.
- running() - Method in class org.apache.spark.scheduler.TaskInfo
-
- runningLocally() - Method in class org.apache.spark.TaskContext
-
- runSVDPlusPlus(RDD<Edge<Object>>, SVDPlusPlus.Conf) - Static method in class org.apache.spark.graphx.lib.SVDPlusPlus
-
This method is now replaced by the updated version of run()
and returns exactly
the same result.
- RuntimePercentage - Class in org.apache.spark.scheduler
-
- RuntimePercentage(double, Option<Object>, double) - Constructor for class org.apache.spark.scheduler.RuntimePercentage
-
- runUntilConvergence(Graph<VD, ED>, double, double, ClassTag<VD>, ClassTag<ED>) - Static method in class org.apache.spark.graphx.lib.PageRank
-
Run a dynamic version of PageRank returning a graph with vertex attributes containing the
PageRank and edge attributes containing the normalized edge weight.
- runUntilConvergenceWithOptions(Graph<VD, ED>, double, double, Option<Object>, ClassTag<VD>, ClassTag<ED>) - Static method in class org.apache.spark.graphx.lib.PageRank
-
Run a dynamic version of PageRank returning a graph with vertex attributes containing the
PageRank and edge attributes containing the normalized edge weight.
- runWithOptions(Graph<VD, ED>, int, double, Option<Object>, ClassTag<VD>, ClassTag<ED>) - Static method in class org.apache.spark.graphx.lib.PageRank
-
Run PageRank for a fixed number of iterations returning a graph
with vertex attributes containing the PageRank and edge
attributes the normalized edge weight.
- runWithValidation(RDD<LabeledPoint>, RDD<LabeledPoint>) - Method in class org.apache.spark.mllib.tree.GradientBoostedTrees
-
Method to validate a gradient boosting model
- runWithValidation(JavaRDD<LabeledPoint>, JavaRDD<LabeledPoint>) - Method in class org.apache.spark.mllib.tree.GradientBoostedTrees
-
Java-friendly API for org.apache.spark.mllib.tree.GradientBoostedTrees!#runWithValidation
.
- s() - Method in class org.apache.spark.mllib.linalg.SingularValueDecomposition
-
- sample(boolean, Double) - Method in class org.apache.spark.api.java.JavaDoubleRDD
-
Return a sampled subset of this RDD.
- sample(boolean, Double, long) - Method in class org.apache.spark.api.java.JavaDoubleRDD
-
Return a sampled subset of this RDD.
- sample(boolean, double) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Return a sampled subset of this RDD.
- sample(boolean, double, long) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Return a sampled subset of this RDD.
- sample(boolean, double) - Method in class org.apache.spark.api.java.JavaRDD
-
Return a sampled subset of this RDD.
- sample(boolean, double, long) - Method in class org.apache.spark.api.java.JavaRDD
-
Return a sampled subset of this RDD.
- sample(boolean, double, long) - Method in class org.apache.spark.rdd.RDD
-
Return a sampled subset of this RDD.
- sample(boolean, double, long) - Method in class org.apache.spark.sql.DataFrame
-
Returns a new
DataFrame
by sampling a fraction of rows.
- sample(boolean, double) - Method in class org.apache.spark.sql.DataFrame
-
Returns a new
DataFrame
by sampling a fraction of rows, using a random seed.
- sample(Iterator<T>) - Method in class org.apache.spark.util.random.BernoulliCellSampler
-
- sample(Iterator<T>) - Method in class org.apache.spark.util.random.BernoulliSampler
-
- sample(Iterator<T>) - Method in class org.apache.spark.util.random.PoissonSampler
-
- sample(Iterator<T>) - Method in interface org.apache.spark.util.random.RandomSampler
-
take a random sample
- sampleByKey(boolean, Map<K, Object>, long) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Return a subset of this RDD sampled by key (via stratified sampling).
- sampleByKey(boolean, Map<K, Object>) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Return a subset of this RDD sampled by key (via stratified sampling).
- sampleByKey(boolean, Map<K, Object>, long) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Return a subset of this RDD sampled by key (via stratified sampling).
- sampleByKeyExact(boolean, Map<K, Object>, long) - Method in class org.apache.spark.api.java.JavaPairRDD
-
::Experimental::
Return a subset of this RDD sampled by key (via stratified sampling) containing exactly
math.ceil(numItems * samplingRate) for each stratum (group of pairs with the same key).
- sampleByKeyExact(boolean, Map<K, Object>) - Method in class org.apache.spark.api.java.JavaPairRDD
-
::Experimental::
Return a subset of this RDD sampled by key (via stratified sampling) containing exactly
math.ceil(numItems * samplingRate) for each stratum (group of pairs with the same key).
- sampleByKeyExact(boolean, Map<K, Object>, long) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
::Experimental::
Return a subset of this RDD sampled by key (via stratified sampling) containing exactly
math.ceil(numItems * samplingRate) for each stratum (group of pairs with the same key).
- sampleStdev() - Method in class org.apache.spark.api.java.JavaDoubleRDD
-
Compute the sample standard deviation of this RDD's elements (which corrects for bias in
estimating the standard deviation by dividing by N-1 instead of N).
- sampleStdev() - Method in class org.apache.spark.rdd.DoubleRDDFunctions
-
Compute the sample standard deviation of this RDD's elements (which corrects for bias in
estimating the standard deviation by dividing by N-1 instead of N).
- sampleStdev() - Method in class org.apache.spark.util.StatCounter
-
Return the sample standard deviation of the values, which corrects for bias in estimating the
variance by dividing by N-1 instead of N.
- sampleVariance() - Method in class org.apache.spark.api.java.JavaDoubleRDD
-
Compute the sample variance of this RDD's elements (which corrects for bias in
estimating the standard variance by dividing by N-1 instead of N).
- sampleVariance() - Method in class org.apache.spark.rdd.DoubleRDDFunctions
-
Compute the sample variance of this RDD's elements (which corrects for bias in
estimating the variance by dividing by N-1 instead of N).
- sampleVariance() - Method in class org.apache.spark.util.StatCounter
-
Return the sample variance, which corrects for bias in estimating the variance by dividing
by N-1 instead of N.
- save(SparkContext, String) - Method in class org.apache.spark.mllib.classification.LogisticRegressionModel
-
- save(SparkContext, String) - Method in class org.apache.spark.mllib.classification.NaiveBayesModel
-
- save(SparkContext, String) - Method in class org.apache.spark.mllib.classification.SVMModel
-
- save(SparkContext, String) - Method in class org.apache.spark.mllib.clustering.GaussianMixtureModel
-
- save(SparkContext, String) - Method in class org.apache.spark.mllib.clustering.KMeansModel
-
- save(SparkContext, String) - Method in class org.apache.spark.mllib.clustering.PowerIterationClusteringModel
-
- save(SparkContext, String) - Method in class org.apache.spark.mllib.feature.Word2VecModel
-
- save(SparkContext, String) - Method in class org.apache.spark.mllib.recommendation.MatrixFactorizationModel
-
- save(SparkContext, String) - Method in class org.apache.spark.mllib.regression.IsotonicRegressionModel
-
- save(SparkContext, String) - Method in class org.apache.spark.mllib.regression.LassoModel
-
- save(SparkContext, String) - Method in class org.apache.spark.mllib.regression.LinearRegressionModel
-
- save(SparkContext, String) - Method in class org.apache.spark.mllib.regression.RidgeRegressionModel
-
- save(SparkContext, String) - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel
-
- save(SparkContext, String) - Method in class org.apache.spark.mllib.tree.model.GradientBoostedTreesModel
-
- save(SparkContext, String) - Method in class org.apache.spark.mllib.tree.model.RandomForestModel
-
- save(SparkContext, String) - Method in interface org.apache.spark.mllib.util.Saveable
-
Save this model to the given path.
- save(String) - Method in class org.apache.spark.sql.DataFrame
-
Deprecated.
As of 1.4.0, replaced by write().save(path)
.
- save(String, SaveMode) - Method in class org.apache.spark.sql.DataFrame
-
Deprecated.
As of 1.4.0, replaced by write().mode(mode).save(path)
.
- save(String, String) - Method in class org.apache.spark.sql.DataFrame
-
Deprecated.
As of 1.4.0, replaced by write().format(source).save(path)
.
- save(String, String, SaveMode) - Method in class org.apache.spark.sql.DataFrame
-
Deprecated.
As of 1.4.0, replaced by write().format(source).mode(mode).save(path)
.
- save(String, SaveMode, Map<String, String>) - Method in class org.apache.spark.sql.DataFrame
-
Deprecated.
As of 1.4.0, replaced by
write().format(source).mode(mode).options(options).save(path)
.
- save(String, SaveMode, Map<String, String>) - Method in class org.apache.spark.sql.DataFrame
-
Deprecated.
As of 1.4.0, replaced by
write().format(source).mode(mode).options(options).save(path)
.
- save(String) - Method in class org.apache.spark.sql.DataFrameWriter
-
Saves the content of the
DataFrame
at the specified path.
- save() - Method in class org.apache.spark.sql.DataFrameWriter
-
Saves the content of the
DataFrame
as the specified table.
- Saveable - Interface in org.apache.spark.mllib.util
-
:: DeveloperApi ::
- saveAsHadoopDataset(JobConf) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Output the RDD to any Hadoop-supported storage system, using a Hadoop JobConf object for
that storage system.
- saveAsHadoopDataset(JobConf) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Output the RDD to any Hadoop-supported storage system, using a Hadoop JobConf object for
that storage system.
- saveAsHadoopFile(String, Class<?>, Class<?>, Class<F>, JobConf) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Output the RDD to any Hadoop-supported file system.
- saveAsHadoopFile(String, Class<?>, Class<?>, Class<F>) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Output the RDD to any Hadoop-supported file system.
- saveAsHadoopFile(String, Class<?>, Class<?>, Class<F>, Class<? extends CompressionCodec>) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Output the RDD to any Hadoop-supported file system, compressing with the supplied codec.
- saveAsHadoopFile(String, ClassTag<F>) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Output the RDD to any Hadoop-supported file system, using a Hadoop OutputFormat
class
supporting the key and value types K and V in this RDD.
- saveAsHadoopFile(String, Class<? extends CompressionCodec>, ClassTag<F>) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Output the RDD to any Hadoop-supported file system, using a Hadoop OutputFormat
class
supporting the key and value types K and V in this RDD.
- saveAsHadoopFile(String, Class<?>, Class<?>, Class<? extends OutputFormat<?, ?>>, Class<? extends CompressionCodec>) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Output the RDD to any Hadoop-supported file system, using a Hadoop OutputFormat
class
supporting the key and value types K and V in this RDD.
- saveAsHadoopFile(String, Class<?>, Class<?>, Class<? extends OutputFormat<?, ?>>, JobConf, Option<Class<? extends CompressionCodec>>) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Output the RDD to any Hadoop-supported file system, using a Hadoop OutputFormat
class
supporting the key and value types K and V in this RDD.
- saveAsHadoopFiles(String, String) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Save each RDD in this
DStream as a Hadoop file.
- saveAsHadoopFiles(String, String, Class<?>, Class<?>, Class<F>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Save each RDD in this
DStream as a Hadoop file.
- saveAsHadoopFiles(String, String, Class<?>, Class<?>, Class<F>, JobConf) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Save each RDD in this
DStream as a Hadoop file.
- saveAsHadoopFiles(String, String, ClassTag<F>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
-
Save each RDD in this
DStream as a Hadoop file.
- saveAsHadoopFiles(String, String, Class<?>, Class<?>, Class<? extends OutputFormat<?, ?>>, JobConf) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
-
Save each RDD in this
DStream as a Hadoop file.
- saveAsLibSVMFile(RDD<LabeledPoint>, String) - Static method in class org.apache.spark.mllib.util.MLUtils
-
Save labeled data in LIBSVM format.
- saveAsNewAPIHadoopDataset(Configuration) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Output the RDD to any Hadoop-supported storage system, using
a Configuration object for that storage system.
- saveAsNewAPIHadoopDataset(Configuration) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Output the RDD to any Hadoop-supported storage system with new Hadoop API, using a Hadoop
Configuration object for that storage system.
- saveAsNewAPIHadoopFile(String, Class<?>, Class<?>, Class<F>, Configuration) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Output the RDD to any Hadoop-supported file system.
- saveAsNewAPIHadoopFile(String, Class<?>, Class<?>, Class<F>) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Output the RDD to any Hadoop-supported file system.
- saveAsNewAPIHadoopFile(String, ClassTag<F>) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Output the RDD to any Hadoop-supported file system, using a new Hadoop API OutputFormat
(mapreduce.OutputFormat) object supporting the key and value types K and V in this RDD.
- saveAsNewAPIHadoopFile(String, Class<?>, Class<?>, Class<? extends OutputFormat<?, ?>>, Configuration) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Output the RDD to any Hadoop-supported file system, using a new Hadoop API OutputFormat
(mapreduce.OutputFormat) object supporting the key and value types K and V in this RDD.
- saveAsNewAPIHadoopFiles(String, String) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Save each RDD in this
DStream as a Hadoop file.
- saveAsNewAPIHadoopFiles(String, String, Class<?>, Class<?>, Class<F>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Save each RDD in this
DStream as a Hadoop file.
- saveAsNewAPIHadoopFiles(String, String, Class<?>, Class<?>, Class<F>, Configuration) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Save each RDD in this
DStream as a Hadoop file.
- saveAsNewAPIHadoopFiles(String, String, ClassTag<F>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
-
Save each RDD in this
DStream as a Hadoop file.
- saveAsNewAPIHadoopFiles(String, String, Class<?>, Class<?>, Class<? extends OutputFormat<?, ?>>, Configuration) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
-
Save each RDD in this
DStream as a Hadoop file.
- saveAsObjectFile(String) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Save this RDD as a SequenceFile of serialized objects.
- saveAsObjectFile(String) - Method in class org.apache.spark.rdd.RDD
-
Save this RDD as a SequenceFile of serialized objects.
- saveAsObjectFiles(String, String) - Method in class org.apache.spark.streaming.dstream.DStream
-
Save each RDD in this DStream as a Sequence file of serialized objects.
- saveAsParquetFile(String) - Method in class org.apache.spark.sql.DataFrame
-
Deprecated.
As of 1.4.0, replaced by write().parquet()
.
- saveAsSequenceFile(String, Option<Class<? extends CompressionCodec>>) - Method in class org.apache.spark.rdd.SequenceFileRDDFunctions
-
Output the RDD as a Hadoop SequenceFile using the Writable types we infer from the RDD's key
and value types.
- saveAsTable(String) - Method in class org.apache.spark.sql.DataFrame
-
Deprecated.
As of 1.4.0, replaced by write().saveAsTable(tableName)
.
- saveAsTable(String, SaveMode) - Method in class org.apache.spark.sql.DataFrame
-
Deprecated.
As of 1.4.0, replaced by write().mode(mode).saveAsTable(tableName)
.
- saveAsTable(String, String) - Method in class org.apache.spark.sql.DataFrame
-
Deprecated.
As of 1.4.0, replaced by write().format(source).saveAsTable(tableName)
.
- saveAsTable(String, String, SaveMode) - Method in class org.apache.spark.sql.DataFrame
-
Deprecated.
As of 1.4.0, replaced by write().mode(mode).saveAsTable(tableName)
.
- saveAsTable(String, String, SaveMode, Map<String, String>) - Method in class org.apache.spark.sql.DataFrame
-
Deprecated.
As of 1.4.0, replaced by
write().format(source).mode(mode).options(options).saveAsTable(tableName)
.
- saveAsTable(String, String, SaveMode, Map<String, String>) - Method in class org.apache.spark.sql.DataFrame
-
Deprecated.
As of 1.4.0, replaced by
write().format(source).mode(mode).options(options).saveAsTable(tableName)
.
- saveAsTable(String) - Method in class org.apache.spark.sql.DataFrameWriter
-
Saves the content of the
DataFrame
as the specified table.
- saveAsTextFile(String) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Save this RDD as a text file, using string representations of elements.
- saveAsTextFile(String, Class<? extends CompressionCodec>) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Save this RDD as a compressed text file, using string representations of elements.
- saveAsTextFile(String) - Method in class org.apache.spark.rdd.RDD
-
Save this RDD as a text file, using string representations of elements.
- saveAsTextFile(String, Class<? extends CompressionCodec>) - Method in class org.apache.spark.rdd.RDD
-
Save this RDD as a compressed text file, using string representations of elements.
- saveAsTextFiles(String, String) - Method in class org.apache.spark.streaming.dstream.DStream
-
Save each RDD in this DStream as at text file, using string representation
of elements.
- saveLabeledData(RDD<LabeledPoint>, String) - Static method in class org.apache.spark.mllib.util.MLUtils
-
- SaveMode - Enum in org.apache.spark.sql
-
SaveMode is used to specify the expected behavior of saving a DataFrame to a data source.
- sc() - Method in class org.apache.spark.api.java.JavaSparkContext
-
- sc() - Method in class org.apache.spark.sql.SQLContext.implicits$.StringToColumn
-
- sc() - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
-
Deprecated.
As of 0.9.0, replaced by sparkContext
- sc() - Method in class org.apache.spark.streaming.StreamingContext
-
- scalaIntToJavaLong(DStream<Object>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
-
- scalaToJavaLong(JavaPairDStream<K, Object>, ClassTag<K>) - Static method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
- scale() - Method in class org.apache.spark.mllib.random.GammaGenerator
-
- scale() - Method in class org.apache.spark.sql.types.Decimal
-
- scale() - Method in class org.apache.spark.sql.types.DecimalType
-
- scale() - Method in class org.apache.spark.sql.types.PrecisionInfo
-
- scalingVec() - Method in class org.apache.spark.ml.feature.ElementwiseProduct
-
the vector to multiply with input vectors
- scalingVec() - Method in class org.apache.spark.mllib.feature.ElementwiseProduct
-
- scheduler() - Method in class org.apache.spark.streaming.StreamingContext
-
- schedulingDelay() - Method in class org.apache.spark.streaming.scheduler.BatchInfo
-
Time taken for the first job of this batch to start processing from the time this batch
was submitted to the streaming scheduler.
- SchedulingMode - Class in org.apache.spark.scheduler
-
"FAIR" and "FIFO" determines which policy is used
to order tasks amongst a Schedulable's sub-queues
"NONE" is used when the a Schedulable has no sub-queues.
- SchedulingMode() - Constructor for class org.apache.spark.scheduler.SchedulingMode
-
- schedulingMode() - Method in class org.apache.spark.ui.jobs.JobProgressListener
-
- schedulingPool() - Method in class org.apache.spark.status.api.v1.StageData
-
- schema() - Method in class org.apache.spark.sql.DataFrame
-
- schema(StructType) - Method in class org.apache.spark.sql.DataFrameReader
-
Specifies the input schema.
- schema() - Method in interface org.apache.spark.sql.Row
-
Schema for the row.
- schema() - Method in class org.apache.spark.sql.sources.BaseRelation
-
- schema() - Method in class org.apache.spark.sql.sources.HadoopFsRelation
-
Schema of this relation.
- SchemaRelationProvider - Interface in org.apache.spark.sql.sources
-
::DeveloperApi::
Implemented by objects that produce relations for a specific kind of data source
with a given schema.
- scope() - Method in class org.apache.spark.rdd.RDD
-
The scope associated with the operation that created this RDD.
- scope() - Method in class org.apache.spark.storage.RDDInfo
-
- scoreAndLabels() - Method in class org.apache.spark.mllib.evaluation.BinaryClassificationMetrics
-
- seconds() - Static method in class org.apache.spark.scheduler.StatsReportListener
-
- seconds(long) - Static method in class org.apache.spark.streaming.Durations
-
- Seconds - Class in org.apache.spark.streaming
-
Helper object that creates instance of
Duration
representing
a given number of seconds.
- Seconds() - Constructor for class org.apache.spark.streaming.Seconds
-
- securityManager() - Method in class org.apache.spark.SparkEnv
-
- select(Column...) - Method in class org.apache.spark.sql.DataFrame
-
Selects a set of expressions.
- select(String, String...) - Method in class org.apache.spark.sql.DataFrame
-
Selects a set of columns.
- select(Seq<Column>) - Method in class org.apache.spark.sql.DataFrame
-
Selects a set of expressions.
- select(String, Seq<String>) - Method in class org.apache.spark.sql.DataFrame
-
Selects a set of columns.
- selectedFeatures() - Method in class org.apache.spark.mllib.feature.ChiSqSelectorModel
-
- selectExpr(String...) - Method in class org.apache.spark.sql.DataFrame
-
Selects a set of SQL expressions.
- selectExpr(Seq<String>) - Method in class org.apache.spark.sql.DataFrame
-
Selects a set of SQL expressions.
- sendMessages(String, Map<String, Integer>) - Method in class org.apache.spark.streaming.kafka.KafkaTestUtils
-
Java-friendly function for sending messages to the Kafka broker
- sendMessages(String, Map<String, Object>) - Method in class org.apache.spark.streaming.kafka.KafkaTestUtils
-
Send the messages to the Kafka broker
- sendMessages(String, String[]) - Method in class org.apache.spark.streaming.kafka.KafkaTestUtils
-
Send the array of messages to the Kafka broker
- sendToDst(A) - Method in class org.apache.spark.graphx.EdgeContext
-
Sends a message to the destination vertex.
- sendToDst(A) - Method in class org.apache.spark.graphx.impl.AggregatingEdgeContext
-
- sendToSrc(A) - Method in class org.apache.spark.graphx.EdgeContext
-
Sends a message to the source vertex.
- sendToSrc(A) - Method in class org.apache.spark.graphx.impl.AggregatingEdgeContext
-
- sequenceFile(String, Class<K>, Class<V>, int) - Method in class org.apache.spark.api.java.JavaSparkContext
-
Get an RDD for a Hadoop SequenceFile with given key and value types.
- sequenceFile(String, Class<K>, Class<V>) - Method in class org.apache.spark.api.java.JavaSparkContext
-
Get an RDD for a Hadoop SequenceFile.
- sequenceFile(String, Class<K>, Class<V>, int) - Method in class org.apache.spark.SparkContext
-
Get an RDD for a Hadoop SequenceFile with given key and value types.
- sequenceFile(String, Class<K>, Class<V>) - Method in class org.apache.spark.SparkContext
-
Get an RDD for a Hadoop SequenceFile with given key and value types.
- sequenceFile(String, int, ClassTag<K>, ClassTag<V>, Function0<WritableConverter<K>>, Function0<WritableConverter<V>>) - Method in class org.apache.spark.SparkContext
-
Version of sequenceFile() for types implicitly convertible to Writables through a
WritableConverter.
- SequenceFileRDDFunctions<K,V> - Class in org.apache.spark.rdd
-
Extra functions available on RDDs of (key, value) pairs to create a Hadoop SequenceFile,
through an implicit conversion.
- SequenceFileRDDFunctions(RDD<Tuple2<K, V>>, Class<? extends Writable>, Class<? extends Writable>, Function1<K, Writable>, ClassTag<K>, Function1<V, Writable>, ClassTag<V>) - Constructor for class org.apache.spark.rdd.SequenceFileRDDFunctions
-
- SequenceFileRDDFunctions(RDD<Tuple2<K, V>>, Function1<K, Writable>, ClassTag<K>, Function1<V, Writable>, ClassTag<V>) - Constructor for class org.apache.spark.rdd.SequenceFileRDDFunctions
-
- SerializableWritable<T extends org.apache.hadoop.io.Writable> - Class in org.apache.spark
-
- SerializableWritable(T) - Constructor for class org.apache.spark.SerializableWritable
-
- SerializationStream - Class in org.apache.spark.serializer
-
:: DeveloperApi ::
A stream for writing serialized objects.
- SerializationStream() - Constructor for class org.apache.spark.serializer.SerializationStream
-
- serialize(T, ClassTag<T>) - Method in class org.apache.spark.serializer.SerializerInstance
-
- serialize(Object) - Method in class org.apache.spark.sql.types.UserDefinedType
-
Convert the user type to a SQL datum
- serializedData() - Method in class org.apache.spark.scheduler.local.StatusUpdate
-
- Serializer - Class in org.apache.spark.serializer
-
:: DeveloperApi ::
A serializer.
- Serializer() - Constructor for class org.apache.spark.serializer.Serializer
-
- serializer() - Method in class org.apache.spark.ShuffleDependency
-
- serializer() - Method in class org.apache.spark.SparkEnv
-
- SerializerInstance - Class in org.apache.spark.serializer
-
:: DeveloperApi ::
An instance of a serializer, for use by one thread at a time.
- SerializerInstance() - Constructor for class org.apache.spark.serializer.SerializerInstance
-
- serializeStream(OutputStream) - Method in class org.apache.spark.serializer.SerializerInstance
-
- set(long, long, int, int, VD, VD, ED) - Method in class org.apache.spark.graphx.impl.AggregatingEdgeContext
-
- set(Param<T>, T) - Method in interface org.apache.spark.ml.param.Params
-
Sets a parameter in the embedded param map.
- set(String, Object) - Method in interface org.apache.spark.ml.param.Params
-
Sets a parameter (by name) in the embedded param map.
- set(ParamPair<?>) - Method in interface org.apache.spark.ml.param.Params
-
Sets a parameter in the embedded param map.
- set(String, String) - Method in class org.apache.spark.SparkConf
-
Set a configuration variable.
- set(SparkEnv) - Static method in class org.apache.spark.SparkEnv
-
- set(long) - Method in class org.apache.spark.sql.types.Decimal
-
Set this Decimal to the given Long.
- set(int) - Method in class org.apache.spark.sql.types.Decimal
-
Set this Decimal to the given Int.
- set(long, int, int) - Method in class org.apache.spark.sql.types.Decimal
-
Set this Decimal to the given unscaled Long, with a given precision and scale.
- set(BigDecimal, int, int) - Method in class org.apache.spark.sql.types.Decimal
-
Set this Decimal to the given BigDecimal value, with a given precision and scale.
- set(BigDecimal) - Method in class org.apache.spark.sql.types.Decimal
-
Set this Decimal to the given BigDecimal value, inheriting its precision and scale.
- set(Decimal) - Method in class org.apache.spark.sql.types.Decimal
-
Set this Decimal to the given Decimal value.
- set(String) - Method in class org.apache.spark.sql.types.UTF8String
-
Update the UTF8String with String.
- set(byte[]) - Method in class org.apache.spark.sql.types.UTF8String
-
Update the UTF8String with Array[Byte], which should be encoded in UTF-8
- setAggregator(Aggregator<K, V, C>) - Method in class org.apache.spark.rdd.ShuffledRDD
-
Set aggregator for RDD's shuffle.
- setAlgo(String) - Method in class org.apache.spark.mllib.tree.configuration.Strategy
-
Sets Algorithm using a String.
- setAll(Traversable<Tuple2<String, String>>) - Method in class org.apache.spark.SparkConf
-
Set multiple parameters together
- setAlpha(double) - Method in class org.apache.spark.ml.recommendation.ALS
-
- setAlpha(double) - Method in class org.apache.spark.mllib.clustering.LDA
-
Alias for setDocConcentration()
- setAlpha(double) - Method in class org.apache.spark.mllib.recommendation.ALS
-
- setAppName(String) - Method in class org.apache.spark.launcher.SparkLauncher
-
Set the application name.
- setAppName(String) - Method in class org.apache.spark.SparkConf
-
Set a name for your application.
- setAppResource(String) - Method in class org.apache.spark.launcher.SparkLauncher
-
Set the main application resource.
- setBandwidth(double) - Method in class org.apache.spark.mllib.stat.KernelDensity
-
Sets the bandwidth (standard deviation) of the Gaussian kernel (default: 1.0
).
- setBeta(double) - Method in class org.apache.spark.mllib.clustering.LDA
-
Alias for setTopicConcentration()
- setBlocks(int) - Method in class org.apache.spark.mllib.recommendation.ALS
-
- setCacheNodeIds(boolean) - Method in class org.apache.spark.ml.classification.DecisionTreeClassifier
-
- setCacheNodeIds(boolean) - Method in class org.apache.spark.ml.classification.GBTClassifier
-
- setCacheNodeIds(boolean) - Method in class org.apache.spark.ml.classification.RandomForestClassifier
-
- setCacheNodeIds(boolean) - Method in class org.apache.spark.ml.regression.DecisionTreeRegressor
-
- setCacheNodeIds(boolean) - Method in class org.apache.spark.ml.regression.GBTRegressor
-
- setCacheNodeIds(boolean) - Method in class org.apache.spark.ml.regression.RandomForestRegressor
-
- setCallSite(String) - Method in class org.apache.spark.api.java.JavaSparkContext
-
Pass-through to SparkContext.setCallSite.
- setCallSite(String) - Method in class org.apache.spark.SparkContext
-
Set the thread-local property for overriding the call sites
of actions and RDDs.
- setCategoricalFeaturesInfo(Map<Integer, Integer>) - Method in class org.apache.spark.mllib.tree.configuration.Strategy
-
Sets categoricalFeaturesInfo using a Java Map.
- setCheckpointDir(String) - Method in class org.apache.spark.api.java.JavaSparkContext
-
Set the directory under which RDDs are going to be checkpointed.
- setCheckpointDir(String) - Method in class org.apache.spark.SparkContext
-
Set the directory under which RDDs are going to be checkpointed.
- setCheckpointInterval(int) - Method in class org.apache.spark.ml.classification.DecisionTreeClassifier
-
- setCheckpointInterval(int) - Method in class org.apache.spark.ml.classification.GBTClassifier
-
- setCheckpointInterval(int) - Method in class org.apache.spark.ml.classification.RandomForestClassifier
-
- setCheckpointInterval(int) - Method in class org.apache.spark.ml.recommendation.ALS
-
- setCheckpointInterval(int) - Method in class org.apache.spark.ml.regression.DecisionTreeRegressor
-
- setCheckpointInterval(int) - Method in class org.apache.spark.ml.regression.GBTRegressor
-
- setCheckpointInterval(int) - Method in class org.apache.spark.ml.regression.RandomForestRegressor
-
- setCheckpointInterval(int) - Method in class org.apache.spark.mllib.clustering.LDA
-
Period (in iterations) between checkpoints (default = 10).
- setCheckpointInterval(int) - Method in class org.apache.spark.mllib.recommendation.ALS
-
- setCheckpointInterval(int) - Method in class org.apache.spark.mllib.tree.configuration.Strategy
-
- setClassifier(Classifier<?, ?, ?>) - Method in class org.apache.spark.ml.classification.OneVsRest
-
- setConf(String, String) - Method in class org.apache.spark.launcher.SparkLauncher
-
Set a single configuration value for the application.
- setConf(String, String) - Method in class org.apache.spark.sql.hive.HiveContext
-
- setConf(Properties) - Method in class org.apache.spark.sql.SQLContext
-
Set Spark SQL configuration properties.
- setConf(String, String) - Method in class org.apache.spark.sql.SQLContext
-
Set the given Spark SQL configuration property.
- setConvergenceTol(double) - Method in class org.apache.spark.mllib.clustering.GaussianMixture
-
Set the largest change in log-likelihood at which convergence is
considered to have occurred.
- setConvergenceTol(double) - Method in class org.apache.spark.mllib.optimization.LBFGS
-
Set the convergence tolerance of iterations for L-BFGS.
- setDecayFactor(double) - Method in class org.apache.spark.mllib.clustering.StreamingKMeans
-
Set the decay factor directly (for forgetful algorithms).
- setDefault(ParamPair<?>...) - Method in interface org.apache.spark.ml.param.Params
-
Sets default values for a list of params.
- setDefault(Param<T>, T) - Method in interface org.apache.spark.ml.param.Params
-
Sets a default value for a param.
- setDefault(Seq<ParamPair<?>>) - Method in interface org.apache.spark.ml.param.Params
-
Sets default values for a list of params.
- setDefaultClassLoader(ClassLoader) - Method in class org.apache.spark.serializer.Serializer
-
Sets a class loader for the serializer to use in deserialization.
- setDegree(int) - Method in class org.apache.spark.ml.feature.PolynomialExpansion
-
- setDeployMode(String) - Method in class org.apache.spark.launcher.SparkLauncher
-
Set the deploy mode for the application.
- setDocConcentration(double) - Method in class org.apache.spark.mllib.clustering.LDA
-
Concentration parameter (commonly named "alpha") for the prior placed on documents'
distributions over topics ("theta").
- setDropLast(boolean) - Method in class org.apache.spark.ml.feature.OneHotEncoder
-
- setElasticNetParam(double) - Method in class org.apache.spark.ml.classification.LogisticRegression
-
Set the ElasticNet mixing parameter.
- setElasticNetParam(double) - Method in class org.apache.spark.ml.regression.LinearRegression
-
Set the ElasticNet mixing parameter.
- setEpsilon(double) - Method in class org.apache.spark.mllib.clustering.KMeans
-
Set the distance threshold within which we've consider centers to have converged.
- setEstimator(Estimator<?>) - Method in class org.apache.spark.ml.tuning.CrossValidator
-
- setEstimatorParamMaps(ParamMap[]) - Method in class org.apache.spark.ml.tuning.CrossValidator
-
- setEvaluator(Evaluator) - Method in class org.apache.spark.ml.tuning.CrossValidator
-
- setExecutorEnv(String, String) - Method in class org.apache.spark.SparkConf
-
Set an environment variable to be used when launching executors for this application.
- setExecutorEnv(Seq<Tuple2<String, String>>) - Method in class org.apache.spark.SparkConf
-
Set multiple environment variables to be used when launching executors.
- setExecutorEnv(Tuple2<String, String>[]) - Method in class org.apache.spark.SparkConf
-
Set multiple environment variables to be used when launching executors.
- setFeaturesCol(String) - Method in class org.apache.spark.ml.PredictionModel
-
- setFeaturesCol(String) - Method in class org.apache.spark.ml.Predictor
-
- setFeatureSubsetStrategy(String) - Method in class org.apache.spark.ml.classification.RandomForestClassifier
-
- setFeatureSubsetStrategy(String) - Method in class org.apache.spark.ml.regression.RandomForestRegressor
-
- setFinalRDDStorageLevel(StorageLevel) - Method in class org.apache.spark.mllib.recommendation.ALS
-
- setFitIntercept(boolean) - Method in class org.apache.spark.ml.classification.LogisticRegression
-
Whether to fit an intercept term.
- setGaps(boolean) - Method in class org.apache.spark.ml.feature.RegexTokenizer
-
- setGradient(Gradient) - Method in class org.apache.spark.mllib.optimization.GradientDescent
-
Set the gradient function (of the loss function of one single data example)
to be used for SGD.
- setGradient(Gradient) - Method in class org.apache.spark.mllib.optimization.LBFGS
-
Set the gradient function (of the loss function of one single data example)
to be used for L-BFGS.
- setHalfLife(double, String) - Method in class org.apache.spark.mllib.clustering.StreamingKMeans
-
Set the half life and time unit ("batches" or "points") for forgetful algorithms.
- setIfMissing(String, String) - Method in class org.apache.spark.SparkConf
-
Set a parameter if it isn't already configured
- setImplicitPrefs(boolean) - Method in class org.apache.spark.ml.recommendation.ALS
-
- setImplicitPrefs(boolean) - Method in class org.apache.spark.mllib.recommendation.ALS
-
- setImpurity(String) - Method in class org.apache.spark.ml.classification.DecisionTreeClassifier
-
- setImpurity(String) - Method in class org.apache.spark.ml.classification.GBTClassifier
-
The impurity setting is ignored for GBT models.
- setImpurity(String) - Method in class org.apache.spark.ml.classification.RandomForestClassifier
-
- setImpurity(String) - Method in class org.apache.spark.ml.regression.DecisionTreeRegressor
-
- setImpurity(String) - Method in class org.apache.spark.ml.regression.GBTRegressor
-
The impurity setting is ignored for GBT models.
- setImpurity(String) - Method in class org.apache.spark.ml.regression.RandomForestRegressor
-
- setImpurity(Impurity) - Method in class org.apache.spark.mllib.tree.configuration.Strategy
-
- setInitialCenters(Vector[], double[]) - Method in class org.apache.spark.mllib.clustering.StreamingKMeans
-
Specify initial centers directly.
- setInitializationMode(String) - Method in class org.apache.spark.mllib.clustering.KMeans
-
Set the initialization algorithm.
- setInitializationMode(String) - Method in class org.apache.spark.mllib.clustering.PowerIterationClustering
-
- setInitializationSteps(int) - Method in class org.apache.spark.mllib.clustering.KMeans
-
Set the number of steps for the k-means|| initialization mode.
- setInitialModel(GaussianMixtureModel) - Method in class org.apache.spark.mllib.clustering.GaussianMixture
-
Set the initial GMM starting point, bypassing the random initialization.
- setInitialWeights(Vector) - Method in class org.apache.spark.mllib.classification.StreamingLogisticRegressionWithSGD
-
Set the initial weights.
- setInitialWeights(Vector) - Method in class org.apache.spark.mllib.regression.StreamingLinearRegressionWithSGD
-
Set the initial weights.
- setInputCol(String) - Method in class org.apache.spark.ml.feature.Binarizer
-
- setInputCol(String) - Method in class org.apache.spark.ml.feature.Bucketizer
-
- setInputCol(String) - Method in class org.apache.spark.ml.feature.HashingTF
-
- setInputCol(String) - Method in class org.apache.spark.ml.feature.IDF
-
- setInputCol(String) - Method in class org.apache.spark.ml.feature.IDFModel
-
- setInputCol(String) - Method in class org.apache.spark.ml.feature.OneHotEncoder
-
- setInputCol(String) - Method in class org.apache.spark.ml.feature.StandardScaler
-
- setInputCol(String) - Method in class org.apache.spark.ml.feature.StandardScalerModel
-
- setInputCol(String) - Method in class org.apache.spark.ml.feature.StringIndexer
-
- setInputCol(String) - Method in class org.apache.spark.ml.feature.StringIndexerModel
-
- setInputCol(String) - Method in class org.apache.spark.ml.feature.VectorIndexer
-
- setInputCol(String) - Method in class org.apache.spark.ml.feature.VectorIndexerModel
-
- setInputCol(String) - Method in class org.apache.spark.ml.feature.Word2Vec
-
- setInputCol(String) - Method in class org.apache.spark.ml.feature.Word2VecModel
-
- setInputCol(String) - Method in class org.apache.spark.ml.UnaryTransformer
-
- setInputCols(String[]) - Method in class org.apache.spark.ml.feature.VectorAssembler
-
- setIntercept(boolean) - Method in class org.apache.spark.mllib.regression.GeneralizedLinearAlgorithm
-
Set if the algorithm should add an intercept.
- setIntermediateRDDStorageLevel(StorageLevel) - Method in class org.apache.spark.mllib.recommendation.ALS
-
- setIsotonic(boolean) - Method in class org.apache.spark.mllib.regression.IsotonicRegression
-
- setItemCol(String) - Method in class org.apache.spark.ml.recommendation.ALS
-
- setItemCol(String) - Method in class org.apache.spark.ml.recommendation.ALSModel
-
- setIterations(int) - Method in class org.apache.spark.mllib.recommendation.ALS
-
- setJars(Seq<String>) - Method in class org.apache.spark.SparkConf
-
Set JAR files to distribute to the cluster.
- setJars(String[]) - Method in class org.apache.spark.SparkConf
-
Set JAR files to distribute to the cluster.
- setJavaHome(String) - Method in class org.apache.spark.launcher.SparkLauncher
-
Set a custom JAVA_HOME for launching the Spark application.
- setJobDescription(String) - Method in class org.apache.spark.SparkContext
-
Set a human readable description of the current job.
- setJobGroup(String, String, boolean) - Method in class org.apache.spark.api.java.JavaSparkContext
-
Assigns a group ID to all the jobs started by this thread until the group ID is set to a
different value or cleared.
- setJobGroup(String, String) - Method in class org.apache.spark.api.java.JavaSparkContext
-
Assigns a group ID to all the jobs started by this thread until the group ID is set to a
different value or cleared.
- setJobGroup(String, String, boolean) - Method in class org.apache.spark.SparkContext
-
Assigns a group ID to all the jobs started by this thread until the group ID is set to a
different value or cleared.
- setK(int) - Method in class org.apache.spark.mllib.clustering.GaussianMixture
-
Set the number of Gaussians in the mixture model.
- setK(int) - Method in class org.apache.spark.mllib.clustering.KMeans
-
Set the number of clusters to create (k).
- setK(int) - Method in class org.apache.spark.mllib.clustering.LDA
-
Number of topics to infer.
- setK(int) - Method in class org.apache.spark.mllib.clustering.PowerIterationClustering
-
- setK(int) - Method in class org.apache.spark.mllib.clustering.StreamingKMeans
-
Set the number of clusters.
- setKappa(double) - Method in class org.apache.spark.mllib.clustering.OnlineLDAOptimizer
-
Learning rate: exponential decay rate---should be between
(0.5, 1.0] to guarantee asymptotic convergence.
- setKeyOrdering(Ordering<K>) - Method in class org.apache.spark.rdd.ShuffledRDD
-
Set key ordering for RDD's shuffle.
- setLabelCol(String) - Method in class org.apache.spark.ml.evaluation.BinaryClassificationEvaluator
-
- setLabelCol(String) - Method in class org.apache.spark.ml.evaluation.RegressionEvaluator
-
- setLabelCol(String) - Method in class org.apache.spark.ml.Predictor
-
- setLambda(double) - Method in class org.apache.spark.mllib.classification.NaiveBayes
-
Set the smoothing parameter.
- setLambda(double) - Method in class org.apache.spark.mllib.recommendation.ALS
-
- setLearningRate(double) - Method in class org.apache.spark.mllib.feature.Word2Vec
-
- setLearningRate(double) - Method in class org.apache.spark.mllib.tree.configuration.BoostingStrategy
-
- setLocalProperty(String, String) - Method in class org.apache.spark.api.java.JavaSparkContext
-
Set a local property that affects jobs submitted from this thread, such as the
Spark fair scheduler pool.
- setLocalProperty(String, String) - Method in class org.apache.spark.SparkContext
-
Set a local property that affects jobs submitted from this thread, such as the
Spark fair scheduler pool.
- setLogLevel(String) - Method in class org.apache.spark.api.java.JavaSparkContext
-
Control our logLevel.
- setLogLevel(String) - Method in class org.apache.spark.SparkContext
-
Control our logLevel.
- setLoss(Loss) - Method in class org.apache.spark.mllib.tree.configuration.BoostingStrategy
-
- setLossType(String) - Method in class org.apache.spark.ml.classification.GBTClassifier
-
- setLossType(String) - Method in class org.apache.spark.ml.regression.GBTRegressor
-
- setMainClass(String) - Method in class org.apache.spark.launcher.SparkLauncher
-
Sets the application class name for Java/Scala applications.
- setMapSideCombine(boolean) - Method in class org.apache.spark.rdd.ShuffledRDD
-
Set mapSideCombine flag for RDD's shuffle.
- setMaster(String) - Method in class org.apache.spark.launcher.SparkLauncher
-
Set the Spark master for the application.
- setMaster(String) - Method in class org.apache.spark.SparkConf
-
The master URL to connect to, such as "local" to run locally with one thread, "local[4]" to
run locally with 4 cores, or "spark://master:7077" to run on a Spark standalone cluster.
- setMaxBins(int) - Method in class org.apache.spark.ml.classification.DecisionTreeClassifier
-
- setMaxBins(int) - Method in class org.apache.spark.ml.classification.GBTClassifier
-
- setMaxBins(int) - Method in class org.apache.spark.ml.classification.RandomForestClassifier
-
- setMaxBins(int) - Method in class org.apache.spark.ml.regression.DecisionTreeRegressor
-
- setMaxBins(int) - Method in class org.apache.spark.ml.regression.GBTRegressor
-
- setMaxBins(int) - Method in class org.apache.spark.ml.regression.RandomForestRegressor
-
- setMaxBins(int) - Method in class org.apache.spark.mllib.tree.configuration.Strategy
-
- setMaxCategories(int) - Method in class org.apache.spark.ml.feature.VectorIndexer
-
- setMaxDepth(int) - Method in class org.apache.spark.ml.classification.DecisionTreeClassifier
-
- setMaxDepth(int) - Method in class org.apache.spark.ml.classification.GBTClassifier
-
- setMaxDepth(int) - Method in class org.apache.spark.ml.classification.RandomForestClassifier
-
- setMaxDepth(int) - Method in class org.apache.spark.ml.regression.DecisionTreeRegressor
-
- setMaxDepth(int) - Method in class org.apache.spark.ml.regression.GBTRegressor
-
- setMaxDepth(int) - Method in class org.apache.spark.ml.regression.RandomForestRegressor
-
- setMaxDepth(int) - Method in class org.apache.spark.mllib.tree.configuration.Strategy
-
- setMaxIter(int) - Method in class org.apache.spark.ml.classification.GBTClassifier
-
- setMaxIter(int) - Method in class org.apache.spark.ml.classification.LogisticRegression
-
Set the maximum number of iterations.
- setMaxIter(int) - Method in class org.apache.spark.ml.feature.Word2Vec
-
- setMaxIter(int) - Method in class org.apache.spark.ml.recommendation.ALS
-
- setMaxIter(int) - Method in class org.apache.spark.ml.regression.GBTRegressor
-
- setMaxIter(int) - Method in class org.apache.spark.ml.regression.LinearRegression
-
Set the maximum number of iterations.
- setMaxIterations(int) - Method in class org.apache.spark.mllib.clustering.GaussianMixture
-
Set the maximum number of iterations to run.
- setMaxIterations(int) - Method in class org.apache.spark.mllib.clustering.KMeans
-
Set maximum number of iterations to run.
- setMaxIterations(int) - Method in class org.apache.spark.mllib.clustering.LDA
-
Maximum number of iterations for learning.
- setMaxIterations(int) - Method in class org.apache.spark.mllib.clustering.PowerIterationClustering
-
- setMaxMemoryInMB(int) - Method in class org.apache.spark.ml.classification.DecisionTreeClassifier
-
- setMaxMemoryInMB(int) - Method in class org.apache.spark.ml.classification.GBTClassifier
-
- setMaxMemoryInMB(int) - Method in class org.apache.spark.ml.classification.RandomForestClassifier
-
- setMaxMemoryInMB(int) - Method in class org.apache.spark.ml.regression.DecisionTreeRegressor
-
- setMaxMemoryInMB(int) - Method in class org.apache.spark.ml.regression.GBTRegressor
-
- setMaxMemoryInMB(int) - Method in class org.apache.spark.ml.regression.RandomForestRegressor
-
- setMaxMemoryInMB(int) - Method in class org.apache.spark.mllib.tree.configuration.Strategy
-
- setMaxNumIterations(int) - Method in class org.apache.spark.mllib.optimization.LBFGS
-
- setMetricName(String) - Method in class org.apache.spark.ml.evaluation.BinaryClassificationEvaluator
-
- setMetricName(String) - Method in class org.apache.spark.ml.evaluation.RegressionEvaluator
-
- setMinCount(int) - Method in class org.apache.spark.ml.feature.Word2Vec
-
- setMinCount(int) - Method in class org.apache.spark.mllib.feature.Word2Vec
-
- setMiniBatchFraction(double) - Method in class org.apache.spark.mllib.classification.StreamingLogisticRegressionWithSGD
-
Set the fraction of each batch to use for updates.
- setMiniBatchFraction(double) - Method in class org.apache.spark.mllib.clustering.OnlineLDAOptimizer
-
Mini-batch fraction in (0, 1], which sets the fraction of document sampled and used in
each iteration.
- setMiniBatchFraction(double) - Method in class org.apache.spark.mllib.optimization.GradientDescent
-
:: Experimental ::
Set fraction of data to be used for each SGD iteration.
- setMiniBatchFraction(double) - Method in class org.apache.spark.mllib.regression.StreamingLinearRegressionWithSGD
-
Set the fraction of each batch to use for updates.
- setMinInfoGain(double) - Method in class org.apache.spark.ml.classification.DecisionTreeClassifier
-
- setMinInfoGain(double) - Method in class org.apache.spark.ml.classification.GBTClassifier
-
- setMinInfoGain(double) - Method in class org.apache.spark.ml.classification.RandomForestClassifier
-
- setMinInfoGain(double) - Method in class org.apache.spark.ml.regression.DecisionTreeRegressor
-
- setMinInfoGain(double) - Method in class org.apache.spark.ml.regression.GBTRegressor
-
- setMinInfoGain(double) - Method in class org.apache.spark.ml.regression.RandomForestRegressor
-
- setMinInfoGain(double) - Method in class org.apache.spark.mllib.tree.configuration.Strategy
-
- setMinInstancesPerNode(int) - Method in class org.apache.spark.ml.classification.DecisionTreeClassifier
-
- setMinInstancesPerNode(int) - Method in class org.apache.spark.ml.classification.GBTClassifier
-
- setMinInstancesPerNode(int) - Method in class org.apache.spark.ml.classification.RandomForestClassifier
-
- setMinInstancesPerNode(int) - Method in class org.apache.spark.ml.regression.DecisionTreeRegressor
-
- setMinInstancesPerNode(int) - Method in class org.apache.spark.ml.regression.GBTRegressor
-
- setMinInstancesPerNode(int) - Method in class org.apache.spark.ml.regression.RandomForestRegressor
-
- setMinInstancesPerNode(int) - Method in class org.apache.spark.mllib.tree.configuration.Strategy
-
- setMinSupport(double) - Method in class org.apache.spark.mllib.fpm.FPGrowth
-
Sets the minimal support level (default: 0.3
).
- setMinTokenLength(int) - Method in class org.apache.spark.ml.feature.RegexTokenizer
-
- setModelType(String) - Method in class org.apache.spark.mllib.classification.NaiveBayes
-
Set the model type using a string (case-sensitive).
- setName(String) - Method in class org.apache.spark.api.java.JavaDoubleRDD
-
Assign a name to this RDD
- setName(String) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Assign a name to this RDD
- setName(String) - Method in class org.apache.spark.api.java.JavaRDD
-
Assign a name to this RDD
- setName(String) - Method in class org.apache.spark.graphx.impl.EdgeRDDImpl
-
- setName(String) - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
-
- setName(String) - Method in class org.apache.spark.rdd.RDD
-
Assign a name to this RDD
- setNonnegative(boolean) - Method in class org.apache.spark.ml.recommendation.ALS
-
- setNonnegative(boolean) - Method in class org.apache.spark.mllib.recommendation.ALS
-
- setNumBlocks(int) - Method in class org.apache.spark.ml.recommendation.ALS
-
Sets both numUserBlocks and numItemBlocks to the specific value.
- setNumClasses(int) - Method in class org.apache.spark.mllib.classification.LogisticRegressionWithLBFGS
-
:: Experimental ::
Set the number of possible outcomes for k classes classification problem in
Multinomial Logistic Regression.
- setNumClasses(int) - Method in class org.apache.spark.mllib.tree.configuration.Strategy
-
- setNumCorrections(int) - Method in class org.apache.spark.mllib.optimization.LBFGS
-
Set the number of corrections used in the LBFGS update.
- setNumFeatures(int) - Method in class org.apache.spark.ml.feature.HashingTF
-
- setNumFolds(int) - Method in class org.apache.spark.ml.tuning.CrossValidator
-
- setNumItemBlocks(int) - Method in class org.apache.spark.ml.recommendation.ALS
-
- setNumIterations(int) - Method in class org.apache.spark.mllib.classification.StreamingLogisticRegressionWithSGD
-
Set the number of iterations of gradient descent to run per update.
- setNumIterations(int) - Method in class org.apache.spark.mllib.feature.Word2Vec
-
- setNumIterations(int) - Method in class org.apache.spark.mllib.optimization.GradientDescent
-
Set the number of iterations for SGD.
- setNumIterations(int) - Method in class org.apache.spark.mllib.optimization.LBFGS
-
Set the maximal number of iterations for L-BFGS.
- setNumIterations(int) - Method in class org.apache.spark.mllib.regression.StreamingLinearRegressionWithSGD
-
Set the number of iterations of gradient descent to run per update.
- setNumIterations(int) - Method in class org.apache.spark.mllib.tree.configuration.BoostingStrategy
-
- setNumPartitions(int) - Method in class org.apache.spark.ml.feature.Word2Vec
-
- setNumPartitions(int) - Method in class org.apache.spark.mllib.feature.Word2Vec
-
- setNumPartitions(int) - Method in class org.apache.spark.mllib.fpm.FPGrowth
-
Sets the number of partitions used by parallel FP-growth (default: same as input data).
- setNumTrees(int) - Method in class org.apache.spark.ml.classification.RandomForestClassifier
-
- setNumTrees(int) - Method in class org.apache.spark.ml.regression.RandomForestRegressor
-
- setNumUserBlocks(int) - Method in class org.apache.spark.ml.recommendation.ALS
-
- setOptimizer(LDAOptimizer) - Method in class org.apache.spark.mllib.clustering.LDA
-
:: DeveloperApi ::
- setOptimizer(String) - Method in class org.apache.spark.mllib.clustering.LDA
-
Set the LDAOptimizer used to perform the actual calculation by algorithm name.
- setOrNull(long, int, int) - Method in class org.apache.spark.sql.types.Decimal
-
Set this Decimal to the given unscaled Long, with a given precision and scale,
and return it, or return null if it cannot be set due to overflow.
- setOutputCol(String) - Method in class org.apache.spark.ml.feature.Binarizer
-
- setOutputCol(String) - Method in class org.apache.spark.ml.feature.Bucketizer
-
- setOutputCol(String) - Method in class org.apache.spark.ml.feature.HashingTF
-
- setOutputCol(String) - Method in class org.apache.spark.ml.feature.IDF
-
- setOutputCol(String) - Method in class org.apache.spark.ml.feature.IDFModel
-
- setOutputCol(String) - Method in class org.apache.spark.ml.feature.OneHotEncoder
-
- setOutputCol(String) - Method in class org.apache.spark.ml.feature.StandardScaler
-
- setOutputCol(String) - Method in class org.apache.spark.ml.feature.StandardScalerModel
-
- setOutputCol(String) - Method in class org.apache.spark.ml.feature.StringIndexer
-
- setOutputCol(String) - Method in class org.apache.spark.ml.feature.StringIndexerModel
-
- setOutputCol(String) - Method in class org.apache.spark.ml.feature.VectorAssembler
-
- setOutputCol(String) - Method in class org.apache.spark.ml.feature.VectorIndexer
-
- setOutputCol(String) - Method in class org.apache.spark.ml.feature.VectorIndexerModel
-
- setOutputCol(String) - Method in class org.apache.spark.ml.feature.Word2Vec
-
- setOutputCol(String) - Method in class org.apache.spark.ml.feature.Word2VecModel
-
- setOutputCol(String) - Method in class org.apache.spark.ml.UnaryTransformer
-
- setP(double) - Method in class org.apache.spark.ml.feature.Normalizer
-
- setParent(Estimator<M>) - Method in class org.apache.spark.ml.Model
-
Sets the parent of this model (Java API).
- setPattern(String) - Method in class org.apache.spark.ml.feature.RegexTokenizer
-
- setPredictionCol(String) - Method in class org.apache.spark.ml.evaluation.RegressionEvaluator
-
- setPredictionCol(String) - Method in class org.apache.spark.ml.PredictionModel
-
- setPredictionCol(String) - Method in class org.apache.spark.ml.Predictor
-
- setPredictionCol(String) - Method in class org.apache.spark.ml.recommendation.ALS
-
- setPredictionCol(String) - Method in class org.apache.spark.ml.recommendation.ALSModel
-
- setProductBlocks(int) - Method in class org.apache.spark.mllib.recommendation.ALS
-
- setPropertiesFile(String) - Method in class org.apache.spark.launcher.SparkLauncher
-
Set a custom properties file with Spark configuration for the application.
- setQuantileCalculationStrategy(Enumeration.Value) - Method in class org.apache.spark.mllib.tree.configuration.Strategy
-
- setRandomCenters(int, double, long) - Method in class org.apache.spark.mllib.clustering.StreamingKMeans
-
Initialize random centers, requiring only the number of dimensions.
- setRank(int) - Method in class org.apache.spark.ml.recommendation.ALS
-
- setRank(int) - Method in class org.apache.spark.mllib.recommendation.ALS
-
- setRatingCol(String) - Method in class org.apache.spark.ml.recommendation.ALS
-
- setRawPredictionCol(String) - Method in class org.apache.spark.ml.classification.ClassificationModel
-
- setRawPredictionCol(String) - Method in class org.apache.spark.ml.classification.Classifier
-
- setRegParam(double) - Method in class org.apache.spark.ml.classification.LogisticRegression
-
Set the regularization parameter.
- setRegParam(double) - Method in class org.apache.spark.ml.recommendation.ALS
-
- setRegParam(double) - Method in class org.apache.spark.ml.regression.LinearRegression
-
Set the regularization parameter.
- setRegParam(double) - Method in class org.apache.spark.mllib.classification.StreamingLogisticRegressionWithSGD
-
Set the regularization parameter.
- setRegParam(double) - Method in class org.apache.spark.mllib.optimization.GradientDescent
-
Set the regularization parameter.
- setRegParam(double) - Method in class org.apache.spark.mllib.optimization.LBFGS
-
Set the regularization parameter.
- setRest(long, int, VD, ED) - Method in class org.apache.spark.graphx.impl.AggregatingEdgeContext
-
- setRuns(int) - Method in class org.apache.spark.mllib.clustering.KMeans
-
:: Experimental ::
Set the number of runs of the algorithm to execute in parallel.
- setSample(RDD<Object>) - Method in class org.apache.spark.mllib.stat.KernelDensity
-
Sets the sample to use for density estimation.
- setSample(JavaRDD<Double>) - Method in class org.apache.spark.mllib.stat.KernelDensity
-
Sets the sample to use for density estimation (for Java users).
- setScalingVec(Vector) - Method in class org.apache.spark.ml.feature.ElementwiseProduct
-
- setScoreCol(String) - Method in class org.apache.spark.ml.evaluation.BinaryClassificationEvaluator
-
- setSeed(long) - Method in class org.apache.spark.ml.classification.GBTClassifier
-
- setSeed(long) - Method in class org.apache.spark.ml.classification.RandomForestClassifier
-
- setSeed(long) - Method in class org.apache.spark.ml.feature.Word2Vec
-
- setSeed(long) - Method in class org.apache.spark.ml.recommendation.ALS
-
- setSeed(long) - Method in class org.apache.spark.ml.regression.GBTRegressor
-
- setSeed(long) - Method in class org.apache.spark.ml.regression.RandomForestRegressor
-
- setSeed(long) - Method in class org.apache.spark.mllib.clustering.GaussianMixture
-
Set the random seed
- setSeed(long) - Method in class org.apache.spark.mllib.clustering.KMeans
-
Set the random seed for cluster initialization.
- setSeed(long) - Method in class org.apache.spark.mllib.clustering.LDA
-
Random seed
- setSeed(long) - Method in class org.apache.spark.mllib.feature.Word2Vec
-
- setSeed(long) - Method in class org.apache.spark.mllib.random.ExponentialGenerator
-
- setSeed(long) - Method in class org.apache.spark.mllib.random.GammaGenerator
-
- setSeed(long) - Method in class org.apache.spark.mllib.random.LogNormalGenerator
-
- setSeed(long) - Method in class org.apache.spark.mllib.random.PoissonGenerator
-
- setSeed(long) - Method in class org.apache.spark.mllib.random.StandardNormalGenerator
-
- setSeed(long) - Method in class org.apache.spark.mllib.random.UniformGenerator
-
- setSeed(long) - Method in class org.apache.spark.mllib.recommendation.ALS
-
- setSeed(long) - Method in class org.apache.spark.util.random.BernoulliCellSampler
-
- setSeed(long) - Method in class org.apache.spark.util.random.BernoulliSampler
-
- setSeed(long) - Method in class org.apache.spark.util.random.PoissonSampler
-
- setSeed(long) - Method in interface org.apache.spark.util.random.Pseudorandom
-
Set random seed.
- setSerializer(Serializer) - Method in class org.apache.spark.rdd.CoGroupedRDD
-
Set a serializer for this RDD's shuffle, or null to use the default (spark.serializer)
- setSerializer(Serializer) - Method in class org.apache.spark.rdd.ShuffledRDD
-
Set a serializer for this RDD's shuffle, or null to use the default (spark.serializer)
- setSparkHome(String) - Method in class org.apache.spark.launcher.SparkLauncher
-
Set a custom Spark installation location for the application.
- setSparkHome(String) - Method in class org.apache.spark.SparkConf
-
Set the location where Spark is installed on worker nodes.
- setSplits(double[]) - Method in class org.apache.spark.ml.feature.Bucketizer
-
- setSrcOnly(long, int, VD) - Method in class org.apache.spark.graphx.impl.AggregatingEdgeContext
-
- setStages(PipelineStage[]) - Method in class org.apache.spark.ml.Pipeline
-
- setStepSize(double) - Method in class org.apache.spark.ml.classification.GBTClassifier
-
- setStepSize(double) - Method in class org.apache.spark.ml.feature.Word2Vec
-
- setStepSize(double) - Method in class org.apache.spark.ml.regression.GBTRegressor
-
- setStepSize(double) - Method in class org.apache.spark.mllib.classification.StreamingLogisticRegressionWithSGD
-
Set the step size for gradient descent.
- setStepSize(double) - Method in class org.apache.spark.mllib.optimization.GradientDescent
-
Set the initial step size of SGD for the first step.
- setStepSize(double) - Method in class org.apache.spark.mllib.regression.StreamingLinearRegressionWithSGD
-
Set the step size for gradient descent.
- setSubsamplingRate(double) - Method in class org.apache.spark.ml.classification.GBTClassifier
-
- setSubsamplingRate(double) - Method in class org.apache.spark.ml.classification.RandomForestClassifier
-
- setSubsamplingRate(double) - Method in class org.apache.spark.ml.regression.GBTRegressor
-
- setSubsamplingRate(double) - Method in class org.apache.spark.ml.regression.RandomForestRegressor
-
- setSubsamplingRate(double) - Method in class org.apache.spark.mllib.tree.configuration.Strategy
-
- setTau0(double) - Method in class org.apache.spark.mllib.clustering.OnlineLDAOptimizer
-
A (positive) learning parameter that downweights early iterations.
- setThreshold(double) - Method in class org.apache.spark.ml.classification.LogisticRegression
-
- setThreshold(double) - Method in class org.apache.spark.ml.classification.LogisticRegressionModel
-
- setThreshold(double) - Method in class org.apache.spark.ml.feature.Binarizer
-
- setThreshold(double) - Method in class org.apache.spark.mllib.classification.LogisticRegressionModel
-
:: Experimental ::
Sets the threshold that separates positive predictions from negative predictions
in Binary Logistic Regression.
- setThreshold(double) - Method in class org.apache.spark.mllib.classification.SVMModel
-
:: Experimental ::
Sets the threshold that separates positive predictions from negative predictions.
- setTol(double) - Method in class org.apache.spark.ml.classification.LogisticRegression
-
Set the convergence tolerance of iterations.
- setTol(double) - Method in class org.apache.spark.ml.regression.LinearRegression
-
Set the convergence tolerance of iterations.
- setTopicConcentration(double) - Method in class org.apache.spark.mllib.clustering.LDA
-
Concentration parameter (commonly named "beta" or "eta") for the prior placed on topics'
distributions over terms.
- setTreeStrategy(Strategy) - Method in class org.apache.spark.mllib.tree.configuration.BoostingStrategy
-
- setup() - Method in class org.apache.spark.streaming.kafka.KafkaTestUtils
-
setup the whole embedded servers, including Zookeeper and Kafka brokers
- setUpdater(Updater) - Method in class org.apache.spark.mllib.optimization.GradientDescent
-
Set the updater function to actually perform a gradient step in a given direction.
- setUpdater(Updater) - Method in class org.apache.spark.mllib.optimization.LBFGS
-
Set the updater function to actually perform a gradient step in a given direction.
- setupGroups(int) - Method in class org.apache.spark.rdd.PartitionCoalescer
-
Initializes targetLen partition groups and assigns a preferredLocation
This uses coupon collector to estimate how many preferredLocations it must rotate through
until it has seen most of the preferred locations (2 * n log(n))
- setUseNodeIdCache(boolean) - Method in class org.apache.spark.mllib.tree.configuration.Strategy
-
- setUserBlocks(int) - Method in class org.apache.spark.mllib.recommendation.ALS
-
- setUserCol(String) - Method in class org.apache.spark.ml.recommendation.ALS
-
- setUserCol(String) - Method in class org.apache.spark.ml.recommendation.ALSModel
-
- setValidateData(boolean) - Method in class org.apache.spark.mllib.regression.GeneralizedLinearAlgorithm
-
Set if the algorithm should validate data before training.
- setValidationTol(double) - Method in class org.apache.spark.mllib.tree.configuration.BoostingStrategy
-
- setValue(R) - Method in class org.apache.spark.Accumulable
-
Set the accumulator's value; only allowed on master
- setVectorSize(int) - Method in class org.apache.spark.ml.feature.Word2Vec
-
- setVectorSize(int) - Method in class org.apache.spark.mllib.feature.Word2Vec
-
- setVerbose(boolean) - Method in class org.apache.spark.launcher.SparkLauncher
-
Enables verbose reporting for SparkSubmit.
- setWithMean(boolean) - Method in class org.apache.spark.ml.feature.StandardScaler
-
- setWithMean(boolean) - Method in class org.apache.spark.mllib.feature.StandardScalerModel
-
- setWithStd(boolean) - Method in class org.apache.spark.ml.feature.StandardScaler
-
- setWithStd(boolean) - Method in class org.apache.spark.mllib.feature.StandardScalerModel
-
- shape() - Method in class org.apache.spark.mllib.random.GammaGenerator
-
- ShortestPaths - Class in org.apache.spark.graphx.lib
-
Computes shortest paths to the given set of landmark vertices, returning a graph where each
vertex attribute is a map containing the shortest-path distance to each reachable landmark.
- ShortestPaths() - Constructor for class org.apache.spark.graphx.lib.ShortestPaths
-
- ShortType - Static variable in class org.apache.spark.sql.types.DataTypes
-
Gets the ShortType object.
- ShortType - Class in org.apache.spark.sql.types
-
:: DeveloperApi ::
The data type representing Short
values.
- shouldGoLeft(Vector) - Method in interface org.apache.spark.ml.tree.Split
-
Return true (split to left) or false (split to right)
- shouldOwn(Param<?>) - Method in interface org.apache.spark.ml.param.Params
-
Validates that the input param belongs to this instance.
- show(int) - Method in class org.apache.spark.sql.DataFrame
-
- show() - Method in class org.apache.spark.sql.DataFrame
-
Displays the top 20 rows of
DataFrame
in a tabular form.
- showBytesDistribution(String, Function2<TaskInfo, TaskMetrics, Option<Object>>, Seq<Tuple2<TaskInfo, TaskMetrics>>) - Static method in class org.apache.spark.scheduler.StatsReportListener
-
- showBytesDistribution(String, Option<org.apache.spark.util.Distribution>) - Static method in class org.apache.spark.scheduler.StatsReportListener
-
- showBytesDistribution(String, org.apache.spark.util.Distribution) - Static method in class org.apache.spark.scheduler.StatsReportListener
-
- showDistribution(String, org.apache.spark.util.Distribution, Function1<Object, String>) - Static method in class org.apache.spark.scheduler.StatsReportListener
-
- showDistribution(String, Option<org.apache.spark.util.Distribution>, Function1<Object, String>) - Static method in class org.apache.spark.scheduler.StatsReportListener
-
- showDistribution(String, Option<org.apache.spark.util.Distribution>, String) - Static method in class org.apache.spark.scheduler.StatsReportListener
-
- showDistribution(String, String, Function2<TaskInfo, TaskMetrics, Option<Object>>, Seq<Tuple2<TaskInfo, TaskMetrics>>) - Static method in class org.apache.spark.scheduler.StatsReportListener
-
- showMillisDistribution(String, Option<org.apache.spark.util.Distribution>) - Static method in class org.apache.spark.scheduler.StatsReportListener
-
- showMillisDistribution(String, Function2<TaskInfo, TaskMetrics, Option<Object>>, Seq<Tuple2<TaskInfo, TaskMetrics>>) - Static method in class org.apache.spark.scheduler.StatsReportListener
-
- showMillisDistribution(String, Function1<BatchInfo, Option<Object>>) - Method in class org.apache.spark.streaming.scheduler.StatsReportListener
-
- SHUFFLE() - Static method in class org.apache.spark.storage.BlockId
-
- SHUFFLE_DATA() - Static method in class org.apache.spark.storage.BlockId
-
- SHUFFLE_INDEX() - Static method in class org.apache.spark.storage.BlockId
-
- ShuffleBlockId - Class in org.apache.spark.storage
-
- ShuffleBlockId(int, int, int) - Constructor for class org.apache.spark.storage.ShuffleBlockId
-
- ShuffleDataBlockId - Class in org.apache.spark.storage
-
- ShuffleDataBlockId(int, int, int) - Constructor for class org.apache.spark.storage.ShuffleDataBlockId
-
- ShuffleDependency<K,V,C> - Class in org.apache.spark
-
:: DeveloperApi ::
Represents a dependency on the output of a shuffle stage.
- ShuffleDependency(RDD<? extends Product2<K, V>>, Partitioner, Option<Serializer>, Option<Ordering<K>>, Option<Aggregator<K, V, C>>, boolean) - Constructor for class org.apache.spark.ShuffleDependency
-
- ShuffledRDD<K,V,C> - Class in org.apache.spark.rdd
-
:: DeveloperApi ::
The resulting RDD from a shuffle (e.g.
- ShuffledRDD(RDD<? extends Product2<K, V>>, Partitioner) - Constructor for class org.apache.spark.rdd.ShuffledRDD
-
- shuffleHandle() - Method in class org.apache.spark.ShuffleDependency
-
- shuffleId() - Method in class org.apache.spark.CleanShuffle
-
- shuffleId() - Method in class org.apache.spark.FetchFailed
-
- shuffleId() - Method in class org.apache.spark.ShuffleDependency
-
- shuffleId() - Method in class org.apache.spark.storage.ShuffleBlockId
-
- shuffleId() - Method in class org.apache.spark.storage.ShuffleDataBlockId
-
- shuffleId() - Method in class org.apache.spark.storage.ShuffleIndexBlockId
-
- ShuffleIndexBlockId - Class in org.apache.spark.storage
-
- ShuffleIndexBlockId(int, int, int) - Constructor for class org.apache.spark.storage.ShuffleIndexBlockId
-
- shuffleManager() - Method in class org.apache.spark.SparkEnv
-
- shuffleMemoryManager() - Method in class org.apache.spark.SparkEnv
-
- shuffleRead() - Method in class org.apache.spark.status.api.v1.ExecutorStageSummary
-
- shuffleReadBytes() - Method in class org.apache.spark.status.api.v1.StageData
-
- ShuffleReadMetricDistributions - Class in org.apache.spark.status.api.v1
-
- ShuffleReadMetrics - Class in org.apache.spark.status.api.v1
-
- shuffleReadMetrics() - Method in class org.apache.spark.status.api.v1.TaskMetricDistributions
-
- shuffleReadMetrics() - Method in class org.apache.spark.status.api.v1.TaskMetrics
-
- shuffleReadRecords() - Method in class org.apache.spark.status.api.v1.StageData
-
- shuffleWrite() - Method in class org.apache.spark.status.api.v1.ExecutorStageSummary
-
- shuffleWriteBytes() - Method in class org.apache.spark.status.api.v1.StageData
-
- ShuffleWriteMetricDistributions - Class in org.apache.spark.status.api.v1
-
- ShuffleWriteMetrics - Class in org.apache.spark.status.api.v1
-
- shuffleWriteMetrics() - Method in class org.apache.spark.status.api.v1.TaskMetricDistributions
-
- shuffleWriteMetrics() - Method in class org.apache.spark.status.api.v1.TaskMetrics
-
- shuffleWriteRecords() - Method in class org.apache.spark.status.api.v1.StageData
-
- sigma() - Method in class org.apache.spark.mllib.stat.distribution.MultivariateGaussian
-
- sigmas() - Method in class org.apache.spark.mllib.clustering.ExpectationSum
-
- SignalLoggerHandler - Class in org.apache.spark.util
-
- SignalLoggerHandler(String, Logger) - Constructor for class org.apache.spark.util.SignalLoggerHandler
-
- signum(Column) - Static method in class org.apache.spark.sql.functions
-
Computes the signum of the given value.
- signum(String) - Static method in class org.apache.spark.sql.functions
-
Computes the signum of the given column.
- SimpleFutureAction<T> - Class in org.apache.spark
-
A
FutureAction
holding the result of an action that triggers a single job.
- simpleString() - Method in class org.apache.spark.sql.types.ArrayType
-
- simpleString() - Method in class org.apache.spark.sql.types.ByteType
-
- simpleString() - Method in class org.apache.spark.sql.types.DataType
-
Readable string representation for the type.
- simpleString() - Method in class org.apache.spark.sql.types.DecimalType
-
- simpleString() - Method in class org.apache.spark.sql.types.IntegerType
-
- simpleString() - Method in class org.apache.spark.sql.types.LongType
-
- simpleString() - Method in class org.apache.spark.sql.types.MapType
-
- simpleString() - Method in class org.apache.spark.sql.types.ShortType
-
- simpleString() - Method in class org.apache.spark.sql.types.StructType
-
- SimpleUpdater - Class in org.apache.spark.mllib.optimization
-
:: DeveloperApi ::
A simple updater for gradient descent *without* any regularization.
- SimpleUpdater() - Constructor for class org.apache.spark.mllib.optimization.SimpleUpdater
-
- sin(Column) - Static method in class org.apache.spark.sql.functions
-
Computes the sine of the given value.
- sin(String) - Static method in class org.apache.spark.sql.functions
-
Computes the sine of the given column.
- SingularValueDecomposition<UType,VType> - Class in org.apache.spark.mllib.linalg
-
:: Experimental ::
Represents singular value decomposition (SVD) factors.
- SingularValueDecomposition(UType, Vector, VType) - Constructor for class org.apache.spark.mllib.linalg.SingularValueDecomposition
-
- sinh(Column) - Static method in class org.apache.spark.sql.functions
-
Computes the hyperbolic sine of the given value.
- sinh(String) - Static method in class org.apache.spark.sql.functions
-
Computes the hyperbolic sine of the given column.
- size() - Method in class org.apache.spark.ml.attribute.AttributeGroup
-
Size of the attribute group.
- size() - Method in class org.apache.spark.ml.param.ParamMap
-
Number of param pairs in this map.
- size() - Method in class org.apache.spark.mllib.linalg.DenseVector
-
- size() - Method in class org.apache.spark.mllib.linalg.SparseVector
-
- size() - Method in interface org.apache.spark.mllib.linalg.Vector
-
Size of the vector.
- size() - Method in class org.apache.spark.rdd.PartitionGroup
-
- size() - Method in interface org.apache.spark.sql.Row
-
Number of elements in the Row.
- size() - Method in class org.apache.spark.storage.MemoryEntry
-
- SizeEstimator - Class in org.apache.spark.util
-
:: DeveloperApi ::
Estimates the sizes of Java objects (number of bytes of memory they occupy), for use in
memory-aware caches.
- SizeEstimator() - Constructor for class org.apache.spark.util.SizeEstimator
-
- sizeInBytes() - Method in class org.apache.spark.sql.sources.BaseRelation
-
Returns an estimated size of this relation in bytes.
- sketch(RDD<K>, int, ClassTag<K>) - Static method in class org.apache.spark.RangePartitioner
-
Sketches the input RDD via reservoir sampling on each partition.
- skippedStages() - Method in class org.apache.spark.ui.jobs.JobProgressListener
-
- slack() - Method in class org.apache.spark.rdd.PartitionCoalescer
-
- slice(int, int) - Method in class org.apache.spark.sql.types.UTF8String
-
Return a substring of this,
- slice(Time, Time) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
-
Return all the RDDs between 'fromDuration' to 'toDuration' (both included)
- slice(org.apache.spark.streaming.Interval) - Method in class org.apache.spark.streaming.dstream.DStream
-
Return all the RDDs defined by the Interval object (both end times included)
- slice(Time, Time) - Method in class org.apache.spark.streaming.dstream.DStream
-
Return all the RDDs between 'fromTime' to 'toTime' (both included)
- slideDuration() - Method in class org.apache.spark.streaming.dstream.DStream
-
Time interval after which the DStream generates a RDD
- slideDuration() - Method in class org.apache.spark.streaming.dstream.InputDStream
-
- sliding(int) - Method in class org.apache.spark.mllib.rdd.RDDFunctions
-
Returns a RDD from grouping items of its parent RDD in fixed size blocks by passing a sliding
window over them.
- SnappyCompressionCodec - Class in org.apache.spark.io
-
- SnappyCompressionCodec(SparkConf) - Constructor for class org.apache.spark.io.SnappyCompressionCodec
-
- SnappyOutputStreamWrapper - Class in org.apache.spark.io
-
Wrapper over SnappyOutputStream
which guards against write-after-close and double-close
issues.
- SnappyOutputStreamWrapper(SnappyOutputStream) - Constructor for class org.apache.spark.io.SnappyOutputStreamWrapper
-
- socketStream(String, int, Function<InputStream, Iterable<T>>, StorageLevel) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
-
Create an input stream from network source hostname:port.
- socketStream(String, int, Function1<InputStream, Iterator<T>>, StorageLevel, ClassTag<T>) - Method in class org.apache.spark.streaming.StreamingContext
-
Create a input stream from TCP source hostname:port.
- socketTextStream(String, int, StorageLevel) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
-
Create an input stream from network source hostname:port.
- socketTextStream(String, int) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
-
Create an input stream from network source hostname:port.
- socketTextStream(String, int, StorageLevel) - Method in class org.apache.spark.streaming.StreamingContext
-
Create a input stream from TCP source hostname:port.
- Sort() - Static method in class org.apache.spark.mllib.tree.configuration.QuantileStrategy
-
- sort(String, String...) - Method in class org.apache.spark.sql.DataFrame
-
Returns a new
DataFrame
sorted by the specified column, all in ascending order.
- sort(Column...) - Method in class org.apache.spark.sql.DataFrame
-
Returns a new
DataFrame
sorted by the given expressions.
- sort(String, Seq<String>) - Method in class org.apache.spark.sql.DataFrame
-
Returns a new
DataFrame
sorted by the specified column, all in ascending order.
- sort(Seq<Column>) - Method in class org.apache.spark.sql.DataFrame
-
Returns a new
DataFrame
sorted by the given expressions.
- sortBy(Function<T, S>, boolean, int) - Method in class org.apache.spark.api.java.JavaRDD
-
Return this RDD sorted by the given key function.
- sortBy(Function1<T, K>, boolean, int, Ordering<K>, ClassTag<K>) - Method in class org.apache.spark.rdd.RDD
-
Return this RDD sorted by the given key function.
- sortByKey() - Method in class org.apache.spark.api.java.JavaPairRDD
-
Sort the RDD by key, so that each partition contains a sorted range of the elements in
ascending order.
- sortByKey(boolean) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Sort the RDD by key, so that each partition contains a sorted range of the elements.
- sortByKey(boolean, int) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Sort the RDD by key, so that each partition contains a sorted range of the elements.
- sortByKey(Comparator<K>) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Sort the RDD by key, so that each partition contains a sorted range of the elements.
- sortByKey(Comparator<K>, boolean) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Sort the RDD by key, so that each partition contains a sorted range of the elements.
- sortByKey(Comparator<K>, boolean, int) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Sort the RDD by key, so that each partition contains a sorted range of the elements.
- sortByKey(boolean, int) - Method in class org.apache.spark.rdd.OrderedRDDFunctions
-
Sort the RDD by key, so that each partition contains a sorted range of the elements.
- SPARK_JOB_DESCRIPTION() - Static method in class org.apache.spark.SparkContext
-
- SPARK_JOB_GROUP_ID() - Static method in class org.apache.spark.SparkContext
-
- SPARK_JOB_INTERRUPT_ON_CANCEL() - Static method in class org.apache.spark.SparkContext
-
- SPARK_MASTER - Static variable in class org.apache.spark.launcher.SparkLauncher
-
The Spark master.
- SparkConf - Class in org.apache.spark
-
Configuration for a Spark application.
- SparkConf(boolean) - Constructor for class org.apache.spark.SparkConf
-
- SparkConf() - Constructor for class org.apache.spark.SparkConf
-
Create a SparkConf that loads defaults from system properties and the classpath
- sparkContext() - Method in class org.apache.spark.rdd.RDD
-
The SparkContext that created this RDD.
- SparkContext - Class in org.apache.spark
-
Main entry point for Spark functionality.
- SparkContext(SparkConf) - Constructor for class org.apache.spark.SparkContext
-
- SparkContext() - Constructor for class org.apache.spark.SparkContext
-
Create a SparkContext that loads settings from system properties (for instance, when
launching with ./bin/spark-submit).
- SparkContext(SparkConf, Map<String, Set<SplitInfo>>) - Constructor for class org.apache.spark.SparkContext
-
:: DeveloperApi ::
Alternative constructor for setting preferred locations where Spark will create executors.
- SparkContext(String, String, SparkConf) - Constructor for class org.apache.spark.SparkContext
-
Alternative constructor that allows setting common Spark properties directly
- SparkContext(String, String, String, Seq<String>, Map<String, String>, Map<String, Set<SplitInfo>>) - Constructor for class org.apache.spark.SparkContext
-
Alternative constructor that allows setting common Spark properties directly
- sparkContext() - Method in class org.apache.spark.sql.SQLContext
-
- sparkContext() - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
-
The underlying SparkContext
- sparkContext() - Method in class org.apache.spark.streaming.StreamingContext
-
Return the associated Spark context
- SparkContext.DoubleAccumulatorParam$ - Class in org.apache.spark
-
- SparkContext.DoubleAccumulatorParam$() - Constructor for class org.apache.spark.SparkContext.DoubleAccumulatorParam$
-
- SparkContext.FloatAccumulatorParam$ - Class in org.apache.spark
-
- SparkContext.FloatAccumulatorParam$() - Constructor for class org.apache.spark.SparkContext.FloatAccumulatorParam$
-
- SparkContext.IntAccumulatorParam$ - Class in org.apache.spark
-
- SparkContext.IntAccumulatorParam$() - Constructor for class org.apache.spark.SparkContext.IntAccumulatorParam$
-
- SparkContext.LongAccumulatorParam$ - Class in org.apache.spark
-
- SparkContext.LongAccumulatorParam$() - Constructor for class org.apache.spark.SparkContext.LongAccumulatorParam$
-
- SparkEnv - Class in org.apache.spark
-
:: DeveloperApi ::
Holds all the runtime environment objects for a running Spark instance (either master or worker),
including the serializer, Akka actor system, block manager, map output tracker, etc.
- SparkEnv(String, org.apache.spark.rpc.RpcEnv, Serializer, Serializer, CacheManager, MapOutputTracker, ShuffleManager, org.apache.spark.broadcast.BroadcastManager, BlockTransferService, org.apache.spark.storage.BlockManager, SecurityManager, HttpFileServer, String, org.apache.spark.metrics.MetricsSystem, ShuffleMemoryManager, ExecutorMemoryManager, org.apache.spark.scheduler.OutputCommitCoordinator, SparkConf) - Constructor for class org.apache.spark.SparkEnv
-
- SparkException - Exception in org.apache.spark
-
- SparkException(String, Throwable) - Constructor for exception org.apache.spark.SparkException
-
- SparkException(String) - Constructor for exception org.apache.spark.SparkException
-
- SparkFiles - Class in org.apache.spark
-
Resolves paths to files added through SparkContext.addFile()
.
- SparkFiles() - Constructor for class org.apache.spark.SparkFiles
-
- sparkFilesDir() - Method in class org.apache.spark.SparkEnv
-
- SparkFirehoseListener - Class in org.apache.spark
-
Class that allows users to receive all SparkListener events.
- SparkFirehoseListener() - Constructor for class org.apache.spark.SparkFirehoseListener
-
- SparkFlumeEvent - Class in org.apache.spark.streaming.flume
-
A wrapper class for AvroFlumeEvent's with a custom serialization format.
- SparkFlumeEvent() - Constructor for class org.apache.spark.streaming.flume.SparkFlumeEvent
-
- sparkJobId() - Method in class org.apache.spark.streaming.ui.SparkJobIdWithUIData
-
- SparkJobIdWithUIData - Class in org.apache.spark.streaming.ui
-
- SparkJobIdWithUIData(int, Option<org.apache.spark.ui.jobs.UIData.JobUIData>) - Constructor for class org.apache.spark.streaming.ui.SparkJobIdWithUIData
-
- SparkJobInfo - Interface in org.apache.spark
-
Exposes information about Spark Jobs.
- SparkJobInfoImpl - Class in org.apache.spark
-
- SparkJobInfoImpl(int, int[], JobExecutionStatus) - Constructor for class org.apache.spark.SparkJobInfoImpl
-
- SparkLauncher - Class in org.apache.spark.launcher
-
Launcher for Spark applications.
- SparkLauncher() - Constructor for class org.apache.spark.launcher.SparkLauncher
-
- SparkLauncher(Map<String, String>) - Constructor for class org.apache.spark.launcher.SparkLauncher
-
Creates a launcher that will set the given environment variables in the child.
- SparkListener - Interface in org.apache.spark.scheduler
-
:: DeveloperApi ::
Interface for listening to events from the Spark scheduler.
- SparkListenerApplicationEnd - Class in org.apache.spark.scheduler
-
- SparkListenerApplicationEnd(long) - Constructor for class org.apache.spark.scheduler.SparkListenerApplicationEnd
-
- SparkListenerApplicationStart - Class in org.apache.spark.scheduler
-
- SparkListenerApplicationStart(String, Option<String>, long, String, Option<String>) - Constructor for class org.apache.spark.scheduler.SparkListenerApplicationStart
-
- SparkListenerBlockManagerAdded - Class in org.apache.spark.scheduler
-
- SparkListenerBlockManagerAdded(long, BlockManagerId, long) - Constructor for class org.apache.spark.scheduler.SparkListenerBlockManagerAdded
-
- SparkListenerBlockManagerRemoved - Class in org.apache.spark.scheduler
-
- SparkListenerBlockManagerRemoved(long, BlockManagerId) - Constructor for class org.apache.spark.scheduler.SparkListenerBlockManagerRemoved
-
- SparkListenerEnvironmentUpdate - Class in org.apache.spark.scheduler
-
- SparkListenerEnvironmentUpdate(Map<String, Seq<Tuple2<String, String>>>) - Constructor for class org.apache.spark.scheduler.SparkListenerEnvironmentUpdate
-
- SparkListenerEvent - Interface in org.apache.spark.scheduler
-
- SparkListenerExecutorAdded - Class in org.apache.spark.scheduler
-
- SparkListenerExecutorAdded(long, String, ExecutorInfo) - Constructor for class org.apache.spark.scheduler.SparkListenerExecutorAdded
-
- SparkListenerExecutorMetricsUpdate - Class in org.apache.spark.scheduler
-
Periodic updates from executors.
- SparkListenerExecutorMetricsUpdate(String, Seq<Tuple4<Object, Object, Object, TaskMetrics>>) - Constructor for class org.apache.spark.scheduler.SparkListenerExecutorMetricsUpdate
-
- SparkListenerExecutorRemoved - Class in org.apache.spark.scheduler
-
- SparkListenerExecutorRemoved(long, String, String) - Constructor for class org.apache.spark.scheduler.SparkListenerExecutorRemoved
-
- SparkListenerJobEnd - Class in org.apache.spark.scheduler
-
- SparkListenerJobEnd(int, long, JobResult) - Constructor for class org.apache.spark.scheduler.SparkListenerJobEnd
-
- SparkListenerJobStart - Class in org.apache.spark.scheduler
-
- SparkListenerJobStart(int, long, Seq<StageInfo>, Properties) - Constructor for class org.apache.spark.scheduler.SparkListenerJobStart
-
- SparkListenerStageCompleted - Class in org.apache.spark.scheduler
-
- SparkListenerStageCompleted(StageInfo) - Constructor for class org.apache.spark.scheduler.SparkListenerStageCompleted
-
- SparkListenerStageSubmitted - Class in org.apache.spark.scheduler
-
- SparkListenerStageSubmitted(StageInfo, Properties) - Constructor for class org.apache.spark.scheduler.SparkListenerStageSubmitted
-
- SparkListenerTaskEnd - Class in org.apache.spark.scheduler
-
- SparkListenerTaskEnd(int, int, String, TaskEndReason, TaskInfo, TaskMetrics) - Constructor for class org.apache.spark.scheduler.SparkListenerTaskEnd
-
- SparkListenerTaskGettingResult - Class in org.apache.spark.scheduler
-
- SparkListenerTaskGettingResult(TaskInfo) - Constructor for class org.apache.spark.scheduler.SparkListenerTaskGettingResult
-
- SparkListenerTaskStart - Class in org.apache.spark.scheduler
-
- SparkListenerTaskStart(int, int, TaskInfo) - Constructor for class org.apache.spark.scheduler.SparkListenerTaskStart
-
- SparkListenerUnpersistRDD - Class in org.apache.spark.scheduler
-
- SparkListenerUnpersistRDD(int) - Constructor for class org.apache.spark.scheduler.SparkListenerUnpersistRDD
-
- sparkPartitionId() - Static method in class org.apache.spark.sql.functions
-
Partition ID of the Spark task.
- sparkProperties() - Method in class org.apache.spark.ui.env.EnvironmentListener
-
- SparkShutdownHook - Class in org.apache.spark.util
-
- SparkShutdownHook(int, Function0<BoxedUnit>) - Constructor for class org.apache.spark.util.SparkShutdownHook
-
- SparkStageInfo - Interface in org.apache.spark
-
Exposes information about Spark Stages.
- SparkStageInfoImpl - Class in org.apache.spark
-
- SparkStageInfoImpl(int, int, long, String, int, int, int, int) - Constructor for class org.apache.spark.SparkStageInfoImpl
-
- SparkStatusTracker - Class in org.apache.spark
-
Low-level status reporting APIs for monitoring job and stage progress.
- sparkUser() - Method in class org.apache.spark.api.java.JavaSparkContext
-
- sparkUser() - Method in class org.apache.spark.scheduler.SparkListenerApplicationStart
-
- sparkUser() - Method in class org.apache.spark.SparkContext
-
- sparkUser() - Method in class org.apache.spark.status.api.v1.ApplicationAttemptInfo
-
- sparse(int, int, int[], int[], double[]) - Static method in class org.apache.spark.mllib.linalg.Matrices
-
Creates a column-major sparse matrix in Compressed Sparse Column (CSC) format.
- sparse(int, int[], double[]) - Static method in class org.apache.spark.mllib.linalg.Vectors
-
Creates a sparse vector providing its index array and value array.
- sparse(int, Seq<Tuple2<Object, Object>>) - Static method in class org.apache.spark.mllib.linalg.Vectors
-
Creates a sparse vector using unordered (index, value) pairs.
- sparse(int, Iterable<Tuple2<Integer, Double>>) - Static method in class org.apache.spark.mllib.linalg.Vectors
-
Creates a sparse vector using unordered (index, value) pairs in a Java friendly way.
- SparseMatrix - Class in org.apache.spark.mllib.linalg
-
Column-major sparse matrix.
- SparseMatrix(int, int, int[], int[], double[], boolean) - Constructor for class org.apache.spark.mllib.linalg.SparseMatrix
-
- SparseMatrix(int, int, int[], int[], double[]) - Constructor for class org.apache.spark.mllib.linalg.SparseMatrix
-
Column-major sparse matrix.
- SparseVector - Class in org.apache.spark.mllib.linalg
-
A sparse vector represented by an index array and an value array.
- SparseVector(int, int[], double[]) - Constructor for class org.apache.spark.mllib.linalg.SparseVector
-
- sparsity() - Method in class org.apache.spark.ml.attribute.NumericAttribute
-
- spdiag(Vector) - Static method in class org.apache.spark.mllib.linalg.SparseMatrix
-
Generate a diagonal matrix in SparseMatrix
format from the supplied values.
- SpecialLengths - Class in org.apache.spark.api.r
-
- SpecialLengths() - Constructor for class org.apache.spark.api.r.SpecialLengths
-
- speculative() - Method in class org.apache.spark.scheduler.TaskInfo
-
- speculative() - Method in class org.apache.spark.status.api.v1.TaskData
-
- speye(int) - Static method in class org.apache.spark.mllib.linalg.Matrices
-
Generate a sparse Identity Matrix in Matrix
format.
- speye(int) - Static method in class org.apache.spark.mllib.linalg.SparseMatrix
-
Generate an Identity Matrix in SparseMatrix
format.
- split() - Method in class org.apache.spark.ml.tree.InternalNode
-
- Split - Interface in org.apache.spark.ml.tree
-
:: DeveloperApi ::
Interface for a "Split," which specifies a test made at a decision tree node
to choose the left or right path.
- split() - Method in class org.apache.spark.mllib.tree.model.Node
-
- Split - Class in org.apache.spark.mllib.tree.model
-
:: DeveloperApi ::
Split applied to a feature
param: feature feature index
param: threshold Threshold for continuous feature.
- Split(int, double, Enumeration.Value, List<Object>) - Constructor for class org.apache.spark.mllib.tree.model.Split
-
- SPLIT_INFO_REFLECTIONS() - Static method in class org.apache.spark.rdd.HadoopRDD
-
- splitIndex() - Method in class org.apache.spark.storage.RDDBlockId
-
- SplitInfo - Class in org.apache.spark.scheduler
-
- SplitInfo(Class<?>, String, String, long, Object) - Constructor for class org.apache.spark.scheduler.SplitInfo
-
- splits() - Method in interface org.apache.spark.api.java.JavaRDDLike
-
- splits() - Method in class org.apache.spark.ml.feature.Bucketizer
-
Parameter for mapping continuous features into buckets.
- sprand(int, int, double, Random) - Static method in class org.apache.spark.mllib.linalg.Matrices
-
Generate a SparseMatrix
consisting of i.i.d.
gaussian random numbers.
- sprand(int, int, double, Random) - Static method in class org.apache.spark.mllib.linalg.SparseMatrix
-
Generate a SparseMatrix
consisting of i.i.d
.
- sprandn(int, int, double, Random) - Static method in class org.apache.spark.mllib.linalg.Matrices
-
Generate a SparseMatrix
consisting of i.i.d.
gaussian random numbers.
- sprandn(int, int, double, Random) - Static method in class org.apache.spark.mllib.linalg.SparseMatrix
-
Generate a SparseMatrix
consisting of i.i.d
.
- sqdist(Vector, Vector) - Static method in class org.apache.spark.mllib.linalg.Vectors
-
Returns the squared distance between two Vectors.
- sql(String) - Method in class org.apache.spark.sql.SQLContext
-
- sqlContext() - Method in class org.apache.spark.sql.DataFrame
-
- sqlContext() - Method in class org.apache.spark.sql.sources.BaseRelation
-
- SQLContext - Class in org.apache.spark.sql
-
The entry point for working with structured data (rows and columns) in Spark.
- SQLContext(SparkContext) - Constructor for class org.apache.spark.sql.SQLContext
-
- SQLContext(JavaSparkContext) - Constructor for class org.apache.spark.sql.SQLContext
-
- SQLContext.implicits$ - Class in org.apache.spark.sql
-
:: Experimental ::
(Scala-specific) Implicit methods available in Scala for converting
common Scala objects into
DataFrame
s.
- SQLContext.implicits$() - Constructor for class org.apache.spark.sql.SQLContext.implicits$
-
- SQLContext.implicits$.StringToColumn - Class in org.apache.spark.sql
-
Converts $"col name" into an
Column
.
- SQLContext.implicits$.StringToColumn(StringContext) - Constructor for class org.apache.spark.sql.SQLContext.implicits$.StringToColumn
-
- sqlType() - Method in class org.apache.spark.sql.types.UserDefinedType
-
Underlying storage type for this UDT
- SQLUserDefinedType - Annotation Type in org.apache.spark.sql.types
-
::DeveloperApi::
A user-defined type which can be automatically recognized by a SQLContext and registered.
- sqrt(Column) - Static method in class org.apache.spark.sql.functions
-
Computes the square root of the specified float value.
- squaredDist(Vector) - Method in class org.apache.spark.util.Vector
-
- SquaredError - Class in org.apache.spark.mllib.tree.loss
-
:: DeveloperApi ::
Class for squared error loss calculation.
- SquaredError() - Constructor for class org.apache.spark.mllib.tree.loss.SquaredError
-
- SquaredL2Updater - Class in org.apache.spark.mllib.optimization
-
:: DeveloperApi ::
Updater for L2 regularized problems.
- SquaredL2Updater() - Constructor for class org.apache.spark.mllib.optimization.SquaredL2Updater
-
- Src - Static variable in class org.apache.spark.graphx.TripletFields
-
Expose the source and edge fields but not the destination field.
- srcAttr() - Method in class org.apache.spark.graphx.EdgeContext
-
The vertex attribute of the edge's source vertex.
- srcAttr() - Method in class org.apache.spark.graphx.EdgeTriplet
-
The source vertex attribute
- srcAttr() - Method in class org.apache.spark.graphx.impl.AggregatingEdgeContext
-
- srcId() - Method in class org.apache.spark.graphx.Edge
-
- srcId() - Method in class org.apache.spark.graphx.EdgeContext
-
The vertex id of the edge's source vertex.
- srcId() - Method in class org.apache.spark.graphx.impl.AggregatingEdgeContext
-
- srdd() - Method in class org.apache.spark.api.java.JavaDoubleRDD
-
- ssc() - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
-
- ssc() - Method in class org.apache.spark.streaming.dstream.DStream
-
- stackTrace() - Method in class org.apache.spark.ExceptionFailure
-
- stage() - Method in class org.apache.spark.scheduler.AskPermissionToCommitOutput
-
- stageAttemptId() - Method in class org.apache.spark.scheduler.SparkListenerTaskEnd
-
- stageAttemptId() - Method in class org.apache.spark.scheduler.SparkListenerTaskStart
-
- StageData - Class in org.apache.spark.status.api.v1
-
- stageFailed(String) - Method in class org.apache.spark.scheduler.StageInfo
-
- stageId() - Method in class org.apache.spark.scheduler.SparkListenerTaskEnd
-
- stageId() - Method in class org.apache.spark.scheduler.SparkListenerTaskStart
-
- stageId() - Method in class org.apache.spark.scheduler.StageInfo
-
- stageId() - Method in interface org.apache.spark.SparkStageInfo
-
- stageId() - Method in class org.apache.spark.SparkStageInfoImpl
-
- stageId() - Method in class org.apache.spark.status.api.v1.StageData
-
- stageId() - Method in class org.apache.spark.TaskContext
-
The ID of the stage that this task belong to.
- stageIds() - Method in class org.apache.spark.scheduler.SparkListenerJobStart
-
- stageIds() - Method in interface org.apache.spark.SparkJobInfo
-
- stageIds() - Method in class org.apache.spark.SparkJobInfoImpl
-
- stageIds() - Method in class org.apache.spark.status.api.v1.JobData
-
- stageIdToActiveJobIds() - Method in class org.apache.spark.ui.jobs.JobProgressListener
-
- stageIdToData() - Method in class org.apache.spark.ui.jobs.JobProgressListener
-
- stageIdToInfo() - Method in class org.apache.spark.ui.jobs.JobProgressListener
-
- stageInfo() - Method in class org.apache.spark.scheduler.SparkListenerStageCompleted
-
- stageInfo() - Method in class org.apache.spark.scheduler.SparkListenerStageSubmitted
-
- StageInfo - Class in org.apache.spark.scheduler
-
:: DeveloperApi ::
Stores information about a stage to pass from the scheduler to SparkListeners.
- StageInfo(int, int, String, int, Seq<RDDInfo>, Seq<Object>, String) - Constructor for class org.apache.spark.scheduler.StageInfo
-
- stageInfos() - Method in class org.apache.spark.scheduler.SparkListenerJobStart
-
- stages() - Method in class org.apache.spark.ml.Pipeline
-
param for pipeline stages
- stages() - Method in class org.apache.spark.ml.PipelineModel
-
- StageStatus - Enum in org.apache.spark.status.api.v1
-
- StandardNormalGenerator - Class in org.apache.spark.mllib.random
-
:: DeveloperApi ::
Generates i.i.d.
- StandardNormalGenerator() - Constructor for class org.apache.spark.mllib.random.StandardNormalGenerator
-
- StandardScaler - Class in org.apache.spark.ml.feature
-
:: Experimental ::
Standardizes features by removing the mean and scaling to unit variance using column summary
statistics on the samples in the training set.
- StandardScaler(String) - Constructor for class org.apache.spark.ml.feature.StandardScaler
-
- StandardScaler() - Constructor for class org.apache.spark.ml.feature.StandardScaler
-
- StandardScaler - Class in org.apache.spark.mllib.feature
-
:: Experimental ::
Standardizes features by removing the mean and scaling to unit std using column summary
statistics on the samples in the training set.
- StandardScaler(boolean, boolean) - Constructor for class org.apache.spark.mllib.feature.StandardScaler
-
- StandardScaler() - Constructor for class org.apache.spark.mllib.feature.StandardScaler
-
- StandardScalerModel - Class in org.apache.spark.ml.feature
-
- StandardScalerModel - Class in org.apache.spark.mllib.feature
-
:: Experimental ::
Represents a StandardScaler model that can transform vectors.
- StandardScalerModel(Vector, Vector, boolean, boolean) - Constructor for class org.apache.spark.mllib.feature.StandardScalerModel
-
- StandardScalerModel(Vector, Vector) - Constructor for class org.apache.spark.mllib.feature.StandardScalerModel
-
- StandardScalerModel(Vector) - Constructor for class org.apache.spark.mllib.feature.StandardScalerModel
-
- starGraph(SparkContext, int) - Static method in class org.apache.spark.graphx.util.GraphGenerators
-
Create a star graph with vertex 0 being the center.
- start() - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
-
Start the execution of the streams.
- start() - Method in class org.apache.spark.streaming.dstream.ConstantInputDStream
-
- start() - Method in class org.apache.spark.streaming.dstream.InputDStream
-
Method called to start receiving data.
- start() - Method in class org.apache.spark.streaming.dstream.ReceiverInputDStream
-
- start() - Method in class org.apache.spark.streaming.StreamingContext
-
Start the execution of the streams.
- startIndexInLevel(int) - Static method in class org.apache.spark.mllib.tree.model.Node
-
Return the index of the first node in the given level.
- startPosition() - Method in exception org.apache.spark.sql.AnalysisException
-
- startsWith(Column) - Method in class org.apache.spark.sql.Column
-
String starts with.
- startsWith(String) - Method in class org.apache.spark.sql.Column
-
String starts with another string literal.
- startsWith(UTF8String) - Method in class org.apache.spark.sql.types.UTF8String
-
- startTime() - Method in class org.apache.spark.api.java.JavaSparkContext
-
- startTime() - Method in class org.apache.spark.SparkContext
-
- startTime() - Method in class org.apache.spark.status.api.v1.ApplicationAttemptInfo
-
- startTime() - Method in class org.apache.spark.ui.jobs.JobProgressListener
-
- stat() - Method in class org.apache.spark.sql.DataFrame
-
- StatCounter - Class in org.apache.spark.util
-
A class for tracking the statistics of a set of numbers (count, mean and variance) in a
numerically robust way.
- StatCounter(TraversableOnce<Object>) - Constructor for class org.apache.spark.util.StatCounter
-
- StatCounter() - Constructor for class org.apache.spark.util.StatCounter
-
Initialize the StatCounter with no values.
- state() - Method in class org.apache.spark.scheduler.local.StatusUpdate
-
- staticPageRank(int, double) - Method in class org.apache.spark.graphx.GraphOps
-
Run PageRank for a fixed number of iterations returning a graph with vertex attributes
containing the PageRank and edge attributes the normalized edge weight.
- staticPersonalizedPageRank(long, int, double) - Method in class org.apache.spark.graphx.GraphOps
-
Run Personalized PageRank for a fixed number of iterations with
with all iterations originating at the source node
returning a graph with vertex attributes
containing the PageRank and edge attributes the normalized edge weight.
- statistic() - Method in class org.apache.spark.mllib.stat.test.ChiSqTestResult
-
- statistic() - Method in interface org.apache.spark.mllib.stat.test.TestResult
-
Test statistic.
- Statistics - Class in org.apache.spark.mllib.stat
-
:: Experimental ::
API for statistical functions in MLlib.
- Statistics() - Constructor for class org.apache.spark.mllib.stat.Statistics
-
- Statistics - Class in org.apache.spark.streaming.receiver
-
:: DeveloperApi ::
Statistics for querying the supervisor about state of workers.
- Statistics(int, int, int, String) - Constructor for class org.apache.spark.streaming.receiver.Statistics
-
- stats() - Method in class org.apache.spark.api.java.JavaDoubleRDD
-
Return a
StatCounter
object that captures the mean, variance and
count of the RDD's elements in one operation.
- stats() - Method in class org.apache.spark.mllib.tree.model.Node
-
- stats() - Method in class org.apache.spark.rdd.DoubleRDDFunctions
-
Return a
StatCounter
object that captures the mean, variance and
count of the RDD's elements in one operation.
- StatsReportListener - Class in org.apache.spark.scheduler
-
:: DeveloperApi ::
Simple SparkListener that logs a few summary statistics when each stage completes
- StatsReportListener() - Constructor for class org.apache.spark.scheduler.StatsReportListener
-
- StatsReportListener - Class in org.apache.spark.streaming.scheduler
-
:: DeveloperApi ::
A simple StreamingListener that logs summary statistics across Spark Streaming batches
param: numBatchInfos Number of last batches to consider for generating statistics (default: 10)
- StatsReportListener(int) - Constructor for class org.apache.spark.streaming.scheduler.StatsReportListener
-
- status() - Method in class org.apache.spark.scheduler.TaskInfo
-
- status() - Method in interface org.apache.spark.SparkJobInfo
-
- status() - Method in class org.apache.spark.SparkJobInfoImpl
-
- status() - Method in class org.apache.spark.status.api.v1.JobData
-
- status() - Method in class org.apache.spark.status.api.v1.StageData
-
- statusTracker() - Method in class org.apache.spark.api.java.JavaSparkContext
-
- statusTracker() - Method in class org.apache.spark.SparkContext
-
- StatusUpdate - Class in org.apache.spark.scheduler.local
-
- StatusUpdate(long, Enumeration.Value, ByteBuffer) - Constructor for class org.apache.spark.scheduler.local.StatusUpdate
-
- std() - Method in class org.apache.spark.ml.attribute.NumericAttribute
-
- std() - Method in class org.apache.spark.mllib.feature.StandardScalerModel
-
- std() - Method in class org.apache.spark.mllib.random.LogNormalGenerator
-
- stdev() - Method in class org.apache.spark.api.java.JavaDoubleRDD
-
Compute the standard deviation of this RDD's elements.
- stdev() - Method in class org.apache.spark.rdd.DoubleRDDFunctions
-
Compute the standard deviation of this RDD's elements.
- stdev() - Method in class org.apache.spark.util.StatCounter
-
Return the standard deviation of the values.
- stop() - Method in class org.apache.spark.api.java.JavaSparkContext
-
Shut down the SparkContext.
- stop() - Method in interface org.apache.spark.broadcast.BroadcastFactory
-
- stop() - Method in class org.apache.spark.broadcast.HttpBroadcastFactory
-
- stop() - Method in class org.apache.spark.broadcast.TorrentBroadcastFactory
-
- stop() - Method in class org.apache.spark.SparkContext
-
- stop() - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
-
Stop the execution of the streams.
- stop(boolean) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
-
Stop the execution of the streams.
- stop(boolean, boolean) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
-
Stop the execution of the streams.
- stop() - Method in class org.apache.spark.streaming.dstream.ConstantInputDStream
-
- stop() - Method in class org.apache.spark.streaming.dstream.InputDStream
-
Method called to stop receiving data.
- stop() - Method in class org.apache.spark.streaming.dstream.ReceiverInputDStream
-
- stop(String) - Method in class org.apache.spark.streaming.receiver.Receiver
-
Stop the receiver completely.
- stop(String, Throwable) - Method in class org.apache.spark.streaming.receiver.Receiver
-
Stop the receiver completely due to an exception
- stop(boolean) - Method in class org.apache.spark.streaming.StreamingContext
-
Stop the execution of the streams immediately (does not wait for all received data
to be processed).
- stop(boolean, boolean) - Method in class org.apache.spark.streaming.StreamingContext
-
Stop the execution of the streams, with option of ensuring all received data
has been processed.
- StopCoordinator - Class in org.apache.spark.scheduler
-
- StopCoordinator() - Constructor for class org.apache.spark.scheduler.StopCoordinator
-
- StopExecutor - Class in org.apache.spark.scheduler.local
-
- StopExecutor() - Constructor for class org.apache.spark.scheduler.local.StopExecutor
-
- storageLevel() - Method in class org.apache.spark.status.api.v1.RDDPartitionInfo
-
- storageLevel() - Method in class org.apache.spark.status.api.v1.RDDStorageInfo
-
- storageLevel() - Method in class org.apache.spark.storage.BlockStatus
-
- storageLevel() - Method in class org.apache.spark.storage.RDDInfo
-
- StorageLevel - Class in org.apache.spark.storage
-
:: DeveloperApi ::
Flags for controlling the storage of an RDD.
- StorageLevel() - Constructor for class org.apache.spark.storage.StorageLevel
-
- storageLevel() - Method in class org.apache.spark.streaming.dstream.DStream
-
- storageLevel() - Method in class org.apache.spark.streaming.receiver.Receiver
-
- storageLevelCache() - Static method in class org.apache.spark.storage.StorageLevel
-
:: DeveloperApi ::
Read StorageLevel object from ObjectInput stream.
- StorageLevels - Class in org.apache.spark.api.java
-
Expose some commonly useful storage level constants.
- StorageLevels() - Constructor for class org.apache.spark.api.java.StorageLevels
-
- StorageListener - Class in org.apache.spark.ui.storage
-
:: DeveloperApi ::
A SparkListener that prepares information to be displayed on the BlockManagerUI.
- StorageListener(StorageStatusListener) - Constructor for class org.apache.spark.ui.storage.StorageListener
-
- StorageStatus - Class in org.apache.spark.storage
-
:: DeveloperApi ::
Storage information for each BlockManager.
- StorageStatus(BlockManagerId, long) - Constructor for class org.apache.spark.storage.StorageStatus
-
- StorageStatus(BlockManagerId, long, Map<BlockId, BlockStatus>) - Constructor for class org.apache.spark.storage.StorageStatus
-
Create a storage status with an initial set of blocks, leaving the source unmodified.
- storageStatusList() - Method in class org.apache.spark.storage.StorageStatusListener
-
- storageStatusList() - Method in class org.apache.spark.ui.exec.ExecutorsListener
-
- storageStatusList() - Method in class org.apache.spark.ui.storage.StorageListener
-
- StorageStatusListener - Class in org.apache.spark.storage
-
:: DeveloperApi ::
A SparkListener that maintains executor storage status.
- StorageStatusListener() - Constructor for class org.apache.spark.storage.StorageStatusListener
-
- store(Iterator<T>) - Method in interface org.apache.spark.streaming.receiver.ActorHelper
-
Store an iterator of received data as a data block into Spark's memory.
- store(ByteBuffer) - Method in interface org.apache.spark.streaming.receiver.ActorHelper
-
Store the bytes of received data as a data block into Spark's memory.
- store(T) - Method in interface org.apache.spark.streaming.receiver.ActorHelper
-
Store a single item of received data to Spark's memory.
- store(T) - Method in class org.apache.spark.streaming.receiver.Receiver
-
Store a single item of received data to Spark's memory.
- store(ArrayBuffer<T>) - Method in class org.apache.spark.streaming.receiver.Receiver
-
Store an ArrayBuffer of received data as a data block into Spark's memory.
- store(ArrayBuffer<T>, Object) - Method in class org.apache.spark.streaming.receiver.Receiver
-
Store an ArrayBuffer of received data as a data block into Spark's memory.
- store(Iterator<T>) - Method in class org.apache.spark.streaming.receiver.Receiver
-
Store an iterator of received data as a data block into Spark's memory.
- store(Iterator<T>, Object) - Method in class org.apache.spark.streaming.receiver.Receiver
-
Store an iterator of received data as a data block into Spark's memory.
- store(Iterator<T>) - Method in class org.apache.spark.streaming.receiver.Receiver
-
Store an iterator of received data as a data block into Spark's memory.
- store(Iterator<T>, Object) - Method in class org.apache.spark.streaming.receiver.Receiver
-
Store an iterator of received data as a data block into Spark's memory.
- store(ByteBuffer) - Method in class org.apache.spark.streaming.receiver.Receiver
-
Store the bytes of received data as a data block into Spark's memory.
- store(ByteBuffer, Object) - Method in class org.apache.spark.streaming.receiver.Receiver
-
Store the bytes of received data as a data block into Spark's memory.
- Strategy - Class in org.apache.spark.mllib.tree.configuration
-
:: Experimental ::
Stores all the configuration options for tree construction
param: algo Learning goal.
- Strategy(Enumeration.Value, Impurity, int, int, int, Enumeration.Value, Map<Object, Object>, int, double, int, double, boolean, int) - Constructor for class org.apache.spark.mllib.tree.configuration.Strategy
-
- Strategy(Enumeration.Value, Impurity, int, int, int, Map<Integer, Integer>) - Constructor for class org.apache.spark.mllib.tree.configuration.Strategy
-
- STREAM() - Static method in class org.apache.spark.storage.BlockId
-
- StreamBlockId - Class in org.apache.spark.storage
-
- StreamBlockId(int, long) - Constructor for class org.apache.spark.storage.StreamBlockId
-
- streamId() - Method in class org.apache.spark.storage.StreamBlockId
-
- streamId() - Method in class org.apache.spark.streaming.receiver.Receiver
-
Get the unique identifier the receiver input stream that this
receiver is associated with.
- streamId() - Method in class org.apache.spark.streaming.scheduler.ReceiverInfo
-
- streamIdToNumRecords() - Method in class org.apache.spark.streaming.scheduler.BatchInfo
-
- StreamingContext - Class in org.apache.spark.streaming
-
Main entry point for Spark Streaming functionality.
- StreamingContext(SparkContext, Duration) - Constructor for class org.apache.spark.streaming.StreamingContext
-
Create a StreamingContext using an existing SparkContext.
- StreamingContext(SparkConf, Duration) - Constructor for class org.apache.spark.streaming.StreamingContext
-
Create a StreamingContext by providing the configuration necessary for a new SparkContext.
- StreamingContext(String, String, Duration, String, Seq<String>, Map<String, String>) - Constructor for class org.apache.spark.streaming.StreamingContext
-
Create a StreamingContext by providing the details necessary for creating a new SparkContext.
- StreamingContext(String, Configuration) - Constructor for class org.apache.spark.streaming.StreamingContext
-
Recreate a StreamingContext from a checkpoint file.
- StreamingContext(String) - Constructor for class org.apache.spark.streaming.StreamingContext
-
Recreate a StreamingContext from a checkpoint file.
- StreamingContext(String, SparkContext) - Constructor for class org.apache.spark.streaming.StreamingContext
-
Recreate a StreamingContext from a checkpoint file using an existing SparkContext.
- StreamingContextState - Enum in org.apache.spark.streaming
-
:: DeveloperApi ::
Represents the state of a StreamingContext.
- StreamingKMeans - Class in org.apache.spark.mllib.clustering
-
:: Experimental ::
- StreamingKMeans(int, double, String) - Constructor for class org.apache.spark.mllib.clustering.StreamingKMeans
-
- StreamingKMeans() - Constructor for class org.apache.spark.mllib.clustering.StreamingKMeans
-
- StreamingKMeansModel - Class in org.apache.spark.mllib.clustering
-
:: Experimental ::
- StreamingKMeansModel(Vector[], double[]) - Constructor for class org.apache.spark.mllib.clustering.StreamingKMeansModel
-
- StreamingLinearAlgorithm<M extends GeneralizedLinearModel,A extends GeneralizedLinearAlgorithm<M>> - Class in org.apache.spark.mllib.regression
-
:: DeveloperApi ::
StreamingLinearAlgorithm implements methods for continuously
training a generalized linear model model on streaming data,
and using it for prediction on (possibly different) streaming data.
- StreamingLinearAlgorithm() - Constructor for class org.apache.spark.mllib.regression.StreamingLinearAlgorithm
-
- StreamingLinearRegressionWithSGD - Class in org.apache.spark.mllib.regression
-
:: Experimental ::
Train or predict a linear regression model on streaming data.
- StreamingLinearRegressionWithSGD() - Constructor for class org.apache.spark.mllib.regression.StreamingLinearRegressionWithSGD
-
Construct a StreamingLinearRegression object with default parameters:
{stepSize: 0.1, numIterations: 50, miniBatchFraction: 1.0}.
- StreamingListener - Interface in org.apache.spark.streaming.scheduler
-
:: DeveloperApi ::
A listener interface for receiving information about an ongoing streaming
computation.
- StreamingListenerBatchCompleted - Class in org.apache.spark.streaming.scheduler
-
- StreamingListenerBatchCompleted(BatchInfo) - Constructor for class org.apache.spark.streaming.scheduler.StreamingListenerBatchCompleted
-
- StreamingListenerBatchStarted - Class in org.apache.spark.streaming.scheduler
-
- StreamingListenerBatchStarted(BatchInfo) - Constructor for class org.apache.spark.streaming.scheduler.StreamingListenerBatchStarted
-
- StreamingListenerBatchSubmitted - Class in org.apache.spark.streaming.scheduler
-
- StreamingListenerBatchSubmitted(BatchInfo) - Constructor for class org.apache.spark.streaming.scheduler.StreamingListenerBatchSubmitted
-
- StreamingListenerEvent - Interface in org.apache.spark.streaming.scheduler
-
:: DeveloperApi ::
Base trait for events related to StreamingListener
- StreamingListenerReceiverError - Class in org.apache.spark.streaming.scheduler
-
- StreamingListenerReceiverError(ReceiverInfo) - Constructor for class org.apache.spark.streaming.scheduler.StreamingListenerReceiverError
-
- StreamingListenerReceiverStarted - Class in org.apache.spark.streaming.scheduler
-
- StreamingListenerReceiverStarted(ReceiverInfo) - Constructor for class org.apache.spark.streaming.scheduler.StreamingListenerReceiverStarted
-
- StreamingListenerReceiverStopped - Class in org.apache.spark.streaming.scheduler
-
- StreamingListenerReceiverStopped(ReceiverInfo) - Constructor for class org.apache.spark.streaming.scheduler.StreamingListenerReceiverStopped
-
- StreamingLogisticRegressionWithSGD - Class in org.apache.spark.mllib.classification
-
:: Experimental ::
Train or predict a logistic regression model on streaming data.
- StreamingLogisticRegressionWithSGD() - Constructor for class org.apache.spark.mllib.classification.StreamingLogisticRegressionWithSGD
-
Construct a StreamingLogisticRegression object with default parameters:
{stepSize: 0.1, numIterations: 50, miniBatchFraction: 1.0, regParam: 0.0}.
- string() - Method in class org.apache.spark.sql.ColumnName
-
Creates a new StructField
of type string.
- StringArrayParam - Class in org.apache.spark.ml.param
-
:: DeveloperApi ::
Specialized version of Param[Array[String
} for Java.
- StringArrayParam(Params, String, String, Function1<String[], Object>) - Constructor for class org.apache.spark.ml.param.StringArrayParam
-
- StringArrayParam(Params, String, String) - Constructor for class org.apache.spark.ml.param.StringArrayParam
-
- StringContains - Class in org.apache.spark.sql.sources
-
A filter that evaluates to true
iff the attribute evaluates to
a string that contains the string value
.
- StringContains(String, String) - Constructor for class org.apache.spark.sql.sources.StringContains
-
- StringEndsWith - Class in org.apache.spark.sql.sources
-
A filter that evaluates to true
iff the attribute evaluates to
a string that starts with value
.
- StringEndsWith(String, String) - Constructor for class org.apache.spark.sql.sources.StringEndsWith
-
- StringIndexer - Class in org.apache.spark.ml.feature
-
:: Experimental ::
A label indexer that maps a string column of labels to an ML column of label indices.
- StringIndexer(String) - Constructor for class org.apache.spark.ml.feature.StringIndexer
-
- StringIndexer() - Constructor for class org.apache.spark.ml.feature.StringIndexer
-
- StringIndexerModel - Class in org.apache.spark.ml.feature
-
- stringRddToDataFrameHolder(RDD<String>) - Method in class org.apache.spark.sql.SQLContext.implicits$
-
- StringRRDD<T> - Class in org.apache.spark.api.r
-
An RDD that stores R objects as Array[String].
- StringRRDD(RDD<T>, byte[], String, byte[], String, Object[], ClassTag<T>) - Constructor for class org.apache.spark.api.r.StringRRDD
-
- StringStartsWith - Class in org.apache.spark.sql.sources
-
A filter that evaluates to true
iff the attribute evaluates to
a string that starts with value
.
- StringStartsWith(String, String) - Constructor for class org.apache.spark.sql.sources.StringStartsWith
-
- stringToText(String) - Static method in class org.apache.spark.SparkContext
-
- StringType - Static variable in class org.apache.spark.sql.types.DataTypes
-
Gets the StringType object.
- StringType - Class in org.apache.spark.sql.types
-
:: DeveloperApi ::
The data type representing String
values.
- stringWritableConverter() - Static method in class org.apache.spark.SparkContext
-
- stronglyConnectedComponents(int) - Method in class org.apache.spark.graphx.GraphOps
-
Compute the strongly connected component (SCC) of each vertex and return a graph with the
vertex value containing the lowest vertex id in the SCC containing that vertex.
- StronglyConnectedComponents - Class in org.apache.spark.graphx.lib
-
Strongly connected components algorithm implementation.
- StronglyConnectedComponents() - Constructor for class org.apache.spark.graphx.lib.StronglyConnectedComponents
-
- struct(Seq<StructField>) - Method in class org.apache.spark.sql.ColumnName
-
Creates a new StructField
of type struct.
- struct(StructType) - Method in class org.apache.spark.sql.ColumnName
-
Creates a new StructField
of type struct.
- struct(Column...) - Static method in class org.apache.spark.sql.functions
-
Creates a new struct column.
- struct(Seq<Column>) - Static method in class org.apache.spark.sql.functions
-
Creates a new struct column.
- struct(String, Seq<String>) - Static method in class org.apache.spark.sql.functions
-
Creates a new struct column that composes multiple input columns.
- StructField - Class in org.apache.spark.sql.types
-
A field inside a StructType.
- StructField(String, DataType, boolean, Metadata) - Constructor for class org.apache.spark.sql.types.StructField
-
- StructType - Class in org.apache.spark.sql.types
-
:: DeveloperApi ::
A
StructType
object can be constructed by
- StructType(StructField[]) - Constructor for class org.apache.spark.sql.types.StructType
-
- subgraph(Function1<EdgeTriplet<VD, ED>, Object>, Function2<Object, VD, Object>) - Method in class org.apache.spark.graphx.Graph
-
Restricts the graph to only the vertices and edges satisfying the predicates.
- subgraph(Function1<EdgeTriplet<VD, ED>, Object>, Function2<Object, VD, Object>) - Method in class org.apache.spark.graphx.impl.GraphImpl
-
- submissionTime() - Method in class org.apache.spark.scheduler.StageInfo
-
When this stage was submitted from the DAGScheduler to a TaskScheduler.
- submissionTime() - Method in interface org.apache.spark.SparkStageInfo
-
- submissionTime() - Method in class org.apache.spark.SparkStageInfoImpl
-
- submissionTime() - Method in class org.apache.spark.status.api.v1.JobData
-
- submissionTime() - Method in class org.apache.spark.streaming.scheduler.BatchInfo
-
- submitJob(RDD<T>, Function1<Iterator<T>, U>, Seq<Object>, Function2<Object, U, BoxedUnit>, Function0<R>) - Method in class org.apache.spark.SparkContext
-
:: Experimental ::
Submit a job for execution and return a FutureJob holding the result.
- subsamplingRate() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
-
- subsetAccuracy() - Method in class org.apache.spark.mllib.evaluation.MultilabelMetrics
-
Returns subset accuracy
(for equal sets of labels)
- substr(Column, Column) - Method in class org.apache.spark.sql.Column
-
An expression that returns a substring.
- substr(int, int) - Method in class org.apache.spark.sql.Column
-
An expression that returns a substring.
- subtract(JavaDoubleRDD) - Method in class org.apache.spark.api.java.JavaDoubleRDD
-
Return an RDD with the elements from this
that are not in other
.
- subtract(JavaDoubleRDD, int) - Method in class org.apache.spark.api.java.JavaDoubleRDD
-
Return an RDD with the elements from this
that are not in other
.
- subtract(JavaDoubleRDD, Partitioner) - Method in class org.apache.spark.api.java.JavaDoubleRDD
-
Return an RDD with the elements from this
that are not in other
.
- subtract(JavaPairRDD<K, V>) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Return an RDD with the elements from this
that are not in other
.
- subtract(JavaPairRDD<K, V>, int) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Return an RDD with the elements from this
that are not in other
.
- subtract(JavaPairRDD<K, V>, Partitioner) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Return an RDD with the elements from this
that are not in other
.
- subtract(JavaRDD<T>) - Method in class org.apache.spark.api.java.JavaRDD
-
Return an RDD with the elements from this
that are not in other
.
- subtract(JavaRDD<T>, int) - Method in class org.apache.spark.api.java.JavaRDD
-
Return an RDD with the elements from this
that are not in other
.
- subtract(JavaRDD<T>, Partitioner) - Method in class org.apache.spark.api.java.JavaRDD
-
Return an RDD with the elements from this
that are not in other
.
- subtract(RDD<T>) - Method in class org.apache.spark.rdd.RDD
-
Return an RDD with the elements from this
that are not in other
.
- subtract(RDD<T>, int) - Method in class org.apache.spark.rdd.RDD
-
Return an RDD with the elements from this
that are not in other
.
- subtract(RDD<T>, Partitioner, Ordering<T>) - Method in class org.apache.spark.rdd.RDD
-
Return an RDD with the elements from this
that are not in other
.
- subtract(Vector) - Method in class org.apache.spark.util.Vector
-
- subtractByKey(JavaPairRDD<K, W>) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Return an RDD with the pairs from this
whose keys are not in other
.
- subtractByKey(JavaPairRDD<K, W>, int) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Return an RDD with the pairs from `this` whose keys are not in `other`.
- subtractByKey(JavaPairRDD<K, W>, Partitioner) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Return an RDD with the pairs from `this` whose keys are not in `other`.
- subtractByKey(RDD<Tuple2<K, W>>, ClassTag<W>) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Return an RDD with the pairs from this
whose keys are not in other
.
- subtractByKey(RDD<Tuple2<K, W>>, int, ClassTag<W>) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Return an RDD with the pairs from `this` whose keys are not in `other`.
- subtractByKey(RDD<Tuple2<K, W>>, Partitioner, ClassTag<W>) - Method in class org.apache.spark.rdd.PairRDDFunctions
-
Return an RDD with the pairs from `this` whose keys are not in `other`.
- succeededTasks() - Method in class org.apache.spark.status.api.v1.ExecutorStageSummary
-
- Success - Class in org.apache.spark
-
:: DeveloperApi ::
Task succeeded.
- Success() - Constructor for class org.apache.spark.Success
-
- successful() - Method in class org.apache.spark.scheduler.TaskInfo
-
- sum() - Method in class org.apache.spark.api.java.JavaDoubleRDD
-
Add up the elements in this RDD.
- sum() - Method in class org.apache.spark.rdd.DoubleRDDFunctions
-
Add up the elements in this RDD.
- sum(Column) - Static method in class org.apache.spark.sql.functions
-
Aggregate function: returns the sum of all values in the expression.
- sum(String) - Static method in class org.apache.spark.sql.functions
-
Aggregate function: returns the sum of all values in the given column.
- sum(String...) - Method in class org.apache.spark.sql.GroupedData
-
Compute the sum for each numeric columns for each group.
- sum(Seq<String>) - Method in class org.apache.spark.sql.GroupedData
-
Compute the sum for each numeric columns for each group.
- sum() - Method in class org.apache.spark.util.StatCounter
-
- sum() - Method in class org.apache.spark.util.Vector
-
- sumApprox(long, Double) - Method in class org.apache.spark.api.java.JavaDoubleRDD
-
:: Experimental ::
Approximate operation to return the sum within a timeout.
- sumApprox(long) - Method in class org.apache.spark.api.java.JavaDoubleRDD
-
:: Experimental ::
Approximate operation to return the sum within a timeout.
- sumApprox(long, double) - Method in class org.apache.spark.rdd.DoubleRDDFunctions
-
:: Experimental ::
Approximate operation to return the sum within a timeout.
- sumDistinct(Column) - Static method in class org.apache.spark.sql.functions
-
Aggregate function: returns the sum of distinct values in the expression.
- sumDistinct(String) - Static method in class org.apache.spark.sql.functions
-
Aggregate function: returns the sum of distinct values in the expression.
- supportedFeatureSubsetStrategies() - Static method in class org.apache.spark.ml.classification.RandomForestClassifier
-
Accessor for supported featureSubsetStrategy settings: auto, all, onethird, sqrt, log2
- supportedFeatureSubsetStrategies() - Static method in class org.apache.spark.ml.regression.RandomForestRegressor
-
Accessor for supported featureSubsetStrategy settings: auto, all, onethird, sqrt, log2
- supportedFeatureSubsetStrategies() - Static method in class org.apache.spark.mllib.tree.RandomForest
-
List of supported feature subset sampling strategies.
- supportedImpurities() - Static method in class org.apache.spark.ml.classification.DecisionTreeClassifier
-
Accessor for supported impurities: entropy, gini
- supportedImpurities() - Static method in class org.apache.spark.ml.classification.RandomForestClassifier
-
Accessor for supported impurity settings: entropy, gini
- supportedImpurities() - Static method in class org.apache.spark.ml.regression.DecisionTreeRegressor
-
Accessor for supported impurities: variance
- supportedImpurities() - Static method in class org.apache.spark.ml.regression.RandomForestRegressor
-
Accessor for supported impurity settings: variance
- supportedLossTypes() - Static method in class org.apache.spark.ml.classification.GBTClassifier
-
Accessor for supported loss settings: logistic
- supportedLossTypes() - Static method in class org.apache.spark.ml.regression.GBTRegressor
-
Accessor for supported loss settings: squared (L2), absolute (L1)
- supportedModelTypes() - Static method in class org.apache.spark.mllib.classification.NaiveBayes
-
- supportsRelocationOfSerializedObjects() - Method in class org.apache.spark.serializer.KryoSerializer
-
- SVDPlusPlus - Class in org.apache.spark.graphx.lib
-
Implementation of SVD++ algorithm.
- SVDPlusPlus() - Constructor for class org.apache.spark.graphx.lib.SVDPlusPlus
-
- SVDPlusPlus.Conf - Class in org.apache.spark.graphx.lib
-
Configuration parameters for SVDPlusPlus.
- SVDPlusPlus.Conf(int, int, double, double, double, double, double, double) - Constructor for class org.apache.spark.graphx.lib.SVDPlusPlus.Conf
-
- SVMDataGenerator - Class in org.apache.spark.mllib.util
-
:: DeveloperApi ::
Generate sample data used for SVM.
- SVMDataGenerator() - Constructor for class org.apache.spark.mllib.util.SVMDataGenerator
-
- SVMModel - Class in org.apache.spark.mllib.classification
-
Model for Support Vector Machines (SVMs).
- SVMModel(Vector, double) - Constructor for class org.apache.spark.mllib.classification.SVMModel
-
- SVMWithSGD - Class in org.apache.spark.mllib.classification
-
Train a Support Vector Machine (SVM) using Stochastic Gradient Descent.
- SVMWithSGD() - Constructor for class org.apache.spark.mllib.classification.SVMWithSGD
-
Construct a SVM object with default parameters: {stepSize: 1.0, numIterations: 100,
regParm: 0.01, miniBatchFraction: 1.0}.
- symbolToColumn(Symbol) - Method in class org.apache.spark.sql.SQLContext.implicits$
-
- systemProperties() - Method in class org.apache.spark.ui.env.EnvironmentListener
-
- t() - Method in class org.apache.spark.SerializableWritable
-
- table(String) - Method in class org.apache.spark.sql.DataFrameReader
-
- table(String) - Method in class org.apache.spark.sql.SQLContext
-
- tableNames() - Method in class org.apache.spark.sql.SQLContext
-
- tableNames(String) - Method in class org.apache.spark.sql.SQLContext
-
- tables() - Method in class org.apache.spark.sql.SQLContext
-
- tables(String) - Method in class org.apache.spark.sql.SQLContext
-
- TableScan - Interface in org.apache.spark.sql.sources
-
::DeveloperApi::
A BaseRelation that can produce all of its tuples as an RDD of Row objects.
- tachyonFolderName() - Method in class org.apache.spark.SparkContext
-
- tag() - Method in class org.apache.spark.sql.types.BinaryType
-
- tag() - Method in class org.apache.spark.sql.types.BooleanType
-
- tag() - Method in class org.apache.spark.sql.types.ByteType
-
- tag() - Method in class org.apache.spark.sql.types.DateType
-
- tag() - Method in class org.apache.spark.sql.types.DecimalType
-
- tag() - Method in class org.apache.spark.sql.types.DoubleType
-
- tag() - Method in class org.apache.spark.sql.types.FloatType
-
- tag() - Method in class org.apache.spark.sql.types.IntegerType
-
- tag() - Method in class org.apache.spark.sql.types.LongType
-
- tag() - Method in class org.apache.spark.sql.types.ShortType
-
- tag() - Method in class org.apache.spark.sql.types.StringType
-
- tag() - Method in class org.apache.spark.sql.types.TimestampType
-
- take(int) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Take the first num elements of the RDD.
- take(int) - Method in class org.apache.spark.rdd.RDD
-
Take the first num elements of the RDD.
- take(int) - Method in class org.apache.spark.sql.DataFrame
-
- takeAsync(int) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
The asynchronous version of the take
action, which returns a
future for retrieving the first num
elements of this RDD.
- takeAsync(int) - Method in class org.apache.spark.rdd.AsyncRDDActions
-
Returns a future for retrieving the first num elements of the RDD.
- takeOrdered(int, Comparator<T>) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Returns the first k (smallest) elements from this RDD as defined by
the specified Comparator[T] and maintains the order.
- takeOrdered(int) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Returns the first k (smallest) elements from this RDD using the
natural ordering for T while maintain the order.
- takeOrdered(int, Ordering<T>) - Method in class org.apache.spark.rdd.RDD
-
Returns the first k (smallest) elements from this RDD as defined by the specified
implicit Ordering[T] and maintains the ordering.
- takeSample(boolean, int) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
- takeSample(boolean, int, long) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
- takeSample(boolean, int, long) - Method in class org.apache.spark.rdd.RDD
-
Return a fixed-size sampled subset of this RDD in an array
- tan(Column) - Static method in class org.apache.spark.sql.functions
-
Computes the tangent of the given value.
- tan(String) - Static method in class org.apache.spark.sql.functions
-
Computes the tangent of the given column.
- tanh(Column) - Static method in class org.apache.spark.sql.functions
-
Computes the hyperbolic tangent of the given value.
- tanh(String) - Static method in class org.apache.spark.sql.functions
-
Computes the hyperbolic tangent of the given column.
- targetStorageLevel() - Method in class org.apache.spark.graphx.impl.EdgeRDDImpl
-
- targetStorageLevel() - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
-
- task() - Method in class org.apache.spark.CleanupTaskWeakReference
-
- task() - Method in class org.apache.spark.scheduler.AskPermissionToCommitOutput
-
- taskAttempt() - Method in class org.apache.spark.scheduler.AskPermissionToCommitOutput
-
- taskAttemptId() - Method in class org.apache.spark.TaskContext
-
An ID that is unique to this task attempt (within the same SparkContext, no two task attempts
will share the same attempt ID).
- TaskCommitDenied - Class in org.apache.spark
-
:: DeveloperApi ::
Task requested the driver to commit, but was denied.
- TaskCommitDenied(int, int, int) - Constructor for class org.apache.spark.TaskCommitDenied
-
- TaskCompletionListener - Interface in org.apache.spark.util
-
:: DeveloperApi ::
- TaskContext - Class in org.apache.spark
-
Contextual information about a task which can be read or mutated during
execution.
- TaskContext() - Constructor for class org.apache.spark.TaskContext
-
- TaskData - Class in org.apache.spark.status.api.v1
-
- TaskEndReason - Interface in org.apache.spark
-
:: DeveloperApi ::
Various possible reasons why a task ended.
- TaskFailedReason - Interface in org.apache.spark
-
:: DeveloperApi ::
Various possible reasons why a task failed.
- taskId() - Method in class org.apache.spark.scheduler.local.KillTask
-
- taskId() - Method in class org.apache.spark.scheduler.local.StatusUpdate
-
- taskId() - Method in class org.apache.spark.scheduler.TaskInfo
-
- taskId() - Method in class org.apache.spark.status.api.v1.TaskData
-
- taskId() - Method in class org.apache.spark.storage.TaskResultBlockId
-
- taskInfo() - Method in class org.apache.spark.scheduler.SparkListenerTaskEnd
-
- taskInfo() - Method in class org.apache.spark.scheduler.SparkListenerTaskGettingResult
-
- taskInfo() - Method in class org.apache.spark.scheduler.SparkListenerTaskStart
-
- TaskInfo - Class in org.apache.spark.scheduler
-
:: DeveloperApi ::
Information about a running task attempt inside a TaskSet.
- TaskInfo(long, int, int, long, String, String, Enumeration.Value, boolean) - Constructor for class org.apache.spark.scheduler.TaskInfo
-
- TaskKilled - Class in org.apache.spark
-
:: DeveloperApi ::
Task was killed intentionally and needs to be rescheduled.
- TaskKilled() - Constructor for class org.apache.spark.TaskKilled
-
- TaskKilledException - Exception in org.apache.spark
-
:: DeveloperApi ::
Exception thrown when a task is explicitly killed (i.e., task failure is expected).
- TaskKilledException() - Constructor for exception org.apache.spark.TaskKilledException
-
- taskLocality() - Method in class org.apache.spark.scheduler.TaskInfo
-
- TaskLocality - Class in org.apache.spark.scheduler
-
- TaskLocality() - Constructor for class org.apache.spark.scheduler.TaskLocality
-
- taskLocality() - Method in class org.apache.spark.status.api.v1.TaskData
-
- TaskMetricDistributions - Class in org.apache.spark.status.api.v1
-
- taskMetrics() - Method in class org.apache.spark.scheduler.SparkListenerExecutorMetricsUpdate
-
- taskMetrics() - Method in class org.apache.spark.scheduler.SparkListenerTaskEnd
-
- taskMetrics() - Method in class org.apache.spark.status.api.v1.TaskData
-
- TaskMetrics - Class in org.apache.spark.status.api.v1
-
- taskMetrics() - Method in class org.apache.spark.TaskContext
-
::DeveloperApi::
- TASKRESULT() - Static method in class org.apache.spark.storage.BlockId
-
- TaskResultBlockId - Class in org.apache.spark.storage
-
- TaskResultBlockId(long) - Constructor for class org.apache.spark.storage.TaskResultBlockId
-
- TaskResultLost - Class in org.apache.spark
-
:: DeveloperApi ::
The task finished successfully, but the result was lost from the executor's block manager before
it was fetched.
- TaskResultLost() - Constructor for class org.apache.spark.TaskResultLost
-
- tasks() - Method in class org.apache.spark.status.api.v1.StageData
-
- TaskSorting - Enum in org.apache.spark.status.api.v1
-
- taskTime() - Method in class org.apache.spark.status.api.v1.ExecutorStageSummary
-
- taskType() - Method in class org.apache.spark.scheduler.SparkListenerTaskEnd
-
- teardown() - Method in class org.apache.spark.streaming.kafka.KafkaTestUtils
-
Teardown the whole servers, including Kafka broker and Zookeeper
- TEST() - Static method in class org.apache.spark.storage.BlockId
-
- TestResult<DF> - Interface in org.apache.spark.mllib.stat.test
-
:: Experimental ::
Trait for hypothesis test results.
- TestSQLContext - Class in org.apache.spark.sql.test
-
- TestSQLContext() - Constructor for class org.apache.spark.sql.test.TestSQLContext
-
- textFile(String) - Method in class org.apache.spark.api.java.JavaSparkContext
-
Read a text file from HDFS, a local file system (available on all nodes), or any
Hadoop-supported file system URI, and return it as an RDD of Strings.
- textFile(String, int) - Method in class org.apache.spark.api.java.JavaSparkContext
-
Read a text file from HDFS, a local file system (available on all nodes), or any
Hadoop-supported file system URI, and return it as an RDD of Strings.
- textFile(String, int) - Method in class org.apache.spark.SparkContext
-
Read a text file from HDFS, a local file system (available on all nodes), or any
Hadoop-supported file system URI, and return it as an RDD of Strings.
- textFileStream(String) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
-
Create an input stream that monitors a Hadoop-compatible filesystem
for new files and reads them as text files (using key as LongWritable, value
as Text and input format as TextInputFormat).
- textFileStream(String) - Method in class org.apache.spark.streaming.StreamingContext
-
Create a input stream that monitors a Hadoop-compatible filesystem
for new files and reads them as text files (using key as LongWritable, value
as Text and input format as TextInputFormat).
- theta() - Method in class org.apache.spark.mllib.classification.NaiveBayesModel
-
- threshold() - Method in class org.apache.spark.ml.feature.Binarizer
-
Param for threshold used to binarize continuous features.
- threshold() - Method in class org.apache.spark.ml.tree.ContinuousSplit
-
- threshold() - Method in class org.apache.spark.mllib.tree.model.Split
-
- thresholds() - Method in class org.apache.spark.mllib.evaluation.BinaryClassificationMetrics
-
Returns thresholds in descending order.
- throwBalls() - Method in class org.apache.spark.rdd.PartitionCoalescer
-
- time() - Method in class org.apache.spark.scheduler.SparkListenerApplicationEnd
-
- time() - Method in class org.apache.spark.scheduler.SparkListenerApplicationStart
-
- time() - Method in class org.apache.spark.scheduler.SparkListenerBlockManagerAdded
-
- time() - Method in class org.apache.spark.scheduler.SparkListenerBlockManagerRemoved
-
- time() - Method in class org.apache.spark.scheduler.SparkListenerExecutorAdded
-
- time() - Method in class org.apache.spark.scheduler.SparkListenerExecutorRemoved
-
- time() - Method in class org.apache.spark.scheduler.SparkListenerJobEnd
-
- time() - Method in class org.apache.spark.scheduler.SparkListenerJobStart
-
- Time - Class in org.apache.spark.streaming
-
This is a simple class that represents an absolute instant of time.
- Time(long) - Constructor for class org.apache.spark.streaming.Time
-
- times(int) - Method in class org.apache.spark.streaming.Duration
-
- timestamp() - Method in class org.apache.spark.sql.ColumnName
-
Creates a new StructField
of type timestamp.
- TimestampType - Static variable in class org.apache.spark.sql.types.DataTypes
-
Gets the TimestampType object.
- TimestampType - Class in org.apache.spark.sql.types
-
:: DeveloperApi ::
The data type representing java.sql.Timestamp
values.
- TimeTrackingOutputStream - Class in org.apache.spark.storage
-
Intercepts write calls and tracks total time spent writing in order to update shuffle write
metrics.
- TimeTrackingOutputStream(ShuffleWriteMetrics, OutputStream) - Constructor for class org.apache.spark.storage.TimeTrackingOutputStream
-
- timeUnit() - Method in class org.apache.spark.mllib.clustering.StreamingKMeans
-
- TIMING_DATA() - Static method in class org.apache.spark.api.r.SpecialLengths
-
- to(Time, Duration) - Method in class org.apache.spark.streaming.Time
-
- toArray() - Method in interface org.apache.spark.api.java.JavaRDDLike
-
- toArray() - Method in class org.apache.spark.input.PortableDataStream
-
Read the file as a byte array
- toArray() - Method in class org.apache.spark.mllib.linalg.DenseVector
-
- toArray() - Method in interface org.apache.spark.mllib.linalg.Matrix
-
Converts to a dense array in column major.
- toArray() - Method in class org.apache.spark.mllib.linalg.SparseVector
-
- toArray() - Method in interface org.apache.spark.mllib.linalg.Vector
-
Converts the instance to a double array.
- toArray() - Method in class org.apache.spark.rdd.RDD
-
Return an array that contains all of the elements in this RDD.
- toBigDecimal() - Method in class org.apache.spark.sql.types.Decimal
-
- toBlockMatrix() - Method in class org.apache.spark.mllib.linalg.distributed.CoordinateMatrix
-
- toBlockMatrix(int, int) - Method in class org.apache.spark.mllib.linalg.distributed.CoordinateMatrix
-
- toBlockMatrix() - Method in class org.apache.spark.mllib.linalg.distributed.IndexedRowMatrix
-
- toBlockMatrix(int, int) - Method in class org.apache.spark.mllib.linalg.distributed.IndexedRowMatrix
-
- toBreeze() - Method in interface org.apache.spark.mllib.linalg.distributed.DistributedMatrix
-
Collects data and assembles a local dense breeze matrix (for test only).
- toBreeze() - Method in interface org.apache.spark.mllib.linalg.Matrix
-
Converts to a breeze matrix.
- toBreeze() - Method in interface org.apache.spark.mllib.linalg.Vector
-
Converts the instance to a breeze vector.
- toByte() - Method in class org.apache.spark.sql.types.Decimal
-
- toCoordinateMatrix() - Method in class org.apache.spark.mllib.linalg.distributed.BlockMatrix
-
Converts to CoordinateMatrix.
- toCoordinateMatrix() - Method in class org.apache.spark.mllib.linalg.distributed.IndexedRowMatrix
-
- toDebugString() - Method in interface org.apache.spark.api.java.JavaRDDLike
-
A description of this RDD and its recursive dependencies for debugging.
- toDebugString() - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel
-
Print the full model to a string.
- toDebugString() - Method in class org.apache.spark.rdd.RDD
-
A description of this RDD and its recursive dependencies for debugging.
- toDebugString() - Method in class org.apache.spark.SparkConf
-
Return a string listing all keys and values, one per line.
- toDebugString() - Method in class org.apache.spark.sql.types.Decimal
-
- toDegrees(Column) - Static method in class org.apache.spark.sql.functions
-
Converts an angle measured in radians to an approximately equivalent angle measured in degrees.
- toDegrees(String) - Static method in class org.apache.spark.sql.functions
-
Converts an angle measured in radians to an approximately equivalent angle measured in degrees.
- toDense() - Method in class org.apache.spark.mllib.linalg.SparseMatrix
-
Generate a DenseMatrix
from the given SparseMatrix
.
- toDense() - Method in interface org.apache.spark.mllib.linalg.Vector
-
Converts this vector to a dense vector.
- toDF(String...) - Method in class org.apache.spark.sql.DataFrame
-
Returns a new
DataFrame
with columns renamed.
- toDF() - Method in class org.apache.spark.sql.DataFrame
-
Returns the object itself.
- toDF(Seq<String>) - Method in class org.apache.spark.sql.DataFrame
-
Returns a new
DataFrame
with columns renamed.
- toDouble() - Method in class org.apache.spark.sql.types.Decimal
-
- toEdgeTriplet() - Method in class org.apache.spark.graphx.EdgeContext
-
Converts the edge and vertex properties into an
EdgeTriplet
for convenience.
- toErrorString() - Method in class org.apache.spark.ExceptionFailure
-
- toErrorString() - Method in class org.apache.spark.ExecutorLostFailure
-
- toErrorString() - Method in class org.apache.spark.FetchFailed
-
- toErrorString() - Static method in class org.apache.spark.Resubmitted
-
- toErrorString() - Method in class org.apache.spark.TaskCommitDenied
-
- toErrorString() - Method in interface org.apache.spark.TaskFailedReason
-
Error message displayed in the web UI.
- toErrorString() - Static method in class org.apache.spark.TaskKilled
-
- toErrorString() - Static method in class org.apache.spark.TaskResultLost
-
- toErrorString() - Static method in class org.apache.spark.UnknownReason
-
- toFloat() - Method in class org.apache.spark.sql.types.Decimal
-
- toFormattedString() - Method in class org.apache.spark.streaming.Duration
-
- toIndexedRowMatrix() - Method in class org.apache.spark.mllib.linalg.distributed.BlockMatrix
-
Converts to IndexedRowMatrix.
- toIndexedRowMatrix() - Method in class org.apache.spark.mllib.linalg.distributed.CoordinateMatrix
-
- toInt() - Method in class org.apache.spark.sql.types.Decimal
-
- toInt() - Method in class org.apache.spark.storage.StorageLevel
-
- toJavaBigDecimal() - Method in class org.apache.spark.sql.types.Decimal
-
- toJavaDStream() - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Convert to a JavaDStream
- toJavaRDD() - Method in class org.apache.spark.rdd.RDD
-
- toJavaRDD() - Method in class org.apache.spark.sql.DataFrame
-
- toJSON() - Method in class org.apache.spark.sql.DataFrame
-
Returns the content of the
DataFrame
as a RDD of JSON strings.
- Tokenizer - Class in org.apache.spark.ml.feature
-
:: Experimental ::
A tokenizer that converts the input string to lowercase and then splits it by white spaces.
- Tokenizer(String) - Constructor for class org.apache.spark.ml.feature.Tokenizer
-
- Tokenizer() - Constructor for class org.apache.spark.ml.feature.Tokenizer
-
- toLocal() - Method in class org.apache.spark.mllib.clustering.DistributedLDAModel
-
Convert model to a local model.
- toLocalIterator() - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Return an iterator that contains all of the elements in this RDD.
- toLocalIterator() - Method in class org.apache.spark.rdd.RDD
-
Return an iterator that contains all of the elements in this RDD.
- toLocalMatrix() - Method in class org.apache.spark.mllib.linalg.distributed.BlockMatrix
-
Collect the distributed matrix on the driver as a `DenseMatrix`.
- toLong() - Method in class org.apache.spark.sql.types.Decimal
-
- toLowerCase() - Method in class org.apache.spark.sql.types.UTF8String
-
- toMetadata(Metadata) - Method in class org.apache.spark.ml.attribute.Attribute
-
Converts to ML metadata with some existing metadata.
- toMetadata() - Method in class org.apache.spark.ml.attribute.Attribute
-
Converts to ML metadata
- toMetadata(Metadata) - Method in class org.apache.spark.ml.attribute.AttributeGroup
-
Converts to ML metadata with some existing metadata.
- toMetadata() - Method in class org.apache.spark.ml.attribute.AttributeGroup
-
Converts to ML metadata
- toOld() - Method in interface org.apache.spark.ml.tree.Split
-
Convert to old Split format
- top(int, Comparator<T>) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Returns the top k (largest) elements from this RDD as defined by
the specified Comparator[T].
- top(int) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Returns the top k (largest) elements from this RDD using the
natural ordering for T.
- top(int, Ordering<T>) - Method in class org.apache.spark.rdd.RDD
-
- toPairDStreamFunctions(DStream<Tuple2<K, V>>, ClassTag<K>, ClassTag<V>, Ordering<K>) - Static method in class org.apache.spark.streaming.dstream.DStream
-
- toPairDStreamFunctions(DStream<Tuple2<K, V>>, ClassTag<K>, ClassTag<V>, Ordering<K>) - Static method in class org.apache.spark.streaming.StreamingContext
-
Deprecated.
As of 1.3.0, replaced by implicit functions in the DStream companion object.
This is kept here only for backward compatibility.
- topByKey(int, Ordering<V>) - Method in class org.apache.spark.mllib.rdd.MLPairRDDFunctions
-
Returns the top k (largest) elements for each key from this RDD as defined by the specified
implicit Ordering[T].
- topic() - Method in class org.apache.spark.streaming.kafka.OffsetRange
-
Kafka topic name
- topicConcentration() - Method in class org.apache.spark.mllib.clustering.EMLDAOptimizer
-
- topicDistributions() - Method in class org.apache.spark.mllib.clustering.DistributedLDAModel
-
For each document in the training set, return the distribution over topics for that document
("theta_doc").
- topicsMatrix() - Method in class org.apache.spark.mllib.clustering.DistributedLDAModel
-
Inferred topics, where each topic is represented by a distribution over terms.
- topicsMatrix() - Method in class org.apache.spark.mllib.clustering.LDAModel
-
Inferred topics, where each topic is represented by a distribution over terms.
- topicsMatrix() - Method in class org.apache.spark.mllib.clustering.LocalLDAModel
-
- toPMML(StreamResult) - Method in interface org.apache.spark.mllib.pmml.PMMLExportable
-
Export the model to the stream result in PMML format
- toPMML(String) - Method in interface org.apache.spark.mllib.pmml.PMMLExportable
-
:: Experimental ::
Export the model to a local file in PMML format
- toPMML(SparkContext, String) - Method in interface org.apache.spark.mllib.pmml.PMMLExportable
-
:: Experimental ::
Export the model to a directory on a distributed file system in PMML format
- toPMML(OutputStream) - Method in interface org.apache.spark.mllib.pmml.PMMLExportable
-
:: Experimental ::
Export the model to the OutputStream in PMML format
- toPMML() - Method in interface org.apache.spark.mllib.pmml.PMMLExportable
-
:: Experimental ::
Export the model to a String in PMML format
- topNode() - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel
-
- toRadians(Column) - Static method in class org.apache.spark.sql.functions
-
Converts an angle measured in degrees to an approximately equivalent angle measured in radians.
- toRadians(String) - Static method in class org.apache.spark.sql.functions
-
Converts an angle measured in degrees to an approximately equivalent angle measured in radians.
- toRDD(JavaDoubleRDD) - Static method in class org.apache.spark.api.java.JavaDoubleRDD
-
- toRDD(JavaPairRDD<K, V>) - Static method in class org.apache.spark.api.java.JavaPairRDD
-
- toRDD(JavaRDD<T>) - Static method in class org.apache.spark.api.java.JavaRDD
-
- toRowMatrix() - Method in class org.apache.spark.mllib.linalg.distributed.CoordinateMatrix
-
- toRowMatrix() - Method in class org.apache.spark.mllib.linalg.distributed.IndexedRowMatrix
-
- TorrentBroadcastFactory - Class in org.apache.spark.broadcast
-
A
Broadcast
implementation that uses a BitTorrent-like
protocol to do a distributed transfer of the broadcasted data to the executors.
- TorrentBroadcastFactory() - Constructor for class org.apache.spark.broadcast.TorrentBroadcastFactory
-
- toSchemaRDD() - Method in class org.apache.spark.sql.DataFrame
-
Deprecated.
As of 1.3.0, replaced by toDF()
.
- toSeq() - Method in class org.apache.spark.ml.param.ParamMap
-
Converts this param map to a sequence of param pairs.
- toSeq() - Method in interface org.apache.spark.sql.Row
-
Return a Scala Seq representing the row.
- toShort() - Method in class org.apache.spark.sql.types.Decimal
-
- toSparkContext(JavaSparkContext) - Static method in class org.apache.spark.api.java.JavaSparkContext
-
- toSparse() - Method in class org.apache.spark.mllib.linalg.DenseMatrix
-
Generate a SparseMatrix
from the given DenseMatrix
.
- toSparse() - Method in class org.apache.spark.mllib.linalg.DenseVector
-
- toSparse() - Method in class org.apache.spark.mllib.linalg.SparseVector
-
- toSparse() - Method in interface org.apache.spark.mllib.linalg.Vector
-
Converts this vector to a sparse vector with all explicit zeros removed.
- toSplitInfo(Class<?>, String, InputSplit) - Static method in class org.apache.spark.scheduler.SplitInfo
-
- toSplitInfo(Class<?>, String, InputSplit) - Static method in class org.apache.spark.scheduler.SplitInfo
-
- toString() - Method in class org.apache.spark.Accumulable
-
- toString() - Method in class org.apache.spark.api.java.JavaRDD
-
- toString() - Method in class org.apache.spark.broadcast.Broadcast
-
- toString() - Method in class org.apache.spark.graphx.EdgeDirection
-
- toString() - Method in class org.apache.spark.graphx.EdgeTriplet
-
- toString() - Method in class org.apache.spark.ml.attribute.Attribute
-
- toString() - Method in class org.apache.spark.ml.classification.DecisionTreeClassificationModel
-
- toString() - Method in class org.apache.spark.ml.classification.GBTClassificationModel
-
- toString() - Method in class org.apache.spark.ml.classification.RandomForestClassificationModel
-
- toString() - Method in class org.apache.spark.ml.param.Param
-
- toString() - Method in class org.apache.spark.ml.param.ParamMap
-
- toString() - Method in class org.apache.spark.ml.regression.DecisionTreeRegressionModel
-
- toString() - Method in class org.apache.spark.ml.regression.GBTRegressionModel
-
- toString() - Method in class org.apache.spark.ml.regression.RandomForestRegressionModel
-
- toString() - Method in class org.apache.spark.ml.tree.InternalNode
-
- toString() - Method in class org.apache.spark.ml.tree.LeafNode
-
- toString() - Method in class org.apache.spark.mllib.classification.LogisticRegressionModel
-
- toString() - Method in class org.apache.spark.mllib.classification.SVMModel
-
- toString() - Method in class org.apache.spark.mllib.linalg.DenseVector
-
- toString() - Method in interface org.apache.spark.mllib.linalg.Matrix
-
A human readable representation of the matrix
- toString(int, int) - Method in interface org.apache.spark.mllib.linalg.Matrix
-
A human readable representation of the matrix with maximum lines and width
- toString() - Method in class org.apache.spark.mllib.linalg.SparseVector
-
- toString() - Method in class org.apache.spark.mllib.regression.GeneralizedLinearModel
-
Print a summary of the model.
- toString() - Method in class org.apache.spark.mllib.regression.LabeledPoint
-
- toString() - Method in class org.apache.spark.mllib.stat.test.ChiSqTestResult
-
- toString() - Method in interface org.apache.spark.mllib.stat.test.TestResult
-
String explaining the hypothesis test result.
- toString() - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel
-
Print a summary of the model.
- toString() - Method in class org.apache.spark.mllib.tree.model.InformationGainStats
-
- toString() - Method in class org.apache.spark.mllib.tree.model.Node
-
- toString() - Method in class org.apache.spark.mllib.tree.model.Predict
-
- toString() - Method in class org.apache.spark.mllib.tree.model.Split
-
- toString() - Method in class org.apache.spark.partial.BoundedDouble
-
- toString() - Method in class org.apache.spark.partial.PartialResult
-
- toString() - Method in class org.apache.spark.rdd.RDD
-
- toString() - Method in class org.apache.spark.scheduler.InputFormatInfo
-
- toString() - Method in class org.apache.spark.scheduler.SplitInfo
-
- toString() - Method in class org.apache.spark.SerializableWritable
-
- toString() - Method in class org.apache.spark.sql.Column
-
- toString() - Method in class org.apache.spark.sql.DataFrame
-
- toString() - Method in interface org.apache.spark.sql.Row
-
- toString() - Method in class org.apache.spark.sql.types.Decimal
-
- toString() - Method in class org.apache.spark.sql.types.DecimalType
-
- toString() - Method in class org.apache.spark.sql.types.Metadata
-
- toString() - Method in class org.apache.spark.sql.types.StructField
-
- toString() - Method in class org.apache.spark.sql.types.UTF8String
-
- toString() - Method in class org.apache.spark.storage.BlockId
-
- toString() - Method in class org.apache.spark.storage.BlockManagerId
-
- toString() - Method in class org.apache.spark.storage.RDDInfo
-
- toString() - Method in class org.apache.spark.storage.StorageLevel
-
- toString() - Method in class org.apache.spark.streaming.Duration
-
- toString() - Method in class org.apache.spark.streaming.kafka.Broker
-
- toString() - Method in class org.apache.spark.streaming.kafka.OffsetRange
-
- toString() - Method in class org.apache.spark.streaming.Time
-
- toString() - Method in class org.apache.spark.util.MutablePair
-
- toString() - Method in class org.apache.spark.util.StatCounter
-
- toString() - Method in class org.apache.spark.util.Vector
-
- toStructField(Metadata) - Method in class org.apache.spark.ml.attribute.Attribute
-
Converts to a StructField
with some existing metadata.
- toStructField() - Method in class org.apache.spark.ml.attribute.Attribute
-
Converts to a StructField
.
- toStructField(Metadata) - Method in class org.apache.spark.ml.attribute.AttributeGroup
-
Converts to a StructField with some existing metadata.
- toStructField() - Method in class org.apache.spark.ml.attribute.AttributeGroup
-
Converts to a StructField.
- totalBlocksFetched() - Method in class org.apache.spark.status.api.v1.ShuffleReadMetricDistributions
-
- totalBlocksFetched() - Method in class org.apache.spark.status.api.v1.ShuffleReadMetrics
-
- totalCores() - Method in class org.apache.spark.scheduler.cluster.ExecutorInfo
-
- totalDelay() - Method in class org.apache.spark.streaming.scheduler.BatchInfo
-
Time taken for all the jobs of this batch to finish processing from the time they
were submitted.
- totalDuration() - Method in class org.apache.spark.status.api.v1.ExecutorSummary
-
- totalInputBytes() - Method in class org.apache.spark.status.api.v1.ExecutorSummary
-
- totalShuffleRead() - Method in class org.apache.spark.status.api.v1.ExecutorSummary
-
- totalShuffleWrite() - Method in class org.apache.spark.status.api.v1.ExecutorSummary
-
- totalTasks() - Method in class org.apache.spark.status.api.v1.ExecutorSummary
-
- toTuple() - Method in class org.apache.spark.graphx.EdgeTriplet
-
- toUnscaledLong() - Method in class org.apache.spark.sql.types.Decimal
-
- toUpperCase() - Method in class org.apache.spark.sql.types.UTF8String
-
- train(RDD<ALS.Rating<ID>>, int, int, int, int, double, boolean, double, boolean, StorageLevel, StorageLevel, int, long, ClassTag<ID>, Ordering<ID>) - Static method in class org.apache.spark.ml.recommendation.ALS
-
:: DeveloperApi ::
Implementation of the ALS algorithm.
- train(RDD<LabeledPoint>, int, double, double, Vector) - Static method in class org.apache.spark.mllib.classification.LogisticRegressionWithSGD
-
Train a logistic regression model given an RDD of (label, features) pairs.
- train(RDD<LabeledPoint>, int, double, double) - Static method in class org.apache.spark.mllib.classification.LogisticRegressionWithSGD
-
Train a logistic regression model given an RDD of (label, features) pairs.
- train(RDD<LabeledPoint>, int, double) - Static method in class org.apache.spark.mllib.classification.LogisticRegressionWithSGD
-
Train a logistic regression model given an RDD of (label, features) pairs.
- train(RDD<LabeledPoint>, int) - Static method in class org.apache.spark.mllib.classification.LogisticRegressionWithSGD
-
Train a logistic regression model given an RDD of (label, features) pairs.
- train(RDD<LabeledPoint>) - Static method in class org.apache.spark.mllib.classification.NaiveBayes
-
Trains a Naive Bayes model given an RDD of (label, features)
pairs.
- train(RDD<LabeledPoint>, double) - Static method in class org.apache.spark.mllib.classification.NaiveBayes
-
Trains a Naive Bayes model given an RDD of (label, features)
pairs.
- train(RDD<LabeledPoint>, double, String) - Static method in class org.apache.spark.mllib.classification.NaiveBayes
-
Trains a Naive Bayes model given an RDD of (label, features)
pairs.
- train(RDD<LabeledPoint>, int, double, double, double, Vector) - Static method in class org.apache.spark.mllib.classification.SVMWithSGD
-
Train a SVM model given an RDD of (label, features) pairs.
- train(RDD<LabeledPoint>, int, double, double, double) - Static method in class org.apache.spark.mllib.classification.SVMWithSGD
-
Train a SVM model given an RDD of (label, features) pairs.
- train(RDD<LabeledPoint>, int, double, double) - Static method in class org.apache.spark.mllib.classification.SVMWithSGD
-
Train a SVM model given an RDD of (label, features) pairs.
- train(RDD<LabeledPoint>, int) - Static method in class org.apache.spark.mllib.classification.SVMWithSGD
-
Train a SVM model given an RDD of (label, features) pairs.
- train(RDD<Vector>, int, int, int, String, long) - Static method in class org.apache.spark.mllib.clustering.KMeans
-
Trains a k-means model using the given set of parameters.
- train(RDD<Vector>, int, int, int, String) - Static method in class org.apache.spark.mllib.clustering.KMeans
-
Trains a k-means model using the given set of parameters.
- train(RDD<Vector>, int, int) - Static method in class org.apache.spark.mllib.clustering.KMeans
-
Trains a k-means model using specified parameters and the default values for unspecified.
- train(RDD<Vector>, int, int, int) - Static method in class org.apache.spark.mllib.clustering.KMeans
-
Trains a k-means model using specified parameters and the default values for unspecified.
- train(RDD<Rating>, int, int, double, int, long) - Static method in class org.apache.spark.mllib.recommendation.ALS
-
- train(RDD<Rating>, int, int, double, int) - Static method in class org.apache.spark.mllib.recommendation.ALS
-
- train(RDD<Rating>, int, int, double) - Static method in class org.apache.spark.mllib.recommendation.ALS
-
Train a matrix factorization model given an RDD of ratings given by users to some products,
in the form of (userID, productID, rating) pairs.
- train(RDD<Rating>, int, int) - Static method in class org.apache.spark.mllib.recommendation.ALS
-
Train a matrix factorization model given an RDD of ratings given by users to some products,
in the form of (userID, productID, rating) pairs.
- train(RDD<LabeledPoint>, int, double, double, double, Vector) - Static method in class org.apache.spark.mllib.regression.LassoWithSGD
-
Train a Lasso model given an RDD of (label, features) pairs.
- train(RDD<LabeledPoint>, int, double, double, double) - Static method in class org.apache.spark.mllib.regression.LassoWithSGD
-
Train a Lasso model given an RDD of (label, features) pairs.
- train(RDD<LabeledPoint>, int, double, double) - Static method in class org.apache.spark.mllib.regression.LassoWithSGD
-
Train a Lasso model given an RDD of (label, features) pairs.
- train(RDD<LabeledPoint>, int) - Static method in class org.apache.spark.mllib.regression.LassoWithSGD
-
Train a Lasso model given an RDD of (label, features) pairs.
- train(RDD<LabeledPoint>, int, double, double, Vector) - Static method in class org.apache.spark.mllib.regression.LinearRegressionWithSGD
-
Train a Linear Regression model given an RDD of (label, features) pairs.
- train(RDD<LabeledPoint>, int, double, double) - Static method in class org.apache.spark.mllib.regression.LinearRegressionWithSGD
-
Train a LinearRegression model given an RDD of (label, features) pairs.
- train(RDD<LabeledPoint>, int, double) - Static method in class org.apache.spark.mllib.regression.LinearRegressionWithSGD
-
Train a LinearRegression model given an RDD of (label, features) pairs.
- train(RDD<LabeledPoint>, int) - Static method in class org.apache.spark.mllib.regression.LinearRegressionWithSGD
-
Train a LinearRegression model given an RDD of (label, features) pairs.
- train(RDD<LabeledPoint>, int, double, double, double, Vector) - Static method in class org.apache.spark.mllib.regression.RidgeRegressionWithSGD
-
Train a RidgeRegression model given an RDD of (label, features) pairs.
- train(RDD<LabeledPoint>, int, double, double, double) - Static method in class org.apache.spark.mllib.regression.RidgeRegressionWithSGD
-
Train a RidgeRegression model given an RDD of (label, features) pairs.
- train(RDD<LabeledPoint>, int, double, double) - Static method in class org.apache.spark.mllib.regression.RidgeRegressionWithSGD
-
Train a RidgeRegression model given an RDD of (label, features) pairs.
- train(RDD<LabeledPoint>, int) - Static method in class org.apache.spark.mllib.regression.RidgeRegressionWithSGD
-
Train a RidgeRegression model given an RDD of (label, features) pairs.
- train(RDD<LabeledPoint>, Strategy) - Static method in class org.apache.spark.mllib.tree.DecisionTree
-
Method to train a decision tree model.
- train(RDD<LabeledPoint>, Enumeration.Value, Impurity, int) - Static method in class org.apache.spark.mllib.tree.DecisionTree
-
Method to train a decision tree model.
- train(RDD<LabeledPoint>, Enumeration.Value, Impurity, int, int) - Static method in class org.apache.spark.mllib.tree.DecisionTree
-
Method to train a decision tree model.
- train(RDD<LabeledPoint>, Enumeration.Value, Impurity, int, int, int, Enumeration.Value, Map<Object, Object>) - Static method in class org.apache.spark.mllib.tree.DecisionTree
-
Method to train a decision tree model.
- train(RDD<LabeledPoint>, BoostingStrategy) - Static method in class org.apache.spark.mllib.tree.GradientBoostedTrees
-
Method to train a gradient boosting model.
- train(JavaRDD<LabeledPoint>, BoostingStrategy) - Static method in class org.apache.spark.mllib.tree.GradientBoostedTrees
-
Java-friendly API for GradientBoostedTrees$.train(org.apache.spark.rdd.RDD<org.apache.spark.mllib.regression.LabeledPoint>, org.apache.spark.mllib.tree.configuration.BoostingStrategy)
- trainClassifier(RDD<LabeledPoint>, int, Map<Object, Object>, String, int, int) - Static method in class org.apache.spark.mllib.tree.DecisionTree
-
Method to train a decision tree model for binary or multiclass classification.
- trainClassifier(JavaRDD<LabeledPoint>, int, Map<Integer, Integer>, String, int, int) - Static method in class org.apache.spark.mllib.tree.DecisionTree
-
Java-friendly API for DecisionTree$.trainClassifier(org.apache.spark.rdd.RDD<org.apache.spark.mllib.regression.LabeledPoint>, int, scala.collection.immutable.Map<java.lang.Object, java.lang.Object>, java.lang.String, int, int)
- trainClassifier(RDD<LabeledPoint>, Strategy, int, String, int) - Static method in class org.apache.spark.mllib.tree.RandomForest
-
Method to train a decision tree model for binary or multiclass classification.
- trainClassifier(RDD<LabeledPoint>, int, Map<Object, Object>, int, String, String, int, int, int) - Static method in class org.apache.spark.mllib.tree.RandomForest
-
Method to train a decision tree model for binary or multiclass classification.
- trainClassifier(JavaRDD<LabeledPoint>, int, Map<Integer, Integer>, int, String, String, int, int, int) - Static method in class org.apache.spark.mllib.tree.RandomForest
-
Java-friendly API for RandomForest$.trainClassifier(org.apache.spark.rdd.RDD<org.apache.spark.mllib.regression.LabeledPoint>, org.apache.spark.mllib.tree.configuration.Strategy, int, java.lang.String, int)
- trainImplicit(RDD<Rating>, int, int, double, int, double, long) - Static method in class org.apache.spark.mllib.recommendation.ALS
-
Train a matrix factorization model given an RDD of 'implicit preferences' given by users
to some products, in the form of (userID, productID, preference) pairs.
- trainImplicit(RDD<Rating>, int, int, double, int, double) - Static method in class org.apache.spark.mllib.recommendation.ALS
-
Train a matrix factorization model given an RDD of 'implicit preferences' given by users
to some products, in the form of (userID, productID, preference) pairs.
- trainImplicit(RDD<Rating>, int, int, double, double) - Static method in class org.apache.spark.mllib.recommendation.ALS
-
Train a matrix factorization model given an RDD of 'implicit preferences' given by users to
some products, in the form of (userID, productID, preference) pairs.
- trainImplicit(RDD<Rating>, int, int) - Static method in class org.apache.spark.mllib.recommendation.ALS
-
Train a matrix factorization model given an RDD of 'implicit preferences' ratings given by
users to some products, in the form of (userID, productID, rating) pairs.
- trainOn(DStream<Vector>) - Method in class org.apache.spark.mllib.clustering.StreamingKMeans
-
Update the clustering model by training on batches of data from a DStream.
- trainOn(JavaDStream<Vector>) - Method in class org.apache.spark.mllib.clustering.StreamingKMeans
-
Java-friendly version of `trainOn`.
- trainOn(DStream<LabeledPoint>) - Method in class org.apache.spark.mllib.regression.StreamingLinearAlgorithm
-
Update the model by training on batches of data from a DStream.
- trainOn(JavaDStream<LabeledPoint>) - Method in class org.apache.spark.mllib.regression.StreamingLinearAlgorithm
-
Java-friendly version of `trainOn`.
- trainRegressor(RDD<LabeledPoint>, Map<Object, Object>, String, int, int) - Static method in class org.apache.spark.mllib.tree.DecisionTree
-
Method to train a decision tree model for regression.
- trainRegressor(JavaRDD<LabeledPoint>, Map<Integer, Integer>, String, int, int) - Static method in class org.apache.spark.mllib.tree.DecisionTree
-
Java-friendly API for DecisionTree$.trainRegressor(org.apache.spark.rdd.RDD<org.apache.spark.mllib.regression.LabeledPoint>, scala.collection.immutable.Map<java.lang.Object, java.lang.Object>, java.lang.String, int, int)
- trainRegressor(RDD<LabeledPoint>, Strategy, int, String, int) - Static method in class org.apache.spark.mllib.tree.RandomForest
-
Method to train a decision tree model for regression.
- trainRegressor(RDD<LabeledPoint>, Map<Object, Object>, int, String, String, int, int, int) - Static method in class org.apache.spark.mllib.tree.RandomForest
-
Method to train a decision tree model for regression.
- trainRegressor(JavaRDD<LabeledPoint>, Map<Integer, Integer>, int, String, String, int, int, int) - Static method in class org.apache.spark.mllib.tree.RandomForest
-
Java-friendly API for RandomForest$.trainRegressor(org.apache.spark.rdd.RDD<org.apache.spark.mllib.regression.LabeledPoint>, org.apache.spark.mllib.tree.configuration.Strategy, int, java.lang.String, int)
- transform(DataFrame) - Method in class org.apache.spark.ml.classification.ClassificationModel
-
Transforms dataset by reading from featuresCol
, and appending new columns as specified by
parameters:
- predicted labels as predictionCol
of type Double
- raw predictions (confidences) as rawPredictionCol
of type Vector
.
- transform(DataFrame) - Method in class org.apache.spark.ml.classification.OneVsRestModel
-
- transform(DataFrame) - Method in class org.apache.spark.ml.feature.Binarizer
-
- transform(DataFrame) - Method in class org.apache.spark.ml.feature.Bucketizer
-
- transform(DataFrame) - Method in class org.apache.spark.ml.feature.HashingTF
-
- transform(DataFrame) - Method in class org.apache.spark.ml.feature.IDFModel
-
- transform(DataFrame) - Method in class org.apache.spark.ml.feature.OneHotEncoder
-
- transform(DataFrame) - Method in class org.apache.spark.ml.feature.StandardScalerModel
-
- transform(DataFrame) - Method in class org.apache.spark.ml.feature.StringIndexerModel
-
- transform(DataFrame) - Method in class org.apache.spark.ml.feature.VectorAssembler
-
- transform(DataFrame) - Method in class org.apache.spark.ml.feature.VectorIndexerModel
-
- transform(DataFrame) - Method in class org.apache.spark.ml.feature.Word2VecModel
-
Transform a sentence column to a vector column to represent the whole sentence.
- transform(DataFrame) - Method in class org.apache.spark.ml.PipelineModel
-
- transform(DataFrame) - Method in class org.apache.spark.ml.PredictionModel
-
Transforms dataset by reading from featuresCol
, calling predict()
, and storing
the predictions as a new column predictionCol
.
- transform(DataFrame) - Method in class org.apache.spark.ml.recommendation.ALSModel
-
- transform(DataFrame, ParamPair<?>, ParamPair<?>...) - Method in class org.apache.spark.ml.Transformer
-
Transforms the dataset with optional parameters
- transform(DataFrame, ParamPair<?>, Seq<ParamPair<?>>) - Method in class org.apache.spark.ml.Transformer
-
Transforms the dataset with optional parameters
- transform(DataFrame, ParamMap) - Method in class org.apache.spark.ml.Transformer
-
Transforms the dataset with provided parameter map as additional parameters.
- transform(DataFrame) - Method in class org.apache.spark.ml.Transformer
-
Transforms the input dataset.
- transform(DataFrame) - Method in class org.apache.spark.ml.tuning.CrossValidatorModel
-
- transform(DataFrame) - Method in class org.apache.spark.ml.UnaryTransformer
-
- transform(Vector) - Method in class org.apache.spark.mllib.feature.ChiSqSelectorModel
-
Applies transformation on a vector.
- transform(Vector) - Method in class org.apache.spark.mllib.feature.ElementwiseProduct
-
Does the hadamard product transformation.
- transform(Iterable<Object>) - Method in class org.apache.spark.mllib.feature.HashingTF
-
Transforms the input document into a sparse term frequency vector.
- transform(Iterable<?>) - Method in class org.apache.spark.mllib.feature.HashingTF
-
Transforms the input document into a sparse term frequency vector (Java version).
- transform(RDD<D>) - Method in class org.apache.spark.mllib.feature.HashingTF
-
Transforms the input document to term frequency vectors.
- transform(JavaRDD<D>) - Method in class org.apache.spark.mllib.feature.HashingTF
-
Transforms the input document to term frequency vectors (Java version).
- transform(RDD<Vector>) - Method in class org.apache.spark.mllib.feature.IDFModel
-
Transforms term frequency (TF) vectors to TF-IDF vectors.
- transform(Vector) - Method in class org.apache.spark.mllib.feature.IDFModel
-
Transforms a term frequency (TF) vector to a TF-IDF vector
- transform(JavaRDD<Vector>) - Method in class org.apache.spark.mllib.feature.IDFModel
-
Transforms term frequency (TF) vectors to TF-IDF vectors (Java version).
- transform(Vector) - Method in class org.apache.spark.mllib.feature.Normalizer
-
Applies unit length normalization on a vector.
- transform(Vector) - Method in class org.apache.spark.mllib.feature.PCAModel
-
Transform a vector by computed Principal Components.
- transform(Vector) - Method in class org.apache.spark.mllib.feature.StandardScalerModel
-
Applies standardization transformation on a vector.
- transform(Vector) - Method in interface org.apache.spark.mllib.feature.VectorTransformer
-
Applies transformation on a vector.
- transform(RDD<Vector>) - Method in interface org.apache.spark.mllib.feature.VectorTransformer
-
Applies transformation on an RDD[Vector].
- transform(JavaRDD<Vector>) - Method in interface org.apache.spark.mllib.feature.VectorTransformer
-
Applies transformation on an JavaRDD[Vector].
- transform(String) - Method in class org.apache.spark.mllib.feature.Word2VecModel
-
Transforms a word to its vector representation
- transform(Function<R, JavaRDD<U>>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
-
Return a new DStream in which each RDD is generated by applying a function
on each RDD of 'this' DStream.
- transform(Function2<R, Time, JavaRDD<U>>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
-
Return a new DStream in which each RDD is generated by applying a function
on each RDD of 'this' DStream.
- transform(List<JavaDStream<?>>, Function2<List<JavaRDD<?>>, Time, JavaRDD<T>>) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
-
Create a new DStream in which each RDD is generated by applying a function on RDDs of
the DStreams.
- transform(Function1<RDD<T>, RDD<U>>, ClassTag<U>) - Method in class org.apache.spark.streaming.dstream.DStream
-
Return a new DStream in which each RDD is generated by applying a function
on each RDD of 'this' DStream.
- transform(Function2<RDD<T>, Time, RDD<U>>, ClassTag<U>) - Method in class org.apache.spark.streaming.dstream.DStream
-
Return a new DStream in which each RDD is generated by applying a function
on each RDD of 'this' DStream.
- transform(Seq<DStream<?>>, Function2<Seq<RDD<?>>, Time, RDD<T>>, ClassTag<T>) - Method in class org.apache.spark.streaming.StreamingContext
-
Create a new DStream in which each RDD is generated by applying a function on RDDs of
the DStreams.
- Transformer - Class in org.apache.spark.ml
-
:: DeveloperApi ::
Abstract class for transformers that transform one dataset into another.
- Transformer() - Constructor for class org.apache.spark.ml.Transformer
-
- transformSchema(StructType) - Method in class org.apache.spark.ml.classification.OneVsRest
-
- transformSchema(StructType) - Method in class org.apache.spark.ml.classification.OneVsRestModel
-
- transformSchema(StructType) - Method in class org.apache.spark.ml.feature.Binarizer
-
- transformSchema(StructType) - Method in class org.apache.spark.ml.feature.Bucketizer
-
- transformSchema(StructType) - Method in class org.apache.spark.ml.feature.HashingTF
-
- transformSchema(StructType) - Method in class org.apache.spark.ml.feature.IDF
-
- transformSchema(StructType) - Method in class org.apache.spark.ml.feature.IDFModel
-
- transformSchema(StructType) - Method in class org.apache.spark.ml.feature.OneHotEncoder
-
- transformSchema(StructType) - Method in class org.apache.spark.ml.feature.StandardScaler
-
- transformSchema(StructType) - Method in class org.apache.spark.ml.feature.StandardScalerModel
-
- transformSchema(StructType) - Method in class org.apache.spark.ml.feature.StringIndexer
-
- transformSchema(StructType) - Method in class org.apache.spark.ml.feature.StringIndexerModel
-
- transformSchema(StructType) - Method in class org.apache.spark.ml.feature.VectorAssembler
-
- transformSchema(StructType) - Method in class org.apache.spark.ml.feature.VectorIndexer
-
- transformSchema(StructType) - Method in class org.apache.spark.ml.feature.VectorIndexerModel
-
- transformSchema(StructType) - Method in class org.apache.spark.ml.feature.Word2Vec
-
- transformSchema(StructType) - Method in class org.apache.spark.ml.feature.Word2VecModel
-
- transformSchema(StructType) - Method in class org.apache.spark.ml.Pipeline
-
- transformSchema(StructType) - Method in class org.apache.spark.ml.PipelineModel
-
- transformSchema(StructType) - Method in class org.apache.spark.ml.PipelineStage
-
:: DeveloperApi ::
- transformSchema(StructType) - Method in class org.apache.spark.ml.PredictionModel
-
- transformSchema(StructType) - Method in class org.apache.spark.ml.Predictor
-
- transformSchema(StructType) - Method in class org.apache.spark.ml.recommendation.ALS
-
- transformSchema(StructType) - Method in class org.apache.spark.ml.recommendation.ALSModel
-
- transformSchema(StructType) - Method in class org.apache.spark.ml.tuning.CrossValidator
-
- transformSchema(StructType) - Method in class org.apache.spark.ml.tuning.CrossValidatorModel
-
- transformSchema(StructType) - Method in class org.apache.spark.ml.UnaryTransformer
-
- transformToPair(Function<R, JavaPairRDD<K2, V2>>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
-
Return a new DStream in which each RDD is generated by applying a function
on each RDD of 'this' DStream.
- transformToPair(Function2<R, Time, JavaPairRDD<K2, V2>>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
-
Return a new DStream in which each RDD is generated by applying a function
on each RDD of 'this' DStream.
- transformToPair(List<JavaDStream<?>>, Function2<List<JavaRDD<?>>, Time, JavaPairRDD<K, V>>) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
-
Create a new DStream in which each RDD is generated by applying a function on RDDs of
the DStreams.
- transformWith(JavaDStream<U>, Function3<R, JavaRDD<U>, Time, JavaRDD<W>>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
-
Return a new DStream in which each RDD is generated by applying a function
on each RDD of 'this' DStream and 'other' DStream.
- transformWith(JavaPairDStream<K2, V2>, Function3<R, JavaPairRDD<K2, V2>, Time, JavaRDD<W>>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
-
Return a new DStream in which each RDD is generated by applying a function
on each RDD of 'this' DStream and 'other' DStream.
- transformWith(DStream<U>, Function2<RDD<T>, RDD<U>, RDD<V>>, ClassTag<U>, ClassTag<V>) - Method in class org.apache.spark.streaming.dstream.DStream
-
Return a new DStream in which each RDD is generated by applying a function
on each RDD of 'this' DStream and 'other' DStream.
- transformWith(DStream<U>, Function3<RDD<T>, RDD<U>, Time, RDD<V>>, ClassTag<U>, ClassTag<V>) - Method in class org.apache.spark.streaming.dstream.DStream
-
Return a new DStream in which each RDD is generated by applying a function
on each RDD of 'this' DStream and 'other' DStream.
- transformWithToPair(JavaDStream<U>, Function3<R, JavaRDD<U>, Time, JavaPairRDD<K2, V2>>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
-
Return a new DStream in which each RDD is generated by applying a function
on each RDD of 'this' DStream and 'other' DStream.
- transformWithToPair(JavaPairDStream<K2, V2>, Function3<R, JavaPairRDD<K2, V2>, Time, JavaPairRDD<K3, V3>>) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
-
Return a new DStream in which each RDD is generated by applying a function
on each RDD of 'this' DStream and 'other' DStream.
- transpose() - Method in class org.apache.spark.mllib.linalg.DenseMatrix
-
- transpose() - Method in class org.apache.spark.mllib.linalg.distributed.BlockMatrix
-
Transpose this BlockMatrix
.
- transpose() - Method in class org.apache.spark.mllib.linalg.distributed.CoordinateMatrix
-
- transpose() - Method in interface org.apache.spark.mllib.linalg.Matrix
-
Transpose the Matrix.
- transpose() - Method in class org.apache.spark.mllib.linalg.SparseMatrix
-
- treeAggregate(U, Function2<U, T, U>, Function2<U, U, U>, int) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Aggregates the elements of this RDD in a multi-level tree pattern.
- treeAggregate(U, Function2<U, T, U>, Function2<U, U, U>) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
- treeAggregate(U, Function2<U, T, U>, Function2<U, U, U>, int, ClassTag<U>) - Method in class org.apache.spark.mllib.rdd.RDDFunctions
-
- treeAggregate(U, Function2<U, T, U>, Function2<U, U, U>, int, ClassTag<U>) - Method in class org.apache.spark.rdd.RDD
-
Aggregates the elements of this RDD in a multi-level tree pattern.
- treeReduce(Function2<T, T, T>, int) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
Reduces the elements of this RDD in a multi-level tree pattern.
- treeReduce(Function2<T, T, T>) - Method in interface org.apache.spark.api.java.JavaRDDLike
-
- treeReduce(Function2<T, T, T>, int) - Method in class org.apache.spark.mllib.rdd.RDDFunctions
-
- treeReduce(Function2<T, T, T>, int) - Method in class org.apache.spark.rdd.RDD
-
Reduces the elements of this RDD in a multi-level tree pattern.
- trees() - Method in class org.apache.spark.ml.classification.GBTClassificationModel
-
- trees() - Method in class org.apache.spark.ml.classification.RandomForestClassificationModel
-
- trees() - Method in class org.apache.spark.ml.regression.GBTRegressionModel
-
- trees() - Method in class org.apache.spark.ml.regression.RandomForestRegressionModel
-
- trees() - Method in class org.apache.spark.mllib.tree.model.GradientBoostedTreesModel
-
- trees() - Method in class org.apache.spark.mllib.tree.model.RandomForestModel
-
- treeStrategy() - Method in class org.apache.spark.mllib.tree.configuration.BoostingStrategy
-
- treeString() - Method in class org.apache.spark.sql.types.StructType
-
- treeWeights() - Method in class org.apache.spark.ml.classification.GBTClassificationModel
-
- treeWeights() - Method in class org.apache.spark.ml.classification.RandomForestClassificationModel
-
- treeWeights() - Method in class org.apache.spark.ml.regression.GBTRegressionModel
-
- treeWeights() - Method in class org.apache.spark.ml.regression.RandomForestRegressionModel
-
- treeWeights() - Method in class org.apache.spark.mllib.tree.model.GradientBoostedTreesModel
-
- triangleCount() - Method in class org.apache.spark.graphx.GraphOps
-
Compute the number of triangles passing through each vertex.
- TriangleCount - Class in org.apache.spark.graphx.lib
-
Compute the number of triangles passing through each vertex.
- TriangleCount() - Constructor for class org.apache.spark.graphx.lib.TriangleCount
-
- TripletFields - Class in org.apache.spark.graphx
-
Represents a subset of the fields of an [[EdgeTriplet]] or [[EdgeContext]].
- TripletFields() - Constructor for class org.apache.spark.graphx.TripletFields
-
Constructs a default TripletFields in which all fields are included.
- TripletFields(boolean, boolean, boolean) - Constructor for class org.apache.spark.graphx.TripletFields
-
- triplets() - Method in class org.apache.spark.graphx.Graph
-
An RDD containing the edge triplets, which are edges along with the vertex data associated with
the adjacent vertices.
- triplets() - Method in class org.apache.spark.graphx.impl.GraphImpl
-
Return a RDD that brings edges together with their source and destination vertices.
- truePositiveRate(double) - Method in class org.apache.spark.mllib.evaluation.MulticlassMetrics
-
Returns true positive rate for a given label (category)
- TwitterUtils - Class in org.apache.spark.streaming.twitter
-
- TwitterUtils() - Constructor for class org.apache.spark.streaming.twitter.TwitterUtils
-
- typeName() - Method in class org.apache.spark.sql.types.DataType
-
Name of the type used in JSON serialization.
- typeName() - Method in class org.apache.spark.sql.types.DecimalType
-
- U() - Method in class org.apache.spark.mllib.linalg.SingularValueDecomposition
-
- udf(Function0<RT>, TypeTags.TypeTag<RT>) - Static method in class org.apache.spark.sql.functions
-
Defines a user-defined function of 0 arguments as user-defined function (UDF).
- udf(Function1<A1, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>) - Static method in class org.apache.spark.sql.functions
-
Defines a user-defined function of 1 arguments as user-defined function (UDF).
- udf(Function2<A1, A2, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>, TypeTags.TypeTag<A2>) - Static method in class org.apache.spark.sql.functions
-
Defines a user-defined function of 2 arguments as user-defined function (UDF).
- udf(Function3<A1, A2, A3, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>, TypeTags.TypeTag<A2>, TypeTags.TypeTag<A3>) - Static method in class org.apache.spark.sql.functions
-
Defines a user-defined function of 3 arguments as user-defined function (UDF).
- udf(Function4<A1, A2, A3, A4, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>, TypeTags.TypeTag<A2>, TypeTags.TypeTag<A3>, TypeTags.TypeTag<A4>) - Static method in class org.apache.spark.sql.functions
-
Defines a user-defined function of 4 arguments as user-defined function (UDF).
- udf(Function5<A1, A2, A3, A4, A5, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>, TypeTags.TypeTag<A2>, TypeTags.TypeTag<A3>, TypeTags.TypeTag<A4>, TypeTags.TypeTag<A5>) - Static method in class org.apache.spark.sql.functions
-
Defines a user-defined function of 5 arguments as user-defined function (UDF).
- udf(Function6<A1, A2, A3, A4, A5, A6, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>, TypeTags.TypeTag<A2>, TypeTags.TypeTag<A3>, TypeTags.TypeTag<A4>, TypeTags.TypeTag<A5>, TypeTags.TypeTag<A6>) - Static method in class org.apache.spark.sql.functions
-
Defines a user-defined function of 6 arguments as user-defined function (UDF).
- udf(Function7<A1, A2, A3, A4, A5, A6, A7, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>, TypeTags.TypeTag<A2>, TypeTags.TypeTag<A3>, TypeTags.TypeTag<A4>, TypeTags.TypeTag<A5>, TypeTags.TypeTag<A6>, TypeTags.TypeTag<A7>) - Static method in class org.apache.spark.sql.functions
-
Defines a user-defined function of 7 arguments as user-defined function (UDF).
- udf(Function8<A1, A2, A3, A4, A5, A6, A7, A8, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>, TypeTags.TypeTag<A2>, TypeTags.TypeTag<A3>, TypeTags.TypeTag<A4>, TypeTags.TypeTag<A5>, TypeTags.TypeTag<A6>, TypeTags.TypeTag<A7>, TypeTags.TypeTag<A8>) - Static method in class org.apache.spark.sql.functions
-
Defines a user-defined function of 8 arguments as user-defined function (UDF).
- udf(Function9<A1, A2, A3, A4, A5, A6, A7, A8, A9, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>, TypeTags.TypeTag<A2>, TypeTags.TypeTag<A3>, TypeTags.TypeTag<A4>, TypeTags.TypeTag<A5>, TypeTags.TypeTag<A6>, TypeTags.TypeTag<A7>, TypeTags.TypeTag<A8>, TypeTags.TypeTag<A9>) - Static method in class org.apache.spark.sql.functions
-
Defines a user-defined function of 9 arguments as user-defined function (UDF).
- udf(Function10<A1, A2, A3, A4, A5, A6, A7, A8, A9, A10, RT>, TypeTags.TypeTag<RT>, TypeTags.TypeTag<A1>, TypeTags.TypeTag<A2>, TypeTags.TypeTag<A3>, TypeTags.TypeTag<A4>, TypeTags.TypeTag<A5>, TypeTags.TypeTag<A6>, TypeTags.TypeTag<A7>, TypeTags.TypeTag<A8>, TypeTags.TypeTag<A9>, TypeTags.TypeTag<A10>) - Static method in class org.apache.spark.sql.functions
-
Defines a user-defined function of 10 arguments as user-defined function (UDF).
- udf() - Method in class org.apache.spark.sql.SQLContext
-
A collection of methods for registering user-defined functions (UDF).
- UDF1<T1,R> - Interface in org.apache.spark.sql.api.java
-
A Spark SQL UDF that has 1 arguments.
- UDF10<T1,T2,T3,T4,T5,T6,T7,T8,T9,T10,R> - Interface in org.apache.spark.sql.api.java
-
A Spark SQL UDF that has 10 arguments.
- UDF11<T1,T2,T3,T4,T5,T6,T7,T8,T9,T10,T11,R> - Interface in org.apache.spark.sql.api.java
-
A Spark SQL UDF that has 11 arguments.
- UDF12<T1,T2,T3,T4,T5,T6,T7,T8,T9,T10,T11,T12,R> - Interface in org.apache.spark.sql.api.java
-
A Spark SQL UDF that has 12 arguments.
- UDF13<T1,T2,T3,T4,T5,T6,T7,T8,T9,T10,T11,T12,T13,R> - Interface in org.apache.spark.sql.api.java
-
A Spark SQL UDF that has 13 arguments.
- UDF14<T1,T2,T3,T4,T5,T6,T7,T8,T9,T10,T11,T12,T13,T14,R> - Interface in org.apache.spark.sql.api.java
-
A Spark SQL UDF that has 14 arguments.
- UDF15<T1,T2,T3,T4,T5,T6,T7,T8,T9,T10,T11,T12,T13,T14,T15,R> - Interface in org.apache.spark.sql.api.java
-
A Spark SQL UDF that has 15 arguments.
- UDF16<T1,T2,T3,T4,T5,T6,T7,T8,T9,T10,T11,T12,T13,T14,T15,T16,R> - Interface in org.apache.spark.sql.api.java
-
A Spark SQL UDF that has 16 arguments.
- UDF17<T1,T2,T3,T4,T5,T6,T7,T8,T9,T10,T11,T12,T13,T14,T15,T16,T17,R> - Interface in org.apache.spark.sql.api.java
-
A Spark SQL UDF that has 17 arguments.
- UDF18<T1,T2,T3,T4,T5,T6,T7,T8,T9,T10,T11,T12,T13,T14,T15,T16,T17,T18,R> - Interface in org.apache.spark.sql.api.java
-
A Spark SQL UDF that has 18 arguments.
- UDF19<T1,T2,T3,T4,T5,T6,T7,T8,T9,T10,T11,T12,T13,T14,T15,T16,T17,T18,T19,R> - Interface in org.apache.spark.sql.api.java
-
A Spark SQL UDF that has 19 arguments.
- UDF2<T1,T2,R> - Interface in org.apache.spark.sql.api.java
-
A Spark SQL UDF that has 2 arguments.
- UDF20<T1,T2,T3,T4,T5,T6,T7,T8,T9,T10,T11,T12,T13,T14,T15,T16,T17,T18,T19,T20,R> - Interface in org.apache.spark.sql.api.java
-
A Spark SQL UDF that has 20 arguments.
- UDF21<T1,T2,T3,T4,T5,T6,T7,T8,T9,T10,T11,T12,T13,T14,T15,T16,T17,T18,T19,T20,T21,R> - Interface in org.apache.spark.sql.api.java
-
A Spark SQL UDF that has 21 arguments.
- UDF22<T1,T2,T3,T4,T5,T6,T7,T8,T9,T10,T11,T12,T13,T14,T15,T16,T17,T18,T19,T20,T21,T22,R> - Interface in org.apache.spark.sql.api.java
-
A Spark SQL UDF that has 22 arguments.
- UDF3<T1,T2,T3,R> - Interface in org.apache.spark.sql.api.java
-
A Spark SQL UDF that has 3 arguments.
- UDF4<T1,T2,T3,T4,R> - Interface in org.apache.spark.sql.api.java
-
A Spark SQL UDF that has 4 arguments.
- UDF5<T1,T2,T3,T4,T5,R> - Interface in org.apache.spark.sql.api.java
-
A Spark SQL UDF that has 5 arguments.
- UDF6<T1,T2,T3,T4,T5,T6,R> - Interface in org.apache.spark.sql.api.java
-
A Spark SQL UDF that has 6 arguments.
- UDF7<T1,T2,T3,T4,T5,T6,T7,R> - Interface in org.apache.spark.sql.api.java
-
A Spark SQL UDF that has 7 arguments.
- UDF8<T1,T2,T3,T4,T5,T6,T7,T8,R> - Interface in org.apache.spark.sql.api.java
-
A Spark SQL UDF that has 8 arguments.
- UDF9<T1,T2,T3,T4,T5,T6,T7,T8,T9,R> - Interface in org.apache.spark.sql.api.java
-
A Spark SQL UDF that has 9 arguments.
- UDFRegistration - Class in org.apache.spark.sql
-
Functions for registering user-defined functions.
- uid() - Method in class org.apache.spark.ml.classification.DecisionTreeClassificationModel
-
- uid() - Method in class org.apache.spark.ml.classification.DecisionTreeClassifier
-
- uid() - Method in class org.apache.spark.ml.classification.GBTClassificationModel
-
- uid() - Method in class org.apache.spark.ml.classification.GBTClassifier
-
- uid() - Method in class org.apache.spark.ml.classification.LogisticRegression
-
- uid() - Method in class org.apache.spark.ml.classification.LogisticRegressionModel
-
- uid() - Method in class org.apache.spark.ml.classification.OneVsRest
-
- uid() - Method in class org.apache.spark.ml.classification.OneVsRestModel
-
- uid() - Method in class org.apache.spark.ml.classification.RandomForestClassificationModel
-
- uid() - Method in class org.apache.spark.ml.classification.RandomForestClassifier
-
- uid() - Method in class org.apache.spark.ml.evaluation.BinaryClassificationEvaluator
-
- uid() - Method in class org.apache.spark.ml.evaluation.RegressionEvaluator
-
- uid() - Method in class org.apache.spark.ml.feature.Binarizer
-
- uid() - Method in class org.apache.spark.ml.feature.Bucketizer
-
- uid() - Method in class org.apache.spark.ml.feature.ElementwiseProduct
-
- uid() - Method in class org.apache.spark.ml.feature.HashingTF
-
- uid() - Method in class org.apache.spark.ml.feature.IDF
-
- uid() - Method in class org.apache.spark.ml.feature.IDFModel
-
- uid() - Method in class org.apache.spark.ml.feature.Normalizer
-
- uid() - Method in class org.apache.spark.ml.feature.OneHotEncoder
-
- uid() - Method in class org.apache.spark.ml.feature.PolynomialExpansion
-
- uid() - Method in class org.apache.spark.ml.feature.RegexTokenizer
-
- uid() - Method in class org.apache.spark.ml.feature.StandardScaler
-
- uid() - Method in class org.apache.spark.ml.feature.StandardScalerModel
-
- uid() - Method in class org.apache.spark.ml.feature.StringIndexer
-
- uid() - Method in class org.apache.spark.ml.feature.StringIndexerModel
-
- uid() - Method in class org.apache.spark.ml.feature.Tokenizer
-
- uid() - Method in class org.apache.spark.ml.feature.VectorAssembler
-
- uid() - Method in class org.apache.spark.ml.feature.VectorIndexer
-
- uid() - Method in class org.apache.spark.ml.feature.VectorIndexerModel
-
- uid() - Method in class org.apache.spark.ml.feature.Word2Vec
-
- uid() - Method in class org.apache.spark.ml.feature.Word2VecModel
-
- uid() - Method in class org.apache.spark.ml.Pipeline
-
- uid() - Method in class org.apache.spark.ml.PipelineModel
-
- uid() - Method in class org.apache.spark.ml.recommendation.ALS
-
- uid() - Method in class org.apache.spark.ml.recommendation.ALSModel
-
- uid() - Method in class org.apache.spark.ml.regression.DecisionTreeRegressionModel
-
- uid() - Method in class org.apache.spark.ml.regression.DecisionTreeRegressor
-
- uid() - Method in class org.apache.spark.ml.regression.GBTRegressionModel
-
- uid() - Method in class org.apache.spark.ml.regression.GBTRegressor
-
- uid() - Method in class org.apache.spark.ml.regression.LinearRegression
-
- uid() - Method in class org.apache.spark.ml.regression.LinearRegressionModel
-
- uid() - Method in class org.apache.spark.ml.regression.RandomForestRegressionModel
-
- uid() - Method in class org.apache.spark.ml.regression.RandomForestRegressor
-
- uid() - Method in class org.apache.spark.ml.tuning.CrossValidator
-
- uid() - Method in class org.apache.spark.ml.tuning.CrossValidatorModel
-
- uiTab() - Method in class org.apache.spark.streaming.StreamingContext
-
- unapply(EdgeContext<VD, ED, A>) - Static method in class org.apache.spark.graphx.EdgeContext
-
Extractor mainly used for Graph#aggregateMessages*.
- unapply(DenseVector) - Static method in class org.apache.spark.mllib.linalg.DenseVector
-
Extracts the value array from a dense vector.
- unapply(SparseVector) - Static method in class org.apache.spark.mllib.linalg.SparseVector
-
- unapply(Column) - Static method in class org.apache.spark.sql.Column
-
- unapply(DataType) - Static method in class org.apache.spark.sql.types.DecimalType
-
- unapply(Expression) - Static method in class org.apache.spark.sql.types.DecimalType
-
- unapply(Expression) - Static method in class org.apache.spark.sql.types.NumericType
-
Enables matching against NumericType for expressions:
- unapply(Broker) - Static method in class org.apache.spark.streaming.kafka.Broker
-
- UnaryTransformer<IN,OUT,T extends UnaryTransformer<IN,OUT,T>> - Class in org.apache.spark.ml
-
:: DeveloperApi ::
Abstract class for transformers that take one input column, apply transformation, and output the
result as a new column.
- UnaryTransformer() - Constructor for class org.apache.spark.ml.UnaryTransformer
-
- unbroadcast(long, boolean, boolean) - Method in interface org.apache.spark.broadcast.BroadcastFactory
-
- unbroadcast(long, boolean, boolean) - Method in class org.apache.spark.broadcast.HttpBroadcastFactory
-
Remove all persisted state associated with the HTTP broadcast with the given ID.
- unbroadcast(long, boolean, boolean) - Method in class org.apache.spark.broadcast.TorrentBroadcastFactory
-
Remove all persisted state associated with the torrent broadcast with the given ID.
- uncacheTable(String) - Method in class org.apache.spark.sql.SQLContext
-
Removes the specified table from the in-memory cache.
- underlyingSplit() - Method in class org.apache.spark.scheduler.SplitInfo
-
- UniformGenerator - Class in org.apache.spark.mllib.random
-
:: DeveloperApi ::
Generates i.i.d.
- UniformGenerator() - Constructor for class org.apache.spark.mllib.random.UniformGenerator
-
- uniformJavaRDD(JavaSparkContext, long, int, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs
-
- uniformJavaRDD(JavaSparkContext, long, int) - Static method in class org.apache.spark.mllib.random.RandomRDDs
-
- uniformJavaRDD(JavaSparkContext, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs
-
- uniformJavaVectorRDD(JavaSparkContext, long, int, int, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs
-
- uniformJavaVectorRDD(JavaSparkContext, long, int, int) - Static method in class org.apache.spark.mllib.random.RandomRDDs
-
- uniformJavaVectorRDD(JavaSparkContext, long, int) - Static method in class org.apache.spark.mllib.random.RandomRDDs
-
- uniformRDD(SparkContext, long, int, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs
-
Generates an RDD comprised of i.i.d.
samples from the uniform distribution U(0.0, 1.0)
.
- uniformVectorRDD(SparkContext, long, int, int, long) - Static method in class org.apache.spark.mllib.random.RandomRDDs
-
Generates an RDD[Vector] with vectors containing i.i.d.
samples drawn from the
uniform distribution on U(0.0, 1.0)
.
- union(JavaDoubleRDD) - Method in class org.apache.spark.api.java.JavaDoubleRDD
-
Return the union of this RDD and another one.
- union(JavaPairRDD<K, V>) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Return the union of this RDD and another one.
- union(JavaRDD<T>) - Method in class org.apache.spark.api.java.JavaRDD
-
Return the union of this RDD and another one.
- union(JavaRDD<T>, List<JavaRDD<T>>) - Method in class org.apache.spark.api.java.JavaSparkContext
-
Build the union of two or more RDDs.
- union(JavaPairRDD<K, V>, List<JavaPairRDD<K, V>>) - Method in class org.apache.spark.api.java.JavaSparkContext
-
Build the union of two or more RDDs.
- union(JavaDoubleRDD, List<JavaDoubleRDD>) - Method in class org.apache.spark.api.java.JavaSparkContext
-
Build the union of two or more RDDs.
- union(RDD<T>) - Method in class org.apache.spark.rdd.RDD
-
Return the union of this RDD and another one.
- union(Seq<RDD<T>>, ClassTag<T>) - Method in class org.apache.spark.SparkContext
-
Build the union of a list of RDDs.
- union(RDD<T>, Seq<RDD<T>>, ClassTag<T>) - Method in class org.apache.spark.SparkContext
-
Build the union of a list of RDDs passed as variable-length arguments.
- union(JavaDStream<T>) - Method in class org.apache.spark.streaming.api.java.JavaDStream
-
Return a new DStream by unifying data of another DStream with this DStream.
- union(JavaPairDStream<K, V>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Return a new DStream by unifying data of another DStream with this DStream.
- union(JavaDStream<T>, List<JavaDStream<T>>) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
-
Create a unified DStream from multiple DStreams of the same type and same slide duration.
- union(JavaPairDStream<K, V>, List<JavaPairDStream<K, V>>) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
-
Create a unified DStream from multiple DStreams of the same type and same slide duration.
- union(DStream<T>) - Method in class org.apache.spark.streaming.dstream.DStream
-
Return a new DStream by unifying data of another DStream with this DStream.
- union(Seq<DStream<T>>, ClassTag<T>) - Method in class org.apache.spark.streaming.StreamingContext
-
Create a unified DStream from multiple DStreams of the same type and same slide duration.
- unionAll(DataFrame) - Method in class org.apache.spark.sql.DataFrame
-
Returns a new
DataFrame
containing union of rows in this frame and another frame.
- UnionRDD<T> - Class in org.apache.spark.rdd
-
- UnionRDD(SparkContext, Seq<RDD<T>>, ClassTag<T>) - Constructor for class org.apache.spark.rdd.UnionRDD
-
- uniqueId() - Method in class org.apache.spark.storage.StreamBlockId
-
- UnknownReason - Class in org.apache.spark
-
:: DeveloperApi ::
We don't know why the task ended -- for example, because of a ClassNotFound exception when
deserializing the task result.
- UnknownReason() - Constructor for class org.apache.spark.UnknownReason
-
- Unlimited() - Static method in class org.apache.spark.sql.types.DecimalType
-
- unpersist() - Method in class org.apache.spark.api.java.JavaDoubleRDD
-
Mark the RDD as non-persistent, and remove all blocks for it from memory and disk.
- unpersist(boolean) - Method in class org.apache.spark.api.java.JavaDoubleRDD
-
Mark the RDD as non-persistent, and remove all blocks for it from memory and disk.
- unpersist() - Method in class org.apache.spark.api.java.JavaPairRDD
-
Mark the RDD as non-persistent, and remove all blocks for it from memory and disk.
- unpersist(boolean) - Method in class org.apache.spark.api.java.JavaPairRDD
-
Mark the RDD as non-persistent, and remove all blocks for it from memory and disk.
- unpersist() - Method in class org.apache.spark.api.java.JavaRDD
-
Mark the RDD as non-persistent, and remove all blocks for it from memory and disk.
- unpersist(boolean) - Method in class org.apache.spark.api.java.JavaRDD
-
Mark the RDD as non-persistent, and remove all blocks for it from memory and disk.
- unpersist() - Method in class org.apache.spark.broadcast.Broadcast
-
Asynchronously delete cached copies of this broadcast on the executors.
- unpersist(boolean) - Method in class org.apache.spark.broadcast.Broadcast
-
Delete cached copies of this broadcast on the executors.
- unpersist(boolean) - Method in class org.apache.spark.graphx.Graph
-
Uncaches both vertices and edges of this graph.
- unpersist(boolean) - Method in class org.apache.spark.graphx.impl.EdgeRDDImpl
-
- unpersist(boolean) - Method in class org.apache.spark.graphx.impl.GraphImpl
-
- unpersist(boolean) - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
-
- unpersist() - Method in class org.apache.spark.mllib.evaluation.BinaryClassificationMetrics
-
Unpersist intermediate RDDs used in the computation.
- unpersist(boolean) - Method in class org.apache.spark.rdd.RDD
-
Mark the RDD as non-persistent, and remove all blocks for it from memory and disk.
- unpersist(boolean) - Method in class org.apache.spark.sql.DataFrame
-
- unpersist() - Method in class org.apache.spark.sql.DataFrame
-
- unpersistVertices(boolean) - Method in class org.apache.spark.graphx.Graph
-
Uncaches only the vertices of this graph, leaving the edges alone.
- unpersistVertices(boolean) - Method in class org.apache.spark.graphx.impl.GraphImpl
-
- unregisterDialect(JdbcDialect) - Static method in class org.apache.spark.sql.jdbc.JdbcDialects
-
Unregister a dialect.
- Unresolved() - Static method in class org.apache.spark.ml.attribute.AttributeType
-
Unresolved type.
- UnresolvedAttribute - Class in org.apache.spark.ml.attribute
-
:: DeveloperApi ::
An unresolved attribute.
- UnresolvedAttribute() - Constructor for class org.apache.spark.ml.attribute.UnresolvedAttribute
-
- until(Time, Duration) - Method in class org.apache.spark.streaming.Time
-
- untilOffset() - Method in class org.apache.spark.streaming.kafka.OffsetRange
-
exclusive ending offset
- update(RDD<Vector>, double, String) - Method in class org.apache.spark.mllib.clustering.StreamingKMeansModel
-
Perform a k-means update on a batch of data.
- update(int, int, double) - Method in interface org.apache.spark.mllib.linalg.Matrix
-
Update element at (i, j)
- update(Function1<Object, Object>) - Method in interface org.apache.spark.mllib.linalg.Matrix
-
Update all the values of this matrix using the function f.
- update() - Method in class org.apache.spark.scheduler.AccumulableInfo
-
- update() - Method in class org.apache.spark.status.api.v1.AccumulableInfo
-
- update(T1, T2) - Method in class org.apache.spark.util.MutablePair
-
Updates this pair with new values and returns itself
- updateAggregateMetrics(UIData.StageUIData, String, TaskMetrics, Option<TaskMetrics>) - Method in class org.apache.spark.ui.jobs.JobProgressListener
-
Upon receiving new metrics for a task, updates the per-stage and per-executor-per-stage
aggregate metrics by calculating deltas between the currently recorded metrics and the new
metrics.
- updatePredictionError(RDD<LabeledPoint>, RDD<Tuple2<Object, Object>>, double, DecisionTreeModel, Loss) - Static method in class org.apache.spark.mllib.tree.model.GradientBoostedTreesModel
-
Update a zipped predictionError RDD
(as obtained with computeInitialPredictionAndError)
- Updater - Class in org.apache.spark.mllib.optimization
-
:: DeveloperApi ::
Class used to perform steps (weight update) using Gradient Descent methods.
- Updater() - Constructor for class org.apache.spark.mllib.optimization.Updater
-
- updateStateByKey(Function2<List<V>, Optional<S>, Optional<S>>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Return a new "state" DStream where the state for each key is updated by applying
the given function on the previous state of the key and the new values of each key.
- updateStateByKey(Function2<List<V>, Optional<S>, Optional<S>>, int) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Return a new "state" DStream where the state for each key is updated by applying
the given function on the previous state of the key and the new values of each key.
- updateStateByKey(Function2<List<V>, Optional<S>, Optional<S>>, Partitioner) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Return a new "state" DStream where the state for each key is updated by applying
the given function on the previous state of the key and the new values of the key.
- updateStateByKey(Function2<List<V>, Optional<S>, Optional<S>>, Partitioner, JavaPairRDD<K, S>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
-
Return a new "state" DStream where the state for each key is updated by applying
the given function on the previous state of the key and the new values of the key.
- updateStateByKey(Function2<Seq<V>, Option<S>, Option<S>>, ClassTag<S>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
-
Return a new "state" DStream where the state for each key is updated by applying
the given function on the previous state of the key and the new values of each key.
- updateStateByKey(Function2<Seq<V>, Option<S>, Option<S>>, int, ClassTag<S>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
-
Return a new "state" DStream where the state for each key is updated by applying
the given function on the previous state of the key and the new values of each key.
- updateStateByKey(Function2<Seq<V>, Option<S>, Option<S>>, Partitioner, ClassTag<S>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
-
Return a new "state" DStream where the state for each key is updated by applying
the given function on the previous state of the key and the new values of the key.
- updateStateByKey(Function1<Iterator<Tuple3<K, Seq<V>, Option<S>>>, Iterator<Tuple2<K, S>>>, Partitioner, boolean, ClassTag<S>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
-
Return a new "state" DStream where the state for each key is updated by applying
the given function on the previous state of the key and the new values of each key.
- updateStateByKey(Function2<Seq<V>, Option<S>, Option<S>>, Partitioner, RDD<Tuple2<K, S>>, ClassTag<S>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
-
Return a new "state" DStream where the state for each key is updated by applying
the given function on the previous state of the key and the new values of the key.
- updateStateByKey(Function1<Iterator<Tuple3<K, Seq<V>, Option<S>>>, Iterator<Tuple2<K, S>>>, Partitioner, boolean, RDD<Tuple2<K, S>>, ClassTag<S>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
-
Return a new "state" DStream where the state for each key is updated by applying
the given function on the previous state of the key and the new values of each key.
- upper(Column) - Static method in class org.apache.spark.sql.functions
-
Converts a string expression to upper case.
- useDisk() - Method in class org.apache.spark.storage.StorageLevel
-
- useDst - Variable in class org.apache.spark.graphx.TripletFields
-
Indicates whether the destination vertex attribute is included.
- useEdge - Variable in class org.apache.spark.graphx.TripletFields
-
Indicates whether the edge attribute is included.
- useMemory() - Method in class org.apache.spark.storage.StorageLevel
-
- useNodeIdCache() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
-
- useOffHeap() - Method in class org.apache.spark.storage.StorageLevel
-
- user() - Method in class org.apache.spark.ml.recommendation.ALS.Rating
-
- user() - Method in class org.apache.spark.mllib.recommendation.Rating
-
- user() - Method in class org.apache.spark.scheduler.JobLogger
-
- userClass() - Method in class org.apache.spark.sql.types.UserDefinedType
-
Class object for the UserType
- UserDefinedFunction - Class in org.apache.spark.sql
-
A user-defined function.
- userDefinedPartitionColumns() - Method in class org.apache.spark.sql.sources.HadoopFsRelation
-
Optional user defined partition columns.
- UserDefinedType<UserType> - Class in org.apache.spark.sql.types
-
::DeveloperApi::
The data type for User Defined Types (UDTs).
- UserDefinedType() - Constructor for class org.apache.spark.sql.types.UserDefinedType
-
- userFactors() - Method in class org.apache.spark.ml.recommendation.ALSModel
-
- userFeatures() - Method in class org.apache.spark.mllib.recommendation.MatrixFactorizationModel
-
- useSrc - Variable in class org.apache.spark.graphx.TripletFields
-
Indicates whether the source vertex attribute is included.
- UTF8String - Class in org.apache.spark.sql.types
-
:: DeveloperApi ::
A UTF-8 String, as internal representation of StringType in SparkSQL
- UTF8String() - Constructor for class org.apache.spark.sql.types.UTF8String
-