T
- result typepublic class Accumulator<T> extends Accumulable<T,T>
Accumulable
where the result type being accumulated is the same
as the types of elements being merged, i.e. variables that are only "added" to through an
associative and commutative operation and can therefore be efficiently supported in parallel.
They can be used to implement counters (as in MapReduce) or sums. Spark natively supports
accumulators of numeric value types, and programmers can add support for new types.
An accumulator is created from an initial value v
by calling SparkContext.accumulator
.
Tasks running on the cluster can then add to it using the +=
operator.
However, they cannot read its value. Only the driver program can read the accumulator's value,
using its Accumulable.value()
method.
The interpreter session below shows an accumulator being used to add up the elements of an array:
scala> val accum = sc.accumulator(0)
accum: org.apache.spark.Accumulator[Int] = 0
scala> sc.parallelize(Array(1, 2, 3, 4)).foreach(x => accum += x)
...
10/09/29 18:41:08 INFO SparkContext: Tasks finished in 0.317106 s
scala> accum.value
res2: Int = 10
param: initialValue initial value of accumulator
param: param helper object defining how to add elements of type T
param: name human-readable name associated with this accumulator
param: countFailedValues whether to accumulate values from failed tasks
add, id, localValue, merge, name, setValue, toString, value, zero