Accumulators are write-only variables which can be created with
val accumulator = sc.accumulator(0, name = "My accumulator") // name is optional
val someRDD = sc.parallelize(Array(1, 2, 3, 4)) someRDD.foreach(element => accumulator += element)
and accessed with
accumulator.value // 'value' is now equal to 10
Using accumulators is complicated by Spark's run-at-least-once guarantee for transformations. If a transformation needs to be recomputed for any reason, the accumulator updates during that transformation will be repeated. This means that accumulator values may be very different than they would be if tasks had run only once.