pyspark.sql.DataFrame.foreachPartition¶
-
DataFrame.
foreachPartition
(f: Callable[[Iterator[pyspark.sql.types.Row]], None]) → None[source]¶ Applies the
f
function to each partition of thisDataFrame
.This a shorthand for
df.rdd.foreachPartition()
.New in version 1.3.0.
- Parameters
- ffunction
A function that accepts one parameter which will receive each partition to process.
Examples
>>> df = spark.createDataFrame( ... [(14, "Tom"), (23, "Alice"), (16, "Bob")], ["age", "name"]) >>> def func(itr): ... for person in itr: ... print(person.name) ... >>> df.foreachPartition(func)