pyspark.sql.functions.sort_array#

pyspark.sql.functions.sort_array(col, asc=True)[source]#

Array function: Sorts the input array in ascending or descending order according to the natural ordering of the array elements. Null elements will be placed at the beginning of the returned array in ascending order or at the end of the returned array in descending order.

New in version 1.5.0.

Changed in version 3.4.0: Supports Spark Connect.

Parameters
colColumn or str

Name of the column or expression.

ascbool, optional

Whether to sort in ascending or descending order. If asc is True (default), then the sorting is in ascending order. If False, then in descending order.

Returns
Column

Sorted array.

Examples

Example 1: Sorting an array in ascending order

>>> import pyspark.sql.functions as sf
>>> df = spark.createDataFrame([([2, 1, None, 3],)], ['data'])
>>> df.select(sf.sort_array(df.data)).show()
+----------------------+
|sort_array(data, true)|
+----------------------+
|       [NULL, 1, 2, 3]|
+----------------------+

Example 2: Sorting an array in descending order

>>> import pyspark.sql.functions as sf
>>> df = spark.createDataFrame([([2, 1, None, 3],)], ['data'])
>>> df.select(sf.sort_array(df.data, asc=False)).show()
+-----------------------+
|sort_array(data, false)|
+-----------------------+
|        [3, 2, 1, NULL]|
+-----------------------+

Example 3: Sorting an array with a single element

>>> import pyspark.sql.functions as sf
>>> df = spark.createDataFrame([([1],)], ['data'])
>>> df.select(sf.sort_array(df.data)).show()
+----------------------+
|sort_array(data, true)|
+----------------------+
|                   [1]|
+----------------------+

Example 4: Sorting an empty array

>>> from pyspark.sql import functions as sf
>>> from pyspark.sql.types import ArrayType, StringType, StructField, StructType
>>> schema = StructType([StructField("data", ArrayType(StringType()), True)])
>>> df = spark.createDataFrame([([],)], schema=schema)
>>> df.select(sf.sort_array(df.data)).show()
+----------------------+
|sort_array(data, true)|
+----------------------+
|                    []|
+----------------------+

Example 5: Sorting an array with null values

>>> from pyspark.sql import functions as sf
>>> from pyspark.sql.types import ArrayType, IntegerType, StructType, StructField
>>> schema = StructType([StructField("data", ArrayType(IntegerType()), True)])
>>> df = spark.createDataFrame([([None, None, None],)], schema=schema)
>>> df.select(sf.sort_array(df.data)).show()
+----------------------+
|sort_array(data, true)|
+----------------------+
|    [NULL, NULL, NULL]|
+----------------------+