pyspark.pandas.DataFrame.sort_values#
- DataFrame.sort_values(by, ascending=True, inplace=False, na_position='last', ignore_index=False)[source]#
Sort by the values along either axis.
- Parameters
- bystr or list of str
- ascendingbool or list of bool, default True
Sort ascending vs. descending. Specify list for multiple sort orders. If this is a list of bools, must match the length of the by.
- inplacebool, default False
if True, perform operation in-place
- na_position{‘first’, ‘last’}, default ‘last’
first puts NaNs at the beginning, last puts NaNs at the end
- ignore_indexbool, default False
If True, the resulting axis will be labeled 0, 1, …, n - 1.
- Returns
- sorted_objDataFrame
Examples
>>> df = ps.DataFrame({ ... 'col1': ['A', 'B', None, 'D', 'C'], ... 'col2': [2, 9, 8, 7, 4], ... 'col3': [0, 9, 4, 2, 3], ... }, ... columns=['col1', 'col2', 'col3'], ... index=['a', 'b', 'c', 'd', 'e']) >>> df col1 col2 col3 a A 2 0 b B 9 9 c None 8 4 d D 7 2 e C 4 3
Sort by col1
>>> df.sort_values(by=['col1']) col1 col2 col3 a A 2 0 b B 9 9 e C 4 3 d D 7 2 c None 8 4
Ignore index for the resulting axis
>>> df.sort_values(by=['col1'], ignore_index=True) col1 col2 col3 0 A 2 0 1 B 9 9 2 C 4 3 3 D 7 2 4 None 8 4
Sort Descending
>>> df.sort_values(by='col1', ascending=False) col1 col2 col3 d D 7 2 e C 4 3 b B 9 9 a A 2 0 c None 8 4
Sort by multiple columns
>>> df = ps.DataFrame({ ... 'col1': ['A', 'A', 'B', None, 'D', 'C'], ... 'col2': [2, 1, 9, 8, 7, 4], ... 'col3': [0, 1, 9, 4, 2, 3], ... }, ... columns=['col1', 'col2', 'col3']) >>> df.sort_values(by=['col1', 'col2']) col1 col2 col3 1 A 1 1 0 A 2 0 2 B 9 9 5 C 4 3 4 D 7 2 3 None 8 4