pyspark.pandas.MultiIndex.from_frame

static MultiIndex.from_frame(df: pyspark.pandas.frame.DataFrame, names: Optional[List[Union[Any, Tuple[Any, …]]]] = None) → pyspark.pandas.indexes.multi.MultiIndex[source]

Make a MultiIndex from a DataFrame.

Parameters
dfDataFrame

DataFrame to be converted to MultiIndex.

nameslist-like, optional

If no names are provided, use the column names, or tuple of column names if the column is a MultiIndex. If a sequence, overwrite names with the given sequence.

Returns
MultiIndex

The MultiIndex representation of the given DataFrame.

See also

MultiIndex.from_arrays

Convert list of arrays to MultiIndex.

MultiIndex.from_tuples

Convert list of tuples to MultiIndex.

MultiIndex.from_product

Make a MultiIndex from cartesian product of iterables.

Examples

>>> df = ps.DataFrame([['HI', 'Temp'], ['HI', 'Precip'],
...                    ['NJ', 'Temp'], ['NJ', 'Precip']],
...                   columns=['a', 'b'])
>>> df  
      a       b
0    HI    Temp
1    HI  Precip
2    NJ    Temp
3    NJ  Precip
>>> ps.MultiIndex.from_frame(df)  
MultiIndex([('HI',   'Temp'),
            ('HI', 'Precip'),
            ('NJ',   'Temp'),
            ('NJ', 'Precip')],
           names=['a', 'b'])

Using explicit names, instead of the column names

>>> ps.MultiIndex.from_frame(df, names=['state', 'observation'])  
MultiIndex([('HI',   'Temp'),
            ('HI', 'Precip'),
            ('NJ',   'Temp'),
            ('NJ', 'Precip')],
           names=['state', 'observation'])