Supported pandas API#

The following table shows the pandas APIs that implemented or non-implemented from pandas API on Spark. Some pandas API do not implement full parameters, so the third column shows missing parameters for each API.

  • ‘Y’ in the second column means it’s implemented including its whole parameter.

  • ‘N’ means it’s not implemented yet.

  • ‘P’ means it’s partially implemented with the missing of some parameters.

All API in the list below computes the data with distributed execution except the ones that require the local execution by design. For example, DataFrame.to_numpy() requires to collect the data to the driver side.

If there is non-implemented pandas API or parameter you want, you can create an Apache Spark JIRA to request or to contribute by your own.

The API list is updated based on the latest pandas official API reference.

CategoricalIndex API#

API

Implemented

Missing parameters

add_categories()

Y

all()

Y

any()

Y

append()

Y

argmax()

P

axis , skipna

argmin()

P

axis , skipna

argsort

N

as_ordered()

Y

as_unordered()

Y

asof()

Y

asof_locs

N

astype()

P

copy

copy()

Y

delete()

Y

diff

N

difference()

Y

drop()

P

errors

drop_duplicates()

Y

droplevel()

Y

dropna()

Y

duplicated

N

equals()

Y

factorize()

Y

fillna()

P

downcast

format

N

get_indexer

N

get_indexer_for

N

get_indexer_non_unique

N

get_level_values()

Y

get_loc

N

get_slice_bound

N

groupby

N

holds_integer()

Y

identical()

Y

infer_objects

N

insert()

Y

intersection()

P

sort

is_

N

is_boolean()

Y

is_categorical()

Y

is_floating()

Y

is_integer()

Y

is_interval()

Y

is_numeric()

Y

is_object()

Y

isin()

P

level

isna()

Y

isnull()

Y

item()

Y

join

N

map()

P

na_action

max()

Y

memory_usage

N

min()

Y

notna()

Y

notnull()

Y

nunique()

Y

putmask

N

ravel

N

reindex

N

remove_categories()

Y

remove_unused_categories()

Y

rename()

Y

rename_categories()

Y

reorder_categories()

Y

repeat()

P

axis

round

N

searchsorted

N

set_categories()

Y

set_names()

Y

shift()

P

freq

slice_indexer

N

slice_locs

N

sort()

Y

sort_values()

P

key , na_position

sortlevel

N

symmetric_difference()

Y

take()

P

allow_fill , axis , fill_value

to_flat_index

N

to_frame()

Y

to_list()

Y

to_numpy()

P

na_value

to_series()

P

index

tolist()

Y

transpose()

Y

union()

Y

unique()

Y

value_counts()

Y

view()

Y

where

N

DataFrame API#

API

Implemented

Missing parameters

abs()

Y

add()

P

axis , fill_value , level

add_prefix()

P

axis

add_suffix()

P

axis

agg()

P

axis

aggregate()

P

axis

align()

P

broadcast_axis , fill_axis , fill_value , level , limit and more. See the pandas.DataFrame.align and pyspark.pandas.DataFrame.align for detail.

all()

Y

any()

P

skipna

apply()

P

by_row , engine , engine_kwargs , raw , result_type

applymap()

P

na_action

asfreq

N

asof

N

assign()

Y

astype()

P

copy , errors

at_time()

Y

backfill()

P

downcast

between_time()

Y

bfill()

P

downcast , limit_area

bool()

Y

boxplot()

P

ax , backend , by , column , figsize and more. See the pandas.DataFrame.boxplot and pyspark.pandas.DataFrame.boxplot for detail.

clip()

P

axis , inplace

combine

N

combine_first()

Y

compare

N

convert_dtypes

N

copy()

Y

corr()

P

numeric_only

corrwith()

P

numeric_only

count()

Y

cov()

P

numeric_only

cummax()

P

axis

cummin()

P

axis

cumprod()

P

axis

cumsum()

P

axis

describe()

P

exclude , include

diff()

Y

div()

P

axis , fill_value , level

divide()

P

axis , fill_value , level

dot()

Y

drop()

P

errors , inplace , level

drop_duplicates()

Y

droplevel()

Y

dropna()

P

ignore_index

duplicated()

Y

eq()

P

axis , level

equals()

Y

eval()

Y

ewm()

P

adjust , axis , method , times

expanding()

P

axis , method

explode()

Y

ffill()

P

downcast , limit_area

fillna()

P

downcast

filter()

Y

first()

Y

first_valid_index()

Y

floordiv()

P

axis , fill_value , level

ge()

P

axis , level

get()

Y

groupby()

P

group_keys , level , observed , sort

gt()

P

axis , level

head()

Y

hist()

P

ax , backend , by , column , data and more. See the pandas.DataFrame.hist and pyspark.pandas.DataFrame.hist for detail.

idxmax()

P

numeric_only , skipna

idxmin()

P

numeric_only , skipna

infer_objects

N

info()

P

memory_usage

insert()

Y

interpolate()

P

axis , downcast , inplace

isetitem

N

isin()

Y

isna()

Y

isnull()

Y

items()

Y

iterrows()

Y

itertuples()

Y

join()

P

other , sort , validate

keys()

Y

kurt()

Y

kurtosis()

Y

last()

Y

last_valid_index()

Y

le()

P

axis , level

lt()

P

axis , level

map()

P

na_action

mask()

P

axis , inplace , level

max()

Y

mean()

Y

median()

Y

melt()

P

col_level , ignore_index

memory_usage

N

merge()

P

copy , indicator , sort , validate

min()

Y

mod()

P

axis , fill_value , level

mode()

Y

mul()

P

axis , fill_value , level

multiply()

P

axis , fill_value , level

ne()

P

axis , level

nlargest()

Y

notna()

Y

notnull()

Y

nsmallest()

Y

nunique()

Y

pad()

P

downcast

pct_change()

P

fill_method , freq , limit

pipe()

Y

pivot()

Y

pivot_table()

P

dropna , margins , margins_name , observed , sort

pop()

Y

pow()

P

axis , fill_value , level

prod()

Y

product()

Y

quantile()

P

interpolation , method

query()

Y

radd()

P

axis , fill_value , level

rank()

P

axis , na_option , pct

rdiv()

P

axis , fill_value , level

reindex()

P

level , limit , method , tolerance

reindex_like()

P

limit , method , tolerance

rename()

P

copy

rename_axis()

P

copy

reorder_levels

N

replace()

Y

resample()

P

axis , convention , group_keys , kind , level and more. See the pandas.DataFrame.resample and pyspark.pandas.DataFrame.resample for detail.

reset_index()

P

allow_duplicates , names

rfloordiv()

P

axis , fill_value , level

rmod()

P

axis , fill_value , level

rmul()

P

axis , fill_value , level

rolling()

P

axis , center , closed , method , on and more. See the pandas.DataFrame.rolling and pyspark.pandas.DataFrame.rolling for detail.

round()

Y

rpow()

P

axis , fill_value , level

rsub()

P

axis , fill_value , level

rtruediv()

P

axis , fill_value , level

sample()

P

axis , weights

select_dtypes()

Y

sem()

Y

set_axis

N

set_flags

N

set_index()

P

verify_integrity

shift()

P

axis , freq , suffix

skew()

Y

sort_index()

P

key , sort_remaining

sort_values()

P

axis , key , kind

squeeze()

Y

stack()

P

dropna , future_stack , level , sort

std()

Y

sub()

P

axis , fill_value , level

subtract()

P

axis , fill_value , level

sum()

Y

swapaxes()

P

axis1 , axis2

swaplevel()

Y

tail()

Y

take()

Y

to_clipboard()

Y

to_csv()

P

chunksize , compression , decimal , doublequote , encoding and more. See the pandas.DataFrame.to_csv and pyspark.pandas.DataFrame.to_csv for detail.

to_dict()

P

index

to_excel()

P

engine_kwargs , storage_options

to_feather()

Y

to_gbq

N

to_hdf()

Y

to_html()

P

encoding

to_json()

P

date_format , date_unit , default_handler , double_precision , force_ascii and more. See the pandas.DataFrame.to_json and pyspark.pandas.DataFrame.to_json for detail.

to_latex()

P

caption , label , position

to_markdown()

P

index , storage_options

to_numpy()

P

copy , dtype , na_value

to_orc()

P

engine , engine_kwargs , index

to_parquet()

P

engine , index , storage_options

to_period

N

to_pickle

N

to_records()

Y

to_sql

N

to_stata()

Y

to_string()

P

encoding , max_colwidth , min_rows

to_timestamp

N

to_xarray

N

to_xml

N

transform()

Y

transpose()

P

copy

truediv()

P

axis , fill_value , level

truncate()

Y

tz_convert

N

tz_localize

N

unstack()

P

fill_value , level , sort

update()

P

errors , filter_func

value_counts

N

var()

P

skipna

where()

P

inplace , level

xs()

P

drop_level

DatetimeIndex API#

API

Implemented

Missing parameters

all()

Y

any()

Y

append()

Y

argmax()

P

axis , skipna

argmin()

P

axis , skipna

argsort

N

as_unit

N

asof()

Y

asof_locs

N

astype()

P

copy

ceil()

Y

copy()

Y

day_name()

Y

delete()

Y

diff

N

difference()

Y

drop()

P

errors

drop_duplicates()

Y

droplevel()

Y

dropna()

Y

duplicated

N

equals()

Y

factorize()

Y

fillna()

P

downcast

floor()

Y

format

N

get_indexer

N

get_indexer_for

N

get_indexer_non_unique

N

get_level_values()

Y

get_loc

N

get_slice_bound

N

groupby

N

holds_integer()

Y

identical()

Y

indexer_at_time()

Y

indexer_between_time()

Y

infer_objects

N

insert()

Y

intersection()

P

sort

is_

N

is_boolean()

Y

is_categorical()

Y

is_floating()

Y

is_integer()

Y

is_interval()

Y

is_numeric()

Y

is_object()

Y

isin()

P

level

isna()

Y

isnull()

Y

isocalendar()

Y

item()

Y

join

N

map()

Y

max()

P

axis , skipna

mean

N

memory_usage

N

min()

P

axis , skipna

month_name()

Y

normalize()

Y

notna()

Y

notnull()

Y

nunique()

Y

putmask

N

ravel

N

reindex

N

rename()

Y

repeat()

P

axis

round()

Y

searchsorted

N

set_names()

Y

shift()

P

freq

slice_indexer

N

slice_locs

N

snap

N

sort()

Y

sort_values()

P

key , na_position

sortlevel

N

std

N

strftime()

Y

symmetric_difference()

Y

take()

P

allow_fill , axis , fill_value

to_flat_index

N

to_frame()

Y

to_julian_date

N

to_list()

Y

to_numpy()

P

na_value

to_period

N

to_pydatetime

N

to_series()

P

index

tolist()

Y

transpose()

Y

tz_convert

N

tz_localize

N

union()

Y

unique()

Y

value_counts()

Y

view()

Y

where

N

Index API#

API

Implemented

Missing parameters

all()

Y

any()

Y

append()

Y

argmax()

P

axis , skipna

argmin()

P

axis , skipna

argsort

N

asof()

Y

asof_locs

N

astype()

P

copy

copy()

Y

delete()

Y

diff

N

difference()

Y

drop()

P

errors

drop_duplicates()

Y

droplevel()

Y

dropna()

Y

duplicated

N

equals()

Y

factorize()

Y

fillna()

P

downcast

format

N

get_indexer

N

get_indexer_for

N

get_indexer_non_unique

N

get_level_values()

Y

get_loc

N

get_slice_bound

N

groupby

N

holds_integer()

Y

identical()

Y

infer_objects

N

insert()

Y

intersection()

P

sort

is_

N

is_boolean()

Y

is_categorical()

Y

is_floating()

Y

is_integer()

Y

is_interval()

Y

is_numeric()

Y

is_object()

Y

isin()

P

level

isna()

Y

isnull()

Y

item()

Y

join

N

map()

Y

max()

P

axis , skipna

memory_usage

N

min()

P

axis , skipna

notna()

Y

notnull()

Y

nunique()

Y

putmask

N

ravel

N

reindex

N

rename()

Y

repeat()

P

axis

round

N

searchsorted

N

set_names()

Y

shift()

P

freq

slice_indexer

N

slice_locs

N

sort()

Y

sort_values()

P

key , na_position

sortlevel

N

symmetric_difference()

Y

take()

P

allow_fill , axis , fill_value

to_flat_index

N

to_frame()

Y

to_list()

Y

to_numpy()

P

na_value

to_series()

P

index

tolist()

Y

transpose()

Y

union()

Y

unique()

Y

value_counts()

Y

view()

Y

where

N

MultiIndex API#

API

Implemented

Missing parameters

all()

Y

any()

Y

append()

Y

argmax()

P

axis , skipna

argmin()

P

axis , skipna

argsort

N

asof()

Y

asof_locs

N

astype()

P

copy

copy()

P

name , names

delete()

Y

diff

N

difference()

Y

drop()

P

errors

drop_duplicates()

Y

droplevel()

Y

dropna()

Y

duplicated

N

equal_levels()

Y

equals()

Y

factorize()

P

use_na_sentinel

fillna()

P

downcast

format

N

get_indexer

N

get_indexer_for

N

get_indexer_non_unique

N

get_level_values()

Y

get_loc

N

get_loc_level

N

get_locs

N

get_slice_bound

N

groupby

N

holds_integer()

Y

identical()

Y

infer_objects

N

insert()

Y

intersection()

P

sort

is_

N

is_boolean()

Y

is_categorical()

Y

is_floating()

Y

is_integer()

Y

is_interval()

Y

is_numeric()

Y

is_object()

Y

isin()

P

level

isna()

Y

isnull()

Y

item()

Y

join

N

map()

Y

max()

P

axis , skipna

memory_usage

N

min()

P

axis , skipna

notna()

Y

notnull()

Y

nunique()

Y

putmask

N

ravel

N

reindex

N

remove_unused_levels

N

rename()

P

level , names

reorder_levels

N

repeat()

P

axis

round

N

searchsorted

N

set_codes

N

set_levels

N

set_names()

Y

shift()

P

freq

slice_indexer

N

slice_locs

N

sort()

Y

sort_values()

P

key , na_position

sortlevel

N

swaplevel()

Y

symmetric_difference()

Y

take()

P

allow_fill , axis , fill_value

to_flat_index

N

to_frame()

P

allow_duplicates

to_list()

Y

to_numpy()

P

na_value

to_series()

P

index

tolist()

Y

transpose()

Y

truncate

N

union()

Y

unique()

Y

value_counts()

Y

view()

Y

where

N

Series API#

API

Implemented

Missing parameters

abs()

Y

add()

P

axis , level

add_prefix()

P

axis

add_suffix()

P

axis

agg()

P

axis

aggregate()

P

axis

align()

P

broadcast_axis , fill_axis , fill_value , level , limit and more. See the pandas.Series.align and pyspark.pandas.Series.align for detail.

all()

P

bool_only

any()

P

bool_only , skipna

apply()

P

by_row , convert_dtype

argmax()

Y

argmin()

Y

argsort()

P

axis , kind , order , stable

asfreq

N

asof()

P

subset

astype()

P

copy , errors

at_time()

Y

autocorr()

Y

backfill()

P

downcast

between()

Y

between_time()

Y

bfill()

P

downcast , limit_area

bool()

Y

case_when

N

clip()

P

axis

combine

N

combine_first()

Y

compare()

P

align_axis , result_names

convert_dtypes

N

copy()

Y

corr()

Y

count()

Y

cov()

Y

cummax()

P

axis

cummin()

P

axis

cumprod()

P

axis

cumsum()

P

axis

describe()

P

exclude , include

diff()

Y

div()

P

axis , fill_value , level

divide()

P

axis , fill_value , level

divmod()

P

axis , fill_value , level

dot()

Y

drop()

P

axis , errors

drop_duplicates()

P

ignore_index

droplevel()

P

axis

dropna()

P

how , ignore_index

duplicated()

Y

eq()

P

axis , fill_value , level

equals()

Y

ewm()

P

adjust , axis , method , times

expanding()

P

axis , method

explode()

P

ignore_index

factorize()

Y

ffill()

P

downcast , limit_area

fillna()

P

downcast

filter()

Y

first()

Y

first_valid_index()

Y

floordiv()

P

axis , fill_value , level

ge()

P

axis , fill_value , level

get()

Y

groupby()

P

group_keys , level , observed , sort

gt()

P

axis , fill_value , level

head()

Y

hist()

P

ax , backend , by , figsize , grid and more. See the pandas.Series.hist and pyspark.pandas.Series.hist for detail.

idxmax()

P

axis

idxmin()

P

axis

infer_objects

N

info

N

interpolate()

P

axis , downcast , inplace

isin()

Y

isna()

Y

isnull()

Y

item()

Y

items()

Y

keys()

Y

kurt()

Y

kurtosis()

Y

last()

Y

last_valid_index()

Y

le()

P

axis , fill_value , level

lt()

P

axis , fill_value , level

map()

Y

mask()

P

axis , inplace , level

max()

Y

mean()

Y

median()

Y

memory_usage

N

min()

Y

mod()

P

axis , fill_value , level

mode()

Y

mul()

P

axis , fill_value , level

multiply()

P

axis , fill_value , level

ne()

P

axis , fill_value , level

nlargest()

P

keep

notna()

Y

notnull()

Y

nsmallest()

P

keep

nunique()

Y

pad()

P

downcast

pct_change()

P

fill_method , freq , limit

pipe()

Y

pop()

Y

pow()

P

axis , fill_value , level

prod()

Y

product()

Y

quantile()

P

interpolation

radd()

P

axis , level

rank()

P

axis , na_option , pct

ravel

N

rdiv()

P

axis , fill_value , level

rdivmod()

P

axis , fill_value , level

reindex()

P

axis , copy , level , limit , method and more. See the pandas.Series.reindex and pyspark.pandas.Series.reindex for detail.

reindex_like()

P

copy , limit , method , tolerance

rename()

P

axis , copy , errors , inplace , level

rename_axis()

P

axis , copy

reorder_levels

N

repeat()

P

axis

replace()

P

inplace , limit , method

resample()

P

axis , convention , group_keys , kind , level and more. See the pandas.Series.resample and pyspark.pandas.Series.resample for detail.

reset_index()

P

allow_duplicates

rfloordiv()

P

axis , fill_value , level

rmod()

P

axis , fill_value , level

rmul()

P

axis , fill_value , level

rolling()

P

axis , center , closed , method , on and more. See the pandas.Series.rolling and pyspark.pandas.Series.rolling for detail.

round()

Y

rpow()

P

axis , fill_value , level

rsub()

P

axis , fill_value , level

rtruediv()

P

axis , fill_value , level

sample()

P

axis , weights

searchsorted()

P

sorter

sem()

Y

set_axis

N

set_flags

N

shift()

P

axis , freq , suffix

skew()

Y

sort_index()

P

key , sort_remaining

sort_values()

P

axis , key , kind

squeeze()

Y

std()

Y

sub()

P

axis , fill_value , level

subtract()

P

axis , fill_value , level

sum()

Y

swapaxes()

P

axis1 , axis2

swaplevel()

Y

tail()

Y

take()

P

axis

to_clipboard()

Y

to_csv()

P

chunksize , compression , decimal , doublequote , encoding and more. See the pandas.Series.to_csv and pyspark.pandas.Series.to_csv for detail.

to_dict()

Y

to_excel()

P

engine_kwargs , storage_options

to_frame()

Y

to_hdf()

Y

to_json()

P

date_format , date_unit , default_handler , double_precision , force_ascii and more. See the pandas.Series.to_json and pyspark.pandas.Series.to_json for detail.

to_latex()

P

caption , label , position

to_list()

Y

to_markdown()

P

index , storage_options

to_numpy()

P

copy , dtype , na_value

to_period

N

to_pickle

N

to_sql

N

to_string()

P

min_rows

to_timestamp

N

to_xarray

N

tolist()

Y

transform()

Y

transpose()

Y

truediv()

P

axis , fill_value , level

truncate()

Y

tz_convert

N

tz_localize

N

unique()

Y

unstack()

P

fill_value , sort

update()

Y

value_counts()

Y

var()

P

skipna

view

N

where()

P

axis , inplace , level

xs()

P

axis , drop_level

TimedeltaIndex API#

API

Implemented

Missing parameters

all()

Y

any()

Y

append()

Y

argmax()

P

axis , skipna

argmin()

P

axis , skipna

argsort

N

as_unit

N

asof()

Y

asof_locs

N

astype()

P

copy

ceil

N

copy()

Y

delete()

Y

diff

N

difference()

Y

drop()

P

errors

drop_duplicates()

Y

droplevel()

Y

dropna()

Y

duplicated

N

equals()

Y

factorize()

Y

fillna()

P

downcast

floor

N

format

N

get_indexer

N

get_indexer_for

N

get_indexer_non_unique

N

get_level_values()

Y

get_loc

N

get_slice_bound

N

groupby

N

holds_integer()

Y

identical()

Y

infer_objects

N

insert()

Y

intersection()

P

sort

is_

N

is_boolean()

Y

is_categorical()

Y

is_floating()

Y

is_integer()

Y

is_interval()

Y

is_numeric()

Y

is_object()

Y

isin()

P

level

isna()

Y

isnull()

Y

item()

Y

join

N

map()

Y

max()

P

axis , skipna

mean

N

median

N

memory_usage

N

min()

P

axis , skipna

notna()

Y

notnull()

Y

nunique()

Y

putmask

N

ravel

N

reindex

N

rename()

Y

repeat()

P

axis

round

N

searchsorted

N

set_names()

Y

shift()

P

freq

slice_indexer

N

slice_locs

N

sort()

Y

sort_values()

P

key , na_position

sortlevel

N

std

N

sum

N

symmetric_difference()

Y

take()

P

allow_fill , axis , fill_value

to_flat_index

N

to_frame()

Y

to_list()

Y

to_numpy()

P

na_value

to_pytimedelta

N

to_series()

P

index

tolist()

Y

total_seconds

N

transpose()

Y

union()

Y

unique()

Y

value_counts()

Y

view()

Y

where

N

General Function API#

API

Implemented

Missing parameters

array

N

bdate_range

N

concat()

P

copy , keys , levels , names , verify_integrity

crosstab

N

cut

N

date_range()

P

unit

eval

N

factorize

N

from_dummies

N

get_dummies()

Y

infer_freq

N

interval_range

N

isna()

Y

isnull()

Y

json_normalize

N

lreshape

N

melt()

P

col_level , ignore_index

merge()

P

copy , indicator , left , sort , validate

merge_asof()

Y

merge_ordered

N

notna()

Y

notnull()

Y

period_range

N

pivot

N

pivot_table

N

qcut

N

read_clipboard()

P

dtype_backend

read_csv()

P

cache_dates , chunksize , compression , converters , date_format and more. See the pandas.read_csv and pyspark.pandas.read_csv for detail.

read_excel()

P

date_format , decimal , dtype_backend , engine_kwargs , na_filter and more. See the pandas.read_excel and pyspark.pandas.read_excel for detail.

read_feather

N

read_fwf

N

read_gbq

N

read_hdf

N

read_html()

P

dtype_backend , extract_links , storage_options

read_json()

P

chunksize , compression , convert_axes , convert_dates , date_unit and more. See the pandas.read_json and pyspark.pandas.read_json for detail.

read_orc()

P

dtype_backend , filesystem

read_parquet()

P

dtype_backend , engine , filesystem , filters , storage_options and more. See the pandas.read_parquet and pyspark.pandas.read_parquet for detail.

read_pickle

N

read_sas

N

read_spss

N

read_sql()

P

chunksize , coerce_float , dtype , dtype_backend , params and more. See the pandas.read_sql and pyspark.pandas.read_sql for detail.

read_sql_query()

P

chunksize , coerce_float , dtype , dtype_backend , params and more. See the pandas.read_sql_query and pyspark.pandas.read_sql_query for detail.

read_sql_table()

P

chunksize , coerce_float , dtype_backend , parse_dates

read_stata

N

read_table()

P

cache_dates , chunksize , comment , compression , converters and more. See the pandas.read_table and pyspark.pandas.read_table for detail.

read_xml

N

set_eng_float_format

N

show_versions

N

test

N

timedelta_range()

P

unit

to_datetime()

P

cache , dayfirst , exact , utc , yearfirst

to_numeric()

P

downcast , dtype_backend

to_pickle

N

to_timedelta()

Y

unique

N

value_counts

N

wide_to_long

N

Expanding API#

API

Implemented

Missing parameters

agg

N

aggregate

N

apply

N

corr

N

count()

P

numeric_only

cov

N

kurt()

P

numeric_only

max()

P

engine , engine_kwargs , numeric_only

mean()

P

engine , engine_kwargs , numeric_only

median

N

min()

P

engine , engine_kwargs , numeric_only

quantile()

P

interpolation , numeric_only , q

rank

N

sem

N

skew()

P

numeric_only

std()

P

ddof , engine , engine_kwargs , numeric_only

sum()

P

engine , engine_kwargs , numeric_only

var()

P

ddof , engine , engine_kwargs , numeric_only

ExpandingGroupby API#

API

Implemented

Missing parameters

agg

N

aggregate

N

apply

N

corr

N

count()

P

numeric_only

cov

N

kurt()

P

numeric_only

max()

P

engine , engine_kwargs , numeric_only

mean()

P

engine , engine_kwargs , numeric_only

median

N

min()

P

engine , engine_kwargs , numeric_only

quantile()

P

interpolation , numeric_only , q

rank

N

sem

N

skew()

P

numeric_only

std()

P

ddof , engine , engine_kwargs , numeric_only

sum()

P

engine , engine_kwargs , numeric_only

var()

P

ddof , engine , engine_kwargs , numeric_only

Rolling API#

API

Implemented

Missing parameters

agg

N

aggregate

N

apply

N

corr

N

count()

P

numeric_only

cov

N

kurt()

P

numeric_only

max()

P

engine , engine_kwargs , numeric_only

mean()

P

engine , engine_kwargs , numeric_only

median

N

min()

P

engine , engine_kwargs , numeric_only

quantile()

P

interpolation , numeric_only , q

rank

N

sem

N

skew()

P

numeric_only

std()

P

ddof , engine , engine_kwargs , numeric_only

sum()

P

engine , engine_kwargs , numeric_only

var()

P

ddof , engine , engine_kwargs , numeric_only

RollingGroupby API#

API

Implemented

Missing parameters

agg

N

aggregate

N

apply

N

corr

N

count()

P

numeric_only

cov

N

kurt()

P

numeric_only

max()

P

engine , engine_kwargs , numeric_only

mean()

P

engine , engine_kwargs , numeric_only

median

N

min()

P

engine , engine_kwargs , numeric_only

quantile()

P

interpolation , numeric_only , q

rank

N

sem

N

skew()

P

numeric_only

std()

P

ddof , engine , engine_kwargs , numeric_only

sum()

P

engine , engine_kwargs , numeric_only

var()

P

ddof , engine , engine_kwargs , numeric_only

Window API#

API

Implemented

Missing parameters

agg

N

aggregate

N

mean

N

std

N

sum

N

var

N

DataFrameGroupBy API#

API

Implemented

Missing parameters

agg()

P

engine , engine_kwargs , func

aggregate()

P

engine , engine_kwargs , func

all()

Y

any()

P

skipna

apply()

P

include_groups

bfill()

Y

boxplot

N

corr()

Y

corrwith

N

count()

Y

cov

N

cumcount()

Y

cummax()

P

axis , numeric_only

cummin()

P

axis , numeric_only

cumprod()

P

axis

cumsum()

P

axis

describe()

P

exclude , include , percentiles

diff()

P

axis

ewm()

Y

expanding()

Y

ffill()

Y

fillna()

P

downcast

filter()

P

dropna

first()

P

skipna

get_group()

P

obj

head()

Y

hist

N

idxmax()

P

axis , numeric_only

idxmin()

P

axis , numeric_only

last()

P

skipna

max()

P

engine , engine_kwargs

mean()

P

engine , engine_kwargs

median()

Y

min()

P

engine , engine_kwargs

ngroup

N

nunique()

Y

ohlc

N

pct_change

N

pipe

N

prod()

Y

quantile()

P

interpolation , numeric_only

rank()

P

axis , na_option , pct

resample

N

rolling()

Y

sample

N

sem()

P

numeric_only

shift()

P

axis , freq , suffix

size()

Y

skew()

P

axis , numeric_only , skipna

std()

P

engine , engine_kwargs , numeric_only

sum()

P

engine , engine_kwargs

tail()

Y

take

N

transform()

P

engine , engine_kwargs

value_counts

N

var()

P

engine , engine_kwargs

GroupBy API#

API

Implemented

Missing parameters

agg()

P

func

aggregate()

P

func

all()

Y

any()

P

skipna

apply()

P

include_groups

bfill()

Y

count()

Y

cumcount()

Y

cummax()

P

axis , numeric_only

cummin()

P

axis , numeric_only

cumprod()

P

axis

cumsum()

P

axis

describe

N

diff()

P

axis

ewm()

Y

expanding()

Y

ffill()

Y

first()

P

skipna

get_group()

P

obj

head()

Y

last()

P

skipna

max()

P

engine , engine_kwargs

mean()

P

engine , engine_kwargs

median()

Y

min()

P

engine , engine_kwargs

ngroup

N

ohlc

N

pct_change

N

pipe

N

prod()

Y

quantile()

P

interpolation , numeric_only

rank()

P

axis , na_option , pct

resample

N

rolling()

Y

sample

N

sem()

P

numeric_only

shift()

P

axis , freq , suffix

size()

Y

std()

P

engine , engine_kwargs , numeric_only

sum()

P

engine , engine_kwargs

tail()

Y

var()

P

engine , engine_kwargs

SeriesGroupBy API#

API

Implemented

Missing parameters

agg()

P

engine , engine_kwargs , func

aggregate()

P

engine , engine_kwargs , func

all()

Y

any()

P

skipna

apply()

Y

bfill()

Y

corr

N

count()

Y

cov

N

cumcount()

Y

cummax()

P

axis , numeric_only

cummin()

P

axis , numeric_only

cumprod()

P

axis

cumsum()

P

axis

describe

N

diff()

P

axis

ewm()

Y

expanding()

Y

ffill()

Y

fillna()

P

downcast

filter()

P

dropna

first()

P

skipna

get_group()

P

obj

head()

Y

hist

N

idxmax()

P

axis

idxmin()

P

axis

last()

P

skipna

max()

P

engine , engine_kwargs

mean()

P

engine , engine_kwargs

median()

Y

min()

P

engine , engine_kwargs

ngroup

N

nlargest()

P

keep

nsmallest()

P

keep

nunique()

Y

ohlc

N

pct_change

N

pipe

N

prod()

Y

quantile()

P

interpolation , numeric_only

rank()

P

axis , na_option , pct

resample

N

rolling()

Y

sample

N

sem()

P

numeric_only

shift()

P

axis , freq , suffix

size()

Y

skew()

P

axis , numeric_only , skipna

std()

P

engine , engine_kwargs , numeric_only

sum()

P

engine , engine_kwargs

tail()

Y

take

N

transform()

P

engine , engine_kwargs

unique()

Y

value_counts()

P

bins , normalize

var()

P

engine , engine_kwargs