DataFrame.
drop_duplicates
Return DataFrame with duplicate rows removed, optionally only considering certain columns.
Only consider certain columns for identifying duplicates, by default use all the columns.
Determines which duplicates (if any) to keep. - first : Drop duplicates except for the first occurrence. - last : Drop duplicates except for the last occurrence. - False : Drop all duplicates.
first
last
Whether to drop duplicates in place or to return a copy.
If True, the resulting axis will be labeled 0, 1, …, n - 1.
DataFrame with duplicates removed or None if inplace=True.
inplace=True
>>> df = ps.DataFrame( ..
>>> df a b
>>> df.drop_duplicates().sort_index() a b
>>> df.drop_duplicates(ignore_index=True).sort_index() a b
>>> df.drop_duplicates('a').sort_index() a b
>>> df.drop_duplicates(['a', 'b']).sort_index() a b
>>> df.drop_duplicates(keep='last').sort_index() a b
>>> df.drop_duplicates(keep=False).sort_index() a b