Drops duplicates in DataFrames with tedious dtypes
Project description
Drops duplicates in DataFrames with tedious dtypes
Tested against Windows / Python 3.11 / Anaconda
pip install dropduplicatesplanb
import pandas as pd
from dropduplicatesplanb import pd_add_drop_duplicates_planB
pd_add_drop_duplicates_planB()
df = pd.read_csv(
"https://raw.githubusercontent.com/pandas-dev/pandas/main/doc/data/titanic.csv"
)
df["baba"] = df.Embarked.apply(lambda q: [q, q, q, q])
df.loc[0, "baba"] = [[[1, 2, 34, 4, 2, 2, 34, 2, 1]]]
df.loc[1, "baba"] = [[[1, 2, 34, 4, 2, 2, 34, 2, 1]]]
df = pd.concat([df for x in range(2)], ignore_index=True)
df21 = df.d_drop_duplicates_planB(subset="baba")
df32 = df.d_drop_duplicates_planB(subset=["PassengerId", "Survived"])
df43 = df.d_drop_duplicates_planB(subset=["PassengerId", "Survived"], keep="first")
df54 = df.d_drop_duplicates_planB()
print(df)
print(df21)
print(df32)
print(df43)
print(df54)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
dropduplicatesplanb-0.10.tar.gz
(20.9 kB
view hashes)
Built Distribution
Close
Hashes for dropduplicatesplanb-0.10-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | e3109028e36f5e37a417cba43222c4d42d7394934ba060d9b427b533b8a9a7cc |
|
MD5 | 30c15bf634b7f6d5f61057cf34d86b19 |
|
BLAKE2b-256 | 91f48de0f92487980ea5b5919ff0f526e34c0c1ecdb919f422039d3088864446 |