Skip to main content

Find and remove duplicate columns in pandas DataFrames

Project description

pandas-dupcol

A lightweight Python utility for finding and removing duplicate columns in pandas DataFrames.

Installation

pip install pandas-dupcol

Features

  • Detect duplicate columns in pandas DataFrames
  • Remove duplicate columns efficiently
  • Uses hash-based optimization with equality verification

Usage

import pandas as pd
import pandas_dupcol as pdc

df = pd.DataFrame({
    "A": [1, 2, 3],
    "B": [4, 5, 6],
    "C": [1, 2, 3]
})

duplicates = pdc.find_duplicate_columns(df)

print(duplicates)

Output:

['C']

Remove Duplicate Columns

cleaned_df = pdc.drop_duplicate_columns(df)

print(cleaned_df)

Output:

   A  B
0  1  4
1  2  5
2  3  6

Author

Sushil Poudel Chhetri

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pandas_dupcol-0.1.2.tar.gz (1.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pandas_dupcol-0.1.2-py3-none-any.whl (2.4 kB view details)

Uploaded Python 3

File details

Details for the file pandas_dupcol-0.1.2.tar.gz.

File metadata

  • Download URL: pandas_dupcol-0.1.2.tar.gz
  • Upload date:
  • Size: 1.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.0

File hashes

Hashes for pandas_dupcol-0.1.2.tar.gz
Algorithm Hash digest
SHA256 887f6db1c435f347d4b24b42b5c545b24e60e278d6904a74052ba568935b7b54
MD5 292bfae8eb7aa2dccd43566312b4296c
BLAKE2b-256 af2c47738b4dccd6a500f3be5bec73e0315f0b703d9fa7f5de6575f4d1b1526a

See more details on using hashes here.

File details

Details for the file pandas_dupcol-0.1.2-py3-none-any.whl.

File metadata

  • Download URL: pandas_dupcol-0.1.2-py3-none-any.whl
  • Upload date:
  • Size: 2.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.0

File hashes

Hashes for pandas_dupcol-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 1a39bd334abfbd8282bd940d19ae92b6fff41cbbf43bfa687c8b6d92759b61f8
MD5 7e1ca22ccfd560137986da8c6f1a9405
BLAKE2b-256 3a7ebe1835f89542c4789669e78c4b923740f4ddcdff922aedb0a1787f2dd743

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page