Skip to main content

One-line intelligent data cleaning for pandas DataFrames

Project description

QuickClean

One-line intelligent data cleaning for pandas DataFrames.

import quickclean as qc
df_clean = qc.clean(df)

Lighter than pyjanitor. Smarter than pandas manual.

Tests Coverage PyPI

Why QuickClean?

QuickClean pandas manual
Lines of code 2 30
Time (100K rows) 0.26s 0.14s
Missing % left 0.0% 0.0%
Outliers handled yes manual
Auto dtype fix yes manual

Installation

pip install quickclean

# With ML-powered imputation
pip install quickclean[smart]

# Everything
pip install quickclean[full]

Usage

import quickclean as qc

# One line
df_clean = qc.clean(df)

# With options
df_clean = qc.clean(
    df,
    strategy="smart",       # 'fast' | 'smart' | 'aggressive'
    aggressiveness=0.5,     # 0.0–1.0
    verbose=True,           # print cleaning report
    preview=False,          # return analysis dict only
)

What gets cleaned automatically

  • Missing values — smart imputation (median/mode/KNN/iterative)
  • Outliers — adaptive detection + cap/remove/impute
  • Duplicates — global + subset aware
  • Data types — inference & auto-correction
  • Formatting — snake_case columns, string normalization, date parsing, numeric string conversion
  • Categories — fuzzy harmonization ("Jakarta" = "JAKARTA")

Performance

Rows Time Throughput
10K 0.04s 273K rows/s
100K 0.24s 412K rows/s
1M 2.40s 416K rows/s

Requirements

  • Python >= 3.10
  • pandas >= 1.5.0
  • numpy >= 1.21.0

Optional: scikit-learn (smart imputation), rapidfuzz (fuzzy matching), dateparser (date parsing)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

quickclean-0.1.0.tar.gz (8.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

quickclean-0.1.0-py3-none-any.whl (11.7 kB view details)

Uploaded Python 3

File details

Details for the file quickclean-0.1.0.tar.gz.

File metadata

  • Download URL: quickclean-0.1.0.tar.gz
  • Upload date:
  • Size: 8.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for quickclean-0.1.0.tar.gz
Algorithm Hash digest
SHA256 5d131aefc92d1ba36d6ccde84a0ec2649e35bc8b8305093555631f516206a311
MD5 dd421a3da1bae74a14ba49f015a0c1a8
BLAKE2b-256 9df04ae00871c586a0e4ffa11627f9d3ba4d15500ac49a5e9e00deea9d67a029

See more details on using hashes here.

Provenance

The following attestation bundles were made for quickclean-0.1.0.tar.gz:

Publisher: publish.yml on alphariz/quickclean

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file quickclean-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: quickclean-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 11.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for quickclean-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 638093ea10d3c022d7af6ab434252685636d7c4534e48af249a0177b1e2e5da2
MD5 6913a14789d8f8749062b96092ded2b7
BLAKE2b-256 c246948bdfcb8a9dd4c7e2f0767abd8870a04623f475a6e79b3ab3ae6e755542

See more details on using hashes here.

Provenance

The following attestation bundles were made for quickclean-0.1.0-py3-none-any.whl:

Publisher: publish.yml on alphariz/quickclean

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page