Skip to main content

Automated Data Cleaning Library

Project description

AutomatedCleaning

AutomatedCleaning is a Python library for automated data cleaning.It helps preprocess and analyze datasets by handling missing values, outliers, spelling corrections, and more.

Logo

Features

  • Supports both large (100+ GB) and small datasets
  • Detects and handles missing values and duplicate records
  • Identifies and corrects spelling errors in categorical values
  • Detect outliers
  • Detects and fixes data imbalance
  • Identifies and corrects skewness in numerical data
  • Checks for correlation and detects multicollinearity
  • Analyzes cardinality in categorical columns
  • Identifies and cleans text columns
  • Detect JSON-type columns
  • Detect and mask PII types of columns
  • Performs univariate, bivariate, and multivariate analysis

Installation

pip install AutomatedCleaning

Usage

import AutomatedCleaning as ac
df = ac.load_data("dataset.csv")
df_cleaned = ac.clean_data(df)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

automatedcleaning-1.1.tar.gz (17.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

automatedcleaning-1.1-py3-none-any.whl (16.1 kB view details)

Uploaded Python 3

File details

Details for the file automatedcleaning-1.1.tar.gz.

File metadata

  • Download URL: automatedcleaning-1.1.tar.gz
  • Upload date:
  • Size: 17.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.9

File hashes

Hashes for automatedcleaning-1.1.tar.gz
Algorithm Hash digest
SHA256 7c424eec0ce09ea424dff2db2ac47cf99fac0c9c64a3ce8bebbcf2cbcd50d9f2
MD5 6960df9682673311f69b2a01caf2a8e0
BLAKE2b-256 743dde424aa40eb69d967338224027cee3f279c7e6e124421813ce08070fb6be

See more details on using hashes here.

File details

Details for the file automatedcleaning-1.1-py3-none-any.whl.

File metadata

File hashes

Hashes for automatedcleaning-1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 ff51c689ef098ea4610a033393cb7b4c27c7e3c247344488b0e47020bd88877b
MD5 4ea543b751b28ff0c41e487574667b87
BLAKE2b-256 8501f435e9e2fbe37fe6145928346c09454551cfca01b4c8e23a8d0c87ae34a3

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page