Skip to main content

Automated Data Cleaning Library

Project description

AutomatedCleaning

AutomatedCleaning is a Python library for automated data cleaning.It helps preprocess and analyze datasets by handling missing values, outliers, spelling corrections, and more.

Logo

Features

  • Supports both large (100+ GB) and small datasets
  • Detects and handles missing values and duplicate records
  • Identifies and corrects spelling errors in categorical values
  • Detect outliers
  • Detects and fixes data imbalance
  • Identifies and corrects skewness in numerical data
  • Checks for correlation and detects multicollinearity
  • Analyzes cardinality in categorical columns
  • Identifies and cleans text columns
  • Detect JSON-type columns
  • Performs univariate, bivariate, and multivariate analysis

Installation

pip install AutomatedCleaning

Usage

import AutomatedCleaning as ac
df = ac.load_data("dataset.csv")
df_cleaned = ac.clean_data(df)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

automatedcleaning-0.1.5.tar.gz (15.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

automatedcleaning-0.1.5-py3-none-any.whl (13.5 kB view details)

Uploaded Python 3

File details

Details for the file automatedcleaning-0.1.5.tar.gz.

File metadata

  • Download URL: automatedcleaning-0.1.5.tar.gz
  • Upload date:
  • Size: 15.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.9

File hashes

Hashes for automatedcleaning-0.1.5.tar.gz
Algorithm Hash digest
SHA256 8c1b0f6e772e97cfbc747c72b7af8b9d911f57d20ca6feb98ede19f2a4c61d1c
MD5 568e7d69801fb8212eec9186d68aee1a
BLAKE2b-256 b52aa6a0c3dd7c08e823a21c3a699e432ad62eaf1bab1f0993c3dea37237277b

See more details on using hashes here.

File details

Details for the file automatedcleaning-0.1.5-py3-none-any.whl.

File metadata

File hashes

Hashes for automatedcleaning-0.1.5-py3-none-any.whl
Algorithm Hash digest
SHA256 3c62031a46c6aacb6bc43fd0241db430513e0db139b241a5a14eb5a78022de61
MD5 42053956867f918378716eb467cd0cfd
BLAKE2b-256 0ac7263d152ccf03995eec86f6104824de9d1cd0aa22892e7babd16fc65b1b73

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page