Skip to main content

Automated Data Cleaning Library

Project description

AutomatedCleaning

AutomatedCleaning is a Python library for automated data cleaning.It helps preprocess and analyze datasets by handling missing values, outliers, spelling corrections, and more.

Logo

Features

  • Supports both large (100+ GB) and small datasets
  • Detects and handles missing values and duplicate records
  • Identifies and corrects spelling errors in categorical values
  • Detect outliers
  • Detects and fixes data imbalance
  • Identifies and corrects skewness in numerical data
  • Checks for correlation and detects multicollinearity
  • Analyzes cardinality in categorical columns
  • Identifies and cleans text columns
  • Detect JSON-type columns
  • Performs univariate, bivariate, and multivariate analysis

Installation

pip install AutomatedCleaning

Usage

import AutomatedCleaning as ac
df = ac.load_data("dataset.csv")
df_cleaned = ac.clean_data(df)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

AutomatedCleaning-0.1.7.tar.gz (14.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

AutomatedCleaning-0.1.7-py3-none-any.whl (13.4 kB view details)

Uploaded Python 3

File details

Details for the file AutomatedCleaning-0.1.7.tar.gz.

File metadata

  • Download URL: AutomatedCleaning-0.1.7.tar.gz
  • Upload date:
  • Size: 14.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.9

File hashes

Hashes for AutomatedCleaning-0.1.7.tar.gz
Algorithm Hash digest
SHA256 1b681639eb7ebb7f4c2a4de6c5a7272a0e59e74210973145eb131e23b3ac1a8d
MD5 3eaee523b07b55876f628a8b32c84f25
BLAKE2b-256 aa897c9bf32b462bd1b94d5c1bdee97f8c3b32da6a6580b79bff0aec562356a9

See more details on using hashes here.

File details

Details for the file AutomatedCleaning-0.1.7-py3-none-any.whl.

File metadata

File hashes

Hashes for AutomatedCleaning-0.1.7-py3-none-any.whl
Algorithm Hash digest
SHA256 b9dd24973cccf8e3eaa057a50f7a657fab62d08991f95ff4b0e245da05e7b639
MD5 0c54c87da8440558a6b9cd6ef6d33aac
BLAKE2b-256 9cdf93a61d80010abe7e4c4fc5cc892e7ee9bd01d65ea16f1038945951c71d00

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page