Skip to main content

Automated Data Cleaning Library

Project description

AutomatedCleaning

AutomatedCleaning is a Python library for automated data cleaning.It helps preprocess and analyze datasets by handling missing values, outliers, spelling corrections, and more.

Logo

Features

  • Supports both large (100+ GB) and small datasets
  • Detects and handles missing values and duplicate records
  • Identifies and corrects spelling errors in categorical values
  • Detect outliers
  • Detects and fixes data imbalance
  • Identifies and corrects skewness in numerical data
  • Checks for correlation and detects multicollinearity
  • Analyzes cardinality in categorical columns
  • Identifies and cleans text columns
  • Detect JSON-type columns
  • Performs univariate, bivariate, and multivariate analysis

Installation

pip install AutomatedCleaning

Usage

import AutomatedCleaning as ac
df = ac.load_data("dataset.csv")
df_cleaned = ac.clean_data(df)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

automatedcleaning-0.1.4.tar.gz (15.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

automatedcleaning-0.1.4-py3-none-any.whl (13.4 kB view details)

Uploaded Python 3

File details

Details for the file automatedcleaning-0.1.4.tar.gz.

File metadata

  • Download URL: automatedcleaning-0.1.4.tar.gz
  • Upload date:
  • Size: 15.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.9

File hashes

Hashes for automatedcleaning-0.1.4.tar.gz
Algorithm Hash digest
SHA256 bbe9064aac1279e4011441ff05392258e2d6d6f0671947ecf88a6ba50ab89117
MD5 cabf0d78d404fce8a0b3dbd00e74d665
BLAKE2b-256 43dd19d7f3cab91930ff7271cdf41d3463158aec231abfd82b82b446e7c453fe

See more details on using hashes here.

File details

Details for the file automatedcleaning-0.1.4-py3-none-any.whl.

File metadata

File hashes

Hashes for automatedcleaning-0.1.4-py3-none-any.whl
Algorithm Hash digest
SHA256 42f77e3901006cde234dca5a2bd63e4f95e844656a9b92ffa0db96635f453362
MD5 2d153da6262ba5cf48ec4010ef0d630f
BLAKE2b-256 03ce6e2ce68b0a89848ee78d172c8edcf6f6478e79010c36f683e39ca5a2acc5

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page