Skip to main content

Automated Data Cleaning Library

Project description

AutomatedCleaning

AutomatedCleaning is a Python library for automated data cleaning.It helps preprocess and analyze datasets by handling missing values, outliers, spelling corrections, and more.

Logo

Features

  • Supports both large (100+ GB) and small datasets
  • Detects and handles missing values and duplicate records
  • Identifies and corrects spelling errors in categorical values
  • Detect outliers
  • Detects and fixes data imbalance
  • Identifies and corrects skewness in numerical data
  • Checks for correlation and detects multicollinearity
  • Analyzes cardinality in categorical columns
  • Identifies and cleans text columns
  • Detect JSON-type columns
  • Performs univariate, bivariate, and multivariate analysis

Installation

pip install AutomatedCleaning

Usage

import AutomatedCleaning as ac
df = ac.load_data("dataset.csv")
df_cleaned = ac.clean_data(df)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

AutomatedCleaning-0.1.6.tar.gz (14.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

AutomatedCleaning-0.1.6-py3-none-any.whl (13.4 kB view details)

Uploaded Python 3

File details

Details for the file AutomatedCleaning-0.1.6.tar.gz.

File metadata

  • Download URL: AutomatedCleaning-0.1.6.tar.gz
  • Upload date:
  • Size: 14.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.9

File hashes

Hashes for AutomatedCleaning-0.1.6.tar.gz
Algorithm Hash digest
SHA256 82904938d7eb0a75c5396983a0b523da47bed2649b56b3175b1456315145f433
MD5 a8c97675289d9abcc64e681bab621528
BLAKE2b-256 136fd7d96d0f4f74c6b0ef4d44ef3c90c68f3f3c534484767b8625dbed567390

See more details on using hashes here.

File details

Details for the file AutomatedCleaning-0.1.6-py3-none-any.whl.

File metadata

File hashes

Hashes for AutomatedCleaning-0.1.6-py3-none-any.whl
Algorithm Hash digest
SHA256 0599c7cde779dde8bc5006593973d335178ec32e5e706f47c2ce81a8cb5e8055
MD5 359fc556761c1d7c330625575e6371c2
BLAKE2b-256 757be44f5fe7f1167858ab2a923ec5a3fc4d05b7597a36c159631e4e737098a0

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page