Skip to main content

Automated Data Cleaning Library

Project description

AutomatedCleaning

AutomatedCleaning is a Python library for automated data cleaning.It helps preprocess and analyze datasets by handling missing values, outliers, spelling corrections, and more.

Features

  • Supports both large (100+ GB) and small datasets
  • Detects and handles missing values and duplicate records
  • Identifies and corrects spelling errors in categorical values
  • Performs outlier detection and treatment
  • Detects and fixes data imbalance
  • Identifies and corrects skewness in numerical data
  • Checks for correlation and detects multicollinearity
  • Analyzes cardinality in categorical columns
  • Identifies and cleans text columns
  • Detect JSON-type columns
  • Performs univariate, bivariate, and multivariate analysis

Installation

pip install AutomatedCleaning

## Usage
```bash
import AutomatedCleaning as ac
df = ac.load_data("dataset.csv")
df_cleaned = ac.clean_data(df)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

automatedcleaning-0.1.2.tar.gz (22.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

automatedcleaning-0.1.2-py3-none-any.whl (25.1 kB view details)

Uploaded Python 3

File details

Details for the file automatedcleaning-0.1.2.tar.gz.

File metadata

  • Download URL: automatedcleaning-0.1.2.tar.gz
  • Upload date:
  • Size: 22.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.9

File hashes

Hashes for automatedcleaning-0.1.2.tar.gz
Algorithm Hash digest
SHA256 71d55e1c77f881a7b1e3d124cdca9580c779b3daf4e89392329db5713907eda8
MD5 d1feb4562c646dab483eb53eac71bf74
BLAKE2b-256 84a86dde3fa9b78c350f15c349c1f36040fc6ce247dbab2a994f74bc4052c0a5

See more details on using hashes here.

File details

Details for the file automatedcleaning-0.1.2-py3-none-any.whl.

File metadata

File hashes

Hashes for automatedcleaning-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 9962e3b35d607a8496a047e6aff95bcfdeb37a874ab00a519c250baba52ea783
MD5 74031e4809ebf7fd1a2e7d23a800feaf
BLAKE2b-256 c6cab5182816b243c31c318721f418534dbd90b1e942708ae17f21efb8e4804d

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page