Skip to main content

Automated Data Cleaning Library

Project description

AutomatedCleaning

AutomatedCleaning is a Python library for automated data cleaning.It helps preprocess and analyze datasets by handling missing values, outliers, spelling corrections, and more.

Logo

Features

  • Supports both large (100+ GB) and small datasets
  • Detects and handles missing values and duplicate records
  • Identifies and corrects spelling errors in categorical values
  • Detect outliers
  • Detects and fixes data imbalance
  • Identifies and corrects skewness in numerical data
  • Checks for correlation and detects multicollinearity
  • Analyzes cardinality in categorical columns
  • Identifies and cleans text columns
  • Detect JSON-type columns
  • Performs univariate, bivariate, and multivariate analysis

Installation

pip install AutomatedCleaning

Usage

import AutomatedCleaning as ac
df = ac.load_data("dataset.csv")
df_cleaned = ac.clean_data(df)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

automatedcleaning-0.1.9.tar.gz (16.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

automatedcleaning-0.1.9-py3-none-any.whl (15.0 kB view details)

Uploaded Python 3

File details

Details for the file automatedcleaning-0.1.9.tar.gz.

File metadata

  • Download URL: automatedcleaning-0.1.9.tar.gz
  • Upload date:
  • Size: 16.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.9

File hashes

Hashes for automatedcleaning-0.1.9.tar.gz
Algorithm Hash digest
SHA256 b1d691ede99fb597b1afacaa9fd4fb4818dc3f9407b61b6b2aaaa061965f3508
MD5 335f36eb168895cd80d73e91b5b8f1ec
BLAKE2b-256 be2901dc01612c36c1ed641508d534c45bf3e729026fd1193fe5a194d8b3d9c5

See more details on using hashes here.

File details

Details for the file automatedcleaning-0.1.9-py3-none-any.whl.

File metadata

File hashes

Hashes for automatedcleaning-0.1.9-py3-none-any.whl
Algorithm Hash digest
SHA256 a12bb59fdb7958594b19bffe60fb2526bd74944c38269313b14bd3d8c668fd7a
MD5 a35bf798491a0390eb012c9620010a97
BLAKE2b-256 09beae5e7e2110c6ebcc757937cca63633e1b5132618f434d53b8e7a3c5114bf

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page