Skip to main content

Automated Data Cleaning Library

Project description

AutomatedCleaning

AutomatedCleaning is a Python library for automated data cleaning.It helps preprocess and analyze datasets by handling missing values, outliers, spelling corrections, and more.

Features

  • Supports both large (100+ GB) and small datasets
  • Detects and handles missing values and duplicate records
  • Identifies and corrects spelling errors in categorical values
  • Detect outliers
  • Detects and fixes data imbalance
  • Identifies and corrects skewness in numerical data
  • Checks for correlation and detects multicollinearity
  • Analyzes cardinality in categorical columns
  • Identifies and cleans text columns
  • Detect JSON-type columns
  • Performs univariate, bivariate, and multivariate analysis

Installation

pip install AutomatedCleaning

Usage

import AutomatedCleaning as ac
df = ac.load_data("dataset.csv")
df_cleaned = ac.clean_data(df)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

automatedcleaning-0.1.3.tar.gz (22.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

automatedcleaning-0.1.3-py3-none-any.whl (25.1 kB view details)

Uploaded Python 3

File details

Details for the file automatedcleaning-0.1.3.tar.gz.

File metadata

  • Download URL: automatedcleaning-0.1.3.tar.gz
  • Upload date:
  • Size: 22.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.9

File hashes

Hashes for automatedcleaning-0.1.3.tar.gz
Algorithm Hash digest
SHA256 e75ec3abdf3cdbf734979820c51ed1d74340c3691de8d55ba401baf3f315a5be
MD5 1dca9d6f35d25000afb4d10564cd99eb
BLAKE2b-256 4d91613db9610f7c3f76910a5be3ca1f896d768eaf6ec5c083dcff8eca474c03

See more details on using hashes here.

File details

Details for the file automatedcleaning-0.1.3-py3-none-any.whl.

File metadata

File hashes

Hashes for automatedcleaning-0.1.3-py3-none-any.whl
Algorithm Hash digest
SHA256 75af5b867c11aac93e2e27a6b22f53d2c701ab8da69e6532456bb37d2989567d
MD5 9d79d8a2887b5ead17da3bee056611b9
BLAKE2b-256 2d434ecbc4f27b5e6bd414d761107ff295de39e3bd9fbee7b4702eebab3e27a0

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page