Skip to main content

Automated Data Cleaning Library

Project description

AutomatedCleaning

AutomatedCleaning is a Python library for automated data cleaning.It helps preprocess and analyze datasets by handling missing values, outliers, spelling corrections, and more.

Logo

Features

  • Supports both large (100+ GB) and small datasets
  • Detects and handles missing values and duplicate records
  • Identifies and corrects spelling errors in categorical values
  • Detect outliers
  • Detects and fixes data imbalance
  • Identifies and corrects skewness in numerical data
  • Checks for correlation and detects multicollinearity
  • Analyzes cardinality in categorical columns
  • Identifies and cleans text columns
  • Detect JSON-type columns
  • Performs univariate, bivariate, and multivariate analysis

Installation

pip install AutomatedCleaning

Usage

import AutomatedCleaning as ac
df = ac.load_data("dataset.csv")
df_cleaned = ac.clean_data(df)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

automatedcleaning-1.0.tar.gz (17.6 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

automatedcleaning-1.0.0-py3-none-any.whl (16.1 kB view details)

Uploaded Python 3

automatedcleaning-1.0-py3-none-any.whl (16.1 kB view details)

Uploaded Python 3

File details

Details for the file automatedcleaning-1.0.tar.gz.

File metadata

  • Download URL: automatedcleaning-1.0.tar.gz
  • Upload date:
  • Size: 17.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.9

File hashes

Hashes for automatedcleaning-1.0.tar.gz
Algorithm Hash digest
SHA256 58826ac1bb8a34de15d7c768f80f15e1aa7cce3b72cdb8cdc17dd79f07d19d8f
MD5 31d890b80ad66ce0e09e336a2288010d
BLAKE2b-256 ad9c85e72c8c9880de570f6e9933092331020d601e9f349f8acaf8a6c85cbc61

See more details on using hashes here.

File details

Details for the file automatedcleaning-1.0.0-py3-none-any.whl.

File metadata

File hashes

Hashes for automatedcleaning-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 74684764be7b1e8d757a8f866dfbae3128ebe4fca0f6d85caea1dc63a0a0c0b9
MD5 56e5d397412920b33f69cf7d9f16909d
BLAKE2b-256 729db0ae594798a21ec301a3ea5d68d55916517c79b2fb0f2572ad27259a87a4

See more details on using hashes here.

File details

Details for the file automatedcleaning-1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for automatedcleaning-1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 38cd2069a3bb45213e1aafdce5e880a8d2740ac495b4a68e7bd3defda14b0b7d
MD5 2d94006f919c0b0101022783295d6f9b
BLAKE2b-256 c342fb6de9778e902cdf6451aa912db4420780fa58a81ceb1cf256388cf292c3

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page