Skip to main content

AutoClean - Python Package for Automated Preprocessing & Cleaning of Datasets

Project description

AutoClean

Python Package for Automated Dataset Preprocessing & Cleaning

pip install auto-clean

:thought_balloon: Read more on how the algorithm of AutoClean works in my Medium article Automated Data Cleaning with Python.

Description

It is commonly known among Data Scientists that data cleaning and preprocessing make up a major part of a data science project. And, in all honesty, on average it is not the most exciting part of the project.

:white_check_mark: AutoClean helps you save time in major parts of these tasks and performs preprocessing in an automated manner!

AutoClean supports:

:point_right: Various imputation methods for missing values
:point_right: Handling of outliers
:point_right: Encoding of categorical data (OneHot, Label)
:point_right: Extraction of datatime values
:point_right: and more!

As an example, the following sample dataset will be passed through the AutoClean pipeline:

The output of AutoClean looks as following, whereas the various adjustments have been highlighted:

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

py-AutoClean-0.0.1a0.tar.gz (8.9 kB view details)

Uploaded Source

File details

Details for the file py-AutoClean-0.0.1a0.tar.gz.

File metadata

  • Download URL: py-AutoClean-0.0.1a0.tar.gz
  • Upload date:
  • Size: 8.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.5.0 importlib_metadata/3.10.0 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.59.0 CPython/3.8.8

File hashes

Hashes for py-AutoClean-0.0.1a0.tar.gz
Algorithm Hash digest
SHA256 ba5f8674602455e63e0068c3377c0961e53f5b09fb31fdc4673e5c145aa6682c
MD5 16b72f1e7c7bfa11a412d9b8b2a26928
BLAKE2b-256 68d0e93bc04196b15a3716da61a2b78677ff460df47c87b514cd66f6c733b4db

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page