Skip to main content

An easy to use pre-processing utility for machine learning.

Project description

Data PreProcessing

EasyPreProcessing is a Python module that comprises of data pre-processing helper functions mainly for the purpose of data science and machine learning.

Many of the common machine learning activities that are performed during the Feature Engineering can be performed in a single line of code using this library.

What functionalities are currently available?

  • Handling missing values
  • Encoding categorical variables
  • Handling DateTime features
  • Handling empty/blank columns
  • Display correlation metrics
  • Standardize dataset
  • Over sampling
  • Clustering (KMeans)

Installing

Just a simple

pip install easypreprocessing

For details regarding all the functionality available:

from easypreprocessing import EasyPreProcessing
prep = EasyPreProcessing('filename.csv')
prep.info()

Sample Templete

Below you can see a sample code of preprocessing using this library.

from easypreprocessing import EasyPreProcessing
prep = EasyPreProcessing('filename_here.csv')
prep.output = 'output_variable_here'

prep.remove_blank()         #Remove blank or empty columns
prep.missing_values         #Display missing values 
prep.categorical.impute()   #Fill missing values for categorical variables
prep.numerical.impute()     #Fill missing values for numerical variables
prep.categorical.encode()   #Convert categorical features to numerical
prep.standardize()          #Standardize dataset
X_train, X_test, y_train, y_test = prep.split()

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

easypreprocessing-1.0.4.tar.gz (5.8 kB view details)

Uploaded Source

Built Distribution

easypreprocessing-1.0.4-py3-none-any.whl (9.7 kB view details)

Uploaded Python 3

File details

Details for the file easypreprocessing-1.0.4.tar.gz.

File metadata

  • Download URL: easypreprocessing-1.0.4.tar.gz
  • Upload date:
  • Size: 5.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.2 importlib_metadata/4.8.0 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.2 CPython/3.6.13

File hashes

Hashes for easypreprocessing-1.0.4.tar.gz
Algorithm Hash digest
SHA256 d5bfbb2296412d5a93d58e82d0b9c2ddae03e61dc2824d1ba1faacda14cfc4e4
MD5 d632a898ff6fabeb6b5c6543c9cb9db3
BLAKE2b-256 110b27872c2c2ebf6f9151757556f1f96aaf880a495d3046f878a9e1a728f244

See more details on using hashes here.

File details

Details for the file easypreprocessing-1.0.4-py3-none-any.whl.

File metadata

  • Download URL: easypreprocessing-1.0.4-py3-none-any.whl
  • Upload date:
  • Size: 9.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.2 importlib_metadata/4.8.0 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.2 CPython/3.6.13

File hashes

Hashes for easypreprocessing-1.0.4-py3-none-any.whl
Algorithm Hash digest
SHA256 37b8a1ea604436d521d59ba42f8e7a8428b643d30762752af1cb1fdc8ce8ac36
MD5 b96bdcae87c8d9b787161d0609c5a08f
BLAKE2b-256 f8c57e8df128c9ad1fbb4829af0773ce96eca92f36e69876f70275cf1bc6ccb9

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page