Skip to main content

Simple tool that assists with preprocessing pandas dataframes for Machine Learning.

Project description

Grimlock

We all know that when it comes to machine learning, it takes far more time to preprocess your data than it does to actually build a model. Enter, grimlock.

grimlock will fix your missing values, handle data encoding, and feature scaling.

Installation

Provided you already have NumPy, SciPy, Sci-kit Learn and Pandas already installed, the grimlock package is pip-installable:

$ pip install grimlock

Cleaning Missing Data

Mesh of pandas.fillna() and sklearn Imputer

from grimlock import clean_missing
clean_missing(dataframe, column, clean_type='zero')

Parameters

  • dataframe: dataframe variable
  • column: column name (string)
  • clean_type: 'zero' (default), 'mean', 'mode', 'most_frequent' (string)

Convert Categorical

Quick conversion for categorical features (non-ordinal)

from grimlock import convert_categorical
convert_categorical(dataframe, column, target_column)

Parameters

  • dataframe: dataframe variable
  • column: column name (string)
  • target_column: target column name (string)

Feature Scaling

coming soon

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

grimlock-0.0.1.tar.gz (2.3 kB view hashes)

Uploaded Source

Built Distribution

grimlock-0.0.1-py3-none-any.whl (2.8 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page