Simple tool that assists with preprocessing pandas dataframes for Machine Learning.
Project description
Grimlock
We all know that when it comes to machine learning, it takes far more time to preprocess your data than it does to actually build a model. Enter, grimlock.
grimlock will fix your missing values, handle data encoding, and feature scaling.
Installation
Provided you already have NumPy, SciPy, Sci-kit Learn and Pandas already installed, the grimlock
package is pip
-installable:
$ pip install grimlock
Cleaning Missing Data
Mesh of pandas.fillna() and sklearn Imputer
from grimlock import clean_missing clean_missing(dataframe, column, clean_type='zero')
Parameters
- dataframe: dataframe variable
- column: column name (string)
- clean_type: 'zero' (default), 'mean', 'mode', 'most_frequent' (string)
Convert Categorical
Quick conversion for categorical features (non-ordinal)
from grimlock import convert_categorical convert_categorical(dataframe, column, target_column)
Parameters
- dataframe: dataframe variable
- column: column name (string)
- target_column: target column name (string)
Feature Scaling
coming soon
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
grimlock-0.0.1.tar.gz
(2.3 kB
view hashes)