Skip to main content

Reconstructable preprocessor library

Project description

MIT License Build Status codecov

Prepnet

Reconstructable preprocessor library.

There are concepts of this library.

  • All pre-processes can save as a pickle.
  • Reconstructable pre-processes for feature analysis

Example

A simple example is see examples/01_iris.ipynb There is pre-process using prepnet for iris dataset in a part of example.

import prepnet 
from sklearn import datasets

# Load dataset.
iris = datasets.load_iris()
df = pd.DataFrame(iris.data, columns=iris.feature_names)
df['target'] = iris.target_names[iris.target]

# Scale by std and mean, and split 5 folds.
context = prepnet.FunctionalContext()
with context.enter('normalize'):
    # All pre-process method allow method chain.
    context[
        'sepal length (cm)',
        'sepal width (cm)',
        'petal length (cm)',
        'petal width (cm)',
    ].standardize()

# context.post is execute always after other preprocesses.
with context.enter('post'):
    context.split()

# convert python list object from prepnet.DataFrameArray.
preprocessed_df_list = list(context.encode(df))
# Concat first 4 element for train dataset.
train_df = pd.concat(preprocessed_df_list[:4], axis=0) 
# Use last element for test dataset.
test_df = preprocessed_df_list[-1]

And above preprocessor context can disable normalize easily

new_context = context.disable()
preprocessed_df_list = list(context.encode(df))
# Concat first 4 element for train dataset
nonnorm_train_df = pd.concat(preprocessed_df_list[:4], axis=0) 
# Use last element for test dataset
nonnorm_test_df = preprocessed_df_list[-1]

Do you ever remember this?

Boss: Hey, what's the difference between the new results and the old ones?

Someone: Well, some preprocesses are different.

Boss: Okay. Let me see the dataset.

Someone: Yes, sir. It's this and this.

Boss: What's the difference two datasets? The value that comes out of the difference is slightly, what's the difference in the preprocess?

Someone: Well, I just don't know.

Boss: Why? The dataset contains a commit idand you're managing source codes with git.

Someone: Even if I knew what version of the dataset it was created from. I would have commented out the details and preprocessed it...

Boss: Hey you...

Install

pip install prepnet

or

git clone https://github.com/elda27/prepnet
cd prepnet
python setup.py install

Test

python -m pytest --cov=prepnet

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

prepnet-0.2.0.tar.gz (12.3 kB view hashes)

Uploaded Source

Built Distributions

prepnet-0.2.0-py3.7.egg (64.1 kB view hashes)

Uploaded Source

prepnet-0.2.0-py3-none-any.whl (26.0 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page