Imperio is a python sci-kit learn inspired package for feature engineering.
Project description
Imperio is a python sci-kit learn inspired package for feature engineering. It contains a some feature transformers to make your data more easy to learn from for Machine Learning Algorithms.
This version of imperio has the next methods of feature selection:
Box-Cox (BoxCoxTransformer).
Clusterize (ClusterizeTransformer).
Combinator (CombinatorTransformer).
Frequency Imputation Transformer (FrequencyImputationTransformer).
log Transformer (LogTransformer).
Smoothing (SmoothingTransformer).
Spatial-Sign Transformer (SpatialSignTransformer).
Target Imputation Transformer (TargetImputationTransformer).
Whitening (WhiteningTransformer).
Yeo-Johnson Transformer (YeoJohnsonTransformer).
ZCA (ZCATransformer).
All these methods work like normal sklearn transformers. They have fit, transform and fit_transform functions implemented.
Additionally every imperio transformer has an apply function which allows to apply an transformation on a pandas Data Frame.
How to use imperio
To use a transformer from imperio you should just import the transformer from imperio in the following framework:
`from imperio import <class name>`
class names are written above in parantheses.
Next create a object of this algorithm (I will use Box-Cox as an example).
`method = BoxCoxTransformer()`
Firstly you should fit the transformer, passing to it a feature matrix (X) and the target array (y). y argument is really used only by Target-Imputation
`mathod.fit(X, y)`
After you fit the model, you can use it for transforming new data, using the transform function. To transform function you should pass only the feature matrix (X).
`X_transformed = method.transform(X)`
Also you can fit and transform the data at the same time using the fit_transform function.
`X_transformed = method.fit_transform(X)`
Also you can apply a transformation directly on a pandas DataFrame, choosing the columns that you want to change.
`new_df = method.apply(df, 'target', ['col1', 'col2'])`
Some advices.
Use `FrequencyImputationTransformer` and `TargetImputationTransformer` for categorical features.
Use `BoxCoxTransformer` and `YeoJohnsonTransformer` for numerical features to normalize a feature distribution.
Use `SpatialSignTransformer` on normalized data to bring outlayers to normal features..
4) Use `CombinatorTransformer` to combine different transformers on categorical and numerical columns separately. With love from Sigmoid.
We are open for feedback. Please send your impression to papaluta.vasile@isa.utm.md
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.