Imperio is a python sci-kit learn inspired package for feature engineering.
Project description
imperio
Imperio is a python sci-kit learn inspired package for feature engineering. It contains a some feature transformers to make your data more easy to learn from for Machine Learning Algorithms.
This version of imperio has the next methods of feature selection:
- Box-Cox (BoxCoxTransformer).
- Clusterize (ClusterizeTransformer).
- Combinator (CombinatorTransformer).
- Frequency Imputation Transformer (FrequencyImputationTransformer).
- log Transformer (LogTransformer).
- Smoothing (SmoothingTransformer).
- Spatial-Sign Transformer (SpatialSignTransformer).
- Target Imputation Transformer (TargetImputationTransformer).
- Whitening (WhiteningTransformer).
- Yeo-Johnson Transformer (YeoJohnsonTransformer).
- ZCA (ZCATransformer).
All these methods work like normal sklearn transformers. They have fit, transform and fit_transform functions implemented.
Additionally every imperio transformer has an apply function which allows to apply an transformation on a pandas Data Frame.
How to use imperio
To use a transformer from imperio you should just import the transformer from imperio in the following framework:
from imperio import BoxCoxTransformer
class names are written above in parantheses.
Next create a object of this algorithm (Box-Cox is used as an example).
method = BoxCoxTransformer()
Firstly you should fit the transformer, passing to it a feature matrix (X) and the target array (y). NOTE: y argument is really used only by the Target-Imputation.
method.fit(X, y)
After you fit the model, you can use it for transforming new data, using the transform function. To transform function you should pass only the feature matrix (X).
X_transformed = method.transform(X)
Also you can fit and transform the data at the same time using the fit_transform function.
X_transformed = method.fit_transform(X)
Also you can apply a transformation directly on a pandas DataFrame, choosing the columns that you want to change.
new_df = method.apply(df, 'target', ['col1', 'col2']
Some advice:
- Use
FrequencyImputationTransformerorTargetImputationTransformerfor categorical features. - Use
BoxCoxTransformerorYeoJohnsonTransformerfor numerical features to normalize a feature distribution. - Use
SpatialSignTransformeron normalized data to bring outliers to normal samples. - Use
CombinatorTransformeron tombine different transformers on categorical and numerical columns separately.
With <3 from Sigmoid!
We are open for feedback. Please send your impressions to vladimir.stojoc@gmail.com
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file imperio-0.1.5.tar.gz.
File metadata
- Download URL: imperio-0.1.5.tar.gz
- Upload date:
- Size: 13.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.10.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
93a0686de50e1a1ff3fe25be66c18c40fbac56f41e17cc044e7b7398ba67e683
|
|
| MD5 |
7d6c5d92476e32610121f53a8ce5c25d
|
|
| BLAKE2b-256 |
44c7c6222d0ed638e9ca0a111fe1d23fc2325c922dc8cd8b93acc98ab99ee12a
|
File details
Details for the file imperio-0.1.5-py3-none-any.whl.
File metadata
- Download URL: imperio-0.1.5-py3-none-any.whl
- Upload date:
- Size: 24.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.10.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3377a04d1e826fd0356564c3a6afc2fe79e0655e4e269ea29ca61e77cea7ac55
|
|
| MD5 |
53c003ab9289e4b4d5a003d6fbc0248b
|
|
| BLAKE2b-256 |
e6527ead831c549633ab45574bde68d3da489aaf8ad253f53d462281b7c5e50d
|