Skip to main content

Feature engineering package with Scikit-learn's fit transform functionality

Project description

Feature Engine

Python 3.6 Python 3.7 Python 3.8 License CircleCI Documentation Status

Feature-engine is a Python library with multiple transformers to engineer features for use in machine learning models. Feature-engine's transformers follow Scikit-learn functionality with fit() and transform() methods to first learn the transforming paramenters from data and then transform the data.

Feature-engine features in the following resources:

Feature Engineering for Machine Learning, Online Course. Python Feature Engineering Cookbook


Current Feature-engine's transformers include functionality for:

  • Missing data imputation
  • Categorical variable encoding
  • Outlier removal
  • Discretisation
  • Numerical Variable Transformation

Imputing Methods

  • MeanMedianImputer
  • RandomSampleImputer
  • EndTailImputer
  • AddNaNBinaryImputer
  • CategoricalVariableImputer
  • FrequentCategoryImputer
  • ArbitraryNumberImputer

Encoding Methods

  • CountFrequencyCategoricalEncoder
  • OrdinalCategoricalEncoder
  • MeanCategoricalEncoder
  • WoERatioCategoricalEncoder
  • OneHotCategoricalEncoder
  • RareLabelCategoricalEncoder

Outlier Handling methods

  • Winsorizer
  • ArbitraryOutlierCapper
  • OutlierTrimmer

Discretisation methods

  • EqualFrequencyDiscretiser
  • EqualWidthDiscretiser
  • DecisionTreeDiscretiser

Variable Transformation methods

  • LogTransformer
  • ReciprocalTransformer
  • PowerTransformer
  • BoxCoxTransformer
  • YeoJohnsonTransformer


pip install feature_engine


git clone


from feature_engine.categorical_encoders import RareLabelEncoder

rare_encoder = RareLabelEncoder(tol = 0.05, n_categories=5), variables = ['Cabin', 'Age'])
data_encoded = rare_encoder.transform(data)

See more usage examples in the jupyter notebooks in the example folder of this repository, or in the documentation:


Local Setup Steps

  • Clone the repo and cd into it
  • Run pip install tox
  • Run tox if the tests pass, your local setup is complete

Opening Pull Requests

PR's are welcome! Please make sure the CI tests pass on your branch.


BSD 3-Clause



Many of the engineering and encoding functionality is inspired by this series of articles from the 2009 KDD competition.

To learn more about the rationale, functionality, pros and cos of each imputer, encoder and transformer, refer to the Feature Engineering for Machine Learning, Online Course

For a summary of the methods check this presentation and this article

To stay alert of latest releases, sign up at trainindata

Project details

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Files for feature-engine, version 0.4.1
Filename, size File type Python version Upload date Hashes
Filename, size feature_engine-0.4.1-py3-none-any.whl (26.7 kB) File type Wheel Python version py3 Upload date Hashes View
Filename, size feature_engine-0.4.1.tar.gz (21.6 kB) File type Source Python version None Upload date Hashes View

Supported by

AWS AWS Cloud computing Datadog Datadog Monitoring DigiCert DigiCert EV certificate Facebook / Instagram Facebook / Instagram PSF Sponsor Fastly Fastly CDN Google Google Object Storage and Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Salesforce Salesforce PSF Sponsor Sentry Sentry Error logging StatusPage StatusPage Status page