Skip to main content

Feature engineering package with Scikit-learn's fit transform functionality

Project description

Feature Engine

PythonVersion License PyPI version Conda CircleCI Documentation Status Join the chat at Sponsorship Downloads Downloads DOI DOI first-timers-only

Feature-engine is a Python library with multiple transformers to engineer and select features for use in machine learning models. Feature-engine's transformers follow Scikit-learn's functionality with fit() and transform() methods to learn the transforming parameters from the data and then transform it.

Feature-engine features in the following resources

Blogs about Feature-engine

En Español


Current Feature-engine's transformers include functionality for:

  • Missing Data Imputation
  • Categorical Encoding
  • Discretisation
  • Outlier Capping or Removal
  • Variable Transformation
  • Variable Creation
  • Variable Selection
  • Datetime Features
  • Time Series
  • Preprocessing
  • Scikit-learn Wrappers

Imputation Methods

  • MeanMedianImputer
  • RandomSampleImputer
  • EndTailImputer
  • AddMissingIndicator
  • CategoricalImputer
  • ArbitraryNumberImputer
  • DropMissingData

Encoding Methods

  • OneHotEncoder
  • OrdinalEncoder
  • CountFrequencyEncoder
  • MeanEncoder
  • WoEEncoder
  • PRatioEncoder
  • RareLabelEncoder
  • DecisionTreeEncoder

Discretisation methods

  • EqualFrequencyDiscretiser
  • EqualWidthDiscretiser
  • DecisionTreeDiscretiser
  • ArbitraryDiscreriser

Outlier Handling methods

  • Winsorizer
  • ArbitraryOutlierCapper
  • OutlierTrimmer

Variable Transformation methods

  • LogTransformer
  • LogCpTransformer
  • ReciprocalTransformer
  • ArcsinTransformer
  • PowerTransformer
  • BoxCoxTransformer
  • YeoJohnsonTransformer

Variable Creation:

  • MathFeatures
  • RelativeFeatures
  • CyclicalFeatures

Feature Selection:

  • DropFeatures
  • DropConstantFeatures
  • DropDuplicateFeatures
  • DropCorrelatedFeatures
  • SmartCorrelationSelection
  • ShuffleFeaturesSelector
  • SelectBySingleFeaturePerformance
  • SelectByTargetMeanPerformance
  • RecursiveFeatureElimination
  • RecursiveFeatureAddition
  • DropHighPSIFeatures


  • DatetimeFeatures

Time Series

  • LagFeatures
  • WindowFeatures
  • ExpandingWindowFeatures


  • MatchVariables


  • SklearnTransformerWrapper


From PyPI using pip:

pip install feature_engine

From Anaconda:

conda install -c conda-forge feature_engine

Or simply clone it:

git clone

Example Usage

>>> import pandas as pd
>>> from feature_engine.encoding import RareLabelEncoder

>>> data = {'var_A': ['A'] * 10 + ['B'] * 10 + ['C'] * 2 + ['D'] * 1}
>>> data = pd.DataFrame(data)
>>> data['var_A'].value_counts()
A    10
B    10
C     2
D     1
Name: var_A, dtype: int64
>>> rare_encoder = RareLabelEncoder(tol=0.10, n_categories=3)
>>> data_encoded = rare_encoder.fit_transform(data)
>>> data_encoded['var_A'].value_counts()
A       10
B       10
Rare     3
Name: var_A, dtype: int64

Find more examples in our Jupyter Notebook Gallery or in the documentation.


Details about how to contribute can be found in the Contribute Page


  • Fork the repo
  • Clone your fork into your local computer: git clone<YOURUSERNAME>/feature_engine.git
  • navigate into the repo folder cd feature_engine
  • Install Feature-engine as a developer: pip install -e .
  • Optional: Create and activate a virtual environment with any tool of choice
  • Install Feature-engine dependencies: pip install -r requirements.txt and pip install -r test_requirements.txt
  • Create a feature branch with a meaningful name for your feature: git checkout -b myfeaturebranch
  • Develop your feature, tests and documentation
  • Make sure the tests pass
  • Make a PR

Thank you!!


Feature-engine documentation is built using Sphinx and is hosted on Read the Docs.

To build the documentation make sure you have the dependencies installed: from the root directory: pip install -r docs/requirements.txt.

Now you can build the docs using: sphinx-build -b html docs build


BSD 3-Clause

Sponsor us

Sponsor us and support further our mission to democratize machine learning and programming tools through open-source software.

Project details

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

feature_engine-1.4.1.tar.gz (158.1 kB view hashes)

Uploaded source

Built Distribution

feature_engine-1.4.1-py2.py3-none-any.whl (276.6 kB view hashes)

Uploaded py2 py3

Supported by

AWS AWS Cloud computing Datadog Datadog Monitoring Facebook / Instagram Facebook / Instagram PSF Sponsor Fastly Fastly CDN Google Google Object Storage and Download Analytics Huawei Huawei PSF Sponsor Microsoft Microsoft PSF Sponsor NVIDIA NVIDIA PSF Sponsor Pingdom Pingdom Monitoring Salesforce Salesforce PSF Sponsor Sentry Sentry Error logging StatusPage StatusPage Status page