Skip to main content

An open source python library for automated feature engineering based on Genetic Programming

Project description

Evolutionary Forest

https://img.shields.io/pypi/v/evolutionary_forest.svg https://img.shields.io/travis/com/zhenlingcn/evolutionaryforest.svg Documentation Status Updates

An open source python library for automated feature engineering based on Genetic Programming

Introduction

Feature engineering is a long-standing issue that has plagued machine learning practitioners for many years. Deep learning techniques have significantly reduced the need for manual feature engineering in recent years. However, a critical issue is that the features discovered by deep learning methods are difficult to interpret.

In the domain of interpretable machine learning, genetic programming has demonstrated to be a promising method for automated feature construction, as it can improve the performance of traditional machine learning systems while maintaining similar interpretability. Nonetheless, such a potent method is rarely mentioned by practitioners. We believe that the main reason for this phenomenon is that there is still a lack of a mature package that can automatically build features based on the genetic programming algorithm. As a result, we propose this package with the goal of providing a powerful feature construction tool for enhancing existing state-of-the-art machine learning algorithms, particularly decision-tree based algorithms.

Features

  • A powerful feature construction tool for generating interpretable machine learning features.

  • A reliable machine learning model has powerful performance on the small dataset.

Installation

From PyPI:

pip install -U evolutionary_forest

From GitHub (Latest Code):

pip install git+https://github.com/hengzhe-zhang/EvolutionaryForest.git

Supported Algorithms

Example

An example of usage:

X, y = load_diabetes(return_X_y=True)
x_train, x_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=0)
r = EvolutionaryForestRegressor(max_height=3, normalize=True, select='AutomaticLexicase',
                                gene_num=10, boost_size=100, n_gen=20, n_pop=200, cross_pb=1,
                                base_learner='Random-DT', verbose=True)
r.fit(x_train, y_train)
print(r2_score(y_test, r.predict(x_test)))

An example of improvements brought about by constructed features:

https://raw.githubusercontent.com/zhenlingcn/EvolutionaryForest/master/docs/constructed_features.png

Tutorials

Here are some nodebook examples of using Evolutionary Forest:

Documentation

Tutorial: English Version | 中文版本

Credits

This package was created with Cookiecutter and the audreyr/cookiecutter-pypackage project template.

Citation

Please cite our paper if you find it helpful :)

@article{zhang2021evolutionary,
  title={An Evolutionary Forest for Regression},
  author={Zhang, Hengzhe and Zhou, Aimin and Zhang, Hu},
  journal={IEEE Transactions on Evolutionary Computation},
  volume={26},
  number={4},
  pages={735--749},
  year={2021},
  publisher={IEEE}
}

@article{zhang2023sr,
  title={SR-Forest: A Genetic Programming based Heterogeneous Ensemble Learning Method},
  author={Zhang, Hengzhe and Zhou, Aimin and Chen, Qi and Xue, Bing and Zhang, Mengjie},
  journal={IEEE Transactions on Evolutionary Computation},
  year={2023},
  publisher={IEEE}
}

History

0.1.0 (2021-05-22)

  • First release on PyPI.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

evolutionary_forest-0.2.5.tar.gz (550.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

evolutionary_forest-0.2.5-py2.py3-none-any.whl (630.9 kB view details)

Uploaded Python 2Python 3

File details

Details for the file evolutionary_forest-0.2.5.tar.gz.

File metadata

  • Download URL: evolutionary_forest-0.2.5.tar.gz
  • Upload date:
  • Size: 550.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.4

File hashes

Hashes for evolutionary_forest-0.2.5.tar.gz
Algorithm Hash digest
SHA256 45412353953bc17a8845782df748d64711e029595b3009530a3993b26c897cc3
MD5 670a07fc53ef1e2f4afc6cdaf0757436
BLAKE2b-256 02ca642c6d5b62fb8f938b493e621cc0af804b04fd8d0bfac4e9fbc1e63d5a2c

See more details on using hashes here.

File details

Details for the file evolutionary_forest-0.2.5-py2.py3-none-any.whl.

File metadata

File hashes

Hashes for evolutionary_forest-0.2.5-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 5cfe7e1ff3eefca40d49c2ebd63ed7e06baaa5ad59065c7db0ed2b6a6cbdd77d
MD5 218234d1c982e57ef66bef818d6f0dd7
BLAKE2b-256 c21e8a449968cd4d7d04e9d677b6d4f63e3b105472e57e7f30447871cb4225a2

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page