Skip to main content

An open source python library for automated feature engineering based on Genetic Programming

Project description

Evolutionary Forest

https://img.shields.io/pypi/v/evolutionary_forest.svg https://img.shields.io/travis/com/zhenlingcn/evolutionaryforest.svg Documentation Status Updates

An open source python library for automated feature engineering based on Genetic Programming

Introduction

Feature engineering is a long-standing issue that has plagued machine learning practitioners for many years. Deep learning techniques have significantly reduced the need for manual feature engineering in recent years. However, a critical issue is that the features discovered by deep learning methods are difficult to interpret.

In the domain of interpretable machine learning, genetic programming has demonstrated to be a promising method for automated feature construction, as it can improve the performance of traditional machine learning systems while maintaining similar interpretability. Nonetheless, such a potent method is rarely mentioned by practitioners. We believe that the main reason for this phenomenon is that there is still a lack of a mature package that can automatically build features based on the genetic programming algorithm. As a result, we propose this package with the goal of providing a powerful feature construction tool for enhancing existing state-of-the-art machine learning algorithms, particularly decision-tree based algorithms.

Features

  • A powerful feature construction tool for generating interpretable machine learning features.

  • A reliable machine learning model has powerful performance on the small dataset.

Installation

From PyPI:

pip install -U evolutionary_forest

From GitHub (Latest Code):

pip install git+https://github.com/hengzhe-zhang/EvolutionaryForest.git

Supported Algorithms

Example

An example of usage:

X, y = load_diabetes(return_X_y=True)
x_train, x_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=0)
r = EvolutionaryForestRegressor(max_height=3, normalize=True, select='AutomaticLexicase',
                                gene_num=10, boost_size=100, n_gen=20, n_pop=200, cross_pb=1,
                                base_learner='Random-DT', verbose=True)
r.fit(x_train, y_train)
print(r2_score(y_test, r.predict(x_test)))

An example of improvements brought about by constructed features:

https://raw.githubusercontent.com/zhenlingcn/EvolutionaryForest/master/docs/constructed_features.png

Tutorials

Here are some nodebook examples of using Evolutionary Forest:

Documentation

Tutorial: English Version | 中文版本

Credits

This package was created with Cookiecutter and the audreyr/cookiecutter-pypackage project template.

Citation

Please cite our paper if you find it helpful :)

@article{zhang2021evolutionary,
  title={An Evolutionary Forest for Regression},
  author={Zhang, Hengzhe and Zhou, Aimin and Zhang, Hu},
  journal={IEEE Transactions on Evolutionary Computation},
  volume={26},
  number={4},
  pages={735--749},
  year={2021},
  publisher={IEEE}
}

@article{zhang2023sr,
  title={SR-Forest: A Genetic Programming based Heterogeneous Ensemble Learning Method},
  author={Zhang, Hengzhe and Zhou, Aimin and Chen, Qi and Xue, Bing and Zhang, Mengjie},
  journal={IEEE Transactions on Evolutionary Computation},
  year={2023},
  publisher={IEEE}
}

History

0.1.0 (2021-05-22)

  • First release on PyPI.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

evolutionary_forest-0.2.4.tar.gz (179.5 kB view details)

Uploaded Source

Built Distribution

evolutionary_forest-0.2.4-py2.py3-none-any.whl (156.8 kB view details)

Uploaded Python 2 Python 3

File details

Details for the file evolutionary_forest-0.2.4.tar.gz.

File metadata

  • Download URL: evolutionary_forest-0.2.4.tar.gz
  • Upload date:
  • Size: 179.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.4

File hashes

Hashes for evolutionary_forest-0.2.4.tar.gz
Algorithm Hash digest
SHA256 ade02241e3ab4c5c7cfc4a23aa4f8866588235e5240a07d8f98a3cdd63f350d2
MD5 9cdcd785e4aec5aba44eea52bf335657
BLAKE2b-256 abab35b47eeb15a4ff203b3c165caed239923b19c97d6005d600c29c62d83cbd

See more details on using hashes here.

File details

Details for the file evolutionary_forest-0.2.4-py2.py3-none-any.whl.

File metadata

File hashes

Hashes for evolutionary_forest-0.2.4-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 03c2cb3bcdd0eb748fbdf4202a565e5696aaa09e16bdeeb41ff81569996c9a46
MD5 e9eab67b60270aa20694fd2c78fd035f
BLAKE2b-256 05b1008f18d171c3faec4e5718b6c18898024655cc2ee299ada7f09ac793b355

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page