Evolutionary structural learning framework FEDOT
Project description
FEDOT
package |
|
---|---|
tests |
|
docs |
|
license |
This repository contains Fedot - a framework for automated modeling and machine learning. It can build composite models for the different real-world processes in an automated way using an evolutionary approach.
Composite models - the models with heterogeneous graph-based structure, that can consist of ML models, domain-specific models, equation-based models, statistical, and even other composite models. Composite modelling allows obtaining efficient multi-scale solutions for various applied problems.
Fedot can be used for classification, regression, clustering, time series forecasting, and other similar tasks. Also, the derived solutions for other problems (e.g. bayesian generation of synthetic data) can be build using Fedot.Core.
The project is maintained by the research team of Natural Systems Simulation Lab, which is a part of the National Center for Cognitive Research of ITMO University.
Installation
Common installation:
$ pip install fedot
In order to work with FEDOT source code:
$ git clone https://github.com/nccr-itmo/FEDOT.git
$ cd FEDOT
$ pip install -r requirements.txt
$ pytest -s test
FEDOT features
The generation of high-quality variable-shaped machine learning pipelines for various tasks: binary/multiclass classification, regression, clustering, time series forecasting;
The structural learning of composite models with different nature (hybrid, bayesian, deep learning, etc) using custom metrics;
The seamless integration of the custom models (including domain-specific), frameworks and algorithms into pipelines;
Benchmarking utilities that can run real-world cases (the ready-to-use examples are provided for credit scoring, sea surface height forecasting, oil production forecasting, etc), state-of-the-art-datasets (like PMLB) and synthetic data.
How to use
The main purpose of FEDOT is to identify a suitable composite model for a given dataset. The model is obtained via optimization process (we also call it ‘composing’).Firstly, you need to prepare datasets for fit and validate and specify a task that you going to solve:
task = Task(TaskTypesEnum.classification)
dataset_to_compose = InputData.from_csv(train_file_path, task=task)
dataset_to_validate = InputData.from_csv(test_file_path, task=task)
Then, chose a set of models that can be included in the composite model, and the optimized metric function:
available_model_types, _ = ModelTypesRepository().suitable_model(task_type=task.task_type)
metric_function = MetricsRepository().metric_by_id(ClassificationMetricsEnum.ROCAUC)
Next, you need to specify requirements for composer. In this case, GPComposer is chosen that is based on evolutionary algorithm.
composer_requirements = GPComposerRequirements(
primary=available_model_types,
secondary=available_model_types, max_arity=3,
max_depth=3, pop_size=20, num_of_generations=20,
crossover_prob=0.8, mutation_prob=0.8, max_lead_time=20)
composer = GPComposer()
Now you can run the optimization and obtain a composite model:
chain_evo_composed = composer.compose_chain(data=dataset_to_compose,
initial_chain=None,
composer_requirements=composer_requirements,
metrics=metric_function,
is_visualise=False)
Finally, you can test the resulted model on the validation dataset:
roc_on_valid_evo_composed = calculate_validation_metric(chain_evo_composed,
dataset_to_validate)
print(f'Composed ROC AUC is {round(roc_on_valid_evo_composed, 3)}')
Extended examples:
Credit scoring problem, i.e. binary classification task
Time series forecasting, i.e. random process regression
Also, several video tutorials are available (in Russian).
Project structure
The latest stable release of FEDOT is on the master branch. Make sure you are looking at and working on the actual code if you’re looking to contribute code.
The repository includes the following directories:
Package core contains the main classes and scripts. It is a core of FEDOT framework
Package examples includes several how-to-use-cases where you can start to discover how FEDOT works
All unit tests can be observed in the test directory
The sources of documentation are in the docs
Also you can check benchmarking repository that was developed to show the comparison of FEDOT against the well-known AutoML frameworks.
Basic Concepts
The main process of FEDOT work is composing leading to the production of the composite models.
Composer is a block that takes meta-requirements and the evolutionary algorithm as an optimization one and get different chains of models to find the most appropriate solution for the case.
The result of composing and basic object user works with is the Chain: Chain is the tree-based structure of any composite model. It keeps the information of nodes relations and everything referred to chain properties and restructure.
- In fact, any chain has two kinds of nodes:
Primary nodes are edge (leaf) nodes of the tree where initial case data is located.
Secondary nodes are all other nodes which transform data during the composing and fitting, including root node with result data.
Meanwhile, every node holds the Model which could be ML or any other kind of model.
The referenced papers:
Kalyuzhnaya A. V. et al. Automatic evolutionary learning of composite models with knowledge enrichment //Proceedings of the 2020 Genetic and Evolutionary Computation Conference Companion. – 2020. – P. 43-44.
Kovalchuk S. V. et al. A conceptual approach to complex model management with generalized modelling patterns and evolutionary identification //Complexity. – 2018. – V. 2018.
Nikitin N. O. et al. Deadline-driven approach for multi-fidelity surrogate-assisted environmental model calibration: SWAN wind wave model case study //Proceedings of the Genetic and Evolutionary Computation Conference Companion. – 2019. – С. 1583-1591.
Vychuzhanin P., Nikitin N. O., Kalyuzhnaya A. V. Robust Ensemble-Based Evolutionary Calibration of the Numerical Wind Wave Model //International Conference on Computational Science. – Springer, Cham, 2019. – P. 614-627.
Nikitin N. O. et al. Evolutionary ensemble approach for behavioral credit scoring //International Conference on Computational Science. – Springer, Cham, 2018. – P. 825-831.
Current R&D and future plans
At the moment, we execute an extensive set of experiments to determine the most suitable approaches for evolutionary chain optimization, hyperparameters tuning, benchmarking, etc. The different case studies from different subject areas (metocean science, oil production, seismic, robotics, economics, etc) are in progress now. The various features are planned to be implemented: multi-data chains, Bayesian networks optimization, domain-specific, equation-based models involvement, model export and atomization, interpretable surrogate models, etc.
Any support and contribution are welcome.
Documentation
The documentation is available in FEDOT.Docs repository.
The description and source code of underlying algorithms is available in FEDOT.Algs repository and its wiki pages (in Russian).
Also, FEDOT API in Read the Docs.
Contribution Guide
The contribution guide is available in the repository.
Acknowledgements
We acknowledge the contributors for their important impact and the participants of the numerous scientific conferences and workshops for their valuable advice and suggestions.
Supported by
Citation
- @inproceedings{kalyuzhnaya2020automatic,
title={Automatic evolutionary learning of composite models with knowledge enrichment}, author={Kalyuzhnaya, Anna V and Nikitin, Nikolay O and Vychuzhanin, Pavel and Hvatov, Alexander and Boukhanovsky, Alexander}, booktitle={Proceedings of the 2020 Genetic and Evolutionary Computation Conference Companion}, pages={43–44}, year={2020}}
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file fedot_example_test-0.0.4.tar.gz
.
File metadata
- Download URL: fedot_example_test-0.0.4.tar.gz
- Upload date:
- Size: 67.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.2.0 pkginfo/1.6.1 requests/2.24.0 setuptools/50.3.2 requests-toolbelt/0.9.1 tqdm/4.47.0 CPython/3.7.6
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | fcc9db9319dda632f7dbfa1876df8c0affed369ff158bfbd13fa1c5716ba010b |
|
MD5 | 3c02e3b56c01cc5c1de7bb7eae3585cd |
|
BLAKE2b-256 | 3b8e0e0de33240dd7f43504caf7a92ac314c6d7b63a8c29bb62500c81040a48a |
File details
Details for the file fedot_example_test-0.0.4-py3-none-any.whl
.
File metadata
- Download URL: fedot_example_test-0.0.4-py3-none-any.whl
- Upload date:
- Size: 90.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.2.0 pkginfo/1.6.1 requests/2.24.0 setuptools/50.3.2 requests-toolbelt/0.9.1 tqdm/4.47.0 CPython/3.7.6
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 259737722fd7015d74c290666e391e8f83b151293de3dc85c2c5e6a8cf31d5d3 |
|
MD5 | 1dbcc2151d450248f8f70ab62948b682 |
|
BLAKE2b-256 | 3ffece48d452f3ad5f3cdfa10288d8c1daeef946572dff18520ce9c8f35e744d |