Skip to main content

Evolutionary structural learning framework FEDOT

Project description

FEDOT

package

Supported Python Versions Supported Python Versions Supported Python Versions Supported Python Versions

tests

Build Status Coverage Status

docs

Documentation Status

license

Supported Python Versions

This repository contains Fedot - a framework for automated modeling and machine learning. It can build composite models for the different real-world processes in an automated way using an evolutionary approach.

Composite models - the models with heterogeneous graph-based structure, that can consist of ML models, domain-specific models, equation-based models, statistical, and even other composite models. Composite modelling allows obtaining efficient multi-scale solutions for various applied problems.

Fedot can be used for classification, regression, clustering, time series forecasting, and other similar tasks. Also, the derived solutions for other problems (e.g. bayesian generation of synthetic data) can be build using Fedot.Core.

The project is maintained by the research team of Natural Systems Simulation Lab, which is a part of the National Center for Cognitive Research of ITMO University.

Installation

Common installation:

$ pip install fedot

In order to work with FEDOT source code:

$ git clone https://github.com/nccr-itmo/FEDOT.git
$ cd FEDOT
$ pip install -r requirements.txt
$ pytest -s test

FEDOT features

  • The generation of high-quality variable-shaped machine learning pipelines for various tasks: binary/multiclass classification, regression, clustering, time series forecasting;

  • The structural learning of composite models with different nature (hybrid, bayesian, deep learning, etc) using custom metrics;

  • The seamless integration of the custom models (including domain-specific), frameworks and algorithms into pipelines;

  • Benchmarking utilities that can run real-world cases (the ready-to-use examples are provided for credit scoring, sea surface height forecasting, oil production forecasting, etc), state-of-the-art-datasets (like PMLB) and synthetic data.

How to use

The main purpose of FEDOT is to identify a suitable composite model for a given dataset. The model is obtained via optimization process (we also call it ‘composing’).Firstly, you need to prepare datasets for fit and validate and specify a task that you going to solve:

task = Task(TaskTypesEnum.classification)
dataset_to_compose = InputData.from_csv(train_file_path, task=task)
dataset_to_validate = InputData.from_csv(test_file_path, task=task)

Then, chose a set of models that can be included in the composite model, and the optimized metric function:

available_model_types, _ = ModelTypesRepository().suitable_model(task_type=task.task_type)
metric_function = MetricsRepository().metric_by_id(ClassificationMetricsEnum.ROCAUC)

Next, you need to specify requirements for composer. In this case, GPComposer is chosen that is based on evolutionary algorithm.

composer_requirements = GPComposerRequirements(
  primary=available_model_types,
  secondary=available_model_types, max_arity=3,
  max_depth=3, pop_size=20, num_of_generations=20,
  crossover_prob=0.8, mutation_prob=0.8, max_lead_time=20)
composer = GPComposer()

Now you can run the optimization and obtain a composite model:

chain_evo_composed = composer.compose_chain(data=dataset_to_compose,
                                            initial_chain=None,
                                            composer_requirements=composer_requirements,
                                            metrics=metric_function,
                                            is_visualise=False)

Finally, you can test the resulted model on the validation dataset:

roc_on_valid_evo_composed = calculate_validation_metric(chain_evo_composed,
                                                        dataset_to_validate)
print(f'Composed ROC AUC is {round(roc_on_valid_evo_composed, 3)}')

Extended examples:

Also, several video tutorials are available (in Russian).

Project structure

The latest stable release of FEDOT is on the master branch. Make sure you are looking at and working on the actual code if you’re looking to contribute code.

The repository includes the following directories:

  • Package core contains the main classes and scripts. It is a core of FEDOT framework

  • Package examples includes several how-to-use-cases where you can start to discover how FEDOT works

  • All unit tests can be observed in the test directory

  • The sources of documentation are in the docs

Also you can check benchmarking repository that was developed to show the comparison of FEDOT against the well-known AutoML frameworks.

Basic Concepts

The main process of FEDOT work is composing leading to the production of the composite models.

Composer is a block that takes meta-requirements and the evolutionary algorithm as an optimization one and get different chains of models to find the most appropriate solution for the case.

The result of composing and basic object user works with is the Chain: Chain is the tree-based structure of any composite model. It keeps the information of nodes relations and everything referred to chain properties and restructure.

In fact, any chain has two kinds of nodes:
  • Primary nodes are edge (leaf) nodes of the tree where initial case data is located.

  • Secondary nodes are all other nodes which transform data during the composing and fitting, including root node with result data.

Meanwhile, every node holds the Model which could be ML or any other kind of model.

The referenced papers:

  • Kalyuzhnaya A. V. et al. Automatic evolutionary learning of composite models with knowledge enrichment //Proceedings of the 2020 Genetic and Evolutionary Computation Conference Companion. – 2020. – P. 43-44.

  • Kovalchuk S. V. et al. A conceptual approach to complex model management with generalized modelling patterns and evolutionary identification //Complexity. – 2018. – V. 2018.

  • Nikitin N. O. et al. Deadline-driven approach for multi-fidelity surrogate-assisted environmental model calibration: SWAN wind wave model case study //Proceedings of the Genetic and Evolutionary Computation Conference Companion. – 2019. – РЎ. 1583-1591.

  • Vychuzhanin P., Nikitin N. O., Kalyuzhnaya A. V. Robust Ensemble-Based Evolutionary Calibration of the Numerical Wind Wave Model //International Conference on Computational Science. – Springer, Cham, 2019. – P. 614-627.

  • Nikitin N. O. et al. Evolutionary ensemble approach for behavioral credit scoring //International Conference on Computational Science. – Springer, Cham, 2018. – P. 825-831.

Current R&D and future plans

At the moment, we execute an extensive set of experiments to determine the most suitable approaches for evolutionary chain optimization, hyperparameters tuning, benchmarking, etc. The different case studies from different subject areas (metocean science, oil production, seismic, robotics, economics, etc) are in progress now. The various features are planned to be implemented: multi-data chains, Bayesian networks optimization, domain-specific, equation-based models involvement, model export and atomization, interpretable surrogate models, etc.

Any support and contribution are welcome.

Documentation

The documentation is available in FEDOT.Docs repository.

The description and source code of underlying algorithms is available in FEDOT.Algs repository and its wiki pages (in Russian).

Also, FEDOT API in Read the Docs.

Contribution Guide

  • The contribution guide is available in the repository.

Acknowledgements

We acknowledge the contributors for their important impact and the participants of the numerous scientific conferences and workshops for their valuable advice and suggestions.

Supported by

Citation

@inproceedings{kalyuzhnaya2020automatic,

title={Automatic evolutionary learning of composite models with knowledge enrichment}, author={Kalyuzhnaya, Anna V and Nikitin, Nikolay O and Vychuzhanin, Pavel and Hvatov, Alexander and Boukhanovsky, Alexander}, booktitle={Proceedings of the 2020 Genetic and Evolutionary Computation Conference Companion}, pages={43–44}, year={2020}}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

fedot_example_test-0.0.4.tar.gz (67.9 kB view details)

Uploaded Source

Built Distribution

fedot_example_test-0.0.4-py3-none-any.whl (90.1 kB view details)

Uploaded Python 3

File details

Details for the file fedot_example_test-0.0.4.tar.gz.

File metadata

  • Download URL: fedot_example_test-0.0.4.tar.gz
  • Upload date:
  • Size: 67.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.6.1 requests/2.24.0 setuptools/50.3.2 requests-toolbelt/0.9.1 tqdm/4.47.0 CPython/3.7.6

File hashes

Hashes for fedot_example_test-0.0.4.tar.gz
Algorithm Hash digest
SHA256 fcc9db9319dda632f7dbfa1876df8c0affed369ff158bfbd13fa1c5716ba010b
MD5 3c02e3b56c01cc5c1de7bb7eae3585cd
BLAKE2b-256 3b8e0e0de33240dd7f43504caf7a92ac314c6d7b63a8c29bb62500c81040a48a

See more details on using hashes here.

File details

Details for the file fedot_example_test-0.0.4-py3-none-any.whl.

File metadata

  • Download URL: fedot_example_test-0.0.4-py3-none-any.whl
  • Upload date:
  • Size: 90.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.6.1 requests/2.24.0 setuptools/50.3.2 requests-toolbelt/0.9.1 tqdm/4.47.0 CPython/3.7.6

File hashes

Hashes for fedot_example_test-0.0.4-py3-none-any.whl
Algorithm Hash digest
SHA256 259737722fd7015d74c290666e391e8f83b151293de3dc85c2c5e6a8cf31d5d3
MD5 1dbcc2151d450248f8f70ab62948b682
BLAKE2b-256 3ffece48d452f3ad5f3cdfa10288d8c1daeef946572dff18520ce9c8f35e744d

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page