Skip to main content

Automated machine learning framework for composite pipelines

Project description

Logo of FEDOT framework

package

Supported Python Versions Supported Python Versions

tests

Build Status Integration Build Status Coverage Status

docs

Documentation Status

license

Supported Python Versions

stats

downloads_stats

support

Telegram Chat

languages

eng rus

mirror

GitLab mirror for this repository

funding

Acknowledgement to ITMO Acknowledgement to NCCR

FEDOT is an open-source framework for automated modeling and machine learning (AutoML) problems. This framework is distributed under the 3-Clause BSD license.

It provides automatic generative design of machine learning pipelines for various real-world problems. The core of FEDOT is based on an evolutionary approach and supports classification (binary and multiclass), regression, clustering, and time series prediction problems.

The structure of the AutoML workflow in FEDOT

The key feature of the framework is the complex management of interactions between various blocks of pipelines. It is represented as a graph that defines connections between data preprocessing and model blocks.

The project is maintained by the research team of the Natural Systems Simulation Lab, which is a part of the National Center for Cognitive Research of ITMO University.

More details about FEDOT are available in the next video:

Introducing Fedot

FEDOT concepts

Installation

  • Package installer for Python pip

The simplest way to install FEDOT is using pip:

$ pip install fedot

Installation with optional dependencies for image and text processing, and for DNNs:

$ pip install fedot[extra]
  • Docker container

Available docker images can be found here here.

How to Use

FEDOT provides a high-level API that allows you to use its capabilities in a simple way. The API can be used for classification, regression, and time series forecasting problems.

To use the API, follow these steps:

  1. Import Fedot class

from fedot.api.main import Fedot
  1. Initialize the Fedot object and define the type of modeling problem. It provides a fit/predict interface:

  • Fedot.fit() begins the optimization and returns the resulting composite pipeline;

  • Fedot.predict() predicts target values for the given input data using an already fitted pipeline;

  • Fedot.get_metrics() estimates the quality of predictions using selected metrics.

NumPy arrays, Pandas DataFrames, and the file’s path can be used as sources of input data. In the case below, x_train, y_train and x_test are numpy.ndarray():

model = Fedot(problem='classification', timeout=5, preset='best_quality', n_jobs=-1)
model.fit(features=x_train, target=y_train)
prediction = model.predict(features=x_test)
metrics = model.get_metrics(target=y_test)

More information about the API is available in the documentation section and advanced approaches are in the advanced section.

Examples & Tutorials

Jupyter notebooks with tutorials are located in examples repository. There you can find the following guides:

Notebooks are issued with the corresponding release versions (the default version is ‘latest’).

Also, external examples are available:

Extended examples:

Also, several video tutorials are available available (in Russian).

Publications About FEDOT

We also published several posts devoted to different aspects of the framework:

In English:

In Russian:

  • Как AutoML помогает создавать модели композитного ИИ — говорим о структурном обучении и фреймворке FEDOT - habr.com

  • Прогнозирование временных рядов с помощью AutoML - habr.com

  • Как мы “повернули реки вспять” на Emergency DataHack 2021, объединив гидрологию и AutoML - habr.com

  • Чистый AutoML для “грязных” данных: как и зачем автоматизировать предобработку таблиц в машинном обучении - ODS blog

  • Фреймворк автоматического машинного обучения FEDOT (Конференция Highload++ 2022) - presentation

  • Про настройку гиперпараметров ансамблей моделей машинного обучения - habr.com

In Chinese:

  • 生成式自动机器学习系统 (presentation at the “Open Innovations 2.0” conference) - youtube.com

Project Structure

The latest stable release of FEDOT is in the master branch.

The repository includes the following directories:

  • Package core contains the main classes and scripts. It is the core of the FEDOT framework

  • Package examples includes several how-to-use-cases where you can start to discover how FEDOT works

  • All unit and integration tests can be observed in the test directory

  • The sources of the documentation are in the docs directory

Current R&D and future plans

Currently, we are working on new features and trying to improve the performance and the user experience of FEDOT. The major ongoing tasks and plans:

  • Implementation of meta-learning based at GNN and RL (see MetaFEDOT)

  • Improvement of the optimisation-related algorithms implemented in GOLEM.

  • Support for more complicated pipeline design patters, especially for time series forecasting.

Any contribution is welcome. Our R&D team is open for cooperation with other scientific teams as well as with industrial partners.

Documentation

Also, a detailed FEDOT API description is available in Read the Docs.

Contribution Guide

  • The contribution guide is available in this repository.

Acknowledgments

We acknowledge the contributors for their important impact and the participants of numerous scientific conferences and workshops for their valuable advice and suggestions.

Side Projects

  • The optimisation core implemented in GOLEM repository.

  • The prototype of the web-GUI for FEDOT is available in the FEDOT.WEB repository.

  • The prototype of FEDOT-based meta-AutoML in the MetaFEDOT repository.

Contacts

Supported by

Citation

@article{nikitin2021automated,

title = {Automated evolutionary approach for the design of composite machine learning pipelines}, author = {Nikolay O. Nikitin and Pavel Vychuzhanin and Mikhail Sarafanov and Iana S. Polonskaia and Ilia Revin and Irina V. Barabanova and Gleb Maximov and Anna V. Kalyuzhnaya and Alexander Boukhanovsky}, journal = {Future Generation Computer Systems}, year = {2021}, issn = {0167-739X}, doi = {https://doi.org/10.1016/j.future.2021.08.022}}

@inproceedings{polonskaia2021multi,

title={Multi-Objective Evolutionary Design of Composite Data-Driven Models}, author={Polonskaia, Iana S. and Nikitin, Nikolay O. and Revin, Ilia and Vychuzhanin, Pavel and Kalyuzhnaya, Anna V.}, booktitle={2021 IEEE Congress on Evolutionary Computation (CEC)}, year={2021}, pages={926-933}, doi={10.1109/CEC45853.2021.9504773}}

Other papers - in ResearchGate.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

fedot-0.7.4.tar.gz (292.1 kB view details)

Uploaded Source

Built Distribution

fedot-0.7.4-py3-none-any.whl (400.4 kB view details)

Uploaded Python 3

File details

Details for the file fedot-0.7.4.tar.gz.

File metadata

  • Download URL: fedot-0.7.4.tar.gz
  • Upload date:
  • Size: 292.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.9.19

File hashes

Hashes for fedot-0.7.4.tar.gz
Algorithm Hash digest
SHA256 a01488847c99b181ff32a1f4b62be77fd1be12d10b908d4cd9e1ee630b6a36fb
MD5 f782bdd5a5bb99cacf714200a8ead6be
BLAKE2b-256 2c8763415911af9df01daf78c175376699ce00dea94fdfd79905bc84a729e393

See more details on using hashes here.

File details

Details for the file fedot-0.7.4-py3-none-any.whl.

File metadata

  • Download URL: fedot-0.7.4-py3-none-any.whl
  • Upload date:
  • Size: 400.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.9.19

File hashes

Hashes for fedot-0.7.4-py3-none-any.whl
Algorithm Hash digest
SHA256 9b131b14ab967af169d4ceaf5282f4be6f446f93afa9d6c8635ec24ed118ff7b
MD5 aa7a0040ce86f4e31a443cf9c0f667a8
BLAKE2b-256 36a0a994c299fd8e59d5b8fc0cdb5e7ea61df7d112042f729139f199015446fa

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page