Fast and customizable framework for automatic ML model creation (AutoML)
Project description
Documentation | Installation | Examples | Telegram chat | Telegram channel
LightAutoML (LAMA) allows you create machine learning models using just a few lines of code, or build your own custom pipeline using ready blocks. It supports tabular, time series, image and text data.
Authors: Alexander Ryzhkov, Anton Vakhrushev, Dmitry Simakov, Rinchin Damdinov, Vasilii Bunakov, Alexander Kirilin, Pavel Shvets.
Quick tour
There are two ways to solve machine learning problems using LightAutoML:
-
Ready-to-use preset:
from lightautoml.automl.presets.tabular_presets import TabularAutoML from lightautoml.tasks import Task automl = TabularAutoML(task = Task(name = 'binary', metric = 'auc')) oof_preds = automl.fit_predict(train_df, roles = {'target': 'my_target', 'drop': ['column_to_drop']}).data test_preds = automl.predict(test_df).data
-
As a framework:
LightAutoML framework has a lot of ready-to-use parts and extensive customization options, to learn more check out the resources section.
Resources
Kaggle kernel examples of LightAutoML usage:
- Tabular Playground Series April 2021 competition solution
- Titanic competition solution (80% accuracy)
- Titanic 12-code-lines competition solution (78% accuracy)
- House prices competition solution
- Natural Language Processing with Disaster Tweets solution
- Tabular Playground Series March 2021 competition solution
- Tabular Playground Series February 2021 competition solution
- Interpretable WhiteBox solution
- Custom ML pipeline elements inside existing ones
- Custom ML pipeline elements inside existing ones
- Tabular Playground Series November 2022 competition solution with Neural Networks
Google Colab tutorials and other examples:
Tutorial_1_basics.ipynb- get started with LightAutoML on tabular data.Tutorial_2_WhiteBox_AutoWoE.ipynb- creating interpretable models.Tutorial_3_sql_data_source.ipynb- shows how to use LightAutoML presets (both standalone and time utilized variants) for solving ML tasks on tabular data from SQL data base instead of CSV.Tutorial_4_NLP_Interpretation.ipynb- example of using TabularNLPAutoML preset, LimeTextExplainer.Tutorial_5_uplift.ipynb- shows how to use LightAutoML for a uplift-modeling task.Tutorial_6_custom_pipeline.ipynb- shows how to create your own pipeline from specified blocks: pipelines for feature generation and feature selection, ML algorithms, hyperparameter optimization etc.Tutorial_7_ICE_and_PDP_interpretation.ipynb- shows how to obtain local and global interpretation of model results using ICE and PDP approaches.Tutorial_8_CV_preset.ipynb- example of using TabularCVAutoML preset in CV multi-class classification task.Tutorial_9_neural_networks.ipynb- example of using Tabular preset with neural networks.Tutorial_10_relational_data_with_star_scheme.ipynb- example of using Tabular preset with neural networks.Tutorial_11_time_series.ipynb- example of using Tabular preset with timeseries data.
Note 1: for production you have no need to use profiler (which increase work time and memory consomption), so please do not turn it on - it is in off state by default
Note 2: to take a look at this report after the run, please comment last line of demo with report deletion command.
Courses, videos
-
LightAutoML crash courses:
-
Video guides:
- (Russian) LightAutoML webinar for Sberloga community (Alexander Ryzhkov, Dmitry Simakov)
- (Russian) LightAutoML hands-on tutorial in Kaggle Kernels (Alexander Ryzhkov)
- (English) Automated Machine Learning with LightAutoML: theory and practice (Alexander Ryzhkov)
- (English) LightAutoML framework general overview, benchmarks and advantages for business (Alexander Ryzhkov)
- (English) LightAutoML practical guide - ML pipeline presets overview (Dmitry Simakov)
-
Articles about LightAutoML:
Installation
To install LAMA framework on your machine from PyPI:
# Base functionality:
pip install -U lightautoml
# For partial installation use corresponding option
# Extra dependencies: [nlp, cv, report] or use 'all' to install all dependencies
pip install -U lightautoml[nlp]
# Or extra dependencies with specific version
pip install 'lightautoml[all]==0.4.0'
Additionally, run following commands to enable pdf report generation:
# MacOS
brew install cairo pango gdk-pixbuf libffi
# Debian / Ubuntu
sudo apt-get install build-essential libcairo2 libpango-1.0-0 libpangocairo-1.0-0 libgdk-pixbuf2.0-0 libffi-dev shared-mime-info
# Fedora
sudo yum install redhat-rpm-config libffi-devel cairo pango gdk-pixbuf2
# Windows
# follow this tutorial https://weasyprint.readthedocs.io/en/stable/install.html#windows
Advanced features
GPU and Spark pipelines
Full GPU and Spark pipelines for LightAutoML currently available for developers testing (still in progress). The code and tutorials for:
- GPU pipeline is available here
- Spark pipeline is available here
Contributing to LightAutoML
If you are interested in contributing to LightAutoML, please read the Contributing Guide to get started.
Support and feature requests
- Seek prompt advice in Telegram group.
- Open bug reports and feature requests on GitHub issues.
Citation
If you mention LightAutoML in your publications, please cite our paper: Vakhrushev, et al. "LightAutoML: AutoML Solution for a Large Financial Services Ecosystem" arXiv:2109.01528, 2021.
BibTeX entry:
@article{vakhrushev2021lightautoml,
title={Lightautoml: Automl solution for a large financial services ecosystem},
author={Vakhrushev, Anton and Ryzhkov, Alexander and Savchenko, Maxim and Simakov, Dmitry and Damdinov, Rinchin and Tuzhilin, Alexander},
journal={arXiv preprint arXiv:2109.01528},
year={2021}
}
License
This project is licensed under the Apache License, Version 2.0. See LICENSE file for more details.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file lightautoml-0.4.2.tar.gz.
File metadata
- Download URL: lightautoml-0.4.2.tar.gz
- Upload date:
- Size: 299.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/2.1.1 CPython/3.10.16 Linux/6.8.0-55-generic
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
6cfbe07be7d44035487f55e2ac8bc5eb6cea54ecfdfd89f417d18fd9d30b14df
|
|
| MD5 |
a0cfb063bb66d2719210571a5f8f2f52
|
|
| BLAKE2b-256 |
1e3845acae7744b205270d0d4d19023bab09018c816de5b43bf98354db5791ca
|
File details
Details for the file lightautoml-0.4.2-py3-none-any.whl.
File metadata
- Download URL: lightautoml-0.4.2-py3-none-any.whl
- Upload date:
- Size: 412.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/2.1.1 CPython/3.10.16 Linux/6.8.0-55-generic
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
868a92d72d1ac5b584bb9629db55a9806930dc6745dbb50b3857d92ab66fbe1e
|
|
| MD5 |
88e58937f3919de2b502842d10a482d4
|
|
| BLAKE2b-256 |
9ab5ff7160e58a315ae82e624d27297ce2ff0407e85547e87842be312fcb3b22
|