Skip to main content

Python package for streaming data and incremental learning

Project description

SAIL

License: MIT main branch Code style: black

The library is for experimenting with streaming processing engines (SPEs) and incremental machine learning (IML) models. The main features of Sail are:

  • Common interface for all incremental models available in libraries like Scikit-Learn, Pytorch, Keras and River.
  • Distributed computing for model selection, ensembling etc.
  • Hyperparameter optimization for incremental models.
  • Interface and pipelines that implement incremental models for both offline and online learning.
  • A robust framework to construct an end-to- end AutoML pipeline with pipeline strategies that enables data ingestion, feature engineering, model selection, incremental training, and monitoring for data drifts on a streaming data.

Documentation

See the SAIL Wiki for full documentation, installation guide, operational details and other information.

Architecture

SAIL Pipeline

Architecture

SAIL Model Framework

Architecture

Difference with River and other existing incremental machine learning libraries.

Sail leverages the existing machine learning libraries like River, sklearn etc and creates a common set of APIs to run these models in the backend. In particular, while River provides minimal utilities for deep learning models, it does not focus on deep learning models developed through Pytorch and Keras. In addition, models in Sail are parallelized using Ray. The parallelization results in three major advatages that are particularly important for incremental models with high volume and high velocity data:

  • Faster computational times for ensemble models.
  • Faster computational times for ensemble of forecasts.
  • Creates a clean interface for developing AutoML algorithms for incremental models.

Spark vs Ray for incremental models.

Sail could have been parallelized using Spark as well. However, to keep the streaming processing engines and machine learning tasks independent, Ray was preferred as the data can then be handled using Pandas, Numpy etc efficiently. This flexibility further allows using other SPEs like Flink or Storm without updating the parallelization framework for IML models.

🛠 Installation

Sail is intended to work with Python 3.8 and above. You can install the latest version from GitHub as so:

git clone https://github.com/IBM/sail.git
cd sail
pip install -e ".[OPTION]"

Supported OPTION include:

  • tensorflow
  • tensorflow_arm64
  • pytorch
  • river
  • ray
  • dev
  • tests
  • examples : to run notebooks and examples
  • all : all of the above
  • all_arm64 : Apple ARM64 version all of the above

✍️ Examples and Notebooks

Examples and notebooks are located in the examples and notebook respectively. Please run the below command to install the necessary packages to run examples.

pip install -e ".[examples]"

Recognition

SAIL has been identified as innovations with market potential that can contribute to UN Sustainable Development Goals by the European Commission Innovation Radar. More details here.

Acknowledgment

This project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 957345 for MORE project.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sail-0.12.1.tar.gz (84.8 kB view details)

Uploaded Source

Built Distribution

sail-0.12.1-py3-none-any.whl (112.0 kB view details)

Uploaded Python 3

File details

Details for the file sail-0.12.1.tar.gz.

File metadata

  • Download URL: sail-0.12.1.tar.gz
  • Upload date:
  • Size: 84.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.10.10

File hashes

Hashes for sail-0.12.1.tar.gz
Algorithm Hash digest
SHA256 11e5010bf1a6ef1fc91b4c60144401ec06ea437e319a392c9b328f9c9f7206af
MD5 2255da72f22891fc36af25ebe279a4dd
BLAKE2b-256 8c1bc87446d53809fb7307566b16ca40c5e8b2daa8a61ac6963df940e9cbf46b

See more details on using hashes here.

File details

Details for the file sail-0.12.1-py3-none-any.whl.

File metadata

  • Download URL: sail-0.12.1-py3-none-any.whl
  • Upload date:
  • Size: 112.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.10.10

File hashes

Hashes for sail-0.12.1-py3-none-any.whl
Algorithm Hash digest
SHA256 eddc780906090cbe74a942a70fcdcb0d7a6fe1d07eb0557a2bcd51d023c0e576
MD5 58f54f04905f10f14c3cfddcd25ac91e
BLAKE2b-256 d86021a19c7297ced78c816f78c8c1dba83c839e56b94170029d18e23eae34dd

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page