Skip to main content

scikit-learn compatible Python toolbox for machine learning with time series

Project description

travis appveyor azure pypi gitter binder zenodo

sktime

sktime is Python toolbox for machine learning with time series. We currently support:

  • Forecasting,

  • Time series classification,

  • Time series regression.

sktime provides dedicated time series algorithms and scikit-learn compatible tools for building, tuning, and evaluating composite models.

For deep learning methods, see our companion package: sktime-dl.


Installation

The package is available via PyPI using:

pip install sktime

The package is actively being developed and some features may not be stable yet.

Development Version

To install the development version, please see our advanced installation instructions.


Quickstart

Forecasting

import numpy as np
from sktime.datasets import load_airline
from sktime.forecasting.theta import ThetaForecaster
from sktime.forecasting.model_selection import temporal_train_test_split
from sktime.performance_metrics.forecasting import smape_loss

y = load_airline()
y_train, y_test = temporal_train_test_split(y)
fh = np.arange(1, len(y_test) + 1)  # forecasting horizon
forecaster = ThetaForecaster()
forecaster.fit(y_train)
y_pred = forecaster.predict(fh)
smape_loss(y_test, y_pred)
>>> 0.1722386848882188

For more, check out the forecasting tutorial.

Time Series Classification

from sktime.datasets import load_arrow_head
from sktime.classification.compose import TimeSeriesForestClassifier
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score

X, y = load_arrow_head(return_X_y=True)
X_train, X_test, y_train, y_test = train_test_split(X, y)
classifier = TimeSeriesForestClassifier()
classifier.fit(X_train, y_train)
y_pred = classifier.predict(X_test)
accuracy_score(y_test, y_pred)
>>> 0.7924528301886793

For more, check out the time series classification tutorial.


Documentation


API Overview

sktime is a unified toolbox for machine learning with time series. Time series give rise to multiple learning tasks (e.g. forecasting and time series classification). The goal of sktime is to provide all the necessary to solve these tasks, including dedicated time series algorithms as well as tools for building, tuning and evaluating composite models.

Many of these tasks are related, and an algorithm that can solve one of them can often be re-used to help solve another one, an idea called reduction. sktime’s unified interface allows to easily adapt an algorithm for one task to another.

For example, to use a regression algorithm to solve a forecasting task, we can simply write:

import numpy as np
from sktime.datasets import load_airline
from sktime.forecasting.compose import ReducedRegressionForecaster
from sklearn.ensemble import RandomForestRegressor
from sktime.forecasting.model_selection import temporal_train_test_split
from sktime.performance_metrics.forecasting import smape_loss

y = load_airline()
y_train, y_test = temporal_train_test_split(y)
fh = np.arange(1, len(y_test) + 1)  # forecasting horizon
regressor = RandomForestRegressor()
forecaster = ReducedRegressionForecaster(regressor)
forecaster.fit(y_train)
y_pred = forecaster.predict(fh)
smape_loss(y_test, y_pred)

For more details, check out our paper.

Currently, sktime provides:

  • State-of-the-art algorithms for time series classification and regression, ported from the Java-based tsml toolkit, as well as forecasting,

  • Transformers, including single-series transformations (e.g. detrending or deseasonalization) and series-as-features transformations (e.g. feature extractors), as well as tools to compose different transformers,

  • Pipelining,

  • Tuning,

  • Ensembling, such as a fully customisable random forest for time-series classification and regression, as well as ensembling for multivariate problems,

For a list of implemented methods, see our estimator overview.

In addition, sktime includes an experimental high-level API that unifies multiple learning tasks, partially inspired by the APIs of mlr and openML.


Development Roadmap

sktime is under active development. We’re looking for new contributors, all contributions are welcome!

  1. Multivariate/panel forecasting based on a modified pysf API,

  2. Unsupervised learning, including time series clustering,

  3. Time series annotation, including segmentation and outlier detection,

  4. Specialised data container for efficient handling of time series/panel data in a modelling workflow and separation of time series meta-data,

  5. Probabilistic modelling framework for time series, including survival and point process models based on an adapted skpro interface.

For more details, read this issue.


How to contribute

For former and current contributors, see our overview.


How to cite sktime

If you use sktime in a scientific publication, we would appreciate citations to the following paper:

Markus Löning, Anthony Bagnall, Sajaysurya Ganesh, Viktor Kazakov, Jason Lines, Franz Király (2019): “sktime: A Unified Interface for Machine Learning with Time Series”

Bibtex entry:

@inproceedings{sktime,
    author = {L{\"{o}}ning, Markus and Bagnall, Anthony and Ganesh, Sajaysurya and Kazakov, Viktor and Lines, Jason and Kir{\'{a}}ly, Franz J},
    booktitle = {Workshop on Systems for ML at NeurIPS 2019},
    title = {{sktime: A Unified Interface for Machine Learning with Time Series}},
    date = {2019},
}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sktime-0.4.0.tar.gz (5.0 MB view hashes)

Uploaded Source

Built Distributions

sktime-0.4.0-cp38-cp38-win_amd64.whl (2.6 MB view hashes)

Uploaded CPython 3.8 Windows x86-64

sktime-0.4.0-cp38-cp38-manylinux2014_x86_64.whl (4.0 MB view hashes)

Uploaded CPython 3.8

sktime-0.4.0-cp38-cp38-manylinux2010_x86_64.whl (3.9 MB view hashes)

Uploaded CPython 3.8 manylinux: glibc 2.12+ x86-64

sktime-0.4.0-cp38-cp38-manylinux1_x86_64.whl (3.5 MB view hashes)

Uploaded CPython 3.8

sktime-0.4.0-cp38-cp38-macosx_10_13_x86_64.whl (2.6 MB view hashes)

Uploaded CPython 3.8 macOS 10.13+ x86-64

sktime-0.4.0-cp37-cp37m-win_amd64.whl (2.6 MB view hashes)

Uploaded CPython 3.7m Windows x86-64

sktime-0.4.0-cp37-cp37m-manylinux2014_x86_64.whl (3.9 MB view hashes)

Uploaded CPython 3.7m

sktime-0.4.0-cp37-cp37m-manylinux2010_x86_64.whl (3.8 MB view hashes)

Uploaded CPython 3.7m manylinux: glibc 2.12+ x86-64

sktime-0.4.0-cp37-cp37m-manylinux1_x86_64.whl (3.5 MB view hashes)

Uploaded CPython 3.7m

sktime-0.4.0-cp37-cp37m-macosx_10_13_x86_64.whl (2.6 MB view hashes)

Uploaded CPython 3.7m macOS 10.13+ x86-64

sktime-0.4.0-cp36-cp36m-win_amd64.whl (2.6 MB view hashes)

Uploaded CPython 3.6m Windows x86-64

sktime-0.4.0-cp36-cp36m-manylinux2014_x86_64.whl (3.9 MB view hashes)

Uploaded CPython 3.6m

sktime-0.4.0-cp36-cp36m-manylinux2010_x86_64.whl (3.8 MB view hashes)

Uploaded CPython 3.6m manylinux: glibc 2.12+ x86-64

sktime-0.4.0-cp36-cp36m-manylinux1_x86_64.whl (3.5 MB view hashes)

Uploaded CPython 3.6m

sktime-0.4.0-cp36-cp36m-macosx_10_13_x86_64.whl (2.6 MB view hashes)

Uploaded CPython 3.6m macOS 10.13+ x86-64

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page