Skip to main content

scikit-learn-inspired time series

Project description

skits

CircleCI PyPI version

A library for SciKit-learn-Inspired Time Series models.

The primary goal of this library is to allow one to train time series prediction models using a similar API to scikit-learn. Consequently, similar to scikit-learn, this library consists of preprocessors, feature_extractors, and pipelines.

Installation

Install with pip:

pip install skits

Preprocessors

The preprocessors expect to receive time series data, and then end up storing some data about the time series such that they can fully invert a transform. The following example shows how to create a DifferenceTransformer transform data, and then invert it back to its original form. The DifferenceTransformer subtracts the point shifted by period away from each point.

import numpy as np
from skits.preprocessing import DifferenceTransformer

y = np.random.random(10)
# scikit-learn expects 2D design matrices,
# so we duplicate the time series.
X = y[:, np.newaxis] 

dt = DifferenceTransformer(period=2)

Xt = dt.fit_transform(X,y)
X_inv = dt.inverse_transform(Xt)

assert np.allclose(X, X_inv)

Feature Extractors

After all preprocessing transformations are completed, multiple features may be built out of the time series. These can be built via feature extractors, which one should combine together into a large FeatureUnion. Current features include autoregressive, seasonal, and integrated features (covering the AR and I of ARIMA models).

Pipelines

There are two types of pipelines. The ForecasterPipeline is for forecasting time series (duh). Specifically, one should build this pipeline with a regressor as the final step such that one can make appropriate predictions. The functionality is similar to a regular scikit-learn pipeline. Differences include the addition of a forecast() method along with a to_scale keyword argument to predict() such that one can make sure that their prediction is on the same scale as the original data.

These classes are likely subject to change as they are fairly hacky right now. For example, one must transform both X and y for all transformations before the introduction of a DifferenceTransformer. While the pipeline handles this, one must prefix all of these transformations with pre_ in the step names.

Anywho, here's an example:

import numpy as np
from sklearn.linear_model import LinearRegression
from sklearn.preprocessing import StandardScaler
from sklearn.pipeline import FeatureUnion

from skits.pipeline import ForecasterPipeline
from skits.preprocessing import ReversibleImputer
from skits.feature_extraction import (AutoregressiveTransformer, 
                                      SeasonalTransformer)

steps = [
    ('pre_scaling', StandardScaler()),
    ('features', FeatureUnion([
        ('ar_transformer', AutoregressiveTransformer(num_lags=3)),
        ('seasonal_transformer', SeasonalTransformer(seasonal_period=20)
    )])),
    ('post_features_imputer', ReversibleImputer()),
    ('regressor', LinearRegression(fit_intercept=False))
]

l = np.linspace(0, 1, 101)
y = 5*np.sin(2 * np.pi * 5 * l) + np.random.normal(0, 1, size=101)
X = y[:, np.newaxis]

pipeline = ForecasterPipeline(steps)

pipeline.fit(X, y)
y_pred = pipeline.predict(X, to_scale=True, refit=True)

And this ends up looking like:

import matplotlib.pyplot as plt

plt.plot(y, lw=2)
plt.plot(y_pred, lw=2)
plt.legend(['y_true', 'y_pred'], bbox_to_anchor=(1, 1));

pred

And forecasting looks like

start_idx = 70
plt.plot(y, lw=2);
plt.plot(pipeline.forecast(y[:, np.newaxis], start_idx=start_idx), lw=2);
ax = plt.gca();
ylim = ax.get_ylim();
plt.plot((start_idx, start_idx), ylim, lw=4);
plt.ylim(ylim);
plt.legend(['y_true', 'y_pred', 'forecast start'], bbox_to_anchor=(1, 1));

forecast

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

skits-0.1.2.tar.gz (8.1 kB view details)

Uploaded Source

Built Distribution

skits-0.1.2-py3-none-any.whl (9.8 kB view details)

Uploaded Python 3

File details

Details for the file skits-0.1.2.tar.gz.

File metadata

  • Download URL: skits-0.1.2.tar.gz
  • Upload date:
  • Size: 8.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/2.0.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.4.0 requests-toolbelt/0.9.1 tqdm/4.37.0 CPython/3.6.9

File hashes

Hashes for skits-0.1.2.tar.gz
Algorithm Hash digest
SHA256 df5d183c1520c1debb36096a7d46f5e62266c45c372419f7429f95d6245a3b8b
MD5 827c5f3644503efc3ad595737fcb1285
BLAKE2b-256 432b4364aa0520373630de4b5689a71570d04bc32b9584139b93118ca1d7c3bd

See more details on using hashes here.

File details

Details for the file skits-0.1.2-py3-none-any.whl.

File metadata

  • Download URL: skits-0.1.2-py3-none-any.whl
  • Upload date:
  • Size: 9.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/2.0.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.4.0 requests-toolbelt/0.9.1 tqdm/4.37.0 CPython/3.6.9

File hashes

Hashes for skits-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 1b0d1d527a8ceda2f921ab62a5d194e2b0748fc36f82e29e9695f0d2a8167028
MD5 36b33b2d0a04effbc71a00610e0806c0
BLAKE2b-256 57f700648e28f0e8491352f67db46334f9fd51faeecde920ac452ad16e90d7f3

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page