Skip to main content

Scalable machine learning based time series forecasting

Project description

mlforecast

CI Python PyPi conda-forge License

Install

PyPI

pip install mlforecast

If you want to perform distributed training, you can instead use pip install mlforecast[distributed], which will also install dask. Note that you’ll also need to install either LightGBM or XGBoost.

conda-forge

conda install -c conda-forge mlforecast

Note that this installation comes with the required dependencies for the local interface. If you want to perform distributed training, you must install dask (conda install -c conda-forge dask) and either LightGBM or XGBoost.

How to use

The following provides a very basic overview, for a more detailed description see the documentation.

Store your time series in a pandas dataframe in long format, that is, each row represents an observation for a specific serie and timestamp.

from mlforecast.utils import generate_daily_series

series = generate_daily_series(
    n_series=20,
    max_length=100,
    n_static_features=1,
    static_as_categorical=False,
    with_trend=True
)
series.head()
ds y static_0
unique_id
id_00 2000-01-01 1.751917 72
id_00 2000-01-02 9.196715 72
id_00 2000-01-03 18.577788 72
id_00 2000-01-04 24.520646 72
id_00 2000-01-05 33.418028 72

Next define your models. If you want to use the local interface this can be any regressor that follows the scikit-learn API. For distributed training there are LGBMForecast and XGBForecast.

import lightgbm as lgb
import xgboost as xgb
from sklearn.ensemble import RandomForestRegressor

models = [
    lgb.LGBMRegressor(),
    xgb.XGBRegressor(),
    RandomForestRegressor(random_state=0),
]

Now instantiate a Forecast object with the models and the features that you want to use. The features can be lags, transformations on the lags and date features. The lag transformations are defined as numba jitted functions that transform an array, if they have additional arguments you supply a tuple (transform_func, arg1, arg2, …).

from mlforecast import Forecast
from window_ops.expanding import expanding_mean
from window_ops.rolling import rolling_mean

fcst = Forecast(
    models=models,
    freq='D',
    lags=[7, 14],
    lag_transforms={
        1: [expanding_mean],
        7: [(rolling_mean, 7)]
    },
    date_features=['dayofweek'],
    differences=[1],
)

To compute the features and train the models call fit on your Forecast object. Here you have to specify the columns that:

  • Identify each serie (id_col). If the series identifier is the index you can specify id_col='index'
  • Contain the timestamps (time_col). Can also be integers if your data doesn’t have timestamps.
  • Are the series values (target_col)
fcst.fit(series, id_col='index', time_col='ds', target_col='y', static_features=['static_0'])
Forecast(models=[LGBMRegressor, XGBRegressor, RandomForestRegressor], freq=<Day>, lag_features=['lag-7', 'lag-14', 'expanding_mean_lag-1', 'rolling_mean_lag-7_window_size-7'], date_features=['dayofweek'], num_threads=1)

To get the forecasts for the next 14 days call predict(14) on the forecast object. This will automatically handle the updates required by the features using a recursive strategy.

predictions = fcst.predict(14)
import matplotlib.pyplot as plt
import pandas as pd

fig, ax = plt.subplots(nrows=2, ncols=2, figsize=(12, 6), gridspec_kw=dict(hspace=0.3))
for i, (cat, axi) in enumerate(zip(series.index.categories, ax.flat)):
    pd.concat([series.loc[cat, ['ds', 'y']], predictions.loc[cat]]).set_index('ds').plot(ax=axi)
    axi.set(title=cat, xlabel=None)
    if i % 2 == 0:
        axi.legend().remove()
    else:
        axi.legend(bbox_to_anchor=(1.01, 1.0))
fig.savefig('figs/index.png', bbox_inches='tight')
plt.close()

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mlforecast-0.3.1.tar.gz (29.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

mlforecast-0.3.1-py3-none-any.whl (32.2 kB view details)

Uploaded Python 3

File details

Details for the file mlforecast-0.3.1.tar.gz.

File metadata

  • Download URL: mlforecast-0.3.1.tar.gz
  • Upload date:
  • Size: 29.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.9.12

File hashes

Hashes for mlforecast-0.3.1.tar.gz
Algorithm Hash digest
SHA256 4945a417a930c8a495dd21153917a7b2b711835198a3310b7c11973f68401ba1
MD5 5fd632abf2fc0cdc3a13f7f323e50ae6
BLAKE2b-256 59bacae3f34eec8faecc92128a63c6dd3048e2e9314a7b0807fb9c7ed9502bf9

See more details on using hashes here.

File details

Details for the file mlforecast-0.3.1-py3-none-any.whl.

File metadata

  • Download URL: mlforecast-0.3.1-py3-none-any.whl
  • Upload date:
  • Size: 32.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.9.12

File hashes

Hashes for mlforecast-0.3.1-py3-none-any.whl
Algorithm Hash digest
SHA256 f10b370b571d459b0da748a405d1fea7bdc73120dbe72f875fc2039a3ad16451
MD5 e3c1a46e110f736add7258929bb61774
BLAKE2b-256 ebb274642f4947f1fe38bd228f440eabe70a2e056a1371d133f3056c163bb264

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page