Skip to main content

The practitioner's time series forecasting library

Project description

Scalecast

Scalecast Logo

About

Scalecast helps you forecast time series. It is especially good in the following use cases:

  • Series need dynamic transformations to extract signals
  • Series have missing values
  • Need to establish proof-of-concept experimentations with model validation
  • Academic research
  • Working in internal production and sandbox environments

Here is how to initiate its main object:

from scalecast.Forecaster import Forecaster

f = Forecaster(
    y = array_of_values,
    current_dates = array_of_dates,
    future_dates=fcst_horizon_length,
    test_length = 0, # do you want to test all models? if so, on how many or what percent of observations?
    cis = False, # evaluate conformal confidence intervals for all models?
    metrics = ['rmse','mape','mae','r2'], # what metrics to evaluate over the validation/test sets?
)

Uniform ML modeling (with models from a diverse set of libraries, including scikit-learn, statsmodels, and tensorflow), reporting, and data visualizations are offered through the Forecaster and MVForecaster interfaces. Data storage and processing then becomes easy as all applicable data, predictions, and many derived metrics are contained in a few objects with much customization available through different modules. Feature requests and issue reporting are welcome! Don't forget to leave a star!⭐

Documentation

Popular Features

  1. Easy LSTM Modeling: setting up an LSTM model for time series using tensorflow is hard. Using scalecast, it's easy. Many tutorials and Kaggle notebooks that are designed for those getting to know the model use scalecast (see the aritcle).
f.set_estimator('lstm')
f.manual_forecast(
    lags=36,
    batch_size=32,
    epochs=15,
    validation_split=.2,
    activation='tanh',
    optimizer='Adam',
    learning_rate=0.001,
    lstm_layer_sizes=(100,)*3,
    dropout=(0,)*3,
)
  1. Auto lag, trend, and seasonality selection:
f.auto_Xvar_select( # iterate through different combinations of covariates
    estimator = 'lasso', # what estimator?
    alpha = .2, # estimator hyperparams?
    monitor = 'ValidationMetricValue', # what metric to monitor to make decisions?
    cross_validate = True, # cross validate
    cvkwargs = {'k':3}, # 3 folds
)
  1. Hyperparameter tuning using grid search and time series cross validation:
from scalecast import GridGenerator

GridGenerator.get_example_grids()
models = ['ridge','lasso','xgboost','lightgbm','knn']
f.tune_test_forecast(
    models,
    limit_grid_size = .2,
    feature_importance = True, # save pfi feature importance for each model?
    cross_validate = True, # cross validate? if False, using a seperate validation set that the user can specify
    rolling = True, # rolling time series cross validation?
    k = 3, # how many folds?
)
  1. Plotting results: plot test predictions, forecasts, fitted values, and more.
import matplotlib.pyplot as plt

fig, ax = plt.subplots(2,1, figsize = (12,6))
f.plot_test_set(models=models,order_by='TestSetRMSE',ax=ax[0])
f.plot(models=models,order_by='TestSetRMSE',ax=ax[1])
plt.show()
  1. Pipelines that include transformations, reverting, and backtesting:
from scalecast import GridGenerator
from scalecast.Pipeline import Transformer, Reverter, Pipeline
from scalecast.util import find_optimal_transformation, backtest_metrics

def forecaster(f):
    models = ['ridge','lasso','xgboost','lightgbm','knn']
    f.tune_test_forecast(
        models,
        limit_grid_size = .2, # randomized grid search on 20% of original grid sizes
        feature_importance = True, # save pfi feature importance for each model?
        cross_validate = True, # cross validate? if False, using a seperate validation set that the user can specify
        rolling = True, # rolling time series cross validation?
        k = 3, # how many folds?
    )

transformer, reverter = find_optimal_transformation(f) # just one of several ways to select transformations for your series

pipeline = Pipeline(
    steps = [
        ('Transform',transformer),
        ('Forecast',forecaster),
        ('Revert',reverter),
    ]
)

f = pipeline.fit_predict(f)
backtest_results = pipeline.backtest(f)
metrics = backtest_metrics(backtest_results)
  1. Model stacking: There are two ways to stack models with scalecast, with the StackingRegressor from scikit-learn or using its own stacking procedure.
from scalecast.auxmodels import auto_arima

f.set_estimator('lstm')
f.manual_forecast(
    lags=36,
    batch_size=32,
    epochs=15,
    validation_split=.2,
    activation='tanh',
    optimizer='Adam',
    learning_rate=0.001,
    lstm_layer_sizes=(100,)*3,
    dropout=(0,)*3,
)

f.set_estimator('prophet')
f.manual_forecast()

auto_arima(f)

# stack previously evaluated models
f.add_signals(['lstm','prophet','arima'])
f.set_estimator('catboost')
f.manual_forecast()
  1. Multivariate modeling and multivariate pipelines:
from scalecast.MVForecaster import MVForecaster
from scalecast.Pipeline import MVPipeline
from scalecast.util import find_optimal_transformation, backtest_metrics
from scalecast import GridGenerator

GridGenerator.get_mv_grids()

def mvforecaster(mvf):
    models = ['ridge','lasso','xgboost','lightgbm','knn']
    mvf.tune_test_forecast(
        models,
        limit_grid_size = .2, # randomized grid search on 20% of original grid sizes
        cross_validate = True, # cross validate? if False, using a seperate validation set that the user can specify
        rolling = True, # rolling time series cross validation?
        k = 3, # how many folds?
    )

mvf = MVForecaster(f1,f2,f3) # can take N Forecaster objects

transformer1, reverter1 = find_optimal_transformation(f1)
transformer2, reverter2 = find_optimal_transformation(f2)
transformer3, reverter3 = find_optimal_transformation(f3)

pipeline = MVPipeline(
    steps = [
        ('Transform',[transformer1,transformer2,transformer3]),
        ('Forecast',mvforecaster),
        ('Revert',[reverter1,reverter2,reverter3])
    ]
)

f1, f2, f3 = pipeline.fit_predict(f1, f2, f3)
backtest_results = pipeline.backtest(f1, f2, f3)
metrics = backtest_metrics(backtest_results)
  1. Transfer Learning: Train a model in one Forecaster object and use that model to make predictions on the data in a separate Forecaster object.
f = Forecaster(...)
f.auto_Xvar_select()
f.set_estimator('xgboost')
f.cross_validate()
f.auto_forecast()

f_new = Forecaster(...) # different series than f
f_new = infer_apply_Xvar_selection(infer_from=f,apply_to=f_new)
f_new.transfer_predict(transfer_from=f,model='xgboost') # transfers the xgboost model from f to f_new

Installation

Required Installations

  • UV recommended
  • Only the base package is needed to get started:
    • pip install --upgrade scalecast
    • uv pip install --upgrade scalecast (recommended)

Optional Installations

  • shap: feature importance (known issue with Python 3.11+)
  • tf: tensorflow for rnn/lstm models (for MAC, you may need to run uv pip install tensorflow-macos tensorflow-metal)
  • darts: theta
  • greykite: silverkite model
  • prophet: prophet model
  • tbats: tbats
  • lgbm: lightgbm

Install these by using

uv pip install "scalecast[list_optional_dependencies]"

For example, install tensorflow and darts using:

uv pip install "scalecast[tf,darts]"

Please note that the optional dependencies may not be tested before new releases.

Papers that use scalecast

Udemy Course

Scalecast: Machine Learning & Deep Learning

Blog posts and notebooks

Forecasting with Different Model Types

Transforming and Reverting

Confidence Intervals

Dynamic Validation

Model Input Selection

Scaled Forecasting on Many Series

Transfer Learning

Contributing

How to cite scalecast

@misc{scalecast,
  title = {{scalecast}},
  author = {Michael Keith},
  year = {2024},
  version = {<your version>},
  url = {https://scalecast.readthedocs.io/en/latest/},
}

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

scalecast-0.21.1.tar.gz (117.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

scalecast-0.21.1-py3-none-any.whl (120.3 kB view details)

Uploaded Python 3

File details

Details for the file scalecast-0.21.1.tar.gz.

File metadata

  • Download URL: scalecast-0.21.1.tar.gz
  • Upload date:
  • Size: 117.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.14

File hashes

Hashes for scalecast-0.21.1.tar.gz
Algorithm Hash digest
SHA256 1c544388250a78f5e6eb26cd89028728fc31cdeeb3be432aa7240ea0cadf55e0
MD5 e15154394c83f374cec9847a5b4259c9
BLAKE2b-256 6e838abdf6523445ba780d394c1002b81b70266b89e71d9c4b5a6cf06deb1eab

See more details on using hashes here.

File details

Details for the file scalecast-0.21.1-py3-none-any.whl.

File metadata

  • Download URL: scalecast-0.21.1-py3-none-any.whl
  • Upload date:
  • Size: 120.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.14

File hashes

Hashes for scalecast-0.21.1-py3-none-any.whl
Algorithm Hash digest
SHA256 289aa3faef23bb405bf832103e954583a156aa1d4f310f4fb041240a415571f8
MD5 e4775d3c593d1ba484bbc5e763ba1dd2
BLAKE2b-256 364f63f64f5e555ceae2c52ebbfe2d0cfcd5566527d9ee1da2e22f0d7f93e405

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page