The practitioner's time series forecasting library
Project description
Scalecast
About
Scalecast helps you forecast time series. Here is how to initiate its main object:
from scalecast.Forecaster import Forecaster
f = Forecaster(
y = array_of_values,
current_dates = array_of_dates,
future_dates=fcst_horizon_length,
test_length = 0, # do you want to test all models? if so, on how many or what percent of observations?
cis = False, # evaluate conformal confidence intervals for all models?
metrics = ['rmse','mape','mae','r2'], # what metrics to evaluate over the validation/test sets?
)
Uniform ML modeling (with models from a diverse set of libraries, including scikit-learn, statsmodels, and tensorflow), reporting, and data visualizations are offered through the Forecaster and MVForecaster interfaces. Data storage and processing then becomes easy as all applicable data, predictions, and many derived metrics are contained in a few objects with much customization available through different modules. Feature requests and issue reporting are welcome! Don't forget to leave a star!⭐
Documentation
Popular Features
- Easy LSTM Modeling: setting up an LSTM model for time series using tensorflow is hard. Using scalecast, it's easy. Many tutorials and Kaggle notebooks that are designed for those getting to know the model use scalecast (see the aritcle).
f.set_estimator('lstm')
f.manual_forecast(
lags=36,
batch_size=32,
epochs=15,
validation_split=.2,
activation='tanh',
optimizer='Adam',
learning_rate=0.001,
lstm_layer_sizes=(100,)*3,
dropout=(0,)*3,
)
- Auto lag, trend, and seasonality selection:
f.auto_Xvar_select( # iterate through different combinations of covariates
estimator = 'lasso', # what estimator?
alpha = .2, # estimator hyperparams?
monitor = 'ValidationMetricValue', # what metric to monitor to make decisions?
cross_validate = True, # cross validate
cvkwargs = {'k':3}, # 3 folds
)
- Hyperparameter tuning using grid search and time series cross validation:
from scalecast import GridGenerator
GridGenerator.get_example_grids()
models = ['ridge','lasso','xgboost','lightgbm','knn']
f.tune_test_forecast(
models,
limit_grid_size = .2,
feature_importance = True, # save pfi feature importance for each model?
cross_validate = True, # cross validate? if False, using a seperate validation set that the user can specify
rolling = True, # rolling time series cross validation?
k = 3, # how many folds?
)
- Plotting results: plot test predictions, forecasts, fitted values, and more.
import matplotlib.pyplot as plt
fig, ax = plt.subplots(2,1, figsize = (12,6))
f.plot_test_set(models=models,order_by='TestSetRMSE',ax=ax[0])
f.plot(models=models,order_by='TestSetRMSE',ax=ax[1])
plt.show()
- Pipelines that include transformations, reverting, and backtesting:
from scalecast import GridGenerator
from scalecast.Pipeline import Transformer, Reverter, Pipeline
from scalecast.util import find_optimal_transformation, backtest_metrics
def forecaster(f):
models = ['ridge','lasso','xgboost','lightgbm','knn']
f.tune_test_forecast(
models,
limit_grid_size = .2, # randomized grid search on 20% of original grid sizes
feature_importance = True, # save pfi feature importance for each model?
cross_validate = True, # cross validate? if False, using a seperate validation set that the user can specify
rolling = True, # rolling time series cross validation?
k = 3, # how many folds?
)
transformer, reverter = find_optimal_transformation(f) # just one of several ways to select transformations for your series
pipeline = Pipeline(
steps = [
('Transform',transformer),
('Forecast',forecaster),
('Revert',reverter),
]
)
f = pipeline.fit_predict(f)
backtest_results = pipeline.backtest(f)
metrics = backtest_metrics(backtest_results)
- Model stacking: There are two ways to stack models with scalecast, with the
StackingRegressorfrom scikit-learn or using its own stacking procedure.
from scalecast.auxmodels import auto_arima
f.set_estimator('lstm')
f.manual_forecast(
lags=36,
batch_size=32,
epochs=15,
validation_split=.2,
activation='tanh',
optimizer='Adam',
learning_rate=0.001,
lstm_layer_sizes=(100,)*3,
dropout=(0,)*3,
)
f.set_estimator('prophet')
f.manual_forecast()
auto_arima(f)
# stack previously evaluated models
f.add_signals(['lstm','prophet','arima'])
f.set_estimator('catboost')
f.manual_forecast()
- Multivariate modeling and multivariate pipelines:
from scalecast.MVForecaster import MVForecaster
from scalecast.Pipeline import MVPipeline
from scalecast.util import find_optimal_transformation, backtest_metrics
from scalecast import GridGenerator
GridGenerator.get_mv_grids()
def mvforecaster(mvf):
models = ['ridge','lasso','xgboost','lightgbm','knn']
mvf.tune_test_forecast(
models,
limit_grid_size = .2, # randomized grid search on 20% of original grid sizes
cross_validate = True, # cross validate? if False, using a seperate validation set that the user can specify
rolling = True, # rolling time series cross validation?
k = 3, # how many folds?
)
mvf = MVForecaster(f1,f2,f3) # can take N Forecaster objects
transformer1, reverter1 = find_optimal_transformation(f1)
transformer2, reverter2 = find_optimal_transformation(f2)
transformer3, reverter3 = find_optimal_transformation(f3)
pipeline = MVPipeline(
steps = [
('Transform',[transformer1,transformer2,transformer3]),
('Forecast',mvforecaster),
('Revert',[reverter1,reverter2,reverter3])
]
)
f1, f2, f3 = pipeline.fit_predict(f1, f2, f3)
backtest_results = pipeline.backtest(f1, f2, f3)
metrics = backtest_metrics(backtest_results)
- Transfer Learning (new with 0.19.0): Train a model in one
Forecasterobject and use that model to make predictions on the data in a separateForecasterobject.
f = Forecaster(...)
f.auto_Xvar_select()
f.set_estimator('xgboost')
f.cross_validate()
f.auto_forecast()
f_new = Forecaster(...) # different series than f
f_new = infer_apply_Xvar_selection(infer_from=f,apply_to=f_new)
f_new.transfer_predict(transfer_from=f,model='xgboost') # transfers the xgboost model from f to f_new
Installation
Required Installations
- UV recommended
- Only the base package is needed to get started:
pip install --upgrade scalecastuv pip install --upgrade scalecast(recommended)
Optional Installations
shap: feature importance (known issue with Python 3.11+)tf: tensorflow for rnn/lstm models (for MAC, you may need to runuv pip install tensorflow-macos tensorflow-metal)darts: thetagreykite: silverkite modelprophet: prophet modeltbats: tbats
Install these by using
uv pip install scalecast[list_optional_dependencies]
For example, install tensorflow and darts using:
uv pip install scalecast[tf,darts]
Please note that the optional dependencies may not be tested before new releases.
Papers that use scalecast
- Post-covid customer service behavior forecasting using machine learning techniques
- Application of ANN and traditional ML algorithms in modelling compost production under different climatic conditions
- Reservoir Computing Solutions for Streamflow Modeling and Prediction in Real World Scenarios
- LSTM-based recurrent neural network provides effective short term flu forecasting
- IMPLEMENTING AN ENERGY TRADING STRATEGY USING FORECASTING OF ENERGY PRICES AND PRODUCTION
- Modelamiento predictivo del número de visitantes en un centro comercial
Udemy Course
Scalecast: Machine Learning & Deep Learning
Blog posts and notebooks
Forecasting with Different Model Types
- Sklearn Univariate
- Sklearn Multivariate
- RNN
- ARIMA
- Theta
- VECM
- Stacking
- Other Notebooks
Transforming and Reverting
Confidence Intervals
- Easy Distribution-Free Conformal Intervals for Time Series
- Dynamic Conformal Intervals for any Time Series Model
- Notebook 1
- Notebook 2
Dynamic Validation
Model Input Selection
- Variable Reduction Techniques for Time Series
- Auto Model Specification with ML Techniques for Time Series
- Notebook 1
- Notebook 2
Scaled Forecasting on Many Series
Transfer Learning
Anomaly Detection
Contributing
- Contributing.md
- Want something that's not listed? Open an issue!
How to cite scalecast
@misc{scalecast,
title = {{scalecast}},
author = {Michael Keith},
year = {2024},
version = {<your version>},
url = {https://scalecast.readthedocs.io/en/latest/},
}
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file scalecast-0.20.0.tar.gz.
File metadata
- Download URL: scalecast-0.20.0.tar.gz
- Upload date:
- Size: 119.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.14
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e4b00e1f00ab34946e4c2407c0e01d337ed5296cf5757ac7e2cd94fd17371919
|
|
| MD5 |
0acb7ed69b667d5e56def349c89c6eb9
|
|
| BLAKE2b-256 |
bc796eae9cab0b6390da80af58c575d0143a8170595cef3fbf7830ec8d866f01
|
File details
Details for the file scalecast-0.20.0-py3-none-any.whl.
File metadata
- Download URL: scalecast-0.20.0-py3-none-any.whl
- Upload date:
- Size: 122.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.14
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b67e00349077669cc93dd3bbe07f58fa5ea1a2ebd07e1a834de55ae6cc313a4a
|
|
| MD5 |
56894375781065a2c8751742c70431ad
|
|
| BLAKE2b-256 |
10d4ce36bb39404eb04d5be4cf86968206a8a118e28ea7657647236b5a9c2471
|