Time series forecasting suite using statistical models
Project description
Nixtla

Statistical ⚡️ Forecast
Lightning fast forecasting with statistical and econometric models
StatsForecast offers a collection of widely used univariate time series forecasting models, including exponential smoothing and automatic ARIMA modeling optimized for high performance using numba.
🔥 Features
- Fastest and most accurate
auto_arimainPythonandR. - New!: Distributed compuration in clusters with ray.
- New!: Good Ol' sklearn syntax with
AutoARIMA().fit(y).predict(h=7). - New!: Inclusion of
exogenous variablesandprediction intervals. - Out of the box implementation of
exponential smoothing,croston,sesonal naive,random walk with driftandtbs. - 20x faster than
pmdarima. - 1.5x faster than
R. - 500x faster than
Prophet. - Compiled to high performance machine code through
numba. - 1,000,000 series in 30 min with ray.
Missing something? Please open an issue or write us in
📖 Why?
Current Python alternatives for statistical models are slow and inaccurate. So we created a library that can be used to forecast in production environments or as benchmarks. StatsForecast includes an extensive battery of models that can efficiently fit thousands of time series.
🔬 Accuracy
We compared accuracy and speed against: pmdarima, Rob Hyndman's forecast package and Facebook's Prophet. We used the Daily, Hourly and Weekly data from the M4 competition.
The following table summarizes the results. As can be seen, our auto_arima is the best model in accuracy (measured by the MASE loss) and time, even compared with the original implementation in R.
| dataset | metric | nixtla | pmdarima [1] | auto_arima_r | prophet |
|---|---|---|---|---|---|
| M4-Daily | MASE | 3.26 | 3.35 | 4.46 | 14.26 |
| M4-Daily | time | 1.41 | 27.61 | 1.81 | 514.33 |
| M4-Hourly | MASE | 0.92 | --- | 1.02 | 1.78 |
| M4-Hourly | time | 12.92 | --- | 23.95 | 17.27 |
| M4-Weekly | MASE | 2.34 | 2.47 | 2.58 | 7.29 |
| M4-Weekly | time | 0.42 | 2.92 | 0.22 | 19.82 |
[1] The model auto_arima from pmdarima had problems with Hourly data. An issue was opened in their repo.
The following table summarizes the data details.
| group | n_series | mean_length | std_length | min_length | max_length |
|---|---|---|---|---|---|
| Daily | 4,227 | 2,371 | 1,756 | 107 | 9,933 |
| Hourly | 414 | 901 | 127 | 748 | 1,008 |
| Weekly | 359 | 1,035 | 707 | 93 | 2,610 |
⏲ Computational efficiency
We measured the computational time against the number of time series. The following graph shows the results. As we can see, the fastest model is our auto_arima.
Nixtla vs Prophet
You can reproduce the results here.
External regressors
Results with external regressors are qualitatively similar to the reported before. You can find the complete experiments here.
👾 Less code
📖 Documentation
Here is a link to the documentation.
🧬 Getting Started 
💻 Installation
PyPI
You can install the released version of StatsForecast from the Python package index with:
pip install statsforecast
(Installing inside a python virtualenvironment or a conda environment is recommended.)
Conda
Also you can install the released version of StatsForecast from conda with:
conda install -c conda-forge statsforecast
(Installing inside a python virtualenvironment or a conda environment is recommended.)
Dev Mode
If you want to make some modifications to the code and see the effects in real time (without reinstalling), follow the steps below:git clone https://github.com/Nixtla/statsforecast.git
cd statsforecast
pip install -e .
🧬 How to use
import numpy as np
import pandas as pd
from IPython.display import display, Markdown
import matplotlib.pyplot as plt
from statsforecast import StatsForecast
from statsforecast.models import seasonal_naive, auto_arima
from statsforecast.utils import AirPassengers
horizon = 12
ap_train = AirPassengers[:-horizon]
ap_test = AirPassengers[-horizon:]
series_train = pd.DataFrame(
{
'ds': pd.date_range(start='1949-01-01', periods=ap_train.size, freq='M'),
'y': ap_train
},
index=pd.Index([0] * ap_train.size, name='unique_id')
)
fcst = StatsForecast(
series_train,
models=[(auto_arima, 12), (seasonal_naive, 12)],
freq='M',
n_jobs=1
)
forecasts = fcst.forecast(12, level=(80, 95))
forecasts['y_test'] = ap_test
fig, ax = plt.subplots(1, 1, figsize = (20, 7))
df_plot = pd.concat([series_train, forecasts]).set_index('ds')
df_plot[['y', 'y_test', 'auto_arima_season_length-12_mean', 'seasonal_naive_season_length-12']].plot(ax=ax, linewidth=2)
ax.fill_between(df_plot.index,
df_plot['auto_arima_season_length-12_lo-80'],
df_plot['auto_arima_season_length-12_hi-80'],
alpha=.35,
color='green',
label='auto_arima_level_80')
ax.fill_between(df_plot.index,
df_plot['auto_arima_season_length-12_lo-95'],
df_plot['auto_arima_season_length-12_hi-95'],
alpha=.2,
color='green',
label='auto_arima_level_95')
ax.set_title('AirPassengers Forecast', fontsize=22)
ax.set_ylabel('Monthly Passengers', fontsize=20)
ax.set_xlabel('Timestamp [t]', fontsize=20)
ax.legend(prop={'size': 15})
ax.grid()
for label in (ax.get_xticklabels() + ax.get_yticklabels()):
label.set_fontsize(20)
Adding external regressors
series_train['trend'] = np.arange(1, ap_train.size + 1)
series_train['intercept'] = np.ones(ap_train.size)
series_train['month'] = series_train['ds'].dt.month
series_train = pd.get_dummies(series_train, columns=['month'], drop_first=True)
display_df(series_train.head())
| unique_id | ds | y | trend | intercept | month_2 | month_3 | month_4 | month_5 | month_6 | month_7 | month_8 | month_9 | month_10 | month_11 | month_12 |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 1949-01-31 00:00:00 | 112 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| 0 | 1949-02-28 00:00:00 | 118 | 2 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| 0 | 1949-03-31 00:00:00 | 132 | 3 | 1 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| 0 | 1949-04-30 00:00:00 | 129 | 4 | 1 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| 0 | 1949-05-31 00:00:00 | 121 | 5 | 1 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
xreg_test = pd.DataFrame(
{
'ds': pd.date_range(start='1960-01-01', periods=ap_test.size, freq='M')
},
index=pd.Index([0] * ap_test.size, name='unique_id')
)
xreg_test['trend'] = np.arange(133, ap_test.size + 133)
xreg_test['intercept'] = np.ones(ap_test.size)
xreg_test['month'] = xreg_test['ds'].dt.month
xreg_test = pd.get_dummies(xreg_test, columns=['month'], drop_first=True)
fcst = StatsForecast(
series_train,
models=[(auto_arima, 12), (seasonal_naive, 12)],
freq='M',
n_jobs=1
)
forecasts = fcst.forecast(12, xreg=xreg_test, level=(80, 95))
forecasts['y_test'] = ap_test
🔨 How to contribute
See CONTRIBUTING.md.
📃 References
- The
auto_arimamodel is based (translated) from the R implementation included in the forecast package developed by Rob Hyndman.
Contributors ✨
Thanks goes to these wonderful people (emoji key):
fede 💻 |
José Morales 💻 🚧 |
Sugato Ray 💻 |
Jeff Tackes 🐛 |
darinkist 🤔 |
Alec Helyar 💬 |
Dave Hirschfeld 💬 |
mergenthaler 💻 |
Kin 💻 |
Yasslight90 🤔 |
asinig 🤔 |
This project follows the all-contributors specification. Contributions of any kind welcome!
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file statsforecast-0.5.4.tar.gz.
File metadata
- Download URL: statsforecast-0.5.4.tar.gz
- Upload date:
- Size: 37.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.0 CPython/3.9.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3de7831e72439b7f04197e1fccd5fe2a692ff7082f5d887047152fa9cb58833c
|
|
| MD5 |
96f3242b8e6df921116d620714153244
|
|
| BLAKE2b-256 |
f0cdf590ce0f3a5b298d0c9d0196abc914d3b9ce42cd7722e93cf1088da414e6
|
File details
Details for the file statsforecast-0.5.4-py3-none-any.whl.
File metadata
- Download URL: statsforecast-0.5.4-py3-none-any.whl
- Upload date:
- Size: 32.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.0 CPython/3.9.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
6d2d7b8402f8d1ec0487bc725377f9c127fc274eebc9ef5706a82f0220b8d799
|
|
| MD5 |
2d453f85728ddb6d8de65a753507035e
|
|
| BLAKE2b-256 |
8bf2dc16e14e69d826ed41e6441b03e26a5f751c043d64e7d744943943f7e98b
|