Deep Learning for Time Series Forecasting
Project description
Nixtla
Neural 🧠 Forecast
Deep Learninng for time series
State-of-the-art time series forecasting for PyTorch.
NeuralForecast
is a python library for time series forecasting with deep learning.
It provides dataset loading utilities, evaluation functions and pytorch implementations of state of the art deep learning forecasting models.
📖 Documentation
Here is a link to the documentation.
Installation
Stable version
This code is a work in progress, any contributions or issues are welcome on GitHub at: https://github.com/Nixtla/neuralforecast.
PyPI
You can install the released version of NeuralForecast
from the Python package index with:
pip install neuralforecast
(Installing inside a python virtualenvironment or a conda environment is recommended.)
Conda
Also you can install the released version of NeuralForecast
from conda with:
conda install -c nixtla neuralforecast
(Installing inside a python virtualenvironment or a conda environment is recommended.)
Development version in development mode
If you want to make some modifications to the code and see the effects in real time (without reinstalling), follow the steps below:
git clone https://github.com/Nixtla/neuralforecast.git
cd neuralforecast
pip install -e .
Getting started
Import libraries
import pandas as pd
import pytorch_lightning as pl
from neuralforecast.data.datasets.long_horizon import LongHorizon
from neuralforecast.data.tsdataset import WindowsDataset
from neuralforecast.data.tsloader import TimeSeriesLoader
from neuralforecast.experiments.utils import get_mask_dfs
from neuralforecast.models.nhits.nhits import NHITS
Dataset
Y_df, _, _ = LongHorizon.load('data', 'ILI')
Y_df['ds'] = pd.to_datetime(Y_df['ds'])
Y_df.head()
.dataframe tbody tr th {
vertical-align: top;
}
.dataframe thead th {
text-align: right;
}
</style>
unique_id | ds | y | |
---|---|---|---|
0 | % WEIGHTED ILI | 2002-01-01 | -0.421499 |
1 | % WEIGHTED ILI | 2002-01-08 | -0.331239 |
2 | % WEIGHTED ILI | 2002-01-15 | -0.342763 |
3 | % WEIGHTED ILI | 2002-01-22 | -0.199782 |
4 | % WEIGHTED ILI | 2002-01-29 | -0.218426 |
Split train/test sets
output_size = 24
Y_df_test = Y_df.groupby('unique_id').tail(output_size)
Y_df_train = Y_df.drop(Y_df_test.index)
Define TimeSeriesDataset and TimeSeriesLoader
input_size = 5 * output_size
train_mask_df, val_mask_df, _ = get_mask_dfs(Y_df=Y_df_train,
ds_in_val=5 * output_size,
ds_in_test=0)
train_mask_df.query('unique_id == "OT"').set_index('ds').plot()
<AxesSubplot:xlabel='ds'>
val_mask_df.query('unique_id == "OT"').set_index('ds').plot()
<AxesSubplot:xlabel='ds'>
train_dataset = WindowsDataset(Y_df_train,
input_size=input_size,
output_size=output_size,
mask_df=train_mask_df)
/Users/fedex/projects/neuralforecast/neuralforecast/data/tsdataset.py:208: FutureWarning: In a future version of pandas all arguments of DataFrame.drop except for the argument 'labels' will be keyword-only
X.drop(['unique_id', 'ds'], 1, inplace=True)
val_dataset = WindowsDataset(Y_df_train,
input_size=input_size,
output_size=output_size,
mask_df=val_mask_df)
train_loader = TimeSeriesLoader(train_dataset, batch_size=1,
n_windows=256,
shuffle=True)
val_loader = TimeSeriesLoader(val_dataset, batch_size=1)
Define model
model = NHITS(n_time_in=input_size, n_time_out=output_size,
n_x=0, n_s=0, n_s_hidden=[0], n_x_hidden=[0],
shared_weights=False, initialization='lecun_normal',
activation='ReLU', stack_types=3 * ['identity'],
n_blocks= 3 * [1], n_layers= 9 * [2], n_theta_hidden=3 * [[512, 512]],
n_pool_kernel_size=3 * [8], n_freq_downsample=[168, 24, 1],
pooling_mode='max', interpolation_mode='linear',
batch_normalization=False, dropout_prob_theta=0,
learning_rate=1e-3, lr_decay=0.5, lr_decay_step_size=5, weight_decay=0,
loss_train='MAE', loss_hypar=0, loss_valid='MAE',
frequency='W-TUE', seasonality=52, random_seed=0)
Train model with early stopping
early_stopping = pl.callbacks.EarlyStopping(monitor="val_loss",
min_delta=1e-4,
patience=3, verbose=False,mode="min")
trainer = pl.Trainer(max_epochs=10,
max_steps=100,
check_val_every_n_epoch=1,
callbacks=[early_stopping])
trainer.fit(model, train_loader, val_loader)
GPU available: False, used: False
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
| Name | Type | Params
---------------------------------
0 | model | _NHITS | 1.0 M
---------------------------------
1.0 M Trainable params
0 Non-trainable params
1.0 M Total params
4.042 Total estimated model params size (MB)
/Users/fedex/opt/miniconda3/envs/nixtla/lib/python3.7/site-packages/pytorch_lightning/trainer/data_loading.py:117: UserWarning: The dataloader, val_dataloader 0, does not have many workers which may be a bottleneck. Consider increasing the value of the `num_workers` argument` (try 8 which is the number of cpus on this machine) in the `DataLoader` init to improve performance.
f"The dataloader, {name}, does not have many workers which may be a bottleneck."
/Users/fedex/opt/miniconda3/envs/nixtla/lib/python3.7/site-packages/torch/nn/functional.py:3635: UserWarning: Default upsampling behavior when mode=linear is changed to align_corners=False since 0.4.0. Please specify align_corners=True if the old behavior is desired. See the documentation of nn.Upsample for details.
"See the documentation of nn.Upsample for details.".format(mode)
/Users/fedex/opt/miniconda3/envs/nixtla/lib/python3.7/site-packages/pytorch_lightning/trainer/data_loading.py:117: UserWarning: The dataloader, train_dataloader, does not have many workers which may be a bottleneck. Consider increasing the value of the `num_workers` argument` (try 8 which is the number of cpus on this machine) in the `DataLoader` init to improve performance.
f"The dataloader, {name}, does not have many workers which may be a bottleneck."
/Users/fedex/opt/miniconda3/envs/nixtla/lib/python3.7/site-packages/pytorch_lightning/trainer/data_loading.py:413: UserWarning: The number of training samples (7) is smaller than the logging interval Trainer(log_every_n_steps=50). Set a lower value for log_every_n_steps if you want to see logs for the training epoch.
f"The number of training samples ({self.num_training_batches}) is smaller than the logging interval"
Y_df_forecast = model.forecast(Y_df_train)
Y_df_forecast.rename(columns={'y': 'y_hat'}, inplace=True)
Y_df_forecast.head()
INFO:root:Train Validation splits
INFO:root: ds
min max
unique_id sample_mask
% WEIGHTED ILI 0 2002-01-01 2020-01-14
1 2020-01-21 2020-06-30
%UNWEIGHTED ILI 0 2002-01-01 2020-01-14
1 2020-01-21 2020-06-30
AGE 0-4 0 2002-01-01 2020-01-14
1 2020-01-21 2020-06-30
AGE 5-24 0 2002-01-01 2020-01-14
1 2020-01-21 2020-06-30
ILITOTAL 0 2002-01-01 2020-01-14
1 2020-01-21 2020-06-30
NUM. OF PROVIDERS 0 2002-01-01 2020-01-14
1 2020-01-21 2020-06-30
OT 0 2002-01-01 2020-01-14
1 2020-01-21 2020-06-30
INFO:root:
Total data 6762 time stamps
Available percentage=100.0, 6762 time stamps
Insample percentage=2.48, 168 time stamps
Outsample percentage=97.52, 6594 time stamps
/Users/fedex/projects/neuralforecast/neuralforecast/data/tsdataset.py:208: FutureWarning: In a future version of pandas all arguments of DataFrame.drop except for the argument 'labels' will be keyword-only
X.drop(['unique_id', 'ds'], 1, inplace=True)
GPU available: False, used: False
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
/Users/fedex/projects/neuralforecast/neuralforecast/data/tsloader.py:47: UserWarning: This class wraps the pytorch `DataLoader` with a special collate function. If you want to use yours simply use `DataLoader`. Removing collate_fn
'This class wraps the pytorch `DataLoader` with a '
/Users/fedex/opt/miniconda3/envs/nixtla/lib/python3.7/site-packages/pytorch_lightning/trainer/data_loading.py:117: UserWarning: The dataloader, predict_dataloader 0, does not have many workers which may be a bottleneck. Consider increasing the value of the `num_workers` argument` (try 8 which is the number of cpus on this machine) in the `DataLoader` init to improve performance.
f"The dataloader, {name}, does not have many workers which may be a bottleneck."
/Users/fedex/opt/miniconda3/envs/nixtla/lib/python3.7/site-packages/torch/nn/functional.py:3635: UserWarning: Default upsampling behavior when mode=linear is changed to align_corners=False since 0.4.0. Please specify align_corners=True if the old behavior is desired. See the documentation of nn.Upsample for details.
"See the documentation of nn.Upsample for details.".format(mode)
.dataframe tbody tr th {
vertical-align: top;
}
.dataframe thead th {
text-align: right;
}
</style>
unique_id | ds | y_hat | |
---|---|---|---|
0 | % WEIGHTED ILI | 2020-01-21 | -0.456533 |
1 | % WEIGHTED ILI | 2020-01-28 | -0.310985 |
2 | % WEIGHTED ILI | 2020-02-04 | -0.114892 |
3 | % WEIGHTED ILI | 2020-02-11 | -0.071604 |
4 | % WEIGHTED ILI | 2020-02-18 | 0.285519 |
Y_df_plot = Y_df_test.merge(Y_df_forecast, how='left', on=['unique_id', 'ds'])
Y_df_plot.query('unique_id == "OT"').set_index('ds').plot()
<AxesSubplot:xlabel='ds'>
Current available models
-
Neural Hierarchical Interpolation for Time Series Forecasting (N-HiTS: A new model for long-horizon forecasting which incorporates novel hierarchical interpolation and multi-rate data sampling techniques to specialize blocks of its architecture to different frequency band of the time-series signal. It achieves SoTA performance on several benchmark datasets, outperforming current Transformer-based models by more than 25%.
-
Exponential Smoothing Recurrent Neural Network (ES-RNN): A hybrid model that combines the expressivity of non linear models to capture the trends while it normalizes using a Holt-Winters inspired model for the levels and seasonals. This model is the winner of the M4 forecasting competition.
-
Neural Basis Expansion Analysis (N-BEATS): A model from Element-AI (Yoshua Bengio’s lab) that has proven to achieve state of the art performance on benchmark large scale forecasting datasets like Tourism, M3, and M4. The model is fast to train an has an interpretable configuration.
-
Neural Basis Expansion Analysis with Exogenous Variables (N-BEATSx): The neural basis expansion with exogenous variables is an extension to the original N-BEATS that allows it to include time dependent covariates.
License
This project is licensed under the GPLv3 License - see the LICENSE file for details.
How to contribute
See CONTRIBUTING.md.
How to cite
If you use NeuralForecast
in a scientific publication, we encourage you to add
the following references to the related papers:
@article{neuralforecast_arxiv,
author = {XXXX},
title = {{NeuralForecast: Deep Learning for Time Series Forecasting}},
journal = {arXiv preprint arXiv:XXX.XXX},
year = {2022}
}
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for neuralforecast-0.0.3-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | d7111073c58660ec6a5d6ee2dfd0f6a2faa661080f5ec525ef3db551ffa23881 |
|
MD5 | 6293f7bf6e79743e8f8f4c1cde5bd50c |
|
BLAKE2b-256 | 2d617e732368f1e2d13c385d0cb2bbc1bb19408406608cb173ae8bb8a900e7ba |