Skip to main content

Deep Learning for Time Series Forecasting

Project description

Nixtla

Neural 🧠 Forecast

Deep Learninng for time series

State-of-the-art time series forecasting for PyTorch.

NeuralForecast is a python library for time series forecasting with deep learning. It provides dataset loading utilities, evaluation functions and pytorch implementations of state of the art deep learning forecasting models.

📖 Documentation

Here is a link to the documentation.

Installation

Stable version

This code is a work in progress, any contributions or issues are welcome on GitHub at: https://github.com/Nixtla/neuralforecast.

PyPI

You can install the released version of NeuralForecast from the Python package index with:

pip install neuralforecast

(Installing inside a python virtualenvironment or a conda environment is recommended.)

Conda

Also you can install the released version of NeuralForecast from conda with:

conda install -c nixtla neuralforecast

(Installing inside a python virtualenvironment or a conda environment is recommended.)

Development version in development mode

If you want to make some modifications to the code and see the effects in real time (without reinstalling), follow the steps below:

git clone https://github.com/Nixtla/neuralforecast.git
cd neuralforecast
pip install -e .

Getting started

Import libraries

import pandas as pd
import pytorch_lightning as pl
from neuralforecast.data.datasets.long_horizon import LongHorizon
from neuralforecast.data.tsdataset import WindowsDataset
from neuralforecast.data.tsloader import TimeSeriesLoader
from neuralforecast.experiments.utils import get_mask_dfs
from neuralforecast.models.nhits.nhits import NHITS

Dataset

Y_df, _, _ = LongHorizon.load('data', 'ILI')
Y_df['ds'] = pd.to_datetime(Y_df['ds'])
Y_df.head()
<style scoped> .dataframe tbody tr th:only-of-type { vertical-align: middle; }
.dataframe tbody tr th {
    vertical-align: top;
}

.dataframe thead th {
    text-align: right;
}
</style>
unique_id ds y
0 % WEIGHTED ILI 2002-01-01 -0.421499
1 % WEIGHTED ILI 2002-01-08 -0.331239
2 % WEIGHTED ILI 2002-01-15 -0.342763
3 % WEIGHTED ILI 2002-01-22 -0.199782
4 % WEIGHTED ILI 2002-01-29 -0.218426

Split train/test sets

output_size = 24
Y_df_test = Y_df.groupby('unique_id').tail(output_size)
Y_df_train = Y_df.drop(Y_df_test.index)

Define TimeSeriesDataset and TimeSeriesLoader

input_size = 5 * output_size
train_mask_df, val_mask_df, _ = get_mask_dfs(Y_df=Y_df_train, 
                                             ds_in_val=5 * output_size,
                                             ds_in_test=0)
train_mask_df.query('unique_id == "OT"').set_index('ds').plot()
<AxesSubplot:xlabel='ds'>

png

val_mask_df.query('unique_id == "OT"').set_index('ds').plot()
<AxesSubplot:xlabel='ds'>

png

train_dataset = WindowsDataset(Y_df_train, 
                               input_size=input_size,
                               output_size=output_size,
                               mask_df=train_mask_df)
/Users/fedex/projects/neuralforecast/neuralforecast/data/tsdataset.py:208: FutureWarning: In a future version of pandas all arguments of DataFrame.drop except for the argument 'labels' will be keyword-only
  X.drop(['unique_id', 'ds'], 1, inplace=True)
val_dataset = WindowsDataset(Y_df_train, 
                             input_size=input_size,
                             output_size=output_size,
                             mask_df=val_mask_df)
train_loader = TimeSeriesLoader(train_dataset, batch_size=1, 
                                n_windows=256,
                                shuffle=True)
val_loader = TimeSeriesLoader(val_dataset, batch_size=1)

Define model

model = NHITS(n_time_in=input_size, n_time_out=output_size,
              n_x=0, n_s=0, n_s_hidden=[0], n_x_hidden=[0],
              shared_weights=False, initialization='lecun_normal',
              activation='ReLU', stack_types=3 * ['identity'],
              n_blocks= 3 * [1], n_layers= 9 * [2], n_theta_hidden=3 * [[512, 512]],
              n_pool_kernel_size=3 * [8], n_freq_downsample=[168, 24, 1],
              pooling_mode='max', interpolation_mode='linear',
              batch_normalization=False, dropout_prob_theta=0,
              learning_rate=1e-3, lr_decay=0.5, lr_decay_step_size=5, weight_decay=0,
              loss_train='MAE', loss_hypar=0, loss_valid='MAE',
              frequency='W-TUE', seasonality=52, random_seed=0)

Train model with early stopping

early_stopping = pl.callbacks.EarlyStopping(monitor="val_loss", 
                                            min_delta=1e-4, 
                                            patience=3, verbose=False,mode="min")

trainer = pl.Trainer(max_epochs=10, 
                     max_steps=100,
                     check_val_every_n_epoch=1,
                     callbacks=[early_stopping])

trainer.fit(model, train_loader, val_loader)
GPU available: False, used: False
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs

  | Name  | Type   | Params
---------------------------------
0 | model | _NHITS | 1.0 M 
---------------------------------
1.0 M     Trainable params
0         Non-trainable params
1.0 M     Total params
4.042     Total estimated model params size (MB)
/Users/fedex/opt/miniconda3/envs/nixtla/lib/python3.7/site-packages/pytorch_lightning/trainer/data_loading.py:117: UserWarning: The dataloader, val_dataloader 0, does not have many workers which may be a bottleneck. Consider increasing the value of the `num_workers` argument` (try 8 which is the number of cpus on this machine) in the `DataLoader` init to improve performance.
  f"The dataloader, {name}, does not have many workers which may be a bottleneck."
/Users/fedex/opt/miniconda3/envs/nixtla/lib/python3.7/site-packages/torch/nn/functional.py:3635: UserWarning: Default upsampling behavior when mode=linear is changed to align_corners=False since 0.4.0. Please specify align_corners=True if the old behavior is desired. See the documentation of nn.Upsample for details.
  "See the documentation of nn.Upsample for details.".format(mode)
/Users/fedex/opt/miniconda3/envs/nixtla/lib/python3.7/site-packages/pytorch_lightning/trainer/data_loading.py:117: UserWarning: The dataloader, train_dataloader, does not have many workers which may be a bottleneck. Consider increasing the value of the `num_workers` argument` (try 8 which is the number of cpus on this machine) in the `DataLoader` init to improve performance.
  f"The dataloader, {name}, does not have many workers which may be a bottleneck."
/Users/fedex/opt/miniconda3/envs/nixtla/lib/python3.7/site-packages/pytorch_lightning/trainer/data_loading.py:413: UserWarning: The number of training samples (7) is smaller than the logging interval Trainer(log_every_n_steps=50). Set a lower value for log_every_n_steps if you want to see logs for the training epoch.
  f"The number of training samples ({self.num_training_batches}) is smaller than the logging interval"
Y_df_forecast = model.forecast(Y_df_train)
Y_df_forecast.rename(columns={'y': 'y_hat'}, inplace=True)
Y_df_forecast.head()
INFO:root:Train Validation splits

INFO:root:                                      ds           
                                     min        max
unique_id         sample_mask                      
% WEIGHTED ILI    0           2002-01-01 2020-01-14
                  1           2020-01-21 2020-06-30
%UNWEIGHTED ILI   0           2002-01-01 2020-01-14
                  1           2020-01-21 2020-06-30
AGE 0-4           0           2002-01-01 2020-01-14
                  1           2020-01-21 2020-06-30
AGE 5-24          0           2002-01-01 2020-01-14
                  1           2020-01-21 2020-06-30
ILITOTAL          0           2002-01-01 2020-01-14
                  1           2020-01-21 2020-06-30
NUM. OF PROVIDERS 0           2002-01-01 2020-01-14
                  1           2020-01-21 2020-06-30
OT                0           2002-01-01 2020-01-14
                  1           2020-01-21 2020-06-30
INFO:root:
Total data 			6762 time stamps 
Available percentage=100.0, 	6762 time stamps 
Insample  percentage=2.48, 	168 time stamps 
Outsample percentage=97.52, 	6594 time stamps 

/Users/fedex/projects/neuralforecast/neuralforecast/data/tsdataset.py:208: FutureWarning: In a future version of pandas all arguments of DataFrame.drop except for the argument 'labels' will be keyword-only
  X.drop(['unique_id', 'ds'], 1, inplace=True)
GPU available: False, used: False
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
/Users/fedex/projects/neuralforecast/neuralforecast/data/tsloader.py:47: UserWarning: This class wraps the pytorch `DataLoader` with a special collate function. If you want to use yours simply use `DataLoader`. Removing collate_fn
  'This class wraps the pytorch `DataLoader` with a '
/Users/fedex/opt/miniconda3/envs/nixtla/lib/python3.7/site-packages/pytorch_lightning/trainer/data_loading.py:117: UserWarning: The dataloader, predict_dataloader 0, does not have many workers which may be a bottleneck. Consider increasing the value of the `num_workers` argument` (try 8 which is the number of cpus on this machine) in the `DataLoader` init to improve performance.
  f"The dataloader, {name}, does not have many workers which may be a bottleneck."
/Users/fedex/opt/miniconda3/envs/nixtla/lib/python3.7/site-packages/torch/nn/functional.py:3635: UserWarning: Default upsampling behavior when mode=linear is changed to align_corners=False since 0.4.0. Please specify align_corners=True if the old behavior is desired. See the documentation of nn.Upsample for details.
  "See the documentation of nn.Upsample for details.".format(mode)
<style scoped> .dataframe tbody tr th:only-of-type { vertical-align: middle; }
.dataframe tbody tr th {
    vertical-align: top;
}

.dataframe thead th {
    text-align: right;
}
</style>
unique_id ds y_hat
0 % WEIGHTED ILI 2020-01-21 -0.456533
1 % WEIGHTED ILI 2020-01-28 -0.310985
2 % WEIGHTED ILI 2020-02-04 -0.114892
3 % WEIGHTED ILI 2020-02-11 -0.071604
4 % WEIGHTED ILI 2020-02-18 0.285519
Y_df_plot = Y_df_test.merge(Y_df_forecast, how='left', on=['unique_id', 'ds'])
Y_df_plot.query('unique_id == "OT"').set_index('ds').plot()
<AxesSubplot:xlabel='ds'>

png

Current available models

  • Neural Hierarchical Interpolation for Time Series Forecasting (N-HiTS: A new model for long-horizon forecasting which incorporates novel hierarchical interpolation and multi-rate data sampling techniques to specialize blocks of its architecture to different frequency band of the time-series signal. It achieves SoTA performance on several benchmark datasets, outperforming current Transformer-based models by more than 25%.

  • Exponential Smoothing Recurrent Neural Network (ES-RNN): A hybrid model that combines the expressivity of non linear models to capture the trends while it normalizes using a Holt-Winters inspired model for the levels and seasonals. This model is the winner of the M4 forecasting competition.

  • Neural Basis Expansion Analysis (N-BEATS): A model from Element-AI (Yoshua Bengio’s lab) that has proven to achieve state of the art performance on benchmark large scale forecasting datasets like Tourism, M3, and M4. The model is fast to train an has an interpretable configuration.

  • Neural Basis Expansion Analysis with Exogenous Variables (N-BEATSx): The neural basis expansion with exogenous variables is an extension to the original N-BEATS that allows it to include time dependent covariates.

License

This project is licensed under the GPLv3 License - see the LICENSE file for details.

How to contribute

See CONTRIBUTING.md.

How to cite

If you use NeuralForecast in a scientific publication, we encourage you to add the following references to the related papers:

@article{neuralforecast_arxiv,
  author  = {XXXX},
  title   = {{NeuralForecast: Deep Learning for Time Series Forecasting}},
  journal = {arXiv preprint arXiv:XXX.XXX},
  year    = {2022}
}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

neuralforecast-0.0.3.tar.gz (97.3 kB view hashes)

Uploaded Source

Built Distribution

neuralforecast-0.0.3-py3-none-any.whl (116.0 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page