Time series forecasting with Gaussian Processes

Time series forecasting with Gaussian Processes

Related Publication

The theoretical description of the algorithm implemented in this software and empirical results can be found in:

"Time series forecasting with Gaussian Processes needs priors"
ECML PKDD 2021: Proc. Machine Learning and Knowledge Discovery in Databases. Applied Data Science Track pp 103–117
Giorgio Corani, Alessio Benavoli, Marco Zaffalon

gpforecast package

The software includes a small package that builds the gaussian process and uses it to produce predictions. The package heavily relies on GPy. A convenience script can be used to run the GP over collections of timeseries.

Installation

gpforecast requires Python 3.8+ to run. Create a new conda environment named ENVNAME, install Python 3.8

conda create --name ENVNAME python=3.8


Activate the new environment to use it

conda activate ENVNAME


Install gpforecast with dependencies

pip install gpforecast


Tutorial [Python]

from gpforecast import GP, get_sample_ts

# STEP 0. Get a sample time series (Monthly or Quarterly time series from M3)
f = 'quarterly'  # 'monthly' or 'quarterly'
sample = get_sample_ts(frequency=f)

# STEP 1. Construct our model
g = GP(frequency=f, priors=True)

# STEP 2. Build our model
g.build(sample['Y'])

# STEP 3. Forecast
res = g.forecast(len(sample['YY']), level=[80,95])

# STEP 4. Evaluating forecast accuracy {MAE, CRPS, LL}, you could implement your own!
mean = res.PointForecast
u95  = res.Hi95
acc = g.compute_indicators(sample['YY'], mean, u95, level=95)

# STEP 5. Forecasts Plot
import matplotlib.pyplot as plt

def forecast_plot(train, res):
plt.style.use('ggplot')
plt.figure(figsize=(16,8))
plt.plot(list(range(1,len(train)+1)), train.reshape(-1,), color='black')
plt.plot(list(range(len(train)+1,len(train)+len(res.PointForecast)+1)),
res.PointForecast, color='blue')
levels = []
for i in res.columns:
if 'Lo' in i: levels.append(int(i[2:]))
for i, l in enumerate(sorted(levels)):
plt.fill_between(list(range(len(train)+1,len(train)+len(res.PointForecast)+1)),
res['Lo'+str(l)], res['Hi'+str(l)], color='blue', alpha=0.3-0.1*i)
plt.title('Forecasts from GP', loc='left', fontsize=16)
plt.xlabel("Time", fontsize=14)
plt.gca().tick_params(axis='both', which='major', labelsize=12)

forecast_plot(sample['Y'], res)


Tutorial [R]

gpforecast could be used in R thanks to Reticulate

install.packages("Reticulate")

library(reticulate)
use_virtualenv("ENVNAME")           # python environment with gpforecast
gpforecast <- import("gpforecast")  # load Python module
# STEP 0. Get a sample time series (Monthly or Quarterly time series from M3)
f = 'monthly'  # 'monthly' or 'quarterly'
sample = gpforecast$get_sample_ts(frequency=f) # STEP 1. Construct our model g <- gpforecast$GP(frequency=f, priors=TRUE)

# STEP 2. Build our model
g$build(sample$Y)

# STEP 3. Forecast
res <- g$forecast(length(sample$YY), level=as.integer(c(80,95)))

# STEP 4. Evaluating forecast accuracy {MAE, CRPS, LL}, you could implement your own!
mean <- as.array(as.vector(t(res['PointForecast'])))
u95  <- as.array(as.vector(t(res['Hi95'])))
acc <- g$compute_indicators(sample$YY, mean, u95, level=as.integer(95))

# STEP 5. Forecasts Plot
library(ggplot2)

forecast_plot <- function(train, res) {
options(repr.plot.width = 16, repr.plot.height = 8)
train_df <- data.frame(
X = 1:length(train),
Y = c(as.vector(t(train))))
res$h = 1:length(res[,'PointForecast']) + length(sample$Y)
p <- ggplot() +
geom_line(data = train_df, aes(X, Y), color = "black", size = 0.8) +
geom_line(data = res, aes(h, res[,'PointForecast']), color = "blue", size = 0.8)
levels <- list()
l <-1
for (i in colnames(res)) {
if (grepl('Lo', i, fixed = TRUE)) {
levels[l] <- substr(i,3,nchar(i))
l <- l + 1
}
}
levels <- sort(as.numeric(levels))
for (i in 1:length(levels)) {
lo <- paste0('Lo',levels[i])
hi <- paste0('Hi',levels[i])
p <- p +
geom_ribbon(data = res,
aes_string(x='h', ymin=lo, ymax=hi),
alpha=0.4-0.1*i, fill = "blue")
}
p <- p + ggtitle('Forecasts from GP') + xlab('Time') +
theme(
axis.title.y=element_blank(),
plot.title = element_text(size = 20),
axis.title = element_text(size = 18),
axis.text = element_text(size = 14))
return (p)
}

forecast_plot(sample\$Y, res)


IDSIA - Istituto Dalle Molle di Studi sull'Intelligenza Artificiale

Lugano, Switzerland idsia.ch

Project details

Uploaded Source
Uploaded Python 3