Skip to main content

Fast and easy gaussian process regression for time series prediction

Project description

makeprediction logo

MakePrediction is a package for building Gaussian process models in Python. It was originally created by [Hanany Tolba].

  • MakePrediction is an open source project. If you have relevant skills and are interested in contributing then please do contact us (hananytolba@yahoo.com).*

Gaussian process regression (GPR):

The advantages of Gaussian processes are:

  • The prediction interpolates the observations.
  • The prediction is probabilistic (Gaussian) so that one can compute empirical confidence intervals and decide based on those if one should refit (online fitting, adaptive fitting) the prediction in some region of interest.
  • Versatile: different kernels can be specified. Common kernels are provided, but it is also possible to specify custom kernels.

In addition to standard scikit-learn estimator API,

  • The methods proposed here are much faster than standard scikit-learn estimator API.
  • The prediction method here (predict) is very complete compared to scikit-learn gaussian process API with many options such as: the sparse context and the automatic online update of prediction.

What does makeprediction do?

  • Modelling and analysis time series.

  • Automatic time-series prediction (forecasting) using Gaussian processes model.

  • Real-Time time series prediction.

  • Deploy on production the fitted (or saved) makeprediction model.

Applications:

  • Energy consumption prediction.
  • Energy demand prediction.
  • Stock price prediction.
  • Stock market prediction.
  • ...

Latest release from PyPI

  • pip install makeprediction

Latest source from GitHub

Be aware that the master branch may change regularly, and new commits may break your code.

MakePrediction GitHub repository, run:

  • pip install .

Example

Here is a simple example:

from makeprediction.quasigp import QuasiGPR as qgpr
from makeprediction.invtools import date2num
from makeprediction.kernels import *
import datetime
import pandas as pd
import numpy as np

#generate time series
###############################
  
x = pd.date_range(start = datetime.datetime(2021,1,1), periods=1000, freq = '3s' )
time2num = date2num(x)

# f(x)
f = lambda dt:  100*np.sin(2*np.pi*dt/500)*np.sin(2*np.pi*dt/3003)  + 500
# f(x) + noise
y = f(time2num) + 7*np.random.randn(x.size)

# split time serie into train and test
trainSize = int(x.size *.7)
xtrain,ytrain = x[:trainSize], y[:trainSize]
xtest,ytest = x[trainSize:], y[trainSize:]

# Create an instance of the class qgpr as model and plot it with plotly:
#########################################
model = qgpr(xtrain,ytrain, RBF()) 
model.plotly()
makeprediction logo
#fit the model
model.fit()
#predict with model and plot
model.predict(xtest)
model.plotly(ytest)
makeprediction logo
#Online prediction with update
ypred = []
for i in range(xtest.size):
    yp,_ = model.predict(xtest[i],return_value = True)
    ypred.append(yp)
    data = {'x_update': xtest[i], 'y_update': ytest[i],}
    model.update(**data)


#plot 

import matplotlib.pyplot as plt
plt.figure(figsize = (10,5))
plt.plot(xtest,ytest,'b', label ='Test')
plt.plot(xtest,ypred,'r',label='Prediction')
plt.legend()
makeprediction logo

The previous prediction with updating, can be obtained simply by the "predict" method as follows:

#prediction with update 
model.predict(xtest,ytest[:-1])
#And ploly 
model.plotly(ytest)
makeprediction logo
# Errors of prediction
model.score(ytest)

{'train_errors': {'MAE': 5.525659848832947,
  'MSE': 48.75753482298262,
  'R2': 0.9757047695585449},
 'test_errors': {'MAE': 6.69916209795533,
  'MSE': 68.7186589422385,
  'R2': 0.9816696384584944}}

Save the model:

model_path = 'saved_model'
model.save(model_path)

Deployement of makeprediction model

Now we are going to simulate the behavior of data that arrives continuously in realtime and stored in a database. We will create a 'csv' file named 'live_db.csv' that automatically grows every 3 seconds with a new line.

Create a 'realtime_db.py' file and copy the following code into it

from makeprediction.ts_generation import rtts
import numpy as np
def func(t):
    f_t  = 100*np.sin(2*np.pi*t/500)*np.sin(2*np.pi*t/3003)  + 500  + 7*np.random.randn(1)[0]
    return f_t

if __name__ == '__main__':
    rtts(function = func,step = 3,filename = 'live_db.csv')

You can notice that the function 'func' is the same as the one that generates the time series that we have noted "f".

Now in a terminal type:

python realtime_db.py 

or

python3 realtime_db.py 

or in a new Jupyter notepad

!python realtime_db.py 

according to your preference

Load the model

from makeprediction.quasigp import QuasiGPR as qgpr
model_path = 'saved_model'
#load the model
loaded_model = qgpr()
loaded_model = loaded_model.load(model_path)

Deployement

loaded_model.deploy2dashbord('live_db.csv')
Dash is running on http://127.0.0.1:9412/

INFO:makeprediction.quasigp:Dash is running on http://127.0.0.1:9412/

 * Serving Flask app "makeprediction.quasigp" (lazy loading)
 * Environment: production
   WARNING: This is a development server. Do not use it in a production deployment.
   Use a production WSGI server instead.
 * Debug mode: off
INFO:werkzeug: * Running on http://127.0.0.1:9412/ (Press CTRL+C to quit)
INFO:werkzeug:127.0.0.1 - - [18/Feb/2021 14:22:51] "POST /_dash-update-component HTTP/1.1" 200 -
INFO:werkzeug:127.0.0.1 - - [18/Feb/2021 14:22:51] "POST /_dash-update-component HTTP/1.1" 200 -
INFO:werkzeug:127.0.0.1 - - [18/Feb/2021 14:22:51] "POST /_dash-update-component HTTP/1.1" 200 -
makeprediction logo makeprediction logo

Demo

All files of this demo are in the 'demos' directory.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

makeprediction-2.1.0.tar.gz (134.8 kB view hashes)

Uploaded Source

Built Distribution

makeprediction-2.1.0-py3-none-any.whl (134.8 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page