Skip to main content

Multi-Output Gaussian Process ToolKit

Project description

Multi-Output Gaussian Process Toolkit

API Documentation - Tutorials & Examples

The Multi-Output Gaussian Process Toolkit is a Python toolkit for training and interpreting Gaussian process models with multiple data channels. It builds upon GPflow and TensorFlow to provide an easy way to train multi-output models effectively and interpret their results. The main authors are Taco de Wolff, Alejandro Cuevas, and Felipe Tobar as part of the Center for Mathematical Modelling at the University of Chile.

Installation

With Anaconda installed on your system, open a command prompt and create a virtual environment:

conda create -n myenv python=3.7
conda activate myenv

where myenv is the name of your environment, and where the version of Python could be 3.6 or above. In order to use TensorFlow on the GPU, the easiest way is to install TensorFlow through conda (and not pip) before we install this toolkit. If you will be using the CPU you can skip this step.

conda install tensorflow-gpu

Next we will install this toolkit and automatically install the necessary dependencies such as GPflow2 and TensorFlow2.

pip install mogptk

See Tutorials & Examples to get started.

Introduction

This repository provides a toolkit to perform multi-output GP regression with kernels that are designed to utilize correlation information among channels in order to better model signals. The toolkit is mainly targeted to time-series, and includes plotting functions for the case of single input with multiple outputs (time series with several channels).

The main kernel corresponds to Multi Output Spectral Mixture Kernel, which correlates every pair of data points (irrespective of their channel of origin) to model the signals. This kernel is specified in detail in the following publication: G. Parra, F. Tobar, Spectral Mixture Kernels for Multi-Output Gaussian Processes, Advances in Neural Information Processing Systems, 2017. Proceedings link: http://papers.nips.cc/paper/7245-spectral-mixture-kernels-for-multi-output-gaussian-processes

The kernel learns the cross-channel correlations of the data, so it is particularly well-suited for the task of signal reconstruction in the event of sporadic data loss. All other included kernels can be derived from the Multi Output Spectral Mixture kernel by restricting some parameters or applying some transformations.

One of the main advantages of the present toolkit is the GPU support, which enables the user to train models through TensorFlow, speeding computations significantly. It also includes sparse-variational GP regression functionality, to decrease computation time even further.

Tutorials

00 - Quick Start: Short notebook showing the basic use of the toolkit.

01 - Data Loading: Functionality to load CSVs and DataFrames while using formatters for dates.

02 - Data Preparation: Handle data, removing observations to simulate sensor failure and apply tranformations to the data.

03 - Parameter Estimation: Parameter initialization using different methods, for single output regression using spectral mixture kernel and multioutput case using MOSM kernel.

04 - Model Training: Training of models while keeping certain parameters fixed.

05 - Error Metrics Obtain different metrics to compare models.

Examples

Currency Exchange: Model training, interpretation and comparison on a dataset of 11 currency exchange rates (against the dollar) from 2017 and 2018. These 11 channels are fitted with the MOSM, SM-LMC, CSM, and CONV kernels and their results are compared and interpreted.

Gold, Oil, NASDAQ, USD-index: The commodity indices for gold and oil, together with the indices for the NASDAQ and the USD against a basket of other currencies, we train multiple models to find correlations between the macro economic indicators.

Human Activity Recognition: Using the Inertial Measurement Unit (IMU) of an Apple iPhone 4, the accelerometer, gyroscope and magnetometer 3D data were recorded for different activities resulting in nine channels.

Seasonal C02 and Airline passangers: Regression for 2 datasets using a single output spectral mixture, first the Mauna Loa C02 concentration and the second the passangers in a airline.

Documentation

See the API documentation for documentation of our toolkit, including usage and examples of functions and classes.

Authors

  • Taco de Wolff
  • Alejandro Cuevas
  • Felipe Tobar

Contributing

We accept and encourage contributions to the toolkit in the form of pull requests (PRs), bug reports and discussions (GitHub issues). TODO

Citing

TODO

License

Released under the MIT license.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Files for mogptk, version 0.1.2
Filename, size File type Python version Upload date Hashes
Filename, size mogptk-0.1.2-py3-none-any.whl (55.7 kB) File type Wheel Python version py3 Upload date Hashes View
Filename, size mogptk-0.1.2.tar.gz (48.3 kB) File type Source Python version None Upload date Hashes View

Supported by

Pingdom Pingdom Monitoring Google Google Object Storage and Download Analytics Sentry Sentry Error logging AWS AWS Cloud computing DataDog DataDog Monitoring Fastly Fastly CDN DigiCert DigiCert EV certificate StatusPage StatusPage Status page