Skip to main content

Multi-Output Gaussian Process ToolKit

Project description

Multi-Output Gaussian Process Toolkit

Paper - API Documentation - Tutorials & Examples

The Multi-Output Gaussian Process Toolkit is a Python toolkit for training and interpreting Gaussian process models with multiple data channels. It builds upon PyTorch to provide an easy way to train multi-output models effectively on CPUs and GPUs. The main authors are Taco de Wolff, Alejandro Cuevas, and Felipe Tobar as part of the Center for Mathematical Modelling at the University of Chile.

Installation

With Anaconda installed on your system, open a command prompt and create a virtual environment:

conda create -n myenv python=3.7
conda activate myenv

where myenv is the name of your environment, and where the version of Python could be 3.6 or above. Next we will install this toolkit and automatically install the necessary dependencies such as PyTorch.

pip install mogptk

In order to upgrade to a new version of MOGPTK or any of its dependencies, use --upgrade as follows:

pip install --upgrade mogptk

For developers of the library or for users who need the latest changes, we recommend cloning the git master or develop branch and to use the following command inside the repository folder:

pip install --upgrade -e .

See Tutorials & Examples to get started.

Introduction

This repository provides a toolkit to perform multi-output GP regression with kernels that are designed to utilize correlation information among channels in order to better model signals. The toolkit is mainly targeted to time-series, and includes plotting functions for the case of single input with multiple outputs (time series with several channels).

The main kernel corresponds to Multi Output Spectral Mixture Kernel, which correlates every pair of data points (irrespective of their channel of origin) to model the signals. This kernel is specified in detail in the following publication: G. Parra, F. Tobar, Spectral Mixture Kernels for Multi-Output Gaussian Processes, Advances in Neural Information Processing Systems, 2017. Proceedings link: https://papers.nips.cc/paper/7245-spectral-mixture-kernels-for-multi-output-gaussian-processes

The kernel learns the cross-channel correlations of the data, so it is particularly well-suited for the task of signal reconstruction in the event of sporadic data loss. All other included kernels can be derived from the Multi Output Spectral Mixture kernel by restricting some parameters or applying some transformations.

One of the main advantages of the present toolkit is the GPU support, which enables the user to train models through PyTorch, speeding computations significantly. It also includes sparse-variational GP regression functionality to decrease computation time even further.

See MOGPTK: The Multi-Output Gaussian Process Toolkit for our publication in Neurocomputing.

Implementation

Implemented models:

  • Exact
  • Snelson (E. Snelson, Z. Ghahramani, "Sparse Gaussian Processes using Pseudo-inputs", 2005)
  • OpperArchambeau (M. Opper, C. Archambeau, "The Variational Gaussian Approximation Revisited", 2009)
  • Titsias (Titsias, "Variational learning of induced variables in sparse Gaussian processes", 2009)
  • Hensman (J. Hensman, et al., "Scalable Variational Gaussian Process Classification", 2015)

Implemented likelihoods:

  • Gaussian
  • Student-T
  • Exponential
  • Laplace
  • Bernoulli
  • Beta
  • Gamma
  • Poisson
  • Weibull
  • Log-Logistic
  • Log-Gaussian
  • Chi
  • Chi-Squared

Tutorials

00 - Quick Start: Short notebook showing the basic use of the toolkit.

01 - Data Loading: Functionality to load CSVs and DataFrames while using formatters for dates.

02 - Data Preparation: Handle data, removing observations to simulate sensor failure and apply tranformations to the data.

03 - Parameter Initialization: Parameter initialization using different methods, for single output regression using spectral mixture kernel and multioutput case using MOSM kernel.

04 - Model Training: Training of models while keeping certain parameters fixed.

05 - Error Metrics Obtain different metrics in order to compare models.

06 - Custom Kernels and Mean Functions Use or create custom kernels as well as training custom mean functions.

Examples

Airline passengers: Regression using a single output spectral mixture on the yearly number of passengers of an airline.

Seasonal CO2 of Mauna-Loa: Regression using a single output spectral mixture on the CO2 concentration at Mauna-Loa throughout many years.

Currency Exchange: Model training, interpretation and comparison on a dataset of 11 currency exchange rates (against the dollar) from 2017 and 2018. These 11 channels are fitted with the MOSM, SM-LMC, CSM, and CONV kernels and their results are compared and interpreted.

Gold, Oil, NASDAQ, USD-index: The commodity indices for gold and oil, together with the indices for the NASDAQ and the USD against a basket of other currencies, we train multiple models to find correlations between the macro economic indicators.

Human Activity Recognition: Using the Inertial Measurement Unit (IMU) of an Apple iPhone 4, the accelerometer, gyroscope and magnetometer 3D data were recorded for different activities resulting in nine channels.

Bramblemet tidal waves: Tidal wave data set of four locations in the south of England. We model the tidal wave periods of approximately 12.5 hours using different multi-output Gaussian processes.

Documentation

See the API documentation for documentation of our toolkit, including usage and examples of functions and classes.

Authors

  • Taco de Wolff
  • Alejandro Cuevas
  • Felipe Tobar

Users

This is a list of users of this toolbox, feel free to add your project!

Contributing

We accept and encourage contributions to the toolkit in the form of pull requests (PRs), bug reports and discussions (GitHub issues). It is adviced to start an open discussion before proposing large PRs. For small PRs we suggest that they address only one issue or add one new feature. All PRs should keep documentation and notebooks up to date.

Citing

Please use our publication at arXiv to cite our toolkit: MOGPTK: The Multi-Output Gaussian Process Toolkit. We recommend the following BibTeX entry:

@article{mogptk,
    author = {T. {de Wolff} and A. {Cuevas} and F. {Tobar}},
    title = {{MOGPTK: The Multi-Output Gaussian Process Toolkit}},
    journal = "Neurocomputing",
    year = "2020",
    issn = "0925-2312",
    doi = "https://doi.org/10.1016/j.neucom.2020.09.085",
    url = "https://github.com/GAMES-UChile/mogptk"
}

License

Released under the MIT license.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mogptk-0.3.0.tar.gz (77.7 kB view details)

Uploaded Source

Built Distribution

mogptk-0.3.0-py3-none-any.whl (124.7 kB view details)

Uploaded Python 3

File details

Details for the file mogptk-0.3.0.tar.gz.

File metadata

  • Download URL: mogptk-0.3.0.tar.gz
  • Upload date:
  • Size: 77.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.9.12

File hashes

Hashes for mogptk-0.3.0.tar.gz
Algorithm Hash digest
SHA256 849e06481391a65f23bf4754f334c3f4270613cec8a8555a95ab3804896b5ec3
MD5 3376c8d2194c24cf8de2060b1b98caab
BLAKE2b-256 4c7ba6e0175132aa949859eac6b58e664659e1c984b8d9b537698623f028a501

See more details on using hashes here.

Provenance

File details

Details for the file mogptk-0.3.0-py3-none-any.whl.

File metadata

  • Download URL: mogptk-0.3.0-py3-none-any.whl
  • Upload date:
  • Size: 124.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.9.12

File hashes

Hashes for mogptk-0.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 935394fa2d7fb16e3f165f11514f17b28ccb1918b0ce9ffe6435f8809e81bacd
MD5 e9db4f0874335655f7b55ee8257fdb12
BLAKE2b-256 295429d44494836ffdcaa14faa1d39c15e47335b0f53bee027fcb0f19b27dbc9

See more details on using hashes here.

Provenance

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page