Skip to main content

Multi-Output Gaussian Process ToolKit

Project description

Multi-Output Gaussian Process Toolkit

Paper - API Documentation - Tutorials & Examples

The Multi-Output Gaussian Process Toolkit is a Python toolkit for training and interpreting Gaussian process models with multiple data channels. It builds upon GPflow and TensorFlow to provide an easy way to train multi-output models effectively and interpret their results. The main authors are Taco de Wolff, Alejandro Cuevas, and Felipe Tobar as part of the Center for Mathematical Modelling at the University of Chile.

Installation

With Anaconda installed on your system, open a command prompt and create a virtual environment:

conda create -n myenv python=3.7
conda activate myenv

where myenv is the name of your environment, and where the version of Python could be 3.6 or above. In order to use TensorFlow on the GPU, the easiest way is to install TensorFlow through conda (and not pip) before we install this toolkit. If you will be using the CPU you can skip this step.

conda install tensorflow-gpu

Next we will install this toolkit and automatically install the necessary dependencies such as GPflow2 and TensorFlow2.

pip install mogptk

See Tutorials & Examples to get started.

Introduction

This repository provides a toolkit to perform multi-output GP regression with kernels that are designed to utilize correlation information among channels in order to better model signals. The toolkit is mainly targeted to time-series, and includes plotting functions for the case of single input with multiple outputs (time series with several channels).

The main kernel corresponds to Multi Output Spectral Mixture Kernel, which correlates every pair of data points (irrespective of their channel of origin) to model the signals. This kernel is specified in detail in the following publication: G. Parra, F. Tobar, Spectral Mixture Kernels for Multi-Output Gaussian Processes, Advances in Neural Information Processing Systems, 2017. Proceedings link: http://papers.nips.cc/paper/7245-spectral-mixture-kernels-for-multi-output-gaussian-processes

The kernel learns the cross-channel correlations of the data, so it is particularly well-suited for the task of signal reconstruction in the event of sporadic data loss. All other included kernels can be derived from the Multi Output Spectral Mixture kernel by restricting some parameters or applying some transformations.

One of the main advantages of the present toolkit is the GPU support, which enables the user to train models through TensorFlow, speeding computations significantly. It also includes sparse-variational GP regression functionality, to decrease computation time even further.

See MOGPTK: The Multi-Output Gaussian Process Toolkit for our publication on arXiv.

Tutorials

00 - Quick Start: Short notebook showing the basic use of the toolkit.

01 - Data Loading: Functionality to load CSVs and DataFrames while using formatters for dates.

02 - Data Preparation: Handle data, removing observations to simulate sensor failure and apply tranformations to the data.

03 - Parameter Initialization: Parameter initialization using different methods, for single output regression using spectral mixture kernel and multioutput case using MOSM kernel.

04 - Model Training: Training of models while keeping certain parameters fixed.

05 - Error Metrics Obtain different metrics to compare models.

Examples

Currency Exchange: Model training, interpretation and comparison on a dataset of 11 currency exchange rates (against the dollar) from 2017 and 2018. These 11 channels are fitted with the MOSM, SM-LMC, CSM, and CONV kernels and their results are compared and interpreted.

Gold, Oil, NASDAQ, USD-index: The commodity indices for gold and oil, together with the indices for the NASDAQ and the USD against a basket of other currencies, we train multiple models to find correlations between the macro economic indicators.

Human Activity Recognition: Using the Inertial Measurement Unit (IMU) of an Apple iPhone 4, the accelerometer, gyroscope and magnetometer 3D data were recorded for different activities resulting in nine channels.

Seasonal C02 and Airline passangers: Regression for 2 datasets using a single output spectral mixture, first the Mauna Loa C02 concentration and the second the passangers in a airline.

Documentation

See the API documentation for documentation of our toolkit, including usage and examples of functions and classes.

Authors

  • Taco de Wolff
  • Alejandro Cuevas
  • Felipe Tobar

Users

This is a list of users of this toolbox, feel free to add your project!

Contributing

We accept and encourage contributions to the toolkit in the form of pull requests (PRs), bug reports and discussions (GitHub issues). It is adviced to start an open discussion before proposing large PRs. For small PRs we suggest that they address only one issue or add one new feature. All PRs should keep documentation and notebooks up to date.

Citing

Please use our publication at arXiv to cite our toolkit: MOGPTK: The Multi-Output Gaussian Process Toolkit. We recommend the following BiBTeX entry:

@article{mogptk,
       author = {T. {de Wolff} and A. {Cuevas} and F. {Tobar}},
        title = {{MOGPTK: The Multi-Output Gaussian Process Toolkit}},
      journal = {arXiv e-prints},
         year = {2020},
          eid = {arXiv:2002.03471},
        pages = {arXiv:2002.03471},
archivePrefix = {arXiv},
       eprint = {2002.03471},
 primaryClass = {stat.ML},
          url = {https://github.com/GAMES-UChile/mogptk},
}

License

Released under the MIT license.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mogptk-0.1.4.tar.gz (50.5 kB view details)

Uploaded Source

Built Distribution

mogptk-0.1.4-py3-none-any.whl (56.6 kB view details)

Uploaded Python 3

File details

Details for the file mogptk-0.1.4.tar.gz.

File metadata

  • Download URL: mogptk-0.1.4.tar.gz
  • Upload date:
  • Size: 50.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/45.1.0 requests-toolbelt/0.9.1 tqdm/4.42.0 CPython/3.7.5

File hashes

Hashes for mogptk-0.1.4.tar.gz
Algorithm Hash digest
SHA256 b5819ca8aa1c4eabe69b64b85f3bfcd0ef99154148db00ed0ce8e725bb137a96
MD5 f256953acd58f472883c537b2d41b644
BLAKE2b-256 cb03aa436c630ada24b12b58e957a3812f4b136101e83b568c8f0ff3748bc4db

See more details on using hashes here.

Provenance

File details

Details for the file mogptk-0.1.4-py3-none-any.whl.

File metadata

  • Download URL: mogptk-0.1.4-py3-none-any.whl
  • Upload date:
  • Size: 56.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/45.1.0 requests-toolbelt/0.9.1 tqdm/4.42.0 CPython/3.7.5

File hashes

Hashes for mogptk-0.1.4-py3-none-any.whl
Algorithm Hash digest
SHA256 747392ce3557b51f248ba7e1ad5aa928e88689bd22b93b5c60e1e148e273f62f
MD5 85081bb768d148eff6be531e487ea785
BLAKE2b-256 5e8a45a9a873e3fd9f8778757ecea7c2335d140a9257a353f7dc9218658f1a6e

See more details on using hashes here.

Provenance

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page