Skip to main content

Multi-Output Gaussian Process ToolKit

Project description

Multi-Output Gaussian Process Toolkit

Paper - API Documentation - Tutorials & Examples

The Multi-Output Gaussian Process Toolkit is a Python toolkit for training and interpreting Gaussian process models with multiple data channels. It builds upon PyTorch to provide an easy way to train multi-output models effectively on CPUs and GPUs. The main authors are Taco de Wolff, Alejandro Cuevas, and Felipe Tobar as part of the Center for Mathematical Modelling at the University of Chile.

Installation

With Anaconda installed on your system, open a command prompt and create a virtual environment:

conda create -n myenv python=3.7
conda activate myenv

where myenv is the name of your environment, and where the version of Python could be 3.6 or above. Next we will install this toolkit and automatically install the necessary dependencies such as PyTorch.

pip install mogptk

In order to upgrade to a new version of MOGPTK or any of its dependencies, use --upgrade as follows:

pip install --upgrade mogptk

For developers of the library or for users who need the latest changes, we recommend cloning the git master or develop branch and to use the following command inside the repository folder:

pip install --upgrade -e .

See Tutorials & Examples to get started.

Introduction

This repository provides a toolkit to perform multi-output GP regression with kernels that are designed to utilize correlation information among channels in order to better model signals. The toolkit is mainly targeted to time-series, and includes plotting functions for the case of single input with multiple outputs (time series with several channels).

The main kernel corresponds to Multi Output Spectral Mixture Kernel, which correlates every pair of data points (irrespective of their channel of origin) to model the signals. This kernel is specified in detail in the following publication: G. Parra, F. Tobar, Spectral Mixture Kernels for Multi-Output Gaussian Processes, Advances in Neural Information Processing Systems, 2017. Proceedings link: https://papers.nips.cc/paper/7245-spectral-mixture-kernels-for-multi-output-gaussian-processes

The kernel learns the cross-channel correlations of the data, so it is particularly well-suited for the task of signal reconstruction in the event of sporadic data loss. All other included kernels can be derived from the Multi Output Spectral Mixture kernel by restricting some parameters or applying some transformations.

One of the main advantages of the present toolkit is the GPU support, which enables the user to train models through PyTorch, speeding computations significantly. It also includes sparse-variational GP regression functionality to decrease computation time even further.

See MOGPTK: The Multi-Output Gaussian Process Toolkit for our publication in Neurocomputing.

Tutorials

00 - Quick Start: Short notebook showing the basic use of the toolkit.

01 - Data Loading: Functionality to load CSVs and DataFrames while using formatters for dates.

02 - Data Preparation: Handle data, removing observations to simulate sensor failure and apply tranformations to the data.

03 - Parameter Initialization: Parameter initialization using different methods, for single output regression using spectral mixture kernel and multioutput case using MOSM kernel.

04 - Model Training: Training of models while keeping certain parameters fixed.

05 - Error Metrics Obtain different metrics in order to compare models.

06 - Custom Kernels and Mean Functions Use or create custom kernels as well as training custom mean functions.

Examples

Airline passangers: Regression using a single output spectral mixture on the yearly number of passengers of an airline.

Seasonal CO2 of Mauna-Loa: Regression using a single output spectral mixture on the CO2 concentration at Mauna-Loa throughout many years.

Currency Exchange: Model training, interpretation and comparison on a dataset of 11 currency exchange rates (against the dollar) from 2017 and 2018. These 11 channels are fitted with the MOSM, SM-LMC, CSM, and CONV kernels and their results are compared and interpreted.

Gold, Oil, NASDAQ, USD-index: The commodity indices for gold and oil, together with the indices for the NASDAQ and the USD against a basket of other currencies, we train multiple models to find correlations between the macro economic indicators.

Human Activity Recognition: Using the Inertial Measurement Unit (IMU) of an Apple iPhone 4, the accelerometer, gyroscope and magnetometer 3D data were recorded for different activities resulting in nine channels.

Bramblemet tidal waves: Tidal wave data set of four locations in the south of England. We model the tidal wave periods of approximately 12.5 hours using different multi-output Gaussian processes.

Documentation

See the API documentation for documentation of our toolkit, including usage and examples of functions and classes.

Authors

  • Taco de Wolff
  • Alejandro Cuevas
  • Felipe Tobar

Users

This is a list of users of this toolbox, feel free to add your project!

Contributing

We accept and encourage contributions to the toolkit in the form of pull requests (PRs), bug reports and discussions (GitHub issues). It is adviced to start an open discussion before proposing large PRs. For small PRs we suggest that they address only one issue or add one new feature. All PRs should keep documentation and notebooks up to date.

Citing

Please use our publication at arXiv to cite our toolkit: MOGPTK: The Multi-Output Gaussian Process Toolkit. We recommend the following BibTeX entry:

@article{mogptk,
    author = {T. {de Wolff} and A. {Cuevas} and F. {Tobar}},
    title = {{MOGPTK: The Multi-Output Gaussian Process Toolkit}},
    journal = "Neurocomputing",
    year = "2020",
    issn = "0925-2312",
    doi = "https://doi.org/10.1016/j.neucom.2020.09.085",
    url = "https://github.com/GAMES-UChile/mogptk"
}

License

Released under the MIT license.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mogptk-0.2.6.tar.gz (56.5 kB view details)

Uploaded Source

Built Distribution

mogptk-0.2.6-py3-none-any.whl (95.6 kB view details)

Uploaded Python 3

File details

Details for the file mogptk-0.2.6.tar.gz.

File metadata

  • Download URL: mogptk-0.2.6.tar.gz
  • Upload date:
  • Size: 56.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/0.0.0 pkginfo/1.7.1 requests/2.25.1 setuptools/52.0.0.post20210125 requests-toolbelt/0.9.1 tqdm/4.61.2 CPython/3.8.8

File hashes

Hashes for mogptk-0.2.6.tar.gz
Algorithm Hash digest
SHA256 9fa893d1f10f625b921407d4e3e60ec6b599b1ddf1760f8b09e7705fb7bab523
MD5 929503828efe86f93874d411af180f2a
BLAKE2b-256 2b6deda72b7880ea3318fee639d66bf1cdcea2648a72f16a160cb98bfc50de0d

See more details on using hashes here.

Provenance

File details

Details for the file mogptk-0.2.6-py3-none-any.whl.

File metadata

  • Download URL: mogptk-0.2.6-py3-none-any.whl
  • Upload date:
  • Size: 95.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/0.0.0 pkginfo/1.7.1 requests/2.25.1 setuptools/52.0.0.post20210125 requests-toolbelt/0.9.1 tqdm/4.61.2 CPython/3.8.8

File hashes

Hashes for mogptk-0.2.6-py3-none-any.whl
Algorithm Hash digest
SHA256 c784e7b38eb7857e581b0d5ef94acaf606315f55efae9373be9eca1a8cd91eca
MD5 5cb95fe6b5fdac7ec1d992c8e7bc02dd
BLAKE2b-256 63083e96a25707dc4ff03c68bf2c774186f925a89e191f1751a456e72724ab0c

See more details on using hashes here.

Provenance

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page