Skip to main content

GPie: Gaussian Process tiny explorer

Project description

GPie

Language Python PyPI License Last Commit

Gaussian Process tiny explorer

  • simple: an intuitive syntax inspired by scikit-learn
  • powerful: a compact core of expressive abstractions
  • extensible: a modular design for effortless composition
  • lightweight: a minimal set of dependencies {standard library, numpy, scipy}

This is a ongoing research project with many parts currently under construction - please expect bugs and sharp edges.

Features

  • several "avant-garde" kernels such as spectral kernel and neural kernel allow for exploration of new ideas
  • each kernel implements anisotropic variant besides isotropic one to support automatic relevance determination
  • a full-fledged toolkit of kernel operators enables all sorts of "kernel engineering", for example, handcrafting composite kernels based on expert knowledge or exploiting special structure of datasets
  • core computations such as likelihood and gradient are carefully formulated for speed and stability
  • sampling inference embraces a probabilistic perspective in learning and prediction to promote robustness
  • Bayesian optimizer offers a principled strategy to optimize expensive and black-box objectives globally

Functionality

  • kernel functions
    • white kernel
    • constant kernel
    • radial basis function kernel
    • rational quadratic kernel
    • Matérn kernel
      • Ornstein-Uhlenbeck kernel
    • periodic kernel
    • spectral kernel
    • neural kernel
  • kernel operators
    • Hadamard (element-wise)
      • sum
      • product
      • exponentiation
    • Kronecker
      • sum
      • product
  • Gaussian process
    • regression
    • classification
  • t process
    • regression
    • classification
  • Bayesian optimizer
    • surrogate: Gaussian process, t process
    • acquisition: PI, EI, LCB, ES, KG
  • sampling inference
    • Markov chain Monte Carlo
      • Metropolis-Hastings
      • Hamiltonian + no-U-turn
    • simulated annealing
  • variational inference

Note: parts of the project in italic font are under construction.

Examples

Gaussian process regression on Mauna Loa CO2

In this example, we use Gaussian process to model the concentration of CO2 at Mauna Loa as a function of time.

# handcraft a composite kernel based on expert knowledge
# long-term trend
k1 = 30.0**2 * RBFKernel(l=200.0)
# seasonal variations
k2 = 3.0**2 * RBFKernel(l=200.0) * PeriodicKernel(p=1.0, l=1.0)
# medium-term irregularities
k3 = 0.5**2 * RationalQuadraticKernel(m=0.8, l=1.0)
# noise
k4 = 0.1**2 * RBFKernel(l=0.1) + 0.2**2 * WhiteKernel()
# composite kernel
kernel = k1 + k2 + k3 + k4
# train GPR on data
gpr = GaussianProcessRegressor(kernel=kernel)
gpr.fit(X, y)

alt text In the plot, scattered dots represent historical observations, and shaded area shows the predictive interval (μ - σ, μ + σ) prophesied by a Gaussian process regressor trained on the historical data.

Sampling inference for Gaussian process regression

Here we use a synthesized dataset for ease of illustration and investigate sampling inference techniques such as Markov chain Monte Carlo. As a Gaussian process defines the predictive distribution, we can get a sense of it by sampling from its prior distribution (before seeing training set) and posterior distribution (after seeing training set).

# with the current hyperparameter configuration,
# ... what is the prior distribution p(y_test)
y_prior = gpr.prior_predictive(X, n_samples=6)
# ... what is the posterior distribution p(y_test|y_train)
y_posterior = gpr.posterior_predictive(X, n_samples=4)

alt text alt text

We can also sample from the posterior distribution of a hyperparameter, which characterizes its uncertainty beyond a single point estimate such as MLE or MAP.

# invoke MCMC sampler to sample hyper values from its posterior distribution
hyper_posterior = gpr.hyper_posterior(n_samples=10000)

alt text

Bayesian optimization

We demonstrate a simple example of Bayesian optimization. It starts by exploring the objective function globally and shifts to exploiting "promising areas" as more observations are made.

# number of evaluations
n_evals = 10
# surrogate model (Gaussian process)
surrogate = GaussianProcessRegressor(1.0 * MaternKernel(d=5, l=1.0) +
                                     1.0 * WhiteKernel())
# bayesian optimizer
bayesopt = BayesianOptimizer(fun=f, bounds=b, x0=x0, n_evals=n_evals,
                             acquisition='lcb', surrogate=surrogate)
bayesopt.minimize(callback=callback)

alt text

Backend

GPie makes extensive use of de facto standard scientific computing packages in Python:

  • numpy: linear algebra, stochastic sampling
  • scipy: gradient-based optimization, stochastic sampling

Installation

GPie requires Python 3.6 or greater. The easiest way to install GPie is from a prebuilt wheel using pip:

pip install --upgrade gpie

You can also install from source to try out the latest features (requires pep517>=0.8.0 and setuptools>=40.9.0):

pip install --upgrade git+https://github.com/zackxzhang/gpie

Roadmap

  • implement Hamiltonian Monte Carlo and no-U-turn
  • add a demo on characteristics of different kernels
  • add a demo of quantified Occam's razor
  • implement Kronecker operators for scalable learning on grid data
  • replace Cholesky decomposition with Krylov subspace methods for speed
  • ...

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

gpie-0.2.2.tar.gz (27.4 kB view details)

Uploaded Source

Built Distribution

gpie-0.2.2-py3-none-any.whl (30.1 kB view details)

Uploaded Python 3

File details

Details for the file gpie-0.2.2.tar.gz.

File metadata

  • Download URL: gpie-0.2.2.tar.gz
  • Upload date:
  • Size: 27.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.6.1 requests/2.25.0 setuptools/51.0.0.post20201207 requests-toolbelt/0.9.1 tqdm/4.54.1 CPython/3.8.3

File hashes

Hashes for gpie-0.2.2.tar.gz
Algorithm Hash digest
SHA256 9cb867b89ad34646ad720f363b12fa6f9ea08186b851fef00446bc392270c8e5
MD5 c76bde590f4cfb33fc0269b9579811ff
BLAKE2b-256 b35348ce3b6cc36a6ffbd4040ce5aead531f2aeb8e4dcfe95280085bb24c32be

See more details on using hashes here.

File details

Details for the file gpie-0.2.2-py3-none-any.whl.

File metadata

  • Download URL: gpie-0.2.2-py3-none-any.whl
  • Upload date:
  • Size: 30.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.6.1 requests/2.25.0 setuptools/51.0.0.post20201207 requests-toolbelt/0.9.1 tqdm/4.54.1 CPython/3.8.3

File hashes

Hashes for gpie-0.2.2-py3-none-any.whl
Algorithm Hash digest
SHA256 8257083b0e0eb195339dfea922f3d1be2a1ed2d6c6365eac582db53a21853a62
MD5 d823de8574cc240bf4e6baf58928329b
BLAKE2b-256 2c377898e61e07e6e15244e47e9565099bec05b7fd398f3349426ed284387583

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page