Skip to main content

Python tools for CSRK rust Gaussian Process crate

Project description

(image interpolated from scan of Nighthawks by Edward Hopper -- 1942 -- public domain)

Gaussian Process Regression with Compactly Supported Radial Kernel

csrk is a Rust crate for large-scale Gaussian Process regression using compactly supported Wendland kernels, spatial hashing, and sparse LDL^T factorization. It enables training on tens of thousands of points and fast sampling and evaluation of GP realizations with near-constant per-query cost.

It is a CPU-based code implementing the Wendland kernels (piecewise polynomial kernels with compact support) using the sprs crates for sparse matrix operations (sprs) and sparse Cholesky decomposition (sprs-ldl).

This Python API calls out to that Rust crate and instantiates a Hybrid Python/Rust Gaussian Process, capable of most of the methods and features included in the Rust representation.

Installation

Installation of the source distribution may require having the rust compiler Cargo installed.

From Pypi

pip install csrk

From source

git clone https://gitlab.com/xevra/csrk-py
cd csrk-py
pip install .

Example

from csrk import HybridGP

gp = HybridGP(
    x_train,    # NumPy array of shape (n_pts, n_dimensions) -- Array of arrays of points
    y_train,    # NumPy array of shape (n_pts,) -- Array of values
    y_err,      # NumPy array of shape (n_pts,) -- Training error -- how reliable is each point?
    scale,      # NumPy array of shape (n_dimensions,) -- determines sparsity
    whitenoise, # float -- Can safely be zero for non-noisy data
    order,      # int \in (0, 1, 2, 3) -- Try 1 first
    )

y_evals = gp.predict_mean(x_evals)

Features

  • compactly supported Wendland kernels
  • sparse kernel construction via spatial hashing
  • scalable sparse LDL^T training
  • Serialization in hdf5
  • No dependency on scikit-sparse

For more on the performance of the Rust GP and an overview of the modules and algorith, see the Rust crate gitlab.

Motivation

The Wendland kernels can be used for training and evaluating GPR interpolators in O(n * m), for n evaluation points and m nearest neighbors. This also affects the Cholesky decomposition of the kernel.

When done properly, this process conserves both compute time and computer memory, as large (n x n) arrays are never allocated.

Previously, I had developed a separate Python library for doing this: (gaussian-process-api). However, despite the use of sparse matrices in the Cholesky decomposition (using scikit-sparse), and despite the C extension backend for the kernel evaluations, this library still constructs dense array intermediate products which may challenge memory resources.

By writing a new module in Rust, I would like to remove the dependency on scikit-sparse and avoid storing large dense matrices in memory at any point throughout the computation, allowing for a lightweight and fast Gaussian Process regression implementation.

Contributing

I am open to suggestions and pull requests.

Acknowledgements

My background in Gaussian Processes, and the Wenland kernels come from Rasmussen and Williams (2005). I thank the authors for publishing openly without charge.

My own work in implementing sparse Gaussian Processes for signal to noise estimation for binary black hole merger detectability with a single LIGO detector is briefly summarized in Appendix A of Delfavero et al. (2023).

I would like to acknowledge the work done by Esmaeilbeigi et al. (2025) for putting the advantages and limitations of the Wenland kernel that I have stumbled through in practical implementations into the vocabulary of higher mathematics.

I would also like to thank Nicolas Posner, who has accompanied my introduction to rust, and whose blog and contributions to other modules encouraged me to learn Rust.

I would also like to thank Nick Fotopoulos for a thorough and constructive code review of the Rust crate!

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

csrk-0.1.2.tar.gz (1.2 MB view details)

Uploaded Source

File details

Details for the file csrk-0.1.2.tar.gz.

File metadata

  • Download URL: csrk-0.1.2.tar.gz
  • Upload date:
  • Size: 1.2 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.12

File hashes

Hashes for csrk-0.1.2.tar.gz
Algorithm Hash digest
SHA256 6861508645d28a119cb7396b00672b92ed55ef179ae81d85e66940534f26d4ac
MD5 1140322a465f6292359f75b5b17689d6
BLAKE2b-256 49d404ea1c028d5ecc317ee6c80253bf73f27c16c75ef13c7d88133bc78d94a3

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page