Skip to main content

Calibrated anomaly detection and probabilistic hyperdimensional computing in JAX. Conformal classifier, regressor, and one-class anomaly detector with finite-sample coverage; closed-form Gaussian and Dirichlet hypervectors; pytree-native end-to-end.

Project description

bayes-hdc — probabilistic hyperdimensional computing in JAX

Tests Docs CodeQL Python License

Documentation · Quickstart in Colab · Examples · Benchmarks · Discussions

Hyperdimensional computing (HDC, also known as vector symbolic architectures) represents data as ~10,000-dimensional vectors combined with cheap elementwise algebra: fast, noise-robust, trivially parallel, and a natural fit for edge hardware. Its weak spot is that predictions come out as raw similarity scores with no notion of confidence. bayes-hdc is the first general-purpose library to fix that: hypervectors that carry distributions, calibrated probabilities, and conformal prediction with finite-sample coverage guarantees. It is JAX end to end — every type is a pytree, so jit, vmap, grad, and pmap compose with everything.

pip install git+https://github.com/rlogger/bayes-hdc   # not yet on PyPI

Anomaly detection with a guaranteed false-positive rate

The headline use case: one-class anomaly detection where the false-positive rate is guaranteed at your target alpha — finite-sample, distribution-free, not tuned by hand. No other HDC library ships this.

Conformal anomaly detection: empirical false-positive rate tracks the target alpha

Copy-paste runnable:

import numpy as np
from bayes_hdc.sklearn import HDAnomalyDetector

rng = np.random.default_rng(0)
X_normal = rng.normal(size=(500, 16)).astype("float32")        # fit on normal data only
X_test   = np.vstack([rng.normal(size=(50, 16)),
                      rng.normal(loc=6.0, size=(50, 16))]).astype("float32")

det = HDAnomalyDetector(alpha=0.05).fit(X_normal)
labels = det.predict(X_test)        # +1 inlier / -1 outlier; marginal FP rate <= alpha
pvals  = det.score_samples(X_test)  # split-conformal p-values

The JAX-native pipeline underneath (custom encoders, fit_anomaly_pipeline, Benjamini-Hochberg FDR control across a batch of queries) is walked through in tutorials/02_anomaly_detection.py. On one-class versions of three small standard datasets it has the best AUROC on two of three against IsolationForest, LOF, and OneClassSVM, while holding the false-positive rate at the target — a knob none of those baselines have. Numbers and harness: BENCHMARKS.md.

Calibrated probabilities and prediction sets

Hypervectors can carry distributions (GaussianHV, DirichletHV) with closed-form moment propagation through bind and bundle, and any classifier's outputs can be wrapped with temperature scaling and split-conformal sets:

from bayes_hdc import TemperatureCalibrator, ConformalClassifier

probs = TemperatureCalibrator.create().fit(logits_cal, y_cal).calibrate(logits_test)

conformal = ConformalClassifier.create(alpha=0.1).fit(probs_cal, y_cal)
sets      = conformal.predict_set(probs)        # (n, k) bool mask
coverage  = conformal.coverage(probs, y_test)   # >= 1-alpha in expectation (marginal)

The scikit-learn wrapper covers classification too — it encodes internally and slots into pipelines, cross_val_score, and GridSearchCV unchanged:

from bayes_hdc.sklearn import HDClassifier

HDClassifier(encoder="kernel").fit(X_train, y_train).predict_proba(X_test)

Benchmarks

Standard HDC datasets, 5 seeds, both encoders tuned with the same bandwidth search on identical splits (UCI-HAR uses the official subject-disjoint split). Full protocol and the anomaly table: BENCHMARKS.md.

Dataset bayes-hdc accuracy TorchHD accuracy (tuned) bayes-hdc ECE, raw → calibrated Coverage @ α=0.1
ISOLET 0.895 ± 0.004 0.882 ± 0.006 0.845 → 0.022 0.901
UCI-HAR 0.849 ± 0.006 0.871 ± 0.005 0.633 → 0.031 0.904
EMG gestures 0.944 ± 0.014 0.892 ± 0.005 0.618 → 0.045 0.947

Accuracy is competitive — ahead on two, behind on one — and the right columns are the point: calibrated probabilities and coverage at the target, which the deterministic libraries don't provide. Every number reproduces from a committed script with embedded provenance (make bench-canonical).

In the HDC library landscape

The deterministic substrate (eight VSA models: BSC, MAP, HRR, FHRR, BSBC, CGR, MCR, VTB) is comparable to TorchHD and HoloVec; the differentiation is the probabilistic and uncertainty-quantification layer.

Library Backend VSA models Probabilistic / UQ Differentiable
TorchHD PyTorch 8 partial
HoloVec NumPy / PyTorch / JAX 8 partial
hdlib NumPy generic
vsapy NumPy 6
NengoSPA Nengo (spiking) 3
bayes-hdc JAX 8 Gaussian/Dirichlet HVs, conformal classifier + regressor + anomaly detector end-to-end

Design rationale and per-primitive paper attributions: DESIGN.md · docs/LITERATURE_AUDIT.md.

Examples

emg_gesture_recognition.py sEMG gestures with calibrated per-gesture probabilities
anomaly_detection_intrusion.py network intrusion flags at a guaranteed FP rate
vision_action_policy.py vision-action policy with per-DOF conformal intervals and abstention
kanerva_example.py "What's the Dollar of Mexico?" role-filler analogy

Sixteen more in examples/, and two worked tutorials in tutorials/.

Status

Alpha (0.5.0a0): the API may shift before 1.0. 666 tests at 93% line coverage run on Ubuntu and macOS across Python 3.9–3.13 on every push; tests verify the VSA algebraic laws on randomized inputs, gradient correctness against finite differences, and the conformal coverage and FDR guarantees directly. Sharp edges: GPU/TPU paths are tested in CI on CPU only, the variational-training API is the most likely to change, and bayes_hdc.sklearn needs scikit-learn installed separately.

Pure Python on top of jax + numpy; no compiled extensions.

Contributing

Good first issues are scoped and mentored. Setup and style: CONTRIBUTING.md; paths to maintainership: COMMUNITY.md. Questions and show-and-tell go in Discussions. If the library is useful to you, consider starring the repo — it genuinely helps others find it.

Citing

@software{bayeshdc2026,
  author  = {R.S.},
  title   = {bayes-hdc: Calibrated, Differentiable Hyperdimensional Computing in JAX},
  url     = {https://github.com/rlogger/bayes-hdc},
  version = {0.5.0a0},
  year    = {2026}
}

Or use the "Cite this repository" button (backed by CITATION.cff).

License

MIT. See also: JAX · TorchHD · awesome-jax · Kleyko et al.'s HDC/VSA surveys.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

bayes_hdc-0.5.0a0.tar.gz (166.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

bayes_hdc-0.5.0a0-py3-none-any.whl (105.9 kB view details)

Uploaded Python 3

File details

Details for the file bayes_hdc-0.5.0a0.tar.gz.

File metadata

  • Download URL: bayes_hdc-0.5.0a0.tar.gz
  • Upload date:
  • Size: 166.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for bayes_hdc-0.5.0a0.tar.gz
Algorithm Hash digest
SHA256 00e90b84729060e084f78da8f65c38bc0849d398e38a53e22c56a5d11f3ce9b2
MD5 583fec20caffe6bcc2dcafd9389045ea
BLAKE2b-256 34159da9fa29827153cb7eae8427307cec0e1fdf589e666dcdca05fbc441c4cd

See more details on using hashes here.

Provenance

The following attestation bundles were made for bayes_hdc-0.5.0a0.tar.gz:

Publisher: publish.yml on rlogger/bayes-hdc

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file bayes_hdc-0.5.0a0-py3-none-any.whl.

File metadata

  • Download URL: bayes_hdc-0.5.0a0-py3-none-any.whl
  • Upload date:
  • Size: 105.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for bayes_hdc-0.5.0a0-py3-none-any.whl
Algorithm Hash digest
SHA256 602fff28418e583296f0b03354468f90a6f8df3ec1c88cdf4bb06f14c3fad5a5
MD5 0f352dd3c68b9b7a4fa2e033531fe82a
BLAKE2b-256 868a0c7e987c470bb4f7a254db30a7555cbb5e2dc30156590e75284979da7c20

See more details on using hashes here.

Provenance

The following attestation bundles were made for bayes_hdc-0.5.0a0-py3-none-any.whl:

Publisher: publish.yml on rlogger/bayes-hdc

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page