Skip to main content

Calibrated anomaly detection and probabilistic hyperdimensional computing in JAX. Conformal classifier, regressor, and one-class anomaly detector with finite-sample coverage; closed-form Gaussian and Dirichlet hypervectors; pytree-native end-to-end.

Project description

bayes-hdc — probabilistic hyperdimensional computing in JAX

Tests Docs CodeQL Python License

Documentation · Quickstart in Colab · Examples · Benchmarks · Discussions

Hyperdimensional computing (HDC, also known as vector symbolic architectures) represents data as ~10,000-dimensional vectors combined with cheap elementwise algebra: fast, noise-robust, trivially parallel, and a natural fit for edge hardware. Its weak spot is that predictions come out as raw similarity scores with no notion of confidence. bayes-hdc is the first general-purpose library to fix that: hypervectors that carry distributions, calibrated probabilities, and conformal prediction with finite-sample coverage guarantees. It is JAX end to end — every type is a pytree, so jit, vmap, grad, and pmap compose with everything.

pip install git+https://github.com/rlogger/bayes-hdc   # not yet on PyPI

Anomaly detection with a guaranteed false-positive rate

The headline use case: one-class anomaly detection where the false-positive rate is guaranteed at your target alpha — finite-sample, distribution-free, not tuned by hand. No other HDC library ships this.

Conformal anomaly detection: empirical false-positive rate tracks the target alpha

Copy-paste runnable:

import numpy as np
from bayes_hdc.sklearn import HDAnomalyDetector

rng = np.random.default_rng(0)
X_normal = rng.normal(size=(500, 16)).astype("float32")        # fit on normal data only
X_test   = np.vstack([rng.normal(size=(50, 16)),
                      rng.normal(loc=6.0, size=(50, 16))]).astype("float32")

det = HDAnomalyDetector(alpha=0.05).fit(X_normal)
labels = det.predict(X_test)        # +1 inlier / -1 outlier; marginal FP rate <= alpha
pvals  = det.score_samples(X_test)  # split-conformal p-values

The JAX-native pipeline underneath (custom encoders, fit_anomaly_pipeline, Benjamini-Hochberg FDR control across a batch of queries) is walked through in tutorials/02_anomaly_detection.py. On one-class versions of three small standard datasets it has the best AUROC on two of three against IsolationForest, LOF, and OneClassSVM, while holding the false-positive rate at the target — a knob none of those baselines have. Numbers and harness: BENCHMARKS.md.

Calibrated probabilities and prediction sets

Hypervectors can carry distributions (GaussianHV, DirichletHV) with closed-form moment propagation through bind and bundle, and any classifier's outputs can be wrapped with temperature scaling and split-conformal sets:

from bayes_hdc import TemperatureCalibrator, ConformalClassifier

probs = TemperatureCalibrator.create().fit(logits_cal, y_cal).calibrate(logits_test)

conformal = ConformalClassifier.create(alpha=0.1).fit(probs_cal, y_cal)
sets      = conformal.predict_set(probs)        # (n, k) bool mask
coverage  = conformal.coverage(probs, y_test)   # >= 1-alpha in expectation (marginal)

The scikit-learn wrapper covers classification too — it encodes internally and slots into pipelines, cross_val_score, and GridSearchCV unchanged:

from bayes_hdc.sklearn import HDClassifier

HDClassifier(encoder="kernel").fit(X_train, y_train).predict_proba(X_test)

Benchmarks

Standard HDC datasets, 5 seeds, both encoders tuned with the same bandwidth search on identical splits (UCI-HAR uses the official subject-disjoint split). Full protocol and the anomaly table: BENCHMARKS.md.

Dataset bayes-hdc accuracy TorchHD accuracy (tuned) bayes-hdc ECE, raw → calibrated Coverage @ α=0.1
ISOLET 0.895 ± 0.004 0.882 ± 0.006 0.845 → 0.022 0.901
UCI-HAR 0.849 ± 0.006 0.871 ± 0.005 0.633 → 0.031 0.904
EMG gestures 0.944 ± 0.014 0.892 ± 0.005 0.618 → 0.045 0.947

Accuracy is competitive — ahead on two, behind on one — and the right columns are the point: calibrated probabilities and coverage at the target, which the deterministic libraries don't provide. Every number reproduces from a committed script with embedded provenance (make bench-canonical).

In the HDC library landscape

The deterministic substrate (eight VSA models: BSC, MAP, HRR, FHRR, BSBC, CGR, MCR, VTB) is comparable to TorchHD and HoloVec; the differentiation is the probabilistic and uncertainty-quantification layer.

Library Backend VSA models Probabilistic / UQ Differentiable
TorchHD PyTorch 8 partial
HoloVec NumPy / PyTorch / JAX 8 partial
hdlib NumPy generic
vsapy NumPy 6
NengoSPA Nengo (spiking) 3
bayes-hdc JAX 8 Gaussian/Dirichlet HVs, conformal classifier + regressor + anomaly detector end-to-end

Design rationale and per-primitive paper attributions: DESIGN.md · docs/LITERATURE_AUDIT.md.

Examples

emg_gesture_recognition.py sEMG gestures with calibrated per-gesture probabilities
anomaly_detection_intrusion.py network intrusion flags at a guaranteed FP rate
vision_action_policy.py vision-action policy with per-DOF conformal intervals and abstention
kanerva_example.py "What's the Dollar of Mexico?" role-filler analogy

Sixteen more in examples/, and two worked tutorials in tutorials/.

Status

Alpha (0.5.0a1): the API may shift before 1.0. 666 tests at 93% line coverage run on Ubuntu and macOS across Python 3.9–3.13 on every push; tests verify the VSA algebraic laws on randomized inputs, gradient correctness against finite differences, and the conformal coverage and FDR guarantees directly. Sharp edges: GPU/TPU paths are tested in CI on CPU only, the variational-training API is the most likely to change, and bayes_hdc.sklearn needs scikit-learn installed separately.

Pure Python on top of jax + numpy; no compiled extensions.

Contributing

Good first issues are scoped and mentored. Setup and style: CONTRIBUTING.md; paths to maintainership: COMMUNITY.md. Questions and show-and-tell go in Discussions. If the library is useful to you, consider starring the repo — it genuinely helps others find it.

Citing

@software{bayeshdc2026,
  author  = {Singh, Rajdeep},
  title   = {bayes-hdc: Calibrated, Differentiable Hyperdimensional Computing in JAX},
  url     = {https://github.com/rlogger/bayes-hdc},
  version = {0.5.0a1},
  year    = {2026}
}

Or use the "Cite this repository" button (backed by CITATION.cff).

License

MIT. See also: JAX · TorchHD · awesome-jax · Kleyko et al.'s HDC/VSA surveys.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

bayes_hdc-0.5.0a1.tar.gz (166.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

bayes_hdc-0.5.0a1-py3-none-any.whl (105.9 kB view details)

Uploaded Python 3

File details

Details for the file bayes_hdc-0.5.0a1.tar.gz.

File metadata

  • Download URL: bayes_hdc-0.5.0a1.tar.gz
  • Upload date:
  • Size: 166.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for bayes_hdc-0.5.0a1.tar.gz
Algorithm Hash digest
SHA256 f5eb3aed553f8cb0564e42c7fdb85a6c220cbca323d51b9acaa699d438a8bc78
MD5 2e7af2d5ebd3d7bc6d82d50ce2cc0189
BLAKE2b-256 db3cd311f3162bd83e6ae36982d939531677fd4861186a368cbb75895d7d0d97

See more details on using hashes here.

Provenance

The following attestation bundles were made for bayes_hdc-0.5.0a1.tar.gz:

Publisher: publish.yml on rlogger/bayes-hdc

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file bayes_hdc-0.5.0a1-py3-none-any.whl.

File metadata

  • Download URL: bayes_hdc-0.5.0a1-py3-none-any.whl
  • Upload date:
  • Size: 105.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for bayes_hdc-0.5.0a1-py3-none-any.whl
Algorithm Hash digest
SHA256 81ea0c4b1046256a4ff9135bbb6462ef7c02df6714e12ee646b86ea9b7f862ea
MD5 25673531deb043c997f90717d3d6d00f
BLAKE2b-256 ca0282b27af405dacadb9ea1f1c1a8d1d96f9e78cb7e260aa984df9d910dabf3

See more details on using hashes here.

Provenance

The following attestation bundles were made for bayes_hdc-0.5.0a1-py3-none-any.whl:

Publisher: publish.yml on rlogger/bayes-hdc

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page