Calibrated anomaly detection and probabilistic hyperdimensional computing in JAX. Conformal classifier, regressor, and one-class anomaly detector with finite-sample coverage; closed-form Gaussian and Dirichlet hypervectors; pytree-native end-to-end.
Project description
Documentation · Quickstart in Colab · Examples · Benchmarks · Discussions
Hyperdimensional computing (HDC, also known as vector symbolic architectures) represents data as ~10,000-dimensional vectors combined with cheap elementwise algebra: fast, noise-robust, trivially parallel, and a natural fit for edge hardware. Its weak spot is that predictions come out as raw similarity scores with no notion of confidence. bayes-hdc is the first general-purpose library to fix that: hypervectors that carry distributions, calibrated probabilities, and conformal prediction with finite-sample coverage guarantees. It is JAX end to end — every type is a pytree, so jit, vmap, grad, and pmap compose with everything.
pip install git+https://github.com/rlogger/bayes-hdc # not yet on PyPI
Anomaly detection with a guaranteed false-positive rate
The headline use case: one-class anomaly detection where the false-positive
rate is guaranteed at your target alpha — finite-sample,
distribution-free, not tuned by hand. No other HDC library ships this.
Copy-paste runnable:
import numpy as np
from bayes_hdc.sklearn import HDAnomalyDetector
rng = np.random.default_rng(0)
X_normal = rng.normal(size=(500, 16)).astype("float32") # fit on normal data only
X_test = np.vstack([rng.normal(size=(50, 16)),
rng.normal(loc=6.0, size=(50, 16))]).astype("float32")
det = HDAnomalyDetector(alpha=0.05).fit(X_normal)
labels = det.predict(X_test) # +1 inlier / -1 outlier; marginal FP rate <= alpha
pvals = det.score_samples(X_test) # split-conformal p-values
The JAX-native pipeline underneath (custom encoders, fit_anomaly_pipeline,
Benjamini-Hochberg FDR control across a batch of queries) is walked through
in tutorials/02_anomaly_detection.py.
On one-class versions of three small standard datasets it has the best AUROC
on two of three against IsolationForest, LOF, and OneClassSVM, while holding
the false-positive rate at the target — a knob none of those baselines have.
Numbers and harness: BENCHMARKS.md.
Calibrated probabilities and prediction sets
Hypervectors can carry distributions (GaussianHV, DirichletHV) with
closed-form moment propagation through bind and bundle, and any classifier's
outputs can be wrapped with temperature scaling and split-conformal sets:
from bayes_hdc import TemperatureCalibrator, ConformalClassifier
probs = TemperatureCalibrator.create().fit(logits_cal, y_cal).calibrate(logits_test)
conformal = ConformalClassifier.create(alpha=0.1).fit(probs_cal, y_cal)
sets = conformal.predict_set(probs) # (n, k) bool mask
coverage = conformal.coverage(probs, y_test) # >= 1-alpha in expectation (marginal)
The scikit-learn wrapper covers classification too — it encodes internally
and slots into pipelines, cross_val_score, and GridSearchCV unchanged:
from bayes_hdc.sklearn import HDClassifier
HDClassifier(encoder="kernel").fit(X_train, y_train).predict_proba(X_test)
Benchmarks
Standard HDC datasets, 5 seeds, both encoders tuned with the same bandwidth search on identical splits (UCI-HAR uses the official subject-disjoint split). Full protocol and the anomaly table: BENCHMARKS.md.
| Dataset | bayes-hdc accuracy | TorchHD accuracy (tuned) | bayes-hdc ECE, raw → calibrated | Coverage @ α=0.1 |
|---|---|---|---|---|
| ISOLET | 0.895 ± 0.004 | 0.882 ± 0.006 | 0.845 → 0.022 | 0.901 |
| UCI-HAR | 0.849 ± 0.006 | 0.871 ± 0.005 | 0.633 → 0.031 | 0.904 |
| EMG gestures | 0.944 ± 0.014 | 0.892 ± 0.005 | 0.618 → 0.045 | 0.947 |
Accuracy is competitive — ahead on two, behind on one — and the right
columns are the point: calibrated probabilities and coverage at the target,
which the deterministic libraries don't provide. Every number reproduces
from a committed script with embedded provenance (make bench-canonical).
In the HDC library landscape
The deterministic substrate (eight VSA models: BSC, MAP, HRR, FHRR, BSBC, CGR, MCR, VTB) is comparable to TorchHD and HoloVec; the differentiation is the probabilistic and uncertainty-quantification layer.
| Library | Backend | VSA models | Probabilistic / UQ | Differentiable |
|---|---|---|---|---|
| TorchHD | PyTorch | 8 | — | partial |
| HoloVec | NumPy / PyTorch / JAX | 8 | — | partial |
| hdlib | NumPy | generic | — | — |
| vsapy | NumPy | 6 | — | — |
| NengoSPA | Nengo (spiking) | 3 | — | — |
| bayes-hdc | JAX | 8 | Gaussian/Dirichlet HVs, conformal classifier + regressor + anomaly detector | end-to-end |
Design rationale and per-primitive paper attributions:
DESIGN.md · docs/LITERATURE_AUDIT.md.
Examples
emg_gesture_recognition.py |
sEMG gestures with calibrated per-gesture probabilities |
anomaly_detection_intrusion.py |
network intrusion flags at a guaranteed FP rate |
vision_action_policy.py |
vision-action policy with per-DOF conformal intervals and abstention |
kanerva_example.py |
"What's the Dollar of Mexico?" role-filler analogy |
Sixteen more in examples/, and two worked tutorials
in tutorials/.
Status
Alpha (0.5.0a1): the API may shift before 1.0. 666 tests at 93% line
coverage run on Ubuntu and macOS across Python 3.9–3.13 on every push;
tests verify the VSA algebraic laws on randomized inputs, gradient correctness
against finite differences, and the conformal coverage and FDR guarantees
directly. Sharp edges: GPU/TPU paths are tested in CI on CPU only, the
variational-training API is the most likely to change, and bayes_hdc.sklearn
needs scikit-learn installed separately.
Pure Python on top of jax + numpy; no compiled extensions.
Contributing
Good first issues
are scoped and mentored. Setup and style: CONTRIBUTING.md;
paths to maintainership: COMMUNITY.md. Questions and
show-and-tell go in Discussions.
If the library is useful to you, consider starring the repo — it genuinely
helps others find it.
Citing
@software{bayeshdc2026,
author = {Singh, Rajdeep},
title = {bayes-hdc: Calibrated, Differentiable Hyperdimensional Computing in JAX},
url = {https://github.com/rlogger/bayes-hdc},
version = {0.5.0a1},
year = {2026}
}
Or use the "Cite this repository" button (backed by CITATION.cff).
License
MIT. See also: JAX · TorchHD · awesome-jax · Kleyko et al.'s HDC/VSA surveys.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file bayes_hdc-0.5.0a1.tar.gz.
File metadata
- Download URL: bayes_hdc-0.5.0a1.tar.gz
- Upload date:
- Size: 166.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f5eb3aed553f8cb0564e42c7fdb85a6c220cbca323d51b9acaa699d438a8bc78
|
|
| MD5 |
2e7af2d5ebd3d7bc6d82d50ce2cc0189
|
|
| BLAKE2b-256 |
db3cd311f3162bd83e6ae36982d939531677fd4861186a368cbb75895d7d0d97
|
Provenance
The following attestation bundles were made for bayes_hdc-0.5.0a1.tar.gz:
Publisher:
publish.yml on rlogger/bayes-hdc
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
bayes_hdc-0.5.0a1.tar.gz -
Subject digest:
f5eb3aed553f8cb0564e42c7fdb85a6c220cbca323d51b9acaa699d438a8bc78 - Sigstore transparency entry: 1785865875
- Sigstore integration time:
-
Permalink:
rlogger/bayes-hdc@4de23d147acb055212d065bc6075355069f960eb -
Branch / Tag:
refs/tags/v0.5.0a1 - Owner: https://github.com/rlogger
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@4de23d147acb055212d065bc6075355069f960eb -
Trigger Event:
push
-
Statement type:
File details
Details for the file bayes_hdc-0.5.0a1-py3-none-any.whl.
File metadata
- Download URL: bayes_hdc-0.5.0a1-py3-none-any.whl
- Upload date:
- Size: 105.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
81ea0c4b1046256a4ff9135bbb6462ef7c02df6714e12ee646b86ea9b7f862ea
|
|
| MD5 |
25673531deb043c997f90717d3d6d00f
|
|
| BLAKE2b-256 |
ca0282b27af405dacadb9ea1f1c1a8d1d96f9e78cb7e260aa984df9d910dabf3
|
Provenance
The following attestation bundles were made for bayes_hdc-0.5.0a1-py3-none-any.whl:
Publisher:
publish.yml on rlogger/bayes-hdc
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
bayes_hdc-0.5.0a1-py3-none-any.whl -
Subject digest:
81ea0c4b1046256a4ff9135bbb6462ef7c02df6714e12ee646b86ea9b7f862ea - Sigstore transparency entry: 1785866041
- Sigstore integration time:
-
Permalink:
rlogger/bayes-hdc@4de23d147acb055212d065bc6075355069f960eb -
Branch / Tag:
refs/tags/v0.5.0a1 - Owner: https://github.com/rlogger
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@4de23d147acb055212d065bc6075355069f960eb -
Trigger Event:
push
-
Statement type: