Behavioral Trust Clustering: a thermodynamic governance layer for production LLMs

These details have not been verified by PyPI

Project links

Project description

snc-core

Behavioral Trust Clustering — a thermodynamic governance layer for production language models.

snc-core wraps any decoder-only LLM with an inference-time governance layer that reduces the hallucination rate by 52% on the official HumanEval benchmark with Qwen2.5-Coder-7B (16.5% → 7.8%, $z = 2.12$, $p < 0.05$). The method is model-agnostic, retraining-free, and exposes a single decision threshold $\theta$ that traces an interpretable Pareto frontier between coverage and precision.

The companion paper is included at paper/snc-trust-layer-paper.pdf and archived on Zenodo (DOI above).

Headline result

On the full HumanEval benchmark ($n = 164$) with Qwen2.5-Coder-7B:

Configuration	pass@1	hallucinations	net precision	$z$-stat (halluc)
Vanilla baseline	137/164 (83.5%)	27/164 (16.5%)	83.54%	—
Hybrid @ $\theta = 0.50$	136/164	19/164 (12.3%)	87.74%	$+1.07$
Hybrid @ $\theta = 0.55$	132/164	15/164 (9.1%)	89.80%	$+1.61$
Hybrid @ $\theta = 0.65$	106/164	9/164 (7.8%)	92.17%	$+2.12$ SIG

At the conservative threshold the hallucination rate is reduced by 52% relative, statistically significant at the 5% level. Five vanilla failures are recovered (HE/91, /102, /123, /144, /160). Nine residual failures correspond to adversarial mode collapse — see paper Section 4.5.

Why

LLMs trained on next-token prediction confidently produce incorrect outputs. In regulated industries (banking, healthcare, legal compliance) the binding constraint is not raw accuracy but known precision conditional on emission. A model that abstains on $10%$ of queries and is correct on the other $90%$ is qualitatively different from one that is silently wrong on $10%$, even at the same headline accuracy. snc-core converts a fraction of the second class into the first.

Install

pip install snc-core              # core (stdlib only, includes Ollama backend)
pip install snc-core[openai]      # + OpenAI-compatible backend
pip install snc-core[test]        # + pytest for development

Python 3.9+. No mandatory dependencies beyond the standard library.

Quick start

from snc_core import HybridLayer, Decision
from snc_core.adapters import OllamaBackend

backend = OllamaBackend(model="qwen2.5-coder:7b")
hybrid = HybridLayer(backend, k=5, threshold=0.65, temperature=0.8)

result = hybrid.query("What is 17 * 24?")
if result.action == Decision.ADMIT:
    print(f"Answer: {result.answer}")
    print(f"Trust: {result.decision.trust:.3f}")
else:
    print("I do not know.")

The same interface works for any backend that satisfies the LLMBackend protocol:

from snc_core.adapters import OpenAIBackend

backend = OpenAIBackend(model="gpt-4o-mini", api_key="sk-...")
hybrid = HybridLayer(backend, k=5, threshold=0.65)

How it works

The layer composes three signals (full derivation in the paper).

Layer 1 — confidence elicitation. A system prompt instructs the model to emit a self-confidence in the canonical form CONFIDENCE: <0..1> under an asymmetric utility function (correct = +1, wrong = −3, empty = 0).

Behavioral clustering. $K = 5$ candidates are sampled at temperature $0.8$. They are clustered by output equivalence on probe inputs extracted automatically from the test specification. Two implementations of the same algorithm in different syntactic forms collapse into the same cluster.

Trust thermodynamics. The trust score is

$$T = \mathrm{PPV} \cdot \exp(-\sigma_{\mathrm{calib}} \cdot T_{\mathrm{comp}})$$

with computational temperature $T_{\mathrm{comp}}$ adaptive to PPV by default. The score reduces to PPV under perfect inter-sample agreement and discounts toward zero as candidates diverge. The decision is a comparison against the user-supplied threshold $\theta$.

The closed-form score admits an exact thermodynamic phase diagram with universal order parameter $X = T_{\mathrm{comp}} \cdot \sigma_{\mathrm{calib}}$ and critical line $X_c = \ln(\mathrm{PPV}/\theta)$ — see paper Section 3.6.

Public API

`HybridLayer`

The primary class. Wraps any LLMBackend and exposes a query(prompt) -> HybridResult method.

HybridLayer(
    backend: LLMBackend,
    k: int = 5,
    threshold: float = 0.5,
    temperature: float = 0.8,
    max_tokens: int = 400,
    system_prompt: str = LAYER1_SYSTEM_PROMPT_EN,
    behavior_extractor: Optional[Callable[[str], Tuple]] = None,
    t_comp: Optional[float] = None,
)

`behavioral_governance`

Apply the governor offline to a population of pre-computed candidates. Useful for replay analysis, threshold sweeps over cached generations, and unit testing.

`trust_thermodynamic`

The closed-form trust score, exposed for direct inspection or composition.

Backends

OllamaBackend(model, base_url, request_timeout) — Ollama-served local models
OpenAIBackend(model, api_key, base_url) — OpenAI-compatible APIs (also vLLM, LMStudio, OpenRouter)
CallableBackend(func) — wrap any user-defined callable

To add a backend, implement the LLMBackend protocol from snc_core.adapters.base.

Tuning the threshold

The threshold $\theta$ is the only operational hyperparameter. Three regimes have been characterized empirically on HumanEval:

Regime	$\theta$	Use case	Result on HumanEval
Aggressive	0.50	Internal tooling, downstream review cheap	88% coverage, 12.3% halluc, 87.74% precision
Balanced	0.55	Customer-facing, false positives visible	90% coverage, 9.1% halluc, 89.80% precision
Conservative	0.65	Banking, healthcare, legal — high-cost errors	70% coverage, 7.8% halluc, 92.17% precision

Calibrate against a small held-out set with the operator's empirical cost ratios.

Reproducing the paper results

The full experimental pipeline is included under benchmarks/:

# 1. Smoke test
python benchmarks/01_smoke_test.py

# 2. Hybrid wrapper validation on small probe set
python benchmarks/02_snc_qwen.py

# 3. HumanEval full benchmark with threshold sweep
python benchmarks/06_humaneval_full.py

All experiments use seed 42. The candidate cache is preserved as JSONL for offline analysis. Expected wall-clock time on a CPU-only consumer workstation: approximately 8–10 hours for the full HumanEval evaluation.

Citation

@article{culotta2026btc,
  title  = {Behavioral Trust Clustering: A Thermodynamic Governance Layer for Production LLMs},
  author = {Culotta, Daniel},
  year   = {2026},
  doi    = {10.5281/zenodo.PLACEHOLDER},
  url    = {https://doi.org/10.5281/zenodo.PLACEHOLDER}
}

Limitations

The package halves but does not eliminate hallucinations. The residual failure mode, adversarial mode collapse, occurs when a majority of stochastic candidates make the same systematic error. We identify nine such cases in HumanEval (paper Appendix B). Mitigation requires external information — typically a property-based test that exercises the systematic error.

The token cost of the hybrid configuration is approximately $K$ times the vanilla cost, modulo savings from clustering and short candidate emissions. On HumanEval the empirical overhead was $2.27\times$.

The behavioral clustering relies on probe inputs that exercise the relevant equivalence. For tasks under-determined by their test specification, the method degrades to structural clustering.

License

MIT. See LICENSE.

Changelog

See CHANGELOG.md. \x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.4.0

May 4, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

snc_core-0.4.0.tar.gz (21.9 kB view details)

Uploaded May 4, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

snc_core-0.4.0-py3-none-any.whl (20.1 kB view details)

Uploaded May 4, 2026 Python 3

File details

Details for the file snc_core-0.4.0.tar.gz.

File metadata

Download URL: snc_core-0.4.0.tar.gz
Upload date: May 4, 2026
Size: 21.9 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.12

File hashes

Hashes for snc_core-0.4.0.tar.gz
Algorithm	Hash digest
SHA256	`396be32b9f14373d832ffe3d77770b0f86a56adea37a7caa9a9b4727172c738d`
MD5	`c86fa8f9d267372ef94522ec3b2a2da6`
BLAKE2b-256	`0fd8dc272d69b0edb1acea099ee3b41eb42ad7d3c0d502b02bbc02fa83a7a1bf`

See more details on using hashes here.

File details

Details for the file snc_core-0.4.0-py3-none-any.whl.

File metadata

Download URL: snc_core-0.4.0-py3-none-any.whl
Upload date: May 4, 2026
Size: 20.1 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.12

File hashes

Hashes for snc_core-0.4.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`194fe01d60a88777aaf63ad75830aaa7babf3893f547090e09a0e9c13c36304e`
MD5	`703e160effcccfc1c1d97cb5f37a2eed`
BLAKE2b-256	`cae1d3ceeedfdc6858738305aa7619c4da3908aa95e3d898d9d3afc85fe0b2db`

See more details on using hashes here.

snc-core 0.4.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

snc-core

Headline result

Why

Install

Quick start

How it works

Public API

HybridLayer

behavioral_governance

trust_thermodynamic

Backends

Tuning the threshold

Reproducing the paper results

Citation

Limitations

License

Changelog

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

`HybridLayer`

`behavioral_governance`

`trust_thermodynamic`