Lightweight hallucination risk scoring for LLM outputs

These details have not been verified by PyPI

Project links

Project description

hallx

Lightweight hallucination-risk scoring for production LLM pipelines.

Overview

Area	What Hallx Provides
Risk output	`confidence` (`0.0` to `1.0`) and `risk_level` (`high`, `medium`, `low`)
Diagnostics	`issues` list for tracing weak signals and policy failures
Actionability	`recommendation` payload (`action`, `suggested_temperature`, `suggestions`)
API modes	Sync and async checks
Integrations	Adapter-based and callable-based workflows
Operations	Feedback storage and calibration reporting

Hallx is designed as a guardrail layer before downstream actions such as API responses, automation steps, and database writes.

Installation

pip install hallx

Development install:

pip install -e .[dev]

Quick Start

from hallx import Hallx

checker = Hallx(profile="balanced", strict=False)
result = checker.check(
    prompt="Summarize refund policy",
    response={"summary": "Refunds are allowed within 30 days."},
    context=["Refunds are allowed within 30 days of purchase."],
    schema={
        "type": "object",
        "properties": {"summary": {"type": "string"}},
        "required": ["summary"],
        "additionalProperties": False,
    },
)

print(result.confidence, result.risk_level)
print(result.scores)
print(result.issues)
print(result.recommendation)

Scoring Model

Hallx uses heuristic risk scoring across three signals:

Signal	Description
`schema`	JSON schema validity and null-injection checks
`consistency`	Stability across repeated generations
`grounding`	Claim-context alignment and source-integrity checks

Confidence formula:

confidence = clamp(
  schema_score * w_schema +
  consistency_score * w_consistency +
  grounding_score * w_grounding,
  0.0, 1.0
)

Default (balanced) weights:

Weight	Value
`w_schema`	`0.34`
`w_consistency`	`0.33`
`w_grounding`	`0.33`

Risk mapping:

Confidence range	Risk
`< 0.40`	`high`
`< 0.75`	`medium`
`>= 0.75`	`low`

Note: skipped checks are penalized by default to avoid over-trusting partial analysis.

Safety Profiles

Profile	Goal	Default `consistency_runs`	Skip penalty
`fast`	lower latency	2	0.15
`balanced`	general-purpose	3	0.25
`strict`	stronger scrutiny	4	0.40

from hallx import Hallx

checker = Hallx(profile="strict")

You can override weights, consistency_runs, and skip_penalty as needed.

Workflow

Hallx working flow

Collect prompt, optional context, optional schema.
Generate a model response through an adapter or callable.
Run schema, consistency, and grounding checks.
Aggregate scores into confidence and risk_level.
Apply policy (proceed or retry) using recommendation metadata.
Optionally record reviewed outcomes for calibration.

Adapters

Provider adapter
OpenAI
Anthropic
Gemini
OpenRouter
Perplexity
Grok
HuggingFace
Ollama

Samples

Sample	Purpose
`samples/basic_sync.py`	minimal sync workflow
`samples/async_openai_adapter.py`	async provider check with context
`samples/async_openai_adapter_no_context.py`	no-context behavior and weighting example
`samples/retry_strategy.py`	recommendation-driven retry policy
`samples/strict_mode.py`	strict blocking behavior
`samples/feedback_calibration.py`	local feedback storage and calibration report
`samples/async_openai_feedback_calibration.py`	async generation + feedback in one loop

Feedback Storage and Calibration

from hallx import Hallx

checker = Hallx(feedback_db_path="/var/lib/myapp/hallx-feedback.sqlite3")

result = checker.check(prompt="p", response="r", context=["c"])
checker.record_outcome(
    result=result,
    label="hallucinated",  # aliases: safe -> correct, unsafe -> hallucinated
    metadata={"reviewer": "qa-team"},
    prompt="p",
    response_excerpt="r",
)

report = checker.calibration_report(window_days=30)
print(report["suggested_threshold"], report["threshold_metrics"])

Default DB path resolution:

Environment	Default path
Env override	`HALLX_FEEDBACK_DB`
Windows	`%LOCALAPPDATA%\\hallx\\feedback.sqlite3` (fallback `%APPDATA%`)
macOS	`~/Library/Application Support/hallx/feedback.sqlite3`
Linux/servers	`$XDG_DATA_HOME/hallx/feedback.sqlite3` or `~/.local/share/hallx/feedback.sqlite3`

Production Notes

Recommendation	Why
Enable strict mode on sensitive paths	block high-risk responses before side effects
Log `confidence`, `risk_level`, `issues`	support auditing and threshold tuning
Use calibration report regularly	adjust thresholds with real reviewed outcomes
Keep context quality high	grounding quality depends on evidence quality

Known Limitations

Hallx is heuristic and does not provide formal factual guarantees.
High confidence can still be wrong if context is missing, stale, or incorrect.
Similarity-based checks can miss nuanced semantic contradictions.
High-stakes domains should combine Hallx with domain validators and human review.

Documentation

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

1.0.3

Mar 30, 2026

1.0.2

Mar 30, 2026

1.0.1

Mar 30, 2026

This version

1.0.0

Mar 28, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

hallx-1.0.0.tar.gz (25.7 kB view details)

Uploaded Mar 28, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

hallx-1.0.0-py3-none-any.whl (27.3 kB view details)

Uploaded Mar 28, 2026 Python 3

File details

Details for the file hallx-1.0.0.tar.gz.

File metadata

Download URL: hallx-1.0.0.tar.gz
Upload date: Mar 28, 2026
Size: 25.7 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.15

File hashes

Hashes for hallx-1.0.0.tar.gz
Algorithm	Hash digest
SHA256	`8f55d446ff821e9ad805f6f98734fa2ad6c5175d8f43dae172a3d195bec6fbce`
MD5	`998fe90af9c8e0320f93735c0afaa7ec`
BLAKE2b-256	`b026fd150a67c9e9eaa1b5a60ce3a9484f74dd2c32e15c24d2aaf97762bdd8b1`

See more details on using hashes here.

File details

Details for the file hallx-1.0.0-py3-none-any.whl.

File metadata

Download URL: hallx-1.0.0-py3-none-any.whl
Upload date: Mar 28, 2026
Size: 27.3 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.15

File hashes

Hashes for hallx-1.0.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`b2630a5669173bc1185b9314a12d68304e6b0d393da9e4b18fc38d30bd448b67`
MD5	`1ca673dc6670f0ec17b4eb0ba13f2c43`
BLAKE2b-256	`ff1ae841990de6319b5cdf947d6a047eaeb98429520ee62ccbb9a4789d058b32`

See more details on using hashes here.

hallx 1.0.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

hallx

Overview

Installation

Quick Start

Scoring Model

Safety Profiles

Workflow

Adapters

Samples

Feedback Storage and Calibration

Production Notes

Known Limitations

Documentation

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes