DFAH (Decision-Faithfulness Assessment Harness) determinism + faithfulness eval harness for Evidentia — dev-time AI-output quality gates
Project description
evidentia-eval
Dev-time AI-output quality eval harness for Evidentia.
Hosts the DFAH (Decision-Faithfulness Assessment Harness) — the auditor-defensible numerical proof layer that validates LLM-driven artifact production is deterministic, replay- equivalent, and faithful to its source policy clauses.
Why this package exists (v0.10.5 P9 extraction)
The DFAH harness was originally bundled into evidentia-ai (the
risk-statement generator + control explainer package). That
conflated two very different deployment surfaces:
evidentia-ai— PRODUCTION runtime. Needed in air-gap installs to actually generate risk statements.evidentia-eval— DEVELOPMENT-time evaluation. NOT needed in air-gap installs; only fires when a CI pipeline runs a determinism / faithfulness gate before tagging a release.
Extracting the eval harness lets air-gap installs of
evidentia-ai skip the optional sentence-transformers stack
entirely (it now lives behind evidentia-eval[faithfulness-semantic]
instead of evidentia-ai[eval-faithfulness]).
Quick start
# Stdlib Jaccard baseline (no extra needed; <10 MB install)
pip install evidentia-eval
# Optional semantic-similarity faithfulness (~250 MB extra
# for sentence-transformers + numpy + model cache on first use)
pip install 'evidentia-eval[faithfulness-semantic]'
CLI verbs:
# Smoke test against a deterministic stub generator (no LLM
# tokens burned)
evidentia eval stub-smoke
# Real-LLM determinism gate against the risk-statement generator
evidentia eval risk-determinism --gap-report gaps.json \
--system-context ctx.yaml \
--fail-on-determinism-rate-below 0.95
# Verify a previously-signed eval bundle
evidentia eval verify path/to/eval-output.json
The CLI verbs live in evidentia.cli.eval (the meta-package);
this package contributes the underlying library.
Public API
| Symbol | Purpose |
|---|---|
DFAHarness |
Owns the run loop + audit emit |
EvalResult |
Top-level harness output (JSON-serializable, Sigstore-signable) |
EvalSample |
One prompt's inputs (immutable; audit-trail-stable) |
DeterminismResult |
Per-prompt determinism outcome |
ReplayResult |
Per-prompt replay-equivalence outcome |
FaithfulnessResult |
Per-claim faithfulness outcome |
PromptFaithfulnessResult |
Aggregated per-prompt faithfulness |
faithfulness_score |
Stdlib Jaccard token-overlap baseline |
faithfulness_score_semantic |
Sentence-transformers path (optional extra) |
determinism_score |
Computes the modal-output pass rate |
replay_equivalent |
Binary replay-equivalence check |
extract_claims |
Atomic-claim extraction from generated artifacts |
normalize_for_determinism |
Canonical whitespace + punctuation normalization |
hash_output |
SHA-256 hex of normalized output |
sign_eval_result |
Sigstore-sign an EvalResult JSON |
verify_eval_result |
Verify a previously-signed eval bundle |
Backward-compat shim
For external scripts that still import from evidentia_ai.eval import ..., evidentia-ai ships a deprecation shim that
re-exports from evidentia_eval. The shim warns once at import
time and is scheduled for removal in v0.12.0.
License
Apache-2.0. See the workspace root LICENSE file.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file evidentia_eval-0.10.7.tar.gz.
File metadata
- Download URL: evidentia_eval-0.10.7.tar.gz
- Upload date:
- Size: 25.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a62ddb06dd87b02b5fce3b0e8e28bcce2d4e68774a8822f0019fa4483c5b18d6
|
|
| MD5 |
e9217182d75cf7655c586bd2119192c8
|
|
| BLAKE2b-256 |
3d0823ef50e9ce23f4a46983f2e81bcc3388b44ab29cc0002e69234619f60ee5
|
Provenance
The following attestation bundles were made for evidentia_eval-0.10.7.tar.gz:
Publisher:
release.yml on Polycentric-Labs/evidentia
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
evidentia_eval-0.10.7.tar.gz -
Subject digest:
a62ddb06dd87b02b5fce3b0e8e28bcce2d4e68774a8822f0019fa4483c5b18d6 - Sigstore transparency entry: 1677071336
- Sigstore integration time:
-
Permalink:
Polycentric-Labs/evidentia@01e643c15b572b21e19eb3a3be822b8fa966bc8a -
Branch / Tag:
refs/tags/v0.10.7 - Owner: https://github.com/Polycentric-Labs
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@01e643c15b572b21e19eb3a3be822b8fa966bc8a -
Trigger Event:
push
-
Statement type:
File details
Details for the file evidentia_eval-0.10.7-py3-none-any.whl.
File metadata
- Download URL: evidentia_eval-0.10.7-py3-none-any.whl
- Upload date:
- Size: 29.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
89d86c10b3f0b8ab05a951d6f43e1cc37d072307592a808110eae233866ba316
|
|
| MD5 |
cd59f794c9fb703e76a7ae397ac4201b
|
|
| BLAKE2b-256 |
3ae2e93c51b2cd12053f45cc2e6bca3f462a9324f89ae349d309d289f66908b7
|
Provenance
The following attestation bundles were made for evidentia_eval-0.10.7-py3-none-any.whl:
Publisher:
release.yml on Polycentric-Labs/evidentia
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
evidentia_eval-0.10.7-py3-none-any.whl -
Subject digest:
89d86c10b3f0b8ab05a951d6f43e1cc37d072307592a808110eae233866ba316 - Sigstore transparency entry: 1677071615
- Sigstore integration time:
-
Permalink:
Polycentric-Labs/evidentia@01e643c15b572b21e19eb3a3be822b8fa966bc8a -
Branch / Tag:
refs/tags/v0.10.7 - Owner: https://github.com/Polycentric-Labs
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@01e643c15b572b21e19eb3a3be822b8fa966bc8a -
Trigger Event:
push
-
Statement type: