Lightweight RAG provenance middleware. Verifies every claim in an LLM response is grounded in a retrieved source - without an LLM call.
Project description
The problem
Every RAG pipeline has the same failure mode. The LLM takes five retrieved chunks, ignores three of them, and generates a response that cites facts from nowhere. Your retriever did its job. Your prompt did its job. The output still contains unsourced claims and you have no way to know until a user catches it.
Existing tools don't solve this at runtime:
- RAGAS evaluates offline. It can't catch a hallucination before it reaches a user.
- LLM guardrails handle safety and policy enforcement well - toxicity, jailbreaks, off-topic content. Their provenance validators strip unsupported sentences but don't return a structured claim→URL map, a compliance rate, or a source allowlist.
- Prompt engineering reduces the problem. It doesn't eliminate it.
Dokis sits inline - between your retriever and your LLM response going out - and enforces provenance in real time.
How it works
Dokis does exactly two things:
1. Pre-retrieval enforcement. Strip chunks whose source URL is not on your allowlist before they enter the prompt.
2. Post-generation auditing. Split the response into atomic claim sentences. Match each claim to the chunk it came from using BM25 lexical scoring. Build a claim → chunk → URL provenance map. Compute a compliance rate. Flag anything below your threshold.
No LLM call. No API key. No network request after startup. Deterministic output.
See it in action
Quickstart
Zero config
import dokis
result = dokis.audit(query, chunks, response)
print(result.compliance_rate) # 0.91
print(result.passed) # True
print(result.provenance_map) # {"Aspirin inhibits...": "https://pubmed.com/1"}
print(result.violations) # claims with no source
With config
import dokis
config = dokis.Config(
allowed_domains = ["pubmed.ncbi.nlm.nih.gov", "cochrane.org"],
min_citation_rate = 0.85,
claim_threshold = 0.3,
)
clean_chunks = dokis.filter(raw_chunks, config)
response = llm.invoke(build_prompt(query, clean_chunks))
result = dokis.audit(query, clean_chunks, response, config=config)
if not result.passed:
raise dokis.ComplianceViolation(result)
LangChain - two lines
from dokis.adapters.langchain import ProvenanceRetriever
retriever = ProvenanceRetriever(
base_retriever=your_existing_retriever,
config=dokis.Config(allowed_domains=["pubmed.ncbi.nlm.nih.gov"]),
)
docs = retriever.invoke(query)
LlamaIndex
from dokis.adapters.llamaindex import ProvenanceQueryEngine
engine = ProvenanceQueryEngine(
base_engine=your_existing_engine,
chunks=source_chunks,
config=dokis.Config(min_citation_rate=0.80),
)
response = engine.query("What reduces fever?")
result = response.metadata["provenance"]
CLI
dokis audit input.json
dokis audit input.json --config provenance.toml
cat input.json | dokis audit -
Reusable middleware (production pattern)
from dokis import ProvenanceMiddleware, Config
mw = ProvenanceMiddleware(Config(
allowed_domains = ["pubmed.ncbi.nlm.nih.gov", "cochrane.org"],
min_citation_rate = 0.85,
matcher = "bm25",
claim_threshold = 0.3,
))
result = mw.audit(query, chunks, response)
Async
result = await mw.aaudit(query, chunks, response)
Installation
pip install dokis # BM25 default, zero cold start
pip install dokis[semantic] # adds SentenceTransformer matching
pip install dokis[nltk] # adds NLTK sentence splitting
pip install dokis[langchain] # adds LangChain ProvenanceRetriever
pip install dokis[llamaindex] # adds LlamaIndex ProvenanceQueryEngine
Configuration
dokis.Config(
allowed_domains = [],
min_citation_rate = 0.80,
claim_threshold = 0.35,
extractor = "regex", # "regex" | "nltk" | "llm"
matcher = "bm25", # "bm25" | "semantic"
model = "all-MiniLM-L6-v2",
fail_on_violation = False,
domain = None,
)
claim_threshold by matcher:
matcher="bm25": normalised per-query BM25 score. Recommended:0.3–0.5.matcher="semantic": cosine similarity. Recommended:0.65–0.85.
Load from TOML:
# method is named from_yaml for backwards compatibility - pass a .toml file
config = dokis.Config.from_yaml("provenance.toml")
The result object
result.compliance_rate # float
result.passed # bool
result.violations # list[Claim]
result.provenance_map # dict[claim_text, source_url]
result.blocked_sources # list[str]
result.claims # list[Claim]
claim.text # str
claim.supported # bool
claim.confidence # float - always set, even when False
claim.source_url # str | None
claim.source_chunk # Chunk | None
record = result.model_dump_json() # fully JSON-serialisable
Benchmarks
Measured on Python 3.12. Medians over 10 warm runs.
Cold start
| Matcher | Cold start | What loads |
|---|---|---|
bm25 (default) |
~0 ms | Nothing - pure Python |
semantic |
~1,666 ms | all-MiniLM-L6-v2 (~80 MB) |
Per-call audit latency (5 chunks, 3 claims)
| Matcher | Median | p95 |
|---|---|---|
bm25 (default) |
0.96 ms | 1.29 ms |
semantic |
21.99 ms | 31.45 ms |
BM25 is 23× faster per audit call. The BM25 index is cached per chunk set - repeated calls against the same chunks stay sub-millisecond.
Install footprint
pip install dokis |
pip install dokis[semantic] |
|---|---|
| ~42 MB (pydantic + numpy + bm25s) | ~135 MB (+ model weights) |
Accuracy (5 grounded + 5 ungrounded claims)
| Matcher | Grounded detected | Ungrounded rejected |
|---|---|---|
bm25 (default) |
5/5 | 4/4 ✦ |
semantic |
5/5 | 4/4 ✦ |
✦ One claim was 7 words - below the 8-word minimum - and filtered before matching. Effective ungrounded rejection rate is 100% for both matchers.
Comparison
| Dokis | RAGAS | LLM guardrails | |
|---|---|---|---|
| Runtime enforcement | ✅ | ❌ offline only | ✅ |
| No LLM call needed | ✅ | ❌ | partial ✦ |
| Per-claim provenance map | ✅ | partial | partial ✧ |
| Source allowlisting | ✅ | ❌ | ❌ |
| Compliance rate per response | ✅ | ❌ | ❌ |
| LangChain integration | ✅ drop-in retriever | ✅ evaluation wrapper | varies |
| JSON-serialisable audit log | ✅ per-response | ❌ | ❌ |
| Cold start | ~0 ms | - | varies |
| Core install size | ~42 MB | - | - |
✦ ProvenanceEmbeddings uses no LLM call. ProvenanceLLM requires one. ✧ Guardrails strips unsupported sentences from the response. Dokis returns a structured claim→URL map you can store and query.
Examples
Three working demos in dokis-examples:
- 01 - Local files - txt files + BM25 + Ollama
- 02 - Chroma vector store - Chroma + nomic-embed-text + Ollama
- 03 - Live web search - Serper API + domain allowlisting + Ollama
Core dependencies
pip install dokis installs exactly three packages: pydantic>=2.0, numpy>=1.26, bm25s>=0.2.
License
MIT
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file dokis-0.1.1.tar.gz.
File metadata
- Download URL: dokis-0.1.1.tar.gz
- Upload date:
- Size: 41.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
564f7fdd7e039846b0c8df9c0e8f5c22c5a5ea625b9be26ded48797ecc1e3406
|
|
| MD5 |
4376d627946c1991208a1f3948795831
|
|
| BLAKE2b-256 |
170d6ca6e1dabe96851a9e7b1384f829059086754dab115dd1ecef806790d218
|
Provenance
The following attestation bundles were made for dokis-0.1.1.tar.gz:
Publisher:
publish.yml on Vbj1808/Dokis
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
dokis-0.1.1.tar.gz -
Subject digest:
564f7fdd7e039846b0c8df9c0e8f5c22c5a5ea625b9be26ded48797ecc1e3406 - Sigstore transparency entry: 1189508738
- Sigstore integration time:
-
Permalink:
Vbj1808/Dokis@9ce3bb0e6d043ce343774bed4f678ccf3ee4dd2d -
Branch / Tag:
refs/tags/v0.1.1 - Owner: https://github.com/Vbj1808
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@9ce3bb0e6d043ce343774bed4f678ccf3ee4dd2d -
Trigger Event:
push
-
Statement type:
File details
Details for the file dokis-0.1.1-py3-none-any.whl.
File metadata
- Download URL: dokis-0.1.1-py3-none-any.whl
- Upload date:
- Size: 30.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4f374b7445cd47a8935a51630b8472b7a8f565622adf716480bde749611dc1b4
|
|
| MD5 |
e9e638a32fd606ea5a99909b1464a9aa
|
|
| BLAKE2b-256 |
96bdca879ce40afe08db7b57f20f51399ccd3b8ea8db3548bb6db1cf9291a688
|
Provenance
The following attestation bundles were made for dokis-0.1.1-py3-none-any.whl:
Publisher:
publish.yml on Vbj1808/Dokis
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
dokis-0.1.1-py3-none-any.whl -
Subject digest:
4f374b7445cd47a8935a51630b8472b7a8f565622adf716480bde749611dc1b4 - Sigstore transparency entry: 1189508743
- Sigstore integration time:
-
Permalink:
Vbj1808/Dokis@9ce3bb0e6d043ce343774bed4f678ccf3ee4dd2d -
Branch / Tag:
refs/tags/v0.1.1 - Owner: https://github.com/Vbj1808
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@9ce3bb0e6d043ce343774bed4f678ccf3ee4dd2d -
Trigger Event:
push
-
Statement type: