Open-world fact verification for AI claims (sibling to halluguard, which handles closed-world).

These details have not been verified by PyPI

Project description

truthcheck

Open-world fact verification for AI claims, the web-search complement to halluguard.

Status: v0.1, working. Pipeline ships: Exa search backend, NLI verifier (lexical fallback when sentence-transformers not installed), SQLite cache, atomic claim splitter. Sibling to adaptmem + halluguard + claimcheck.

The problem this solves

halluguard answers: "Is this claim supported by the documents I gave you?"

That's enough when you control the corpus (your shop's catalog, your company's internal docs, your codebase). It is not enough when:

An LLM cites a figure ("Türkiye nüfusu 85 milyon").
An LLM dates an event ("Bitcoin halving was in May 2024").
An LLM names a person ("Alice Novak is the lead developer of Project X").
An LLM repeats a recent news fact ("OpenAI released o4-mini in March 2026").

Halluguard can't answer because the ground truth lives on the open web, not in the user's corpus. That's truthcheck's job.

Design constraints

Stay composable. Truthcheck is a sibling, not a replacement.
- halluguard.Guard.check(answer) → corpus-grounded verdict
- truthcheck.WebFactChecker.check(claim) → open-web verdict
- Caller decides which to invoke (or both, in series).
Never silently dilute halluguard's positioning. Halluguard says "no LLM, no internet, deterministic." Truthcheck explicitly says "yes LLM (probably), yes internet, probabilistic." Honest naming.
Backend-agnostic. Brave Search, Exa, Bing, DuckDuckGo, your internal corporate Confluence + Notion, anything that returns ranked snippets should plug in.
Cost-aware. Web search APIs cost money. Truthcheck must
- tell the caller a USD estimate per claim before issuing requests
- cache aggressively (claim text → result, TTL configurable)
- support dry_run=True to preview without API spend.

Sketch of the API

from truthcheck import WebFactChecker

checker = WebFactChecker(
    backend="exa",                       # default; "brave" also supported
    api_key=os.environ["EXA_API_KEY"],
    trusted_domains=["wikipedia.org", "*.gov", "*.edu"],
    cache_dir="~/.cache/truthcheck",
)

verdict = checker.check(
    claim="Türkiye nüfusu 85 milyon",
    n_sources=5,
)
# Verdict {
#   status: SUPPORTED | UNSUPPORTED | CONTRADICTED | INCONCLUSIVE,
#   confidence: 0.0, 1.0,
#   sources: [
#     Source(url="https://www.worldometers.info/...", snippet="...", score=0.91),
#     Source(url="https://en.wikipedia.org/wiki/Demographics_of_Turkey", ...),
#     ...
#   ],
#   atomic_claims: ["country: Türkiye", "metric: population", "value: 85 million"],
#   cost_usd: 0.0007,
#   cache_hit: False,
# }

v0.1 decisions (closed)

Default backend: Exa (Brave's free tier was removed)
Splitter: regex-based, deterministic, spacy/LLM in v0.2
Verifier: NLI cross-encoder; lexical fallback when sentence-transformers absent
Cache: SQLite under ~/.cache/truthcheck
Contradiction: INCONCLUSIVE + all sources surfaced
Recency: as_of timestamp stamped on every verdict

Open for v0.2

Turkish / multilingual NLI model
spacy or small LLM for compound claim splitting
DDG / SearXNG backend (no API key)
Redis cache backend

Composition with the cluster

                       answer + corpus → halluguard.Guard.check()
                                                │
                                                ▼
                                    SUPPORTED?  yes ─→ trust=high, done
                                       │
                                       no (claim isn't in corpus)
                                       │
                                       ▼
                              answer claims → truthcheck.WebFactChecker.check()
                                                │
                                                ▼
                                       open-web verdict

Bigger picture: cluster gives the consumer a "belge → halluguard, dünya → truthcheck" pipeline so closed-world and open-world claims can both be verified through one call site (a future helper in claimcheck).

What this repo is NOT

Not a replacement for halluguard. Halluguard handles the case where you have a corpus. Don't use truthcheck where halluguard fits.
Not a search engine. It's a verification layer that uses search engines as a substrate. Bring your own backend.
Not a fact-database. It doesn't ship knowledge graphs. Every verdict is computed at request time against live sources.
Not a guarantee. Open-world fact-checking is an active research area; FEVER state-of-the-art is around 75% F1. Truthcheck reports confidence, never asserts truth.

License

MIT

Install

pip install "truthcheck[brave]"   # Brave backend
pip install "truthcheck[nli]"     # NLI verifier (sentence-transformers)

Set EXA_API_KEY or BRAVE_API_KEY env var before use.

This is a draft. Atakan to review, sharpen the open questions, and decide whether to push public + commit to the v0.1 milestone.

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

0.1.0

Apr 27, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

nakata_truthcheck-0.1.0.tar.gz (16.5 kB view details)

Uploaded Apr 27, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

nakata_truthcheck-0.1.0-py3-none-any.whl (16.6 kB view details)

Uploaded Apr 27, 2026 Python 3

File details

Details for the file nakata_truthcheck-0.1.0.tar.gz.

File metadata

Download URL: nakata_truthcheck-0.1.0.tar.gz
Upload date: Apr 27, 2026
Size: 16.5 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.4

File hashes

Hashes for nakata_truthcheck-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`0f1278db1bb4fd6ef62be26897a734ab4902f828fa1923036bb48d5bdc8273a2`
MD5	`fc79fd79044ea8870bf90c58644128a6`
BLAKE2b-256	`82870bd1bc94f5b898b9396007da39fccd97384b0cff80df36cafade9f3c44df`

See more details on using hashes here.

File details

Details for the file nakata_truthcheck-0.1.0-py3-none-any.whl.

File metadata

Download URL: nakata_truthcheck-0.1.0-py3-none-any.whl
Upload date: Apr 27, 2026
Size: 16.6 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.4

File hashes

Hashes for nakata_truthcheck-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`0d9ddebac601dddea81438aefc3f759c608b70e2044d9c9a882aa614edecfbd6`
MD5	`ab18413af59d29324c81293ab9549c7c`
BLAKE2b-256	`c2b8a25f742da4a1d74732f54a2f79163dc298b1c40becc2c8ff41f39f4bbbf6`

See more details on using hashes here.

nakata-truthcheck 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Meta

Classifiers

Project description

truthcheck

The problem this solves

Design constraints

Sketch of the API

v0.1 decisions (closed)

Open for v0.2

Composition with the cluster

What this repo is NOT

License

Install

Project details

Verified details

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes