Python utilities for veritasai (edit this description)
Project description
VeritasAI
Multilingual fact checker that uses provided knowledge bases to make sure that its claims are reliable.
VeritasAI is a fact verification toolkit designed for multilingual settings and domains where traceable evidence matters (e.g., medical). It works with your own knowledge base—raw text is sufficient—and verifies that any claim produced by the pipeline can be traced back to a specific line or paragraph in that corpus.
Alpha notice: the API may change. Contributions are not currently accepted; feel free to open issues.
Table of Contents
- Features
- Installation
- Requirements
- Quickstart
- Usage & API Overview
- Preparing a Knowledge Base
- Project Status
- Roadmap
- Limitations
- Why VeritasAI vs Alternatives
- Documentation & Support
- License
- Citation
Features
- Multilingual: works across languages as long as a relevant knowledge base is supplied.
- Evidence-grounded: every verified claim links back to a line/paragraph in your corpus.
- Modular pipeline: separate components for claim extraction, evidence retrieval, and claim verification.
- GPU-first: CUDA support for efficient large-model inference.
- No telemetry: VeritasAI does not collect any usage data.
Benchmarks and end-to-end demos are coming; see the Colab links once they are published.
Installation
pip install veritasai
Optional heavy dependencies (usually auto-installed)
torch,transformers,accelerate,bitsandbytes,peft,spacy,sentence-transformers
Requirements
- Python: 3.12
- OS: Linux, macOS, or Windows
- Hardware: NVIDIA GPU with CUDA
Quickstart
A full Colab notebook will be linked here. For now, here’s a minimal illustrative example using the high-level wrapper.
Note: function names below reflect the intended design; adjust if your local API differs.
from veritasai import veritasai # wrapper that runs extraction → retrieval → verification
# A very small raw-text knowledge base (list of passages or documents)
kb = [
"Aspirin (acetylsalicylic acid) can help reduce fever and relieve minor aches.",
"For adults, typical oral doses of ibuprofen are 200–400 mg every 4–6 hours as needed.",
"Type 2 diabetes is characterized by insulin resistance."
]
# Initialize the pipeline (no extra config required)
vai = veritasai()
texts = ["Aspirin reduces fever in adults."]
result = vai.extract_claims(texts,kb,top_n=1,score_limit=0.5) # returns a structured object with claims, evidence, and verdicts
# Inspect the claims from first document
first = result[0]
print(first)
Usage & API Overview
Core classes
VeritasAI exposes three core building blocks (plus a wrapper). Typical usage wires them in sequence:
claim_extractor.py— finds atomic claims in raw text.evidence_retriever.py— retrieves candidate evidence spans from the knowledge base.claim_verifier.py— classifies each (claim, evidence) pair (e.g., SUPPORTED/REFUTED/INSUFFICIENT).
Wrapper pipeline
veritasaiusing individual functions.-
Example sketch:
from veritasai import claim_extractor,claim_verifier, evidence_retriever # A very small raw-text knowledge base (list of passages or documents) kb = [ "Aspirin (acetylsalicylic acid) can help reduce fever and relieve minor aches.", "For adults, typical oral doses of ibuprofen are 200–400 mg every 4–6 hours as needed.", "Type 2 diabetes is characterized by insulin resistance." ] # Initialize the indivual functions (no extra config required) ce = claim_extractor() cv = claim_verifier() er = evidence_retriever() text = "Aspirin reduces fever in adults." text_out, claims = ce.extract_claims(text,max_new_tokens = 512, temperature= 0.1, top_p= 0.80) hits = er.evidence_search(claims,kb,top_n=1,score_function=er.cos_sim2,score_limit=0.5) fact_check=[] for i in range(len(claims)): sentences=[] for hit in hits[i]: sentences.append(hit["sentence"]) raw, parsed = cv.verify_claim(claims[i],sentences) fact_check.append({"claim":claims[i], "evidence":sentences, "labels":parsed}) print(fact_check)
-
No config needed: the defaults should work out-of-the-box; customize models/parameters as needed once those knobs are exposed.
Preparing a Knowledge Base
VeritasAI expects raw text as the knowledge source. You can start simple:
kb = [
"Paragraph 1 ...",
"Paragraph 2 ...",
# ...
]
Project Status
- Stage: Alpha (API subject to change)
- CLI: None at the moment
- Public API: Not finalized
- Telemetry: None collected
Roadmap
- Public Colab notebooks (Quickstart, evaluation, domain-specific demos)
- Configurable models and retrieval backends
- Evaluation scripts and reproducible benchmarks
- Expanded multilingual tests and domain examples (e.g., clinical, legal)
- Optional CLI once the API stabilizes
Limitations
- Requires a GPU + CUDA for practical performance.
- Quality depends heavily on your knowledge base coverage and cleanliness.
- The verifier may be conservative without sufficient domain evidence.
Why VeritasAI vs Alternatives
- Evidence requirement by design: the pipeline prioritizes traceability to your own corpus.
- Modular: you can inspect or replace any stage (extraction / retrieval / verification) as the system evolves.
- Multilingual: leverages modern multilingual embeddings and LMs to work across languages (subject to model availability).
If you’re comparing with generic RAG toolkits: VeritasAI focuses specifically on claim-level verification with explicit evidence spans rather than general question answering.
Documentation & Support
- Issues: use the repository’s Issues tab for bugs/questions.
- Contributing: external contributions aren’t accepted yet—please open an issue to discuss.
- Security: please open a private security advisory in the repo if you discover a vulnerability.
License
MIT. See LICENSE.
Citation
If you use VeritasAI in academic work, please cite it. A BibTeX entry will be provided here.
# (To be added)
README generated on 2025-10-04.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file veritasai-0.1.14.tar.gz.
File metadata
- Download URL: veritasai-0.1.14.tar.gz
- Upload date:
- Size: 12.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e5dce39f0c072eab08f971ce1996ae3c6a0c9a284b1c4d686f9c773af87993e1
|
|
| MD5 |
0a7881f2db00a8046f18fee55e23901f
|
|
| BLAKE2b-256 |
22a18423f6e4bf02c1370269cc063a64b1d328fc3792df529584f4edcd08e7db
|
Provenance
The following attestation bundles were made for veritasai-0.1.14.tar.gz:
Publisher:
publish.yml on sopolat/veritasai
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
veritasai-0.1.14.tar.gz -
Subject digest:
e5dce39f0c072eab08f971ce1996ae3c6a0c9a284b1c4d686f9c773af87993e1 - Sigstore transparency entry: 583848766
- Sigstore integration time:
-
Permalink:
sopolat/veritasai@d732879b0ad63c1d635592eb3eacc992ecc7d708 -
Branch / Tag:
refs/tags/v0.1.14 - Owner: https://github.com/sopolat
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@d732879b0ad63c1d635592eb3eacc992ecc7d708 -
Trigger Event:
push
-
Statement type:
File details
Details for the file veritasai-0.1.14-py3-none-any.whl.
File metadata
- Download URL: veritasai-0.1.14-py3-none-any.whl
- Upload date:
- Size: 10.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8b2b86d267c0ad8c192db07b2c4d6f7e4d77a760fb3938e09705d38ff37b0b1e
|
|
| MD5 |
ceaaaf6d3ae78df615d556e1b85d946b
|
|
| BLAKE2b-256 |
1e26d47eae7cf50c1676d54c39d1fa8b2db2bc6ced2c4299cb92b96af7fa6306
|
Provenance
The following attestation bundles were made for veritasai-0.1.14-py3-none-any.whl:
Publisher:
publish.yml on sopolat/veritasai
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
veritasai-0.1.14-py3-none-any.whl -
Subject digest:
8b2b86d267c0ad8c192db07b2c4d6f7e4d77a760fb3938e09705d38ff37b0b1e - Sigstore transparency entry: 583848767
- Sigstore integration time:
-
Permalink:
sopolat/veritasai@d732879b0ad63c1d635592eb3eacc992ecc7d708 -
Branch / Tag:
refs/tags/v0.1.14 - Owner: https://github.com/sopolat
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@d732879b0ad63c1d635592eb3eacc992ecc7d708 -
Trigger Event:
push
-
Statement type: