Skip to main content

Python utilities for veritasai (edit this description)

Project description

VeritasAI

PyPI - Version Python License: MIT Status

Multilingual fact checker that uses provided knowledge bases to make sure that its claims are reliable.

VeritasAI is a fact verification toolkit designed for multilingual settings and domains where traceable evidence matters (e.g., medical). It works with your own knowledge base—raw text is sufficient—and verifies that any claim produced by the pipeline can be traced back to a specific line or paragraph in that corpus.

Alpha notice: the API may change. Contributions are not currently accepted; feel free to open issues.


Table of Contents


Features

  • Multilingual: works across languages as long as a relevant knowledge base is supplied.
  • Evidence-grounded: every verified claim links back to a line/paragraph in your corpus.
  • Modular pipeline: separate components for claim extraction, evidence retrieval, and claim verification.
  • GPU-first: CUDA support for efficient large-model inference.
  • No telemetry: VeritasAI does not collect any usage data.

Benchmarks and end-to-end demos are coming; see the Colab links once they are published.

Installation

pip install veritasai

Optional heavy dependencies (usually auto-installed)

  • torch, transformers, accelerate, bitsandbytes, peft, spacy, sentence-transformers

Requirements

  • Python: 3.12
  • OS: Linux, macOS, or Windows
  • Hardware: NVIDIA GPU with CUDA

Quickstart

A full Colab notebook will be linked here. For now, here’s a minimal illustrative example using the high-level wrapper.
Note: function names below reflect the intended design; adjust if your local API differs.

from veritasai import veritasai  # wrapper that runs extraction → retrieval → verification

# A very small raw-text knowledge base (list of passages or documents)
kb = [
    "Aspirin (acetylsalicylic acid) can help reduce fever and relieve minor aches.",
    "For adults, typical oral doses of ibuprofen are 200–400 mg every 4–6 hours as needed.",
    "Type 2 diabetes is characterized by insulin resistance."
]

# Initialize the pipeline (no extra config required)
vai = veritasai() 

texts = ["Aspirin reduces fever in adults."]
result = vai.extract_claims(texts,kb,top_n=1,score_limit=0.5)  # returns a structured object with claims, evidence, and verdicts

# Inspect the claims from first document
first = result[0]
print(first)

Usage & API Overview

Core classes

VeritasAI exposes three core building blocks (plus a wrapper). Typical usage wires them in sequence:

  • claim_extractor.py — finds atomic claims in raw text.
  • evidence_retriever.py — retrieves candidate evidence spans from the knowledge base.
  • claim_verifier.py — classifies each (claim, evidence) pair (e.g., SUPPORTED/REFUTED/INSUFFICIENT).

Wrapper pipeline

  • veritasai using individual functions.
    • Example sketch:

      from veritasai import claim_extractor,claim_verifier, evidence_retriever
      # A very small raw-text knowledge base (list of passages or documents)
      kb = [
      	"Aspirin (acetylsalicylic acid) can help reduce fever and relieve minor aches.",
      	"For adults, typical oral doses of ibuprofen are 200–400 mg every 4–6 hours as needed.",
      	"Type 2 diabetes is characterized by insulin resistance."
      ]
      
      # Initialize the indivual functions (no extra config required)
      ce = claim_extractor()
      cv = claim_verifier()
      er = evidence_retriever()
      
      text = "Aspirin reduces fever in adults."
      text_out, claims = ce.extract_claims(text,max_new_tokens = 512, temperature= 0.1, top_p= 0.80)
      hits = er.evidence_search(claims,kb,top_n=1,score_function=er.cos_sim2,score_limit=0.5)
      
      fact_check=[]
      for i in range(len(claims)):
      	sentences=[]
      	for hit in hits[i]:
      		sentences.append(hit["sentence"])
      	raw, parsed = cv.verify_claim(claims[i],sentences)
      	fact_check.append({"claim":claims[i], "evidence":sentences, "labels":parsed})
      print(fact_check)
      

No config needed: the defaults should work out-of-the-box; customize models/parameters as needed once those knobs are exposed.

Preparing a Knowledge Base

VeritasAI expects raw text as the knowledge source. You can start simple:

kb = [
  "Paragraph 1 ...",
  "Paragraph 2 ...",
  # ...
]

Project Status

  • Stage: Alpha (API subject to change)
  • CLI: None at the moment
  • Public API: Not finalized
  • Telemetry: None collected

Roadmap

  • Public Colab notebooks (Quickstart, evaluation, domain-specific demos)
  • Configurable models and retrieval backends
  • Evaluation scripts and reproducible benchmarks
  • Expanded multilingual tests and domain examples (e.g., clinical, legal)
  • Optional CLI once the API stabilizes

Limitations

  • Requires a GPU + CUDA for practical performance.
  • Quality depends heavily on your knowledge base coverage and cleanliness.
  • The verifier may be conservative without sufficient domain evidence.

Why VeritasAI vs Alternatives

  • Evidence requirement by design: the pipeline prioritizes traceability to your own corpus.
  • Modular: you can inspect or replace any stage (extraction / retrieval / verification) as the system evolves.
  • Multilingual: leverages modern multilingual embeddings and LMs to work across languages (subject to model availability).

If you’re comparing with generic RAG toolkits: VeritasAI focuses specifically on claim-level verification with explicit evidence spans rather than general question answering.

Documentation & Support

  • Issues: use the repository’s Issues tab for bugs/questions.
  • Contributing: external contributions aren’t accepted yet—please open an issue to discuss.
  • Security: please open a private security advisory in the repo if you discover a vulnerability.

License

MIT. See LICENSE.

Citation

If you use VeritasAI in academic work, please cite it. A BibTeX entry will be provided here.

# (To be added)

README generated on 2025-10-04.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

veritasai-0.1.17.tar.gz (13.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

veritasai-0.1.17-py3-none-any.whl (10.8 kB view details)

Uploaded Python 3

File details

Details for the file veritasai-0.1.17.tar.gz.

File metadata

  • Download URL: veritasai-0.1.17.tar.gz
  • Upload date:
  • Size: 13.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for veritasai-0.1.17.tar.gz
Algorithm Hash digest
SHA256 c1d38302611ab47be057b1544f3effa7f6bac2f1d7b238105782f480d8fbfe79
MD5 1eab4c7a6f2408e6ce3ecdfa22e23546
BLAKE2b-256 8dc807912adddcdd4b4bf9d0fc34348445eea56e819462d86db4d22e63158962

See more details on using hashes here.

Provenance

The following attestation bundles were made for veritasai-0.1.17.tar.gz:

Publisher: publish.yml on sopolat/veritasai

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file veritasai-0.1.17-py3-none-any.whl.

File metadata

  • Download URL: veritasai-0.1.17-py3-none-any.whl
  • Upload date:
  • Size: 10.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for veritasai-0.1.17-py3-none-any.whl
Algorithm Hash digest
SHA256 76b2bbeaf1d88165c39ace1c91e09710bade5673a5b65d350de3e42071dde1ae
MD5 cb8d3a17b1299e4f084d419e099f4741
BLAKE2b-256 d04ee4d746ea062988f16cc65140f3e0de91e7022065126798819c60e1ab109d

See more details on using hashes here.

Provenance

The following attestation bundles were made for veritasai-0.1.17-py3-none-any.whl:

Publisher: publish.yml on sopolat/veritasai

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page