Skip to main content

Python utilities for veritasai (edit this description)

Project description

VeritasAI

PyPI - Version Python License: MIT Status

Multilingual fact checker that uses provided knowledge bases to make sure that its claims are reliable.

VeritasAI is a fact verification toolkit designed for multilingual settings and domains where traceable evidence matters (e.g., medical). It works with your own knowledge base—raw text is sufficient—and verifies that any claim produced by the pipeline can be traced back to a specific line or paragraph in that corpus.

Alpha notice: the API may change. Contributions are not currently accepted; feel free to open issues.


Table of Contents


Features

  • Multilingual: works across languages as long as a relevant knowledge base is supplied.
  • Evidence-grounded: every verified claim links back to a line/paragraph in your corpus.
  • Modular pipeline: separate components for claim extraction, evidence retrieval, and claim verification.
  • GPU-first: CUDA support for efficient large-model inference.
  • No telemetry: VeritasAI does not collect any usage data.

Benchmarks and end-to-end demos are coming; see the Colab links once they are published.

Installation

pip install veritasai

If you don't have a CUDA-enabled PyTorch yet, install one that matches your system:

# Example for CUDA 12.x (adjust per https://pytorch.org/get-started/locally/)
pip install --index-url https://download.pytorch.org/whl/cu121 torch torchvision torchaudio

Optional heavy dependencies (usually auto-installed)

  • torch, transformers, accelerate, bitsandbytes, peft, spacy, sentence-transformers

Requirements

  • Python: 3.12
  • OS: Linux, macOS, or Windows
  • Hardware: NVIDIA GPU with CUDA

Quickstart

A full Colab notebook will be linked here. For now, here’s a minimal illustrative example using the high-level wrapper.
Note: function names below reflect the intended design; adjust if your local API differs.

from veritasai import VeritasAI  # wrapper that runs extraction → retrieval → verification

# A very small raw-text knowledge base (list of passages or documents)
kb = [
    "Aspirin (acetylsalicylic acid) can help reduce fever and relieve minor aches.",
    "For adults, typical oral doses of ibuprofen are 200–400 mg every 4–6 hours as needed.",
    "Type 2 diabetes is characterized by insulin resistance."
]

# Initialize the pipeline (no extra config required)
va = VeritasAI(knowledge_base=kb)  # or VeritasAI(kb=kb) depending on your API

text = "Aspirin reduces fever in adults."
result = va.verify(text)  # returns a structured object with claims, evidence, and verdicts

# Inspect the first verified claim
first = result.claims[0]
print("Claim:", first.text)
print("Verdict:", first.verdict)        # e.g., "SUPPORTED", "REFUTED", "NOT ENOUGH INFO"
print("Evidence snippet:", first.evidence.snippet)
print("Evidence source:", first.evidence.source_id)  # index/identifier in your KB

Usage & API Overview

Core classes

VeritasAI exposes three core building blocks (plus a wrapper). Typical usage wires them in sequence:

  • claim_extractor.py — finds atomic claims in raw text.
  • evidence_retriever.py — retrieves candidate evidence spans from the knowledge base.
  • claim_verifier.py — classifies each (claim, evidence) pair (e.g., supported/refuted/unknown).

Wrapper pipeline

  • veritasai (wrapper) — orchestrates the three stages end-to-end.
    • Example sketch:

      from veritasai import ClaimExtractor, EvidenceRetriever, ClaimVerifier, VeritasAI
      
      extractor = ClaimExtractor()
      retriever = EvidenceRetriever(knowledge_base=kb)
      verifier  = ClaimVerifier()
      
      # Use individual stages
      claims = extractor.extract(text)
      evidence = retriever.retrieve(claims)
      verdicts = verifier.verify(claims, evidence)
      
      # Or just use the high-level wrapper
      va = VeritasAI(knowledge_base=kb)
      result = va.verify(text)
      

No config needed: the defaults should work out-of-the-box; customize models/parameters as needed once those knobs are exposed.

Preparing a Knowledge Base

VeritasAI expects raw text as the knowledge source. You can start simple:

kb = [
  "Paragraph 1 ...",
  "Paragraph 2 ...",
  # ...
]

For larger corpora, consider pre-splitting into passages (e.g., 256–512 tokens) and storing an ID per passage so evidence links are stable. You can load from files or a database and pass a Python list to the pipeline.

Project Status

  • Stage: Alpha (API subject to change)
  • CLI: None at the moment
  • Public API: Not finalized
  • Telemetry: None collected

Roadmap

  • Public Colab notebooks (Quickstart, evaluation, domain-specific demos)
  • Configurable models and retrieval backends
  • Evaluation scripts and reproducible benchmarks
  • Expanded multilingual tests and domain examples (e.g., clinical, legal)
  • Optional CLI once the API stabilizes

Limitations

  • Requires a GPU + CUDA for practical performance.
  • Quality depends heavily on your knowledge base coverage and cleanliness.
  • The verifier may be conservative without sufficient domain evidence.

Why VeritasAI vs Alternatives

  • Evidence requirement by design: the pipeline prioritizes traceability to your own corpus.
  • Modular: you can inspect or replace any stage (extraction / retrieval / verification) as the system evolves.
  • Multilingual: leverages modern multilingual embeddings and LMs to work across languages (subject to model availability).

If you’re comparing with generic RAG toolkits: VeritasAI focuses specifically on claim-level verification with explicit evidence spans rather than general question answering.

Documentation & Support

  • Issues: use the repository’s Issues tab for bugs/questions.
  • Contributing: external contributions aren’t accepted yet—please open an issue to discuss.
  • Security: please open a private security advisory in the repo if you discover a vulnerability.

License

MIT. See LICENSE.

Citation

If you use VeritasAI in academic work, please cite it. A BibTeX entry will be provided here.

# (To be added)

README generated on 2025-10-04.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

veritasai-0.1.13.tar.gz (13.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

veritasai-0.1.13-py3-none-any.whl (10.4 kB view details)

Uploaded Python 3

File details

Details for the file veritasai-0.1.13.tar.gz.

File metadata

  • Download URL: veritasai-0.1.13.tar.gz
  • Upload date:
  • Size: 13.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for veritasai-0.1.13.tar.gz
Algorithm Hash digest
SHA256 db1fd2c2dae354a9db151f490ab1dcd2c60e011efe59f84761c9b9a5c210e85f
MD5 5e05ee90e4b92d4c346aefeb6ce5d48d
BLAKE2b-256 d146c4d25ab1fa1fd6a1de0f63372fe95b264db9c840e0f400895dba59867c50

See more details on using hashes here.

Provenance

The following attestation bundles were made for veritasai-0.1.13.tar.gz:

Publisher: publish.yml on sopolat/veritasai

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file veritasai-0.1.13-py3-none-any.whl.

File metadata

  • Download URL: veritasai-0.1.13-py3-none-any.whl
  • Upload date:
  • Size: 10.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for veritasai-0.1.13-py3-none-any.whl
Algorithm Hash digest
SHA256 7477950a365e7dba938cf09fa60f0a474f77902122e0a7124b581e0e4e0f593f
MD5 ab9e09bb9600bc59d06833cec3641e17
BLAKE2b-256 61feb1054b782c215d77e2c12b0edbc40cdb9e3503635fc0162b7f4b12832449

See more details on using hashes here.

Provenance

The following attestation bundles were made for veritasai-0.1.13-py3-none-any.whl:

Publisher: publish.yml on sopolat/veritasai

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page