Skip to main content

LoRA training data and adapter management for teaching AI models truthful, hedged responses. Built on SovereignShield's TruthGuard pipeline.

Project description

Veritas — Truth Adapter Training Pipeline

Training data manager and adapter loader for teaching AI models to prefer truthful, hedged responses over confident hallucinations. Built on SovereignShield's TruthGuard pipeline.

What This Does

AI models confidently state things that aren't true. SovereignShield's TruthGuard catches these hallucinations at runtime by blocking answers that contain confident factual claims without tool-backed verification. But blocking bad answers is reactive — the long-term goal is to make the model stop hallucinating in the first place.

Veritas is the training side of that pipeline. It takes everything TruthGuard has collected — blocked claims, verified facts, hedged responses, cited answers — and compiles them into JSONL training pairs. You feed those pairs into a LoRA fine-tuning tool (OpenAI API, HuggingFace, Unsloth), and the model learns to prefer truthful, hedged responses over confident guesses. Over time, the model internalizes the behavior and stops needing TruthGuard to catch it.

Install

pip install veritas-truth-adapter

Quick Start

Python API

from veritas import Veritas

v = Veritas(db_path="truth_guard.db")

# Check data readiness
print(v.stats())

# Export training data when ready
result = v.export("training_data.jsonl")
if result["exported"]:
    print(f"Exported {result['total_pairs']} training pairs")

# Use TruthGuard directly through Veritas
v.start_session("session-001")
v.record_tool_use("SEARCH")
allowed, reason = v.check_answer("The capital of France is Paris.")
v.end_session()

CLI

# Check how much training data is available
veritas stats -d truth_guard.db

# Export training data to JSONL
veritas export -d truth_guard.db -o training_data.jsonl

Training Pair Types

Veritas generates four types of training pairs:

Negative corrections — The model made a confident claim that TruthGuard blocked. The training pair shows the model what it should have said instead (a hedged version of the same claim). This teaches the model to stop making unverified assertions.

Positive verified — The model made a factual claim AND used a verification tool first. The training pair reinforces this behavior — the model gets positive signal for checking its facts before stating them.

Positive hedged — The model expressed appropriate uncertainty ("I believe", "as far as I know") instead of stating something as fact. The training pair rewards this behavior.

Positive cited — The model included a source or reference for its claim. The training pair rewards citing evidence.

How the Pipeline Works

TruthGuard (runtime)          Veritas (training)              Fine-tuning (external)
┌──────────────────┐     ┌─────────────────────┐     ┌──────────────────────────┐
│ AI generates     │     │ Compile blocked      │     │ Feed JSONL into:         │
│ answer           │     │ claims + verified    │     │ - OpenAI fine-tuning API │
│       ↓          │     │ facts + hedged       │     │ - HuggingFace/Unsloth   │
│ Check for        │     │ responses + cited    │     │ - Any JSONL-compatible   │
│ confidence       │────→│ answers into JSONL   │────→│   LoRA trainer          │
│ markers          │     │ training pairs       │     │       ↓                  │
│       ↓          │     │                      │     │ Model learns to prefer   │
│ Block or Allow   │     │ veritas export       │     │ truthful responses       │
│       ↓          │     │                      │     │                          │
│ Log to SQLite    │     │                      │     │                          │
└──────────────────┘     └─────────────────────┘     └──────────────────────────┘

Configuration

v = Veritas(
    db_path="truth_guard.db",   # Path to TruthGuard's SQLite database
    fact_ttl_days=7,            # Verified fact TTL in days (default: 7)
)

# Export with minimum pair threshold
result = v.export(
    output_path="training_data.jsonl",
    min_pairs=10,               # Won't export unless you have at least this many pairs
)

Dependencies

None — Veritas bundles its own copies of TruthGuard and LoRAExporter. Pure Python stdlib only.

Changelog

0.1.1

Security audit patch:

  • TruthGuard: Fact cache lookup now splits answers into sentences before hashing (matches SovereignShield's behavior — individual facts are cached, not full answer blobs).
  • LoRAExporter: _hedge_claim() no longer crashes on empty strings (IndexError guard added).

0.1.0

  • Initial release.

License

BSL 1.1 — See LICENSE

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

veritas_truth_adapter-0.1.1.tar.gz (19.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

veritas_truth_adapter-0.1.1-py3-none-any.whl (19.2 kB view details)

Uploaded Python 3

File details

Details for the file veritas_truth_adapter-0.1.1.tar.gz.

File metadata

  • Download URL: veritas_truth_adapter-0.1.1.tar.gz
  • Upload date:
  • Size: 19.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.10

File hashes

Hashes for veritas_truth_adapter-0.1.1.tar.gz
Algorithm Hash digest
SHA256 b12c13653e513bb36837fffcf87a71af012bd82a8e9cc086c7d93b9f6d437101
MD5 6b7551f54dfcc0b490f5030be094f821
BLAKE2b-256 2b11cff50485deb089fa4cfe082db734f533d7627b2f1c129fee7d31e65ed4c9

See more details on using hashes here.

File details

Details for the file veritas_truth_adapter-0.1.1-py3-none-any.whl.

File metadata

File hashes

Hashes for veritas_truth_adapter-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 86bb8d01f15ec9d235bb1cc2af23841fdcd11d26791c219d1dd95b8af3800c6d
MD5 e45c1a00ea7775bf9e9e71d273b09bb2
BLAKE2b-256 7a520bab4cda9ae8554872d226128f96aabecc1aa1d3e3f31f0006402a1630f1

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page