Skip to main content

Citation-verified RAG service with deterministic + semantic claim verification.

Project description

Axiom Engine

Citation-verified RAG with 6-tier confidence scoring.

Axiom Engine is a retrieval-augmented generation (RAG) service that verifies every cited claim before presenting answers. Each claim is assigned a confidence tier (1-6) based on deterministic + semantic verification.

Install

pip install axiom-rag-engine

Or with uv:

uv add axiom-rag-engine

Quick start

From PyPI

# Set required env vars (or create a .env file)
export AXIOM_ENV=development
export TAVILY_API_KEY=your_key   # or use AXIOM_ALLOW_MOCK_SEARCH=true

# Start the server
axiom-rag-engine serve

# In another terminal — send a test query
axiom-rag-engine probe "What are solid-state batteries?"

# Check resolved configuration (secrets redacted)
axiom-rag-engine check-config

From source

git clone https://github.com/FurkhanShaikh/axiom-rag-engine.git
cd axiom-rag-engine
python tasks.py install          # scaffold .env + install deps via uv
# Edit .env — fill in TAVILY_API_KEY for live web search
python tasks.py run              # start FastAPI server at http://localhost:8000
python tasks.py probe "your question"

Configuration

All settings are controlled via environment variables (or a .env file). No code changes required. Run axiom-rag-engine check-config to see the full resolved configuration.

Variable Default Description
AXIOM_ENV production Runtime environment. Set to development to disable auth.
AXIOM_API_KEYS (empty) Comma-separated API keys. Required when env != development.
TAVILY_API_KEY (empty) Tavily search API key for live web retrieval.
AXIOM_DEFAULT_SYNTHESIZER_MODEL claude-sonnet-4-5 LiteLLM model ID for synthesis.
AXIOM_DEFAULT_VERIFIER_MODEL gpt-4o-mini LiteLLM model ID for semantic verification.
AXIOM_RATE_LIMIT 20/minute Rate limit per API key or IP.
AXIOM_CACHE_TTL_SECONDS 300 Response cache TTL.
AXIOM_REDIS_URL (empty) Optional Redis URL for distributed cache.
AXIOM_CORS_ORIGINS (empty) Comma-separated allowed CORS origins.
AXIOM_DOCS_ENABLED true Set false to disable /docs and /redoc.
AXIOM_SEMANTIC_VERIFICATION_ENABLED true Enable/disable Stage 2 semantic verification.
AXIOM_AUDIT_RETENTION 0 Retain the last N audit trails in memory for /v1/audits/{id}.
AXIOM_LOG_AUDIT_EVENTS false Emit each audit event as a structured log line.
LOG_FORMAT text json for structured log output.

See .env.example for the full list with comments.

Architecture

retriever -> scorer -> ranker -> synthesizer -> verifier -+
   ^                    ^                                 |
   |                    +-- (rewrite loop) <--------------+  (Tier 4/5 & loop < max)
   +-- (re-retrieve) <-----------------------------------+  (loop exhausted & retries left)
Module Responsibility
Retriever Web search via Tavily, dedup, HTML strip, paragraph chunking
Scorer Domain authority + content quality scoring (deterministic)
Ranker BM25-based relevance ranking with quality blend
Synthesizer LLM-powered answer generation with strict citation format
Verifier Two-stage verification: mechanical (exact match) + semantic (LLM)

Verification tiers

Tier Label Meaning
1 Authoritative Verified against official/primary source
2 Multi-Source Verified against multiple independent domains
3 Model Assisted Mechanically verified; semantic relied on model knowledge
4 Misrepresented Quote exists but claim distorts context
5 Hallucinated Quote not found in source chunk
6 Conflicted Reserved for future contradiction detection

CLI reference

axiom-rag-engine serve [--host 0.0.0.0] [--port 8000] [--reload]
axiom-rag-engine probe "question" [--url URL] [--model MODEL] [--debug]
axiom-rag-engine check-config [--format text|json]
axiom-rag-engine audit <request_id> [--url URL] [--api-key KEY] [--json]

Operations

Runtime status

GET /v1/status returns a JSON snapshot of version, uptime, active policy, and configured backends. No secrets are exposed — API keys and Redis URLs are reported as booleans.

curl http://localhost:8000/v1/status | jq .

Combine with axiom-rag-engine check-config to see every AXIOM_* value and where it came from (env var, .env, or built-in default).

Audit trails

Every request emits a full audit trail from each graph node. Two ways to view them:

  1. Retained in-process — set AXIOM_AUDIT_RETENTION=200 to keep the last N trails in memory. Fetch any one by ID:

    # HTTP
    curl -H "X-API-Key: $KEY" http://localhost:8000/v1/audits/<request_id>
    
    # CLI (human-readable event log)
    axiom-rag-engine audit <request_id>
    
  2. Streamed to logs — set AXIOM_LOG_AUDIT_EVENTS=true together with LOG_FORMAT=json to emit one structured line per audit event, ready to forward to a log aggregator.

The retention store is process-local and bounded — for durable history, use the log stream into your existing aggregator.

Metrics dashboard

GET /metrics exposes Prometheus metrics including axiom_pipeline_duration_seconds, per-node and per-model LLM latency histograms, per-model token + USD cost counters (axiom_llm_tokens_total, axiom_llm_cost_usd_total), tier assignment rates, cache hit ratio, and verification-degradation counters.

A ready-to-import Grafana dashboard lives at deploy/grafana/axiom-engine.json. The quickest way to see it wired up end-to-end is the docker-compose stack below.

Token + cost accounting

Every response includes a usage block with call count, prompt/completion tokens, best-effort USD cost, and a per-model breakdown:

"usage": {
  "calls": 2,
  "prompt_tokens": 2661,
  "completion_tokens": 169,
  "total_tokens": 2830,
  "cost_usd": 0.00042,
  "by_model": {
    "claude-sonnet-4-5": {"calls": 1, "prompt_tokens": 2500, "completion_tokens": 150, "cost_usd": 0.00040}
  }
}

Cost is computed via litellm.completion_cost. Local models (Ollama) and untracked providers report 0.0. Cache hits return usage: null because the current request consumed no tokens.

Quick health walk

# Liveness (process alive)
curl -fsS http://localhost:8000/health/live

# Readiness (engine compiled, keys + backend configured)
curl -fsS http://localhost:8000/health/ready

# Full operator snapshot
curl -fsS http://localhost:8000/v1/status | jq .

Development

python tasks.py test             # unit tests (>=70% coverage required)
python tasks.py lint             # ruff + mypy
python tasks.py format           # auto-format
python tasks.py clean            # remove caches + venv

API

  • POST /v1/synthesize — Run the verification pipeline
  • GET /health — Liveness probe
  • GET /health/ready — Readiness probe
  • GET /metrics — Prometheus metrics

See the interactive docs at http://localhost:8000/docs when the server is running.

Docker

The bundled docker-compose.yml brings up the full stack — Axiom, Ollama, Redis (cache backing store), Prometheus (scraping /metrics), and Grafana with the dashboard pre-provisioned:

docker compose up --build

Endpoints once the stack is healthy:

To run without the observability sidecars, comment out the redis, prometheus, and grafana services.

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

axiom_rag_engine-0.1.0b2.tar.gz (68.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

axiom_rag_engine-0.1.0b2-py3-none-any.whl (83.8 kB view details)

Uploaded Python 3

File details

Details for the file axiom_rag_engine-0.1.0b2.tar.gz.

File metadata

  • Download URL: axiom_rag_engine-0.1.0b2.tar.gz
  • Upload date:
  • Size: 68.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.11.7 {"installer":{"name":"uv","version":"0.11.7","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for axiom_rag_engine-0.1.0b2.tar.gz
Algorithm Hash digest
SHA256 2510c1cc957e889b9aff78a2080483112300ef6c337fc5be4e0da7351239b9db
MD5 c22e6cf95f036787bfb9080e2351e283
BLAKE2b-256 c5aecdb78471d1cf9712714ab250244fde32fdae2c3e313f9dc90e32772e591e

See more details on using hashes here.

File details

Details for the file axiom_rag_engine-0.1.0b2-py3-none-any.whl.

File metadata

  • Download URL: axiom_rag_engine-0.1.0b2-py3-none-any.whl
  • Upload date:
  • Size: 83.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.11.7 {"installer":{"name":"uv","version":"0.11.7","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for axiom_rag_engine-0.1.0b2-py3-none-any.whl
Algorithm Hash digest
SHA256 5667f9f81aa3737a12965754888ff00e4b1ac0773124c93969231712444775f9
MD5 26331d8129978a5f691f20b8553edb4d
BLAKE2b-256 a94a2a52b893311b42b5d8d6cfda4556a91038a69f568e76e21931258078d4e1

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page