Skip to main content

CLI tool for running LLM drift detection evaluations

Project description

pramana  ·  प्रमाण

Track whether LLM API outputs stay consistent over time.

Crowdsourced drift detection for LLM APIs. Run reproducible evals, compare results over time, catch silent model changes.

Tests License: MIT


The Problem

When you call gpt-5 or claude-sonnet-4-6 today, you might get different behavior than yesterday. Providers update, fine-tune, and swap models behind stable identifiers. This is invisible.

There's no standard way to notice — let alone measure — these changes.

The Fix

git clone https://github.com/syd-ppt/pramana && cd pramana
uv pip install -e ".[dev]"
$ pramana run --tier cheap --model gpt-5.2
Running cheap suite against gpt-5.2...
✓ 10/10 passed
Pass rate: 100.0%

Same prompts. Same parameters. Deterministic where the provider allows it. Compare across runs and users.


Usage

# See all supported models
pramana models

# Run evals (auto-detects provider from model name)
export OPENAI_API_KEY=sk-...
pramana run --tier cheap --model gpt-4o

# Aliases work too
pramana run --tier cheap --model opus

# Submit to the community dashboard
pramana submit results.json

Tiers:

Tier Tests Purpose
cheap 10 Smoke test, CI gates
moderate 25 Regular monitoring
comprehensive 75 Full evaluation

All tiers cover 6 categories: reasoning, factual, instruction following, coding, safety, creative.


Providers

Provider Temperature Seed Reproducibility
OpenAI ✅ Enforced ✅ Enforced High
Anthropic ✅ Enforced ❌ Ignored Low
Google ✅ Enforced ✅ Enforced Medium

For scientific drift detection, use OpenAI API with explicit keys. See REPRODUCIBILITY.md.


How It Works

You run pramana ──► Fixed prompts hit the API ──► Results hashed & stored
                                                         │
Other users run pramana ──► Same prompts ──► Results compared
                                                         │
                                              Drift detected via
                                              output consistency tracking
  • Content-addressable hashing — SHA-256 of (model, prompt, output) for deduplication
  • Deterministic parameterstemperature=0.0, seed=42 enforced by default
  • No normalization layer — raw API responses, not filtered through LiteLLM

Authentication (Optional)

pramana login          # GitHub/Google OAuth
pramana whoami         # Check status
pramana delete         # GDPR: delete all your data

No login required to run evals or submit results. Auth enables personalized tracking.


Development

git clone https://github.com/syd-ppt/pramana && cd pramana
uv pip install -e ".[dev]"
pytest tests/

Backend: pramana-api · Dashboard: pramana.pages.dev


Contributing

  1. Add test cases — append to suites/v1.0/{tier}.jsonl
  2. Add providers — subclass BaseProvider in src/pramana/providers/
  3. Improve assertions — new types in assertions.py

See CONTRIBUTING.md.


pramana (प्रमाण) — Sanskrit for proof, evidence, valid knowledge

Docs · Dashboard · Issues

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pramana_ai-0.1.0.tar.gz (165.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pramana_ai-0.1.0-py3-none-any.whl (35.5 kB view details)

Uploaded Python 3

File details

Details for the file pramana_ai-0.1.0.tar.gz.

File metadata

  • Download URL: pramana_ai-0.1.0.tar.gz
  • Upload date:
  • Size: 165.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.6.14

File hashes

Hashes for pramana_ai-0.1.0.tar.gz
Algorithm Hash digest
SHA256 81376dad3979d8c3667ae1b3560d3a3170017462ba32f7a17cea67562d9a9f31
MD5 1fa05465b348bfd035432f0f58bd8fda
BLAKE2b-256 11b4b9905dd16ce35dc03301599d5468f3516788ebb009589e8cc17f71ea2432

See more details on using hashes here.

File details

Details for the file pramana_ai-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: pramana_ai-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 35.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.6.14

File hashes

Hashes for pramana_ai-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 d28f46e00ca1ee5d7c2349792c0ab47a2cfaf059c74f77394f999e5b73af0018
MD5 4a9143b0ba551d84fd9b285e81224a14
BLAKE2b-256 32dfa5245888f1f5f2af7fa74d85462a7ed0e3da3eb37d9bac20d0cd4f3cc2a5

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page