Regression protection for LLM pipelines

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

bihanikeshav

These details have not been verified by PyPI

Project links

Homepage

Project description

promptry

Sentry for prompts. Sentry catches when your code breaks. promptry catches when your prompts break — versions them, runs eval suites in CI, and flags regressions or drift against a baseline. Local-first. No SaaS.

from promptry import track, suite, assert_semantic

# track() content-hashes your prompt and stores a new version if it changed
prompt = track(system_prompt, "rag-qa")
response = llm.chat(system=prompt, ...)

# suites are regular Python functions. run them via CLI or in CI.
@suite("rag-regression")
def test_quality():
    response = my_pipeline("What is photosynthesis?")
    assert_semantic(response, "Converts light into chemical energy")

When a suite regresses against its baseline, promptry reports what changed:

Overall score: 0.910 -> 0.720  REGRESSION

Probable cause:
  -> Prompt changed (v3 -> v4)

Install

pip install promptry                       # core
pip install promptry[semantic]             # + semantic assertions (sentence-transformers)
pip install promptry[dashboard]            # + web dashboard
pip install promptry[semantic,dashboard]   # everything

Quick start

promptry init                              # scaffold project + starter eval
promptry run smoke-test --module evals     # run it

PASS test_basic_quality (142ms)
  semantic (0.891) ok

Overall: PASS  score: 0.891

Features

Feature	What it does
Prompt versioning	Content-hashed, automatic dedup
Eval suites	Semantic, schema, LLM-as-judge, JSON, regex, grounding assertions
Regression detection	Compare against baselines, get root cause hints
Drift detection	Catch slow quality degradation over time
Model comparison	Statistical comparison against historical baseline (not just snapshots)
Cost tracking	Token usage and cost per prompt, aggregated reports
Safety templates	25 starter jailbreak / injection / PII tests — add your own
MCP server	Expose everything as tools for Claude, Cursor, VS Code, etc.
Dashboard	Web UI for eval history, prompt diffs, model comparison, cost
JS/TS client	Ship prompt events from frontend/Node apps

Dashboard

pip install promptry[dashboard]
promptry dashboard

Overview Suite Detail Prompts Models Cost

How it differs

	Promptfoo	DeepEval	RAGAS	LangSmith	promptry
Language	TypeScript	Python	Python	Python + JS	Python + JS
Local-first	Yes	Cloud push	Yes	SaaS only	SQLite
Prompt versioning	Via git + YAML	No	No	Prompt Hub	Automatic
Drift over time	No	No	No	Dashboards	Regression window
Root cause hints	No	No	No	No	Yes
Safety / red-team	Yes	Yes	No	No	25 starters
MCP server	Plugin	Partial	No	No	Native
Vendor	OpenAI-owned	Independent	Independent	LangChain	Independent
Cost	Free	Freemium	Free	Freemium	Free

Honest caveats: Promptfoo has more assertion types and a larger red-team corpus. RAGAS has the gold-standard RAG metrics (faithfulness, context precision, answer relevancy). LangSmith has better multi-user dashboards and deeper LangChain integration. promptry's niche is the combo of local SQLite + automatic versioning + CI-native + MCP server in one Python-first package.

GitHub Action

Run eval suites in CI with one line. On pull requests it posts (or updates) a single comment summarizing the eval: overall score, pass/fail counts, and any regressed tests vs. the previous run. View on Marketplace.

# .github/workflows/eval.yml
name: Eval
on: [push, pull_request]
jobs:
  eval:
    runs-on: ubuntu-latest
    permissions:
      contents: read
      pull-requests: write  # required for PR comments
    steps:
      - uses: actions/checkout@v4
      - uses: bihanikeshav/promptry@v0.6.0
        with:
          suite: rag-regression
          module: evals
          compare: prod  # optional — compare against baseline

Example PR comment on a regression:

## promptry eval: rag-regression

| | Current | Baseline | Delta |
|---|---|---|---|
| Overall score | 0.891 | 0.910 | -0.019 |
| Passed | 8/10 | 9/10 | -1 |
| Status | REGRESSED | PASS | |

**Regressions:**
- `test_photosynthesis_answer`: semantic 0.89 -> 0.72 (-0.17)
- `test_schema_validation`: passed -> **failed**

_Generated by [promptry](https://github.com/bihanikeshav/promptry)_

Subsequent pushes edit the same comment instead of spamming new ones.

Input	Required	Default	Description
`suite`	Yes		Eval suite name
`module`	Yes		Python module containing the suite
`compare`	No		Baseline tag to compare against
`python-version`	No	`3.12`	Python version
`extras`	No	`semantic`	pip extras to install
`pr-comment`	No	`true`	Post/update a PR comment with results
`github-token`	No	`${{ github.token }}`	Token used to post PR comments

MCP server

claude mcp add promptry -- promptry mcp    # Claude Code

Works with Claude Desktop, Cursor, Windsurf, VS Code. See full setup.

Documentation

The full guide covers all assertions, cost tracking, model comparison, safety templates, notifications, storage modes, JS client, CLI reference, MCP setup, and config options.

Honest caveats

Early-stage. v0.7, solo-maintained, small user base. API is stable but bus-factor is one. Issues welcome.
"No API keys" applies to the framework only. SQLite storage and the CLI need nothing. assert_llm, assert_grounded, and cost tracking all need your own LLM provider key.
Drift detection is a rolling-window regression on scores. Works for steady degradation over a configurable window (default 30 runs). It is not a formal hypothesis test — see drift detection docs for exactly what it does and does not do.
Safety templates are starters, not comprehensive coverage. 25 curated prompts across 6 categories. For serious red-teaming look at garak or PyRIT. Bring your own templates via templates.toml.
Cost tracking uses hardcoded rate tables. Fine for rough estimates; won't reflect batching discounts, prompt caching, or provider price changes. Reconcile against invoices for finance.
Auto-instrumentation is opt-in. promptry.integrations.openai and .litellm wrap clients automatically; otherwise you add track() manually. Explicit by default.
No hosted multi-user UI. For that, look at LangSmith or Arize.

License

MIT

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

bihanikeshav

These details have not been verified by PyPI

Project links

Homepage

Release history Release notifications | RSS feed

This version

0.7.0

Apr 17, 2026

0.6.1

Apr 17, 2026

0.6.0

Apr 12, 2026

0.5.0

Apr 12, 2026

0.4.0

Apr 10, 2026

0.3.0

Mar 18, 2026

0.1.0

Mar 4, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

promptry-0.7.0.tar.gz (297.1 kB view details)

Uploaded Apr 17, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

promptry-0.7.0-py3-none-any.whl (266.9 kB view details)

Uploaded Apr 17, 2026 Python 3

File details

Details for the file promptry-0.7.0.tar.gz.

File metadata

Download URL: promptry-0.7.0.tar.gz
Upload date: Apr 17, 2026
Size: 297.1 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for promptry-0.7.0.tar.gz
Algorithm	Hash digest
SHA256	`dc2a37aaa75392b51612bb5e67a6ce79d881e639df23fd8f12a079a9c282e2f6`
MD5	`c7ccac7e6e93abe810a1016fb0772ee3`
BLAKE2b-256	`74fa93dab82613d0f4166d549a647404d80b9e3e523a5bb1afb242df9f89ab81`

See more details on using hashes here.

Provenance

The following attestation bundles were made for promptry-0.7.0.tar.gz:

Publisher: publish-pypi.yml on bihanikeshav/promptry

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: promptry-0.7.0.tar.gz
- Subject digest: dc2a37aaa75392b51612bb5e67a6ce79d881e639df23fd8f12a079a9c282e2f6
- Sigstore transparency entry: 1332809910
- Sigstore integration time: Apr 17, 2026
Source repository:
- Permalink: bihanikeshav/promptry@7da4e6c2efccdd0d49129cf4ff06f3e4589017d9
- Branch / Tag: refs/tags/v0.7.0
- Owner: https://github.com/bihanikeshav
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish-pypi.yml@7da4e6c2efccdd0d49129cf4ff06f3e4589017d9
- Trigger Event: release

File details

Details for the file promptry-0.7.0-py3-none-any.whl.

File metadata

Download URL: promptry-0.7.0-py3-none-any.whl
Upload date: Apr 17, 2026
Size: 266.9 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for promptry-0.7.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`e0bf986cbf67e28d18bfe16f6433f473474c8755b39e417cac341e6f76d5b871`
MD5	`dcba3a4a0c5e6086bdf0e8f540eb6a2f`
BLAKE2b-256	`bac1871eb6c1382b484d04febac16cedf499276f12b6a9d7fc128f1cb492e164`

See more details on using hashes here.

Provenance

The following attestation bundles were made for promptry-0.7.0-py3-none-any.whl:

Publisher: publish-pypi.yml on bihanikeshav/promptry

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: promptry-0.7.0-py3-none-any.whl
- Subject digest: e0bf986cbf67e28d18bfe16f6433f473474c8755b39e417cac341e6f76d5b871
- Sigstore transparency entry: 1332810058
- Sigstore integration time: Apr 17, 2026
Source repository:
- Permalink: bihanikeshav/promptry@7da4e6c2efccdd0d49129cf4ff06f3e4589017d9
- Branch / Tag: refs/tags/v0.7.0
- Owner: https://github.com/bihanikeshav
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish-pypi.yml@7da4e6c2efccdd0d49129cf4ff06f3e4589017d9
- Trigger Event: release

promptry 0.7.0

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

promptry

Install

Quick start

Features

Dashboard

How it differs

GitHub Action

MCP server

Documentation

Honest caveats

License

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance