An agent regression firewall: replay saved agent traces and flag regressions by checking requirements, not text diffs. PASS / FAIL / UNCERTAIN.

These details have not been verified by PyPI

Project links

Project description

reqfence

An agent regression firewall. When you change a prompt, model, or tool, reqfence replays saved agent traces and flags regressions by checking whether outputs still satisfy their requirements — not by text-diffing. Every check returns PASS / FAIL / UNCERTAIN.

Standalone package. Does not depend on ariadx or fie-sdk: the requirement critic + trace schema are vendored and dependency-cleaned. Milestone 1 (CLI).

Why two tiers

Validated by the Milestone 0 derisking experiment (🟢 GREEN): text-diff can't tell a harmless reword from a confidently-wrong answer. reqfence uses two tiers over developer-declared requirements:

Each declared requirement has exactly one owner, decided by decidability:

Tier	Owns	Role
Deterministic (`checks.py`)	every checkable item (JSON-valid, field-present, tool-called, word-count, …)	Primary hard gate, ~100% precision by construction
Semantic (`semantic.py`)	only the uncheckable items (factual correctness)	Catches confidently-wrong outputs; abstains (UNCERTAIN) when the judge isn't unanimous

The semantic judge is never asked to grade a checkable item — that alone removed the false alarms an earlier "grade everything" design produced (the LLM can't reliably count words). See RESULTS.md.

Final verdict (engine.py, schema.combine): each requirement resolves to one PASS/FAIL/UNCERTAIN; the candidate FAILs if any requirement fails, PASSes iff all pass, else UNCERTAIN. A semantic UNCERTAIN never fails the build; a deterministic FAIL always does.

Install

pip install -e ".[groq]"     # or ".[anthropic]"; core installs with just pydantic+click

Python ≥ 3.11 (uses stdlib tomllib).

The three commands

`reqfence init`

Scaffolds reqfence.toml + empty fixtures.jsonl / candidates.jsonl.

`reqfence record` — save a baseline

Stores a frozen baseline trace + its developer-declared requirement checklist. Ingests an already-captured trace (it does not execute an agent):

# requirements.json: [{"id":"json","desc":"valid JSON","check":{"type":"valid_json"}}, ...]
reqfence record --id weather --task "Return weather as JSON" \
  --requirements requirements.json --from-trace baseline_trace.json
# or convert a framework trace:
reqfence record --id t1 --task "..." --requirements reqs.json --from-langgraph messages.json
reqfence record --id t1 --task "..." --requirements reqs.json --from-openai steps.json --openai-format run_steps

`reqfence check` — gate a change

Replays candidate traces against baselines, runs both tiers, prints a per-requirement table, and exits non-zero if any FAIL (UNCERTAIN does not):

reqfence check                       # uses paths from reqfence.toml
reqfence check --no-semantic         # deterministic gate only (no API key needed)

The semantic tier runs only when enabled and a key is in the environment (GROQ_API_KEY / ANTHROPIC_API_KEY). Keys are read from the environment only; check will also read a nearby .env for convenience but never prints or writes it.

Requirement checks (catalog)

Core six (the reliable gate, unit-tested for precision): valid_json, contains_substring (+ regex), max_words, contains_field, tool_called, no_tool_error. Extended (thin, tested): min_words, min_sources, json_array_len, file_written. Special: semantic — always abstains deterministically; only the LLM tier judges it.

Fixtures format

Versioned JSONL, one record per line (fixtures.jsonl = baselines + checklists, candidates.jsonl = labeled candidate traces). The Milestone 0 benchmark is migrated in under fixtures/ via python scripts/migrate_m0.py. The format is a first-class artifact designed to grow.

Tests

pip install -e ".[dev]" && pytest      # 26 tests: checks, union/abstention, fixtures, CLI

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.1.0

Jul 3, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

reqfence-0.1.0.tar.gz (29.5 kB view details)

Uploaded Jul 3, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

reqfence-0.1.0-py3-none-any.whl (26.7 kB view details)

Uploaded Jul 3, 2026 Python 3

File details

Details for the file reqfence-0.1.0.tar.gz.

File metadata

Download URL: reqfence-0.1.0.tar.gz
Upload date: Jul 3, 2026
Size: 29.5 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.0

File hashes

Hashes for reqfence-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`0119255eca8dd9480ecc3e8f7ce30b5b9d6f77673278c0343e33da779b261528`
MD5	`6271e34eaede0ffd80bc7505348b40c2`
BLAKE2b-256	`cd090cc4663a28982c54e18514fc8dca085e8160e1415652606348ec2cc689e0`

See more details on using hashes here.

File details

Details for the file reqfence-0.1.0-py3-none-any.whl.

File metadata

Download URL: reqfence-0.1.0-py3-none-any.whl
Upload date: Jul 3, 2026
Size: 26.7 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.0

File hashes

Hashes for reqfence-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`397c16bbceed2ff7191355499df7e01485e13721a46cb9491d8b8330360065fc`
MD5	`e515c97ee8641a0683dae58fac7085c2`
BLAKE2b-256	`7a46e82bd8ff7a330e4ba33c754b93f887e92e778366953fe1cb12d39c65275c`

See more details on using hashes here.

reqfence 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

reqfence

Why two tiers

Install

The three commands

`reqfence init`

`reqfence record` — save a baseline

`reqfence check` — gate a change

Requirement checks (catalog)

Fixtures format

Tests

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

reqfence 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

reqfence

Why two tiers

Install

The three commands

reqfence init

reqfence record — save a baseline

reqfence check — gate a change

Requirement checks (catalog)

Fixtures format

Tests

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

`reqfence init`

`reqfence record` — save a baseline

`reqfence check` — gate a change