ADRA — Adversarial Dev Review Agent: a deterministic-first, adversarial-validation engine for the software lifecycle (PR review, validation/refutation experiments, auto-docs, human escalation).
Project description
ADRA — Adversarial Dev Review Agent
A client-agnostic, deterministic-first, adversarial-validation engine that supports the software lifecycle: it reviews PRs, designs and runs validation/refutation experiments, writes documentation back, and escalates to a human exactly where a senior engineer would. Governed by LLM-as-Judge + a blocking adversarial critic, grounded by deterministic tools, with immutable provenance. Runs offline with no API key.
pip install adra · Python ≥ 3.11 · Apache-2.0 · status: v0.04.000
What's different
The AI-code-review market splits in two, and both miss the same spot:
- Reviewers (CodeRabbit, Greptile, Qodo, Korbit, Sourcery, Bito) feed linters into an LLM, but the model's prose is the verdict — hallucinated and "consistently-stated- but-false" findings leak through; the deterministic signals are inputs, never the gate.
- Autonomous coders (Devin, OpenHands, SWE-agent, Sweep) write code and treat "tests pass" as success rather than adversarially trying to prove the change wrong.
ADRA occupies the gap: a deterministic spine (git / CI / static analysis / SQL probes) that grounds a blocking adversarial critic whose job is to refute, not bless, each artifact — every finding carrying its evidence, with disciplined human escalation when nothing deterministic backs the verdict. Existing tools generate opinions; ADRA generates proofs and refutations, and escalates when it can't.
The six capabilities
| Skill | What it does |
|---|---|
code_review |
Review a diff: language/leak scan + test-discoverability + exact CI command + semantic findings |
pr_eval |
Evaluate a PR: merge-base health → bundle validate → conformance → verdict + PR body |
experiment |
Hypothesis-driven validation experiment: SQL-warehouse probes + synthesis |
improve |
Minimum-functional improvement proposal (prune filler, smallest safe diff) |
document |
Turn a run record into a PR page / experiment page / methodology-history row |
decide |
Route analysis: candidate routes + trade-offs + recommendation — human-owned |
Each skill is the same loop, differing only by its domain prompt and deterministic tools.
Why deterministic-first
Tools (git, the exact CI command, bundle validate, language scan, SQL probe) run
first and become both the grounding the model may not contradict and the evidence in
the provenance log. Because the deterministic floor carries the verdict, the whole loop
runs — and the test suite passes — offline with no API key. Connecting a real provider
adds the semantic layer on top.
Architecture
intake ─▶ plan ─▶ ground (deterministic tools) ─▶ generate ─▶ CRITIC ─┐
▲ revise ◀──────┘
└── accepted / escalate ─▶ artifacts + run record
adra/state.py— the typed domain model (Severity,Finding,ToolResult,CriticVerdict,RunState). One contract end to end.adra/rubric.py— the shared adversarial rubric (criteria as typed data); drives both the deterministic critic and the critic prompt, so "what we check" never drifts.adra/orchestrator.py— the hand-rolled, framework-free state machine.adra/critic.py— deterministic red-team pass (rubric-driven) + LLM semantic attacks.adra/judge.py— rubric scoring with swap-and-average + reference anchoring.adra/llm.py— the tiny ADRA-ownedChatModelseam:mock(offline) | any real provider via pydantic-ai (provider:model, config-only). No LangChain/LangGraph.adra/tools/— each returns aToolResult(git / CI / bundle / lang / discovery / sql).adra/skills/— theSkillbase + the six skills.adra/clients/— client governance suites (the bundled fictional Northwind Data Platform); selectable viaADRA_CLIENT_DIR.adra/provenance.py— the immutable run record (the deep change-history layer).cli/— theadracommand.refs/— annotated bibliography + papers.docs/— deep docs + engine ADRs (docs/adr/).
Quickstart (offline, no key)
python -m venv .venv && . .venv/Scripts/activate # Windows: .venv\Scripts\activate
pip install -e ".[dev]"
pytest -q # 11 passing, fully offline
python scripts/demo_offline.py # end-to-end demo, all six skills
Expected: the stale-base PR is blocked + escalated (12 commits behind, a notebook
deletion, a .yml → .yml.t rename dropping a bundle resource, bundle validate failing);
the language/leak review is blocked; the clean experiment / improve / document / decide
runs are accepted.
adra review path/to.diff --ci-command 'python -m coverage run -m unittest discover -s . -p "test*.py"'
adra decide "Raise the refresh cadence" "edit the shared CI template" "change it in the owning repo"
Enable a real provider
pip install -e ".[llm]" # pydantic-ai; the seam is provider-agnostic
export ANTHROPIC_API_KEY=... # or put it in .env (any provider's key works)
export ADRA_PROVIDER=anthropic # adding a provider is config only: ADRA_PROVIDER / ADRA_MODEL / ADRA_MODEL_<ROLE>
adra pr-eval --source task/123/x --repo /path/to/repo --external
--external (or ADRA_ALLOW_EXTERNAL=1) lets the tools actually run git / the CI command.
Default is dry-run / read-only.
Client-agnostic grounding
A client = a governance suite (conventions, ADRs, CI standards, glossary, incident cases)
the engine grounds on. ADRA ships a complete, fictional client — Northwind Data
Platform — under adra/clients/synthetic/northwind/. Point ADRA at any client:
export ADRA_CLIENT_DIR=/path/to/your/standards # or Settings(client_dir=...)
The rubric references the suite by id and the prompts cite it — the engine code does not change per client.
Connectors & emulator
The engine grounds through one connector Protocol so the same skills run against a real
platform or a synthetic one:
- Real: GitHub (PRs/reviews/issues/contents via a thin
httpxREST v3 client), Azure DevOps (REST 7.1), Databricks (databricks-sdk+ bundles CLI), Azure (azure-identity+ monitor / health). - Emulator: a self-contained platform (synthetic git repos + PRs + wiki + boards + CI + a SQLite warehouse) so the full flow runs offline.
The GitHub, Azure DevOps, Databricks, and Azure connectors are implemented; the offline emulator runs the full flow with no external calls. Each is enabled by its
pip install adra[...]extra and exercised read-only by default.
Security model
Deterministic floor (tools are ground truth; the LLM cannot overturn a blocker) · read-only
by default (writes require --external and explicit human confirmation) · human gates on
PR create / push / merge and any risk claim · English-only + AI-authorship-leak scan on
anything written to disk · immutable provenance for every run. The agent reads untrusted
repo/PR/issue content — a dual-LLM / capability split + sandboxed, egress-filtered execution are
planned for the connector phase (not yet implemented; OWASP LLM/Agentic Top-10).
Two-repo layout
ADRA is the public-destined OSS engine (this repo) — no secrets, ever. A separate private ADRA Console (a private web app + backend) consumes this engine for experiments and real connections behind access control. The engine is the serious tool you can run anywhere with your own tokens; the console is the connected instance.
Extending
- New criterion: add a
RubricItemtoadra/rubric.py(it shows up in the critic prompt and, ifkind="deterministic", wire its check incritic.py). - New capability: add a
Skillsubclass + aprompts/<skill>.md, register it inadra/skills/__init__.py, add aNode. - New tool: a function returning a
ToolResult; call it from a skill'sground. - New provider: config only — set
ADRA_PROVIDER/ADRA_MODEL/ADRA_MODEL_<ROLE>(pydantic-ai resolves theprovider:model); no new code.
Status
v0.04.000 — the engine is complete and green offline, and the GitHub / Azure DevOps / Databricks
/ Azure connectors plus the offline emulator are implemented. Multi-industry synthetic clients and
the web console are the next phases. Stays 0.x while connectors are partly untested-live.
License
Apache-2.0 — see LICENSE.
References
refs/ holds an annotated bibliography (refs/README.md) + BibTeX (refs/references.bib)
- the core papers (ReAct, Reflexion, Self-Refine, Constitutional AI, LLM-as-judge bias, agentic security / CaMeL, NIST AI RMF, OWASP LLM/Agentic, provenance, code-review agents).
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file adra-0.4.0.tar.gz.
File metadata
- Download URL: adra-0.4.0.tar.gz
- Upload date:
- Size: 75.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
9d4da1bde7905010e8c1ffad11cda2b9dfc9cc93331972c6a8f96b7c54280636
|
|
| MD5 |
04cbb902f1311084e6ae8d8b20567efb
|
|
| BLAKE2b-256 |
b6127cde0e7033bd6b74e99c891dcc54a74ecf61be44716995e942f438a93858
|
Provenance
The following attestation bundles were made for adra-0.4.0.tar.gz:
Publisher:
publish-pypi.yml on fsantibanezleal/ADRA
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
adra-0.4.0.tar.gz -
Subject digest:
9d4da1bde7905010e8c1ffad11cda2b9dfc9cc93331972c6a8f96b7c54280636 - Sigstore transparency entry: 1968569808
- Sigstore integration time:
-
Permalink:
fsantibanezleal/ADRA@98b81c7404063dac612e4b5e482555400e9a118b -
Branch / Tag:
refs/tags/v0.04.000 - Owner: https://github.com/fsantibanezleal
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish-pypi.yml@98b81c7404063dac612e4b5e482555400e9a118b -
Trigger Event:
release
-
Statement type:
File details
Details for the file adra-0.4.0-py3-none-any.whl.
File metadata
- Download URL: adra-0.4.0-py3-none-any.whl
- Upload date:
- Size: 91.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
eb79edb672c502e215014baf08457e1d99f1a8564b33f4c1e801d3307a63f2bc
|
|
| MD5 |
cf4dd1efd0de119a4c7b65364bbad774
|
|
| BLAKE2b-256 |
f88ac0e1faebc9d06c7c70b712b00ab99cf66373d9afae93171ae5d253b1000a
|
Provenance
The following attestation bundles were made for adra-0.4.0-py3-none-any.whl:
Publisher:
publish-pypi.yml on fsantibanezleal/ADRA
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
adra-0.4.0-py3-none-any.whl -
Subject digest:
eb79edb672c502e215014baf08457e1d99f1a8564b33f4c1e801d3307a63f2bc - Sigstore transparency entry: 1968569881
- Sigstore integration time:
-
Permalink:
fsantibanezleal/ADRA@98b81c7404063dac612e4b5e482555400e9a118b -
Branch / Tag:
refs/tags/v0.04.000 - Owner: https://github.com/fsantibanezleal
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish-pypi.yml@98b81c7404063dac612e4b5e482555400e9a118b -
Trigger Event:
release
-
Statement type: