Skip to main content

ADRA — Adversarial Dev Review Agent: a deterministic-first, adversarial-validation engine for the software lifecycle (PR review, validation/refutation experiments, auto-docs, human escalation).

Project description

ADRA — Adversarial Dev Review Agent

A client-agnostic, deterministic-first, adversarial-validation engine that supports the software lifecycle: it reviews PRs, designs and runs validation/refutation experiments, writes documentation back, and escalates to a human exactly where a senior engineer would. Governed by LLM-as-Judge + a blocking adversarial critic, grounded by deterministic tools, with immutable provenance. Runs offline with no API key.

pip install adra · Python ≥ 3.11 · Apache-2.0 · status: v0.04.000


What's different

The AI-code-review market splits in two, and both miss the same spot:

  • Reviewers (CodeRabbit, Greptile, Qodo, Korbit, Sourcery, Bito) feed linters into an LLM, but the model's prose is the verdict — hallucinated and "consistently-stated- but-false" findings leak through; the deterministic signals are inputs, never the gate.
  • Autonomous coders (Devin, OpenHands, SWE-agent, Sweep) write code and treat "tests pass" as success rather than adversarially trying to prove the change wrong.

ADRA occupies the gap: a deterministic spine (git / CI / static analysis / SQL probes) that grounds a blocking adversarial critic whose job is to refute, not bless, each artifact — every finding carrying its evidence, with disciplined human escalation when nothing deterministic backs the verdict. Existing tools generate opinions; ADRA generates proofs and refutations, and escalates when it can't.

The six capabilities

Skill What it does
code_review Review a diff: language/leak scan + test-discoverability + exact CI command + semantic findings
pr_eval Evaluate a PR: merge-base health → bundle validate → conformance → verdict + PR body
experiment Hypothesis-driven validation experiment: SQL-warehouse probes + synthesis
improve Minimum-functional improvement proposal (prune filler, smallest safe diff)
document Turn a run record into a PR page / experiment page / methodology-history row
decide Route analysis: candidate routes + trade-offs + recommendation — human-owned

Each skill is the same loop, differing only by its domain prompt and deterministic tools.

Why deterministic-first

Tools (git, the exact CI command, bundle validate, language scan, SQL probe) run first and become both the grounding the model may not contradict and the evidence in the provenance log. Because the deterministic floor carries the verdict, the whole loop runs — and the test suite passes — offline with no API key. Connecting a real provider adds the semantic layer on top.

Architecture

intake ─▶ plan ─▶ ground (deterministic tools) ─▶ generate ─▶ CRITIC ─┐
                                                      ▲   revise ◀──────┘
                                                      └── accepted / escalate ─▶ artifacts + run record
  • adra/state.py — the typed domain model (Severity, Finding, ToolResult, CriticVerdict, RunState). One contract end to end.
  • adra/rubric.py — the shared adversarial rubric (criteria as typed data); drives both the deterministic critic and the critic prompt, so "what we check" never drifts.
  • adra/orchestrator.py — the hand-rolled, framework-free state machine.
  • adra/critic.py — deterministic red-team pass (rubric-driven) + LLM semantic attacks.
  • adra/judge.py — rubric scoring with swap-and-average + reference anchoring.
  • adra/llm.py — the tiny ADRA-owned ChatModel seam: mock (offline) | any real provider via pydantic-ai (provider:model, config-only). No LangChain/LangGraph.
  • adra/tools/ — each returns a ToolResult (git / CI / bundle / lang / discovery / sql).
  • adra/skills/ — the Skill base + the six skills.
  • adra/clients/ — client governance suites (the bundled fictional Northwind Data Platform); selectable via ADRA_CLIENT_DIR.
  • adra/provenance.py — the immutable run record (the deep change-history layer).
  • cli/ — the adra command. refs/ — annotated bibliography + papers. docs/ — deep docs + engine ADRs (docs/adr/).

Quickstart (offline, no key)

python -m venv .venv && . .venv/Scripts/activate     # Windows: .venv\Scripts\activate
pip install -e ".[dev]"
pytest -q                                            # 11 passing, fully offline
python scripts/demo_offline.py                       # end-to-end demo, all six skills

Expected: the stale-base PR is blocked + escalated (12 commits behind, a notebook deletion, a .yml → .yml.t rename dropping a bundle resource, bundle validate failing); the language/leak review is blocked; the clean experiment / improve / document / decide runs are accepted.

adra review path/to.diff --ci-command 'python -m coverage run -m unittest discover -s . -p "test*.py"'
adra decide "Raise the refresh cadence" "edit the shared CI template" "change it in the owning repo"

Enable a real provider

pip install -e ".[llm]"                  # pydantic-ai; the seam is provider-agnostic
export ANTHROPIC_API_KEY=...             # or put it in .env (any provider's key works)
export ADRA_PROVIDER=anthropic           # adding a provider is config only: ADRA_PROVIDER / ADRA_MODEL / ADRA_MODEL_<ROLE>
adra pr-eval --source task/123/x --repo /path/to/repo --external

--external (or ADRA_ALLOW_EXTERNAL=1) lets the tools actually run git / the CI command. Default is dry-run / read-only.

Client-agnostic grounding

A client = a governance suite (conventions, ADRs, CI standards, glossary, incident cases) the engine grounds on. ADRA ships a complete, fictional client — Northwind Data Platform — under adra/clients/synthetic/northwind/. Point ADRA at any client:

export ADRA_CLIENT_DIR=/path/to/your/standards   # or Settings(client_dir=...)

The rubric references the suite by id and the prompts cite it — the engine code does not change per client.

Connectors & emulator

The engine grounds through one connector Protocol so the same skills run against a real platform or a synthetic one:

  • Real: GitHub (PRs/reviews/issues/contents via a thin httpx REST v3 client), Azure DevOps (REST 7.1), Databricks (databricks-sdk + bundles CLI), Azure (azure-identity + monitor / health).
  • Emulator: a self-contained platform (synthetic git repos + PRs + wiki + boards + CI + a SQLite warehouse) so the full flow runs offline.

The GitHub, Azure DevOps, Databricks, and Azure connectors are implemented; the offline emulator runs the full flow with no external calls. Each is enabled by its pip install adra[...] extra and exercised read-only by default.

Security model

Deterministic floor (tools are ground truth; the LLM cannot overturn a blocker) · read-only by default (writes require --external and explicit human confirmation) · human gates on PR create / push / merge and any risk claim · English-only + AI-authorship-leak scan on anything written to disk · immutable provenance for every run. The agent reads untrusted repo/PR/issue content — a dual-LLM / capability split + sandboxed, egress-filtered execution are planned for the connector phase (not yet implemented; OWASP LLM/Agentic Top-10).

Two-repo layout

ADRA is the public-destined OSS engine (this repo) — no secrets, ever. A separate private ADRA Console (a private web app + backend) consumes this engine for experiments and real connections behind access control. The engine is the serious tool you can run anywhere with your own tokens; the console is the connected instance.

Extending

  • New criterion: add a RubricItem to adra/rubric.py (it shows up in the critic prompt and, if kind="deterministic", wire its check in critic.py).
  • New capability: add a Skill subclass + a prompts/<skill>.md, register it in adra/skills/__init__.py, add a Node.
  • New tool: a function returning a ToolResult; call it from a skill's ground.
  • New provider: config only — set ADRA_PROVIDER / ADRA_MODEL / ADRA_MODEL_<ROLE> (pydantic-ai resolves the provider:model); no new code.

Status

v0.04.000 — the engine is complete and green offline, and the GitHub / Azure DevOps / Databricks / Azure connectors plus the offline emulator are implemented. Multi-industry synthetic clients and the web console are the next phases. Stays 0.x while connectors are partly untested-live.

License

Apache-2.0 — see LICENSE.

References

refs/ holds an annotated bibliography (refs/README.md) + BibTeX (refs/references.bib)

  • the core papers (ReAct, Reflexion, Self-Refine, Constitutional AI, LLM-as-judge bias, agentic security / CaMeL, NIST AI RMF, OWASP LLM/Agentic, provenance, code-review agents).

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

adra-0.4.0.tar.gz (75.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

adra-0.4.0-py3-none-any.whl (91.0 kB view details)

Uploaded Python 3

File details

Details for the file adra-0.4.0.tar.gz.

File metadata

  • Download URL: adra-0.4.0.tar.gz
  • Upload date:
  • Size: 75.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for adra-0.4.0.tar.gz
Algorithm Hash digest
SHA256 9d4da1bde7905010e8c1ffad11cda2b9dfc9cc93331972c6a8f96b7c54280636
MD5 04cbb902f1311084e6ae8d8b20567efb
BLAKE2b-256 b6127cde0e7033bd6b74e99c891dcc54a74ecf61be44716995e942f438a93858

See more details on using hashes here.

Provenance

The following attestation bundles were made for adra-0.4.0.tar.gz:

Publisher: publish-pypi.yml on fsantibanezleal/ADRA

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file adra-0.4.0-py3-none-any.whl.

File metadata

  • Download URL: adra-0.4.0-py3-none-any.whl
  • Upload date:
  • Size: 91.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for adra-0.4.0-py3-none-any.whl
Algorithm Hash digest
SHA256 eb79edb672c502e215014baf08457e1d99f1a8564b33f4c1e801d3307a63f2bc
MD5 cf4dd1efd0de119a4c7b65364bbad774
BLAKE2b-256 f88ac0e1faebc9d06c7c70b712b00ab99cf66373d9afae93171ae5d253b1000a

See more details on using hashes here.

Provenance

The following attestation bundles were made for adra-0.4.0-py3-none-any.whl:

Publisher: publish-pypi.yml on fsantibanezleal/ADRA

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page