Skip to main content

Who guards the agents? A framework for orchestrating AI coding agents through verified implementation phases.

Project description

Juvenal

Quis custodiet ipsos custodes? — Who guards the agents?

Juvenal is a framework for orchestrating AI coding agents through verified implementation phases. It prevents agents from cheating on success criteria by separating implementation from verification, helps agents implement complex projects in phases, etc.

How It Works

A non-agentic Python script orchestrates AI coding agents (Claude or Codex) through alternating steps:

  1. Implementation — an agent executes a prompt to build/modify code
  2. Verification — separate checkers (scripts, agents, or both) verify the work
  3. Bounce — if verification fails, the pipeline bounces back (to a configurable target phase or the most recent implement phase) with failure context injected. A global bounce limit (max_bounces) prevents infinite loops.

The implementing agent and the checking agent are separate processes, so the implementer can't cheat by weakening tests.

Other Such Frameworks

Juvenal is conceptually similar to ralph, but it works slightly better for my exact purposes and reinventing the wheel is cheap now!

Install

pip install -e ".[dev]"

Claude Code Skill

Juvenal ships as a Claude Code plugin, so you can use it directly from Claude Code with /juvenal.

Install the plugin

From the marketplace (pending approval):

/plugin install juvenal

From source (works now):

claude --plugin-dir /path/to/juvenal/plugin

Usage

Once installed, invoke the skill in Claude Code:

/juvenal add authentication to the Flask app

Claude will create a Juvenal workflow for your goal and run it. You can also ask for help with workflow formats or run existing workflows.

Quick Start

# Scaffold a workflow
juvenal init my-project

# Run a workflow
juvenal run workflow.yaml

# Generate a workflow from a goal
juvenal plan "implement a REST API with tests" -o workflow.yaml

# Plan and immediately run
juvenal do "add authentication to the Flask app"

Workflow Formats

YAML

name: "my-workflow"
backend: claude
max_bounces: 999

phases:
  - id: implement
    prompt: "Implement the feature."
    checkers:
      - type: script
        run: "pytest tests/ -x"
      - type: agent
        role: tester

Directory Convention

my-workflow/
  phases/
    01-setup/
      prompt.md            # implementation prompt
      check-build.sh       # script checker (exit 0 = pass)
      check-quality.md     # agent checker
    02-implement/
      prompt.md
      check-tests.sh       # paired with .md = composite
      check-tests.md       # gets {script_output} injected

Bare Markdown

phases/
  01-setup.md              # single phase, default tester checker

Checker Types

Type Description
script Shell command; exit 0 = PASS, nonzero = FAIL
agent AI agent that emits VERDICT: PASS or VERDICT: FAIL: reason
composite Script runs first, output fed to agent via {script_output}

Built-in Roles

Agent checkers can use built-in verification personas:

  • tester — runs tests, checks for build errors
  • architect — validates design, checks for circular dependencies
  • pm — confirms requirements are met, no TODOs remain
  • senior-tester — checks test integrity, looks for cheating
  • senior-engineer — reviews code quality, completeness, security

CLI

juvenal run <workflow> [--resume] [--phase X] [--max-retries N] [--backend claude|codex] [--dry-run]
juvenal plan "goal" [-o output.yaml] [--backend claude|codex]
juvenal do "goal" [--backend claude|codex] [--max-retries N]
juvenal status [--state-file path]
juvenal init [directory] [--template name]

License

MIT

Project details


Release history Release notifications | RSS feed

This version

0.6.0

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

juvenal-0.6.0.tar.gz (47.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

juvenal-0.6.0-py3-none-any.whl (47.7 kB view details)

Uploaded Python 3

File details

Details for the file juvenal-0.6.0.tar.gz.

File metadata

  • Download URL: juvenal-0.6.0.tar.gz
  • Upload date:
  • Size: 47.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for juvenal-0.6.0.tar.gz
Algorithm Hash digest
SHA256 1c8645b4905219b54ce304fd007e5b071b142aff298f381685bedbdb39bdef0d
MD5 4048bab1f02d47d38eb8cfd2ccceb876
BLAKE2b-256 bf0a3f1ce2ef6680672a4332309273c9b5a67393851e0a7ec6a80b649021c489

See more details on using hashes here.

File details

Details for the file juvenal-0.6.0-py3-none-any.whl.

File metadata

  • Download URL: juvenal-0.6.0-py3-none-any.whl
  • Upload date:
  • Size: 47.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for juvenal-0.6.0-py3-none-any.whl
Algorithm Hash digest
SHA256 5355fc67ed754124269b4c5b0084552d9166d3958a7908ec8b9fb2a26a84451b
MD5 29e9d6a12a9c502483c3a107c1d74e82
BLAKE2b-256 08cd02b4f58072b8b7630c82da6aa1a71fcf8b1f42aa5579f5f11c12a02b49a0

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page