AI agent orchestration framework — build, share, and run YAML pipelines
Project description
aqm
An orchestration framework where multiple AI agents pass tasks through explicit queues in sequence.
Build pipelines in YAML. Share them with anyone. Run them locally.
reject critical
┌──────────────────────┐ ┌──────────────────┐
▼ │ ▼ │
[planner] ──► [reviewer] ──┴─approve──► [developer] ──► [qa]
Multi-LLM Support
aqm supports multiple LLM providers in the same pipeline. Mix and match runtimes per agent — use the best model for each role. All runtimes use CLI subprocess calls only (no API keys or SDK setup needed).
| Runtime | Provider | CLI | Install |
|---|---|---|---|
claude |
Anthropic | claude CLI (auto-selects text or code mode) |
npm i -g @anthropic-ai/claude-code && claude login |
gemini |
gemini -p <prompt> (headless mode) |
npm i -g @google/gemini-cli |
|
codex |
OpenAI | codex exec <prompt> (non-interactive) |
npm i -g @openai/codex |
Claude auto-mode: When an agent has
mcpservers orclaude_code_flags, Claude automatically runs in Code mode (full tool access). Otherwise it runs in text-only mode (fast, no tools).
Example — Multi-provider pipeline:
agents:
- id: planner
runtime: gemini # Gemini for fast planning
model: gemini-2.5-flash
system_prompt: "Plan the work: {{ input }}"
handoffs:
- to: developer
condition: always
- id: developer
runtime: claude # Auto Code mode (has mcp)
mcp:
- server: github
system_prompt: "Implement: {{ input }}"
handoffs:
- to: reviewer
condition: always
- id: reviewer
runtime: codex # Codex for code review
model: o4-mini
system_prompt: "Review: {{ input }}"
Note:
runtimeis a required field — every agent must explicitly declare which provider to use.
Why aqm?
Multi-agent frameworks have come in two flavors:
- Code-based (LangGraph, AutoGen) — hard to share pipelines
- Cloud platforms (CrewAI Studio, Vertex AI) — costly, vendor lock-in
aqm takes a different path:
Declare pipelines in a single YAML file. Anyone can create them, anyone can use them — an open ecosystem.
# Use a community-built pipeline instantly
aqm pull software-dev-pipeline
# Share your pipeline with the ecosystem (creates a PR)
aqm publish --name my-pipeline
| Existing approach | aqm |
|---|---|
| Single agent does everything | Role separation per agent, handoff via queues |
| Direct calls between agents | Queue is the contract — loose coupling |
| Pipelines defined in code | Declared in YAML, shareable |
| Context lives only in memory | File-based context (context.md) — human-inspectable |
| Results always accepted | approve/reject gate is a first-class feature |
| External infrastructure required | SQLite default, runs after pip install |
Install
pip install aqm
Requires Python 3.11+.
Quick Start
# Initialize in an existing project
cd my-project
aqm init
# Choose your setup method:
# [1] AI-generate — describe what you want, select a model, Claude builds the YAML
# [2] Create default template — basic planner→executor pipeline
# [3] Pull from registry — install a community pipeline
# Run a pipeline
aqm run "Add JWT authentication to login"
# Open the dashboard
aqm serve
# → http://localhost:8000
No Redis. No Docker. No cloud account. Just SQLite.
How It Works
1. Declare agents and connections in YAML
# .aqm/agents.yaml
agents:
- id: planner
name: Planning Agent
model: claude-opus-4-6
runtime: claude
mcp:
- server: github
- server: filesystem
system_prompt: |
You are a software planner.
Analyze the requirements and write a detailed specification.
handoffs:
- to: reviewer
task: review_spec
condition: always
- id: reviewer
name: Review Agent
model: claude-opus-4-6
runtime: claude
system_prompt: |
Review the specification. Decide approve or reject.
If rejecting, always include the reason.
handoffs:
- to: developer
task: implement
condition: on_approve
- to: planner
task: revise_spec
condition: on_reject
payload: "{{ output }}\nREJECT_REASON: {{ reject_reason }}"
- id: developer
name: Development Agent
runtime: claude
mcp:
- server: github
- server: postgres
handoffs:
- to: qa
task: test_implementation
condition: always
- id: qa
name: QA Agent
runtime: claude
mcp:
- server: browsertools
- server: sentry
handoffs:
- to: planner
task: rethink_spec
condition: "severity == critical"
- to: developer
task: fix_bugs
condition: "severity in [major, minor]"
2. Tasks move between agents through queues
Every task gets a unique ID (T-A3F2B1). Each agent picks up its assigned task from the queue, processes it, and passes the result to the next agent.
T-A3F2B1 created
→ [planner] writes specification
→ [reviewer] REJECTED — "missing security requirements"
→ [planner] revises spec (with reject reason)
→ [reviewer] APPROVED
→ [developer] implements code
→ [qa] tests pass → done
Handoffs support three routing strategies:
- Static — fixed target (
to: reviewer) - Fan-out — multiple targets in parallel (
to: qa, docs, deploy) - Agent-decided — agent chooses at runtime (
condition: auto+HANDOFF: agent_idin output)
3. Context accumulates in files
Each time a task passes through an agent, the result is appended to context.md. The next agent reads this file to understand the full history.
.aqm/tasks/T-A3F2B1/
├── context.md ← Full history (human-readable)
├── stage_01_planner.md
├── stage_02_reviewer.md
└── current_payload.md
4. MCP connects agents to the outside world
Attach MCP servers to any agent so it can act, not just generate text.
| Agent | MCP Connection | Capabilities |
|---|---|---|
| Developer | github, filesystem | Read/write files, create PRs |
| Designer | figma, slack | Read designs, send notifications |
| QA | browsertools, sentry | E2E browser tests, error lookup |
| Analyst | postgres, bigquery | Direct DB queries |
CLI Reference
aqm init
Initialize .aqm/ in the current project directory with an interactive setup wizard.
aqm init
aqm init --path /path/to/project
The wizard offers three setup methods:
| Method | Description |
|---|---|
| [1] AI-generate | Describe your desired pipeline in plain language, select a Claude model (Opus 4.6 default), and Claude generates the YAML — with interactive Q&A, project analysis, and auto-validation. |
| [2] Default template | Start with a basic planner→executor pipeline. Edit agents.yaml to customize. |
| [3] Pull from registry | Browse and install pipelines from GitHub registry or local registry. |
AI-generate example:
$ cd my-react-app
$ aqm init
How would you like to set up your pipeline?
[1] AI-generate from description
[2] Create default template
[3] Pull from registry
Choice: 1
Select AI model:
[1] Opus 4.6 (most capable) (default)
[2] Sonnet 4.6 (fast & capable)
[3] Haiku 4.5 (fastest)
Model: 1
Analyzing project at /Users/you/my-react-app...
Project analysis:
- Language: TypeScript
- Framework: React 18 + Next.js 14
- Package manager: pnpm
- Testing: Vitest + Playwright
- CI/CD: GitHub Actions
- Styling: Tailwind CSS
Pipeline description: Feature development pipeline with code review
and automated testing
Generating agents.yaml with Claude (project analysis + YAML spec reference)...
Generated agents.yaml:
─────────────────────────────────────
apiVersion: aqm/v0.1
agents:
- id: planner
system_prompt: |
You are a senior Next.js/React architect...
...
- id: developer
runtime: claude
mcp:
- github
...
- id: qa
runtime: claude
system_prompt: |
Run Vitest unit tests and Playwright e2e tests...
...
─────────────────────────────────────
[1] Use this pipeline [2] Regenerate [3] Use default template
Choice: 1
✓ .aqm/ initialized with AI-generated pipeline
Creates:
.aqm/
├── agents.yaml ← Pipeline configuration (edit this)
├── tasks/ ← Task context directories (auto-generated)
└── queue.db ← SQLite queue (auto-generated)
aqm run
Create a task and run it through the pipeline.
# Basic usage
aqm run "Build a login feature"
# Specify starting agent
aqm run "Fix the payment bug" --agent developer
# With priority
aqm run --priority high "Fix critical security issue"
# Run in parallel (don't wait for other tasks to finish)
aqm run --parallel "Add documentation"
# With verbose logging
aqm -v run "Add user registration"
# Run a specific pipeline
aqm run --pipeline code-review "Review PR #42"
Execution modes:
| Mode | Behavior |
|---|---|
| Sequential (default) | Waits for running tasks to finish before starting |
Parallel (--parallel) |
Starts immediately alongside other running tasks |
Warning: In parallel mode, multiple agents may modify the same files simultaneously. The last write wins — use
git diffto review changes if conflicts occur.
Priority levels:
| Priority | Description |
|---|---|
critical |
Runs before all other pending tasks |
high |
Higher priority than normal |
normal |
Default priority |
low |
Runs after higher-priority tasks |
Output:
✓ Task created: T-A3F2B1 [high]
Starting agent: planner
stage 1 planner → Feature specification written...
stage 2 reviewer → APPROVED...
stage 3 developer → Code implemented...
✓ Completed T-A3F2B1
If a human gate is encountered:
⏸ Awaiting gate T-A3F2B1
Proceed with 'aqm approve T-A3F2B1' or
'aqm reject T-A3F2B1 -r "reason"'.
aqm fix
Follow-up on a previous task. Carries over the full context.md so agents understand the previous work and can make targeted corrections.
aqm fix T-A3F2B1 "The login button color should be blue, not red"
aqm fix T-A3F2B1 "Authentication fails on mobile"
aqm fix T-A3F2B1 "Update API endpoint to use v2" --agent developer
Output:
✓ Fix task created: T-C4D5E6 (from T-A3F2B1)
Starting agent: planner
stage 1 planner → Analyzed previous context, fixing color...
stage 2 developer → Updated button color to blue...
✓ Completed T-C4D5E6
Use fix for any follow-up — bug reports, corrections, refinements. One command covers all iteration.
aqm status
View task status.
# Summary of all tasks
aqm status
# Detailed view of a specific task
aqm status T-A3F2B1
Detailed output:
T-A3F2B1 Add JWT authentication
Status: awaiting_gate
Current agent: reviewer
Created: 2026-03-24 10:22
Stages: 2
stage 1: planner
stage 2: reviewer [rejected] (missing security requirements)
aqm list
List tasks with optional status filtering.
# All tasks
aqm list
# Filter by status
aqm list --filter completed
aqm list --filter failed
aqm list --filter awaiting_gate
aqm list --filter pending
aqm approve
Approve a task waiting at a human gate.
aqm approve T-A3F2B1
aqm approve T-A3F2B1 -r "Looks good, proceed with implementation"
aqm reject
Reject a task waiting at a human gate. Reason is required.
aqm reject T-A3F2B1 -r "Missing error handling for edge cases"
aqm agents
Display all agents and their handoff graph.
aqm agents
Output:
Agent Pipeline
planner (Planning Agent) [api] (MCP: github, filesystem)
→ reviewer (always)
reviewer (Review Agent) [api]
gate: llm
→ developer (on_approve)
→ planner (on_reject)
developer (Development Agent) [claude_code] (MCP: github)
→ qa (always)
aqm context
Print the full context.md for a task.
aqm context T-A3F2B1
aqm cancel
Cancel a running or pending task.
aqm cancel T-A3F2B1
aqm cancel T-A3F2B1 -r "Requirements changed"
For in_progress tasks, the pipeline stops at the next stage boundary. Completed stages are preserved — use git diff to review any code changes made before cancellation.
aqm priority
Change the priority of an existing task.
aqm priority T-A3F2B1 critical
aqm priority T-A3F2B1 low
Higher-priority tasks are executed first when the queue has multiple pending tasks.
aqm pipeline
Manage multiple pipelines in a project.
aqm pipeline list # List all pipelines
aqm pipeline create <name> # Create interactively
aqm pipeline create <name> --ai # AI-generate
aqm pipeline create <name> --template # Default template
aqm pipeline edit [name] # Edit a pipeline with AI
aqm pipeline default [name] # Get or set default pipeline
aqm pipeline delete <name> # Delete a pipeline
aqm serve
Launch the local web dashboard.
aqm serve
aqm serve --port 3000
aqm serve --host 0.0.0.0 --port 8080
Requires: pip install aqm[serve]
The web dashboard provides all CLI features with a visual interface:
| Page | Features |
|---|---|
Tasks (/) |
Run pipeline, task list with stats, fix follow-ups, cancel tasks |
Agents (/agents) |
D3.js directed graph with SVG connection lines, condition labels, gate badges, fan-out visualization |
Registry (/registry) |
Search pipelines (GitHub + local + bundled), pull, publish |
Validate (/validate) |
JSON Schema validation with error details and fix suggestions |
Task Detail (/tasks/{id}) |
Stage timeline, SSE real-time progress, gate approve/reject, context.md viewer |
Additional web-only features:
- Real-time pipeline progress via Server-Sent Events (SSE)
- Stale task recovery —
in_progresstasks from a crashed server are automatically marked asstalledon restart - Interactive agent diagram — auto-layout with dagre, color-coded edges (green=always, red=on_reject, purple=auto)
aqm pull
Pull a pipeline from the registry and install it into .aqm/agents.yaml.
Searches in order: GitHub registry → local registry.
# Pull from GitHub registry (default)
aqm pull software-dev-pipeline
# Pull from a custom registry repo
aqm pull my-pipeline --repo myorg/my-registry
# Offline mode — skip GitHub, use local/bundled only
aqm pull software-feature-pipeline --offline
aqm publish
Share your pipeline by creating a PR to the GitHub registry.
# Publish to GitHub registry (creates a PR)
aqm publish --name "my-pipeline" --description "Custom workflow"
# Publish to a custom registry repo
aqm publish --name "my-pipeline" --repo myorg/my-registry
# Save to local registry only (no PR)
aqm publish --name "my-pipeline" --local
Requires GitHub CLI (gh) for GitHub publishing.
Local publish (--local) works without it.
aqm search
Search for available pipelines across all sources.
# List all available pipelines
aqm search
# Filter by keyword
aqm search "code review"
# Offline search (local + bundled only)
aqm search --offline
agents.yaml Complete Reference
The agents.yaml file is the single source of truth for your pipeline. It defines all agents, their roles, how they connect, and how tasks flow between them.
Top-Level Structure
# .aqm/agents.yaml
params: # (optional) Parameterization for reusability
model: claude-sonnet-4-20250514
project_path:
type: string
required: true
description: "Path to the project root"
imports: # (optional) Import agents from external files
- from: ./shared/reviewers.yaml
agents: [code_reviewer]
agents:
- id: ... # First agent (pipeline entry point)
- id: ... # Second agent
- id: ... # ...
The file has three top-level keys: params (variable declarations), imports (external agent files), and agents (the pipeline definition). The first non-abstract agent is used as the default starting point.
Params — Parameterization for Reusability
Params make pipelines portable. Declare variables with types and defaults, then reference them anywhere using ${{ params.var_name }} syntax. Add prompt and auto_detect to create interactive setup flows.
params:
# Shorthand — just a default value
model: claude-sonnet-4-20250514
# Full declaration
project_path:
type: string # string | number | boolean
required: true
description: "Path to the project root"
# Interactive param with auto-detection
primary_color:
type: string
required: true
description: "Primary brand color hex code"
prompt: "What is the primary brand color?" # question shown during aqm run
auto_detect: "Analyze CSS/config files for primary color" # LLM instruction for auto-fill
max_retries:
type: number
default: 3
agents:
- id: developer
model: ${{ params.model }}
mcp:
- server: filesystem
args: ["${{ params.project_path }}"]
Interactive setup with prompt + auto_detect:
When you pull a shared pipeline, params can ask questions interactively:
params:
primary_color:
type: string
required: true
prompt: "What is the primary brand color?"
auto_detect: "Analyze the project's CSS/tailwind config and extract the primary color"
project_name:
type: string
required: true
prompt: "What is the project name?"
auto_detect: "Read package.json or pyproject.toml and extract the project name"
When you run aqm run, each unresolved param with a prompt shows an interactive setup:
? What is the primary brand color?
[1] Enter manually
[2] Auto-detect from project
Choice [1]: 2
Detecting...
Detected: #3B82F6
Use this value? [Y/n]: Y
This makes pipelines truly portable — pull a design pipeline, and it asks the right questions for your project.
Override params at runtime:
# Via CLI flags (skips interactive prompts)
aqm run "Build feature" --param model=claude-opus-4-6 --param project_path=/my/project
# Via overrides file (.aqm/params.yaml)
echo "model: claude-opus-4-6" > .aqm/params.yaml
echo "project_path: /my/project" >> .aqm/params.yaml
aqm run "Build feature"
Resolution priority: CLI flags > params.yaml file > interactive prompt > default values.
This is what makes the "pull and customize" workflow possible:
aqm pull software-dev-pipeline
# Interactive prompts fill in project-specific values
aqm run "Build login feature"
Imports — Reuse Agents Across Pipelines
Import agent definitions from external YAML files to avoid duplication:
# .aqm/agents.yaml
imports:
- from: ./shared/reviewer.yaml # relative path
agents: [security_reviewer] # import specific agents (optional — omit to import all)
agents:
- id: planner
runtime: claude
handoffs:
- to: security_reviewer
condition: always
# .aqm/shared/reviewer.yaml
agents:
- id: security_reviewer
runtime: claude
system_prompt: "Review for security vulnerabilities: {{ input }}"
gate:
type: llm
prompt: "Are there any security issues?"
Extends — Agent Inheritance
Define a base agent and extend it to create specialized variants:
agents:
- id: base_reviewer
abstract: true # Not instantiated — only used as a base
runtime: claude
gate:
type: llm
system_prompt: "Review: {{ input }}"
- id: code_reviewer
extends: base_reviewer # Inherits runtime, gate from parent
system_prompt: "Review this CODE for bugs and style: {{ input }}"
- id: security_reviewer
extends: base_reviewer
system_prompt: "Review for security vulnerabilities: {{ input }}"
abstract: trueagents are excluded from the pipeline (base-only)- Child fields override parent fields (shallow merge)
- Works with imports: import a base, extend it locally
Agent Definition
Each agent supports the following fields:
agents:
- id: planner # (required) Unique identifier, used in handoff routing
name: Planning Agent # (required) Display name for CLI output and dashboard
runtime: claude # (required) claude | gemini | codex
model: claude-sonnet-4-20250514 # (optional) Model to use, omit for CLI default
system_prompt: | # (optional) Jinja2 template for the system prompt
You are a software planner.
Analyze: {{ input }}
handoffs: # (optional) List of handoff rules
- to: reviewer
task: review_spec
condition: always
gate: # (optional) Quality gate configuration
type: llm
prompt: "Is this ready?"
mcp: # (optional) MCP server connections
- server: github
claude_code_flags: # (optional) Extra CLI flags, claude_code runtime only
- "--allowedTools"
- "Edit,Write,Bash,Read"
Field Reference
| Field | Type | Required | Default | Description |
|---|---|---|---|---|
id |
string |
Yes | — | Unique identifier. Used as handoff target. Must not duplicate. |
name |
string |
Yes | — | Human-readable display name. |
runtime |
"claude" | "gemini" | "codex" |
Yes | — | Execution runtime. See Runtime section. |
model |
string |
No | CLI default | Claude model ID (e.g. claude-opus-4-6, claude-sonnet-4-20250514). |
system_prompt |
string |
No | "" |
Jinja2 template. Available variables: {{ input }}, {{ output }}. |
handoffs |
list[Handoff] |
No | [] |
Where to send results after this agent completes. |
gate |
GateConfig |
No | null |
Quality gate evaluated before handoff routing. |
mcp |
list[MCPServer] |
No | [] |
MCP servers to attach to this agent. |
claude_code_flags |
list[string] |
No | null |
Additional CLI flags passed to claude. Only used with claude_code runtime. |
abstract |
boolean |
No | false |
If true, agent is a base template only — excluded from pipeline execution. |
extends |
string |
No | null |
ID of a parent agent to inherit fields from (shallow merge). |
Runtime
runtime is required — every agent must declare its provider.
| Value | Provider | Description | Use Case |
|---|---|---|---|
claude |
Anthropic | Uses Claude CLI. Auto-selects text-only or Code mode (with tools/MCP). | All tasks — planning, review, implementation |
gemini |
Runs gemini -p <prompt> in headless mode. System prompt via GEMINI_SYSTEM_MD env var. |
Text generation, planning, review | |
codex |
OpenAI | Runs codex exec <prompt> in non-interactive mode with --full-auto. |
Code generation, implementation, review |
Claude models:
claude-opus-4-6— Most capable, best for complex reasoningclaude-sonnet-4-20250514— Balanced speed and qualityclaude-haiku-4-5-20251001— Fastest, best for simple tasks
Gemini models:
gemini-3.1-pro-preview— Most capable (latest, Feb 2026)gemini-2.5-pro— Strong reasoning, stablegemini-2.5-flash— Fast and capable (default)gemini-2.5-flash-lite— Fastest, cheapest
OpenAI Codex models:
o4-mini— Fast code generation (default)o3— Strong reasoninggpt-4.1— General purpose
Handoff
Handoffs define how tasks flow from one agent to the next.
handoffs:
- to: reviewer # (required) Target agent ID
task: review_spec # (optional) Task label, default: ""
condition: always # (optional) When to trigger, default: "always"
payload: "{{ output }}" # (optional) Data to pass, default: "{{ output }}"
Handoff Fields
| Field | Type | Required | Default | Description |
|---|---|---|---|---|
to |
string |
Yes | — | Target agent id. Comma-separated for fan-out (e.g. "qa, docs"). Ignored when condition: auto. |
task |
string |
No | "" |
Label describing what the next agent should do. |
condition |
string |
No | "always" |
When this handoff triggers. See conditions below. |
payload |
string |
No | "{{ output }}" |
Jinja2 template for data passed to the next agent. |
Condition Values
| Condition | Triggers When |
|---|---|
always |
Always triggers (no gate needed) |
on_approve |
Gate decision is approved |
on_reject |
Gate decision is rejected |
on_pass |
No gate exists, or gate approved |
auto |
Agent decides at runtime — parses HANDOFF: <id> from agent output |
| Custom expression | Expression evaluates to true (e.g. "severity == critical", "severity in [major, minor]") |
Three Routing Strategies
1. Static routing — fixed target, simplest form:
handoffs:
- to: reviewer
condition: always
2. Fan-out — send to multiple agents simultaneously:
handoffs:
- to: qa, docs, deploy # all three run in parallel as child tasks
condition: on_approve
The first target continues in the current task; additional targets spawn independent child tasks that run concurrently.
3. Agent-decided routing (auto) — the agent itself chooses where to route:
# Triage agent analyzes the input and decides which specialist to hand off to
- id: triage
name: Triage Agent
runtime: text
system_prompt: |
Analyze the request and decide which agent should handle it.
End your response with: HANDOFF: <agent_id>
Available agents: developer, designer, analyst
handoffs:
- to: "*" # 'to' is ignored when condition is auto
condition: auto
The agent includes a HANDOFF: directive in its output:
This request requires code changes to the payment module.
HANDOFF: developer
Multiple targets are also supported:
This needs both code changes and documentation updates.
HANDOFF: developer, docs
Note: When using
auto, the agent must includeHANDOFF: <agent_id>in its output. If the directive is missing, the handoff is skipped and a warning is logged.
Payload Template Variables
| Variable | Description |
|---|---|
{{ output }} |
Current agent's output text |
{{ input }} |
Current agent's input text |
{{ reject_reason }} |
Gate rejection reason (empty if approved) |
{{ gate_result }} |
Gate decision: "approved" or "rejected" |
Example — Passing reject reason back to planner:
handoffs:
- to: planner
task: revise_spec
condition: on_reject
payload: "REJECTED: {{ reject_reason }}\nOriginal plan: {{ output }}"
Example — Intelligent triage with fan-out:
agents:
- id: triage
name: Triage Agent
runtime: claude
system_prompt: |
Analyze this customer request. Determine which teams should handle it.
If multiple teams are needed, list them all.
End with: HANDOFF: team1, team2
handoffs:
- to: "*"
condition: auto
- id: billing
name: Billing Agent
runtime: claude
system_prompt: "Handle billing issues: {{ input }}"
- id: technical
name: Technical Agent
runtime: claude
system_prompt: "Investigate technical issues: {{ input }}"
- id: account
name: Account Agent
runtime: claude
system_prompt: "Handle account issues: {{ input }}"
Gate
Gates evaluate an agent's output before handoff routing occurs. If a gate exists, handoff conditions like on_approve / on_reject are resolved based on the gate result.
gate:
type: llm # (required) llm | human
prompt: "Is this plan actionable?" # (optional) Extra evaluation criteria for LLM gate
model: claude-sonnet-4-20250514 # (optional) Model for LLM gate evaluation
Gate Fields
| Field | Type | Required | Default | Description |
|---|---|---|---|---|
type |
"llm" | "human" |
No | "llm" |
Gate type. |
prompt |
string |
No | "" |
Additional evaluation criteria. Jinja2 template (variables: {{ output }}, {{ input }}). |
model |
string |
No | claude-sonnet-4-20250514 |
Model used for LLM gate evaluation. |
Gate Types
| Type | Behavior |
|---|---|
llm |
Claude CLI automatically evaluates the output and returns {"decision": "approved"/"rejected", "reason": "..."}. Pipeline continues immediately. |
human |
Pipeline pauses and waits for manual approval. Resume with aqm approve <task-id> or aqm reject <task-id> -r "reason". |
MCP Servers
Attach Model Context Protocol servers to give agents real-world capabilities.
mcp:
# Simple format — auto-resolves to npx -y @modelcontextprotocol/server-{name}
- server: github
# With arguments
- server: filesystem
args: ["/path/to/dir"]
# Custom server with full configuration
- server: custom-db
command: node
args: ["./my-mcp-server.js"]
env:
DATABASE_URL: "postgres://..."
API_KEY: "sk-..."
MCP Server Fields
| Field | Type | Required | Default | Description |
|---|---|---|---|---|
server |
string |
Yes | — | Server name. Used as identifier and for auto-resolution. |
command |
string |
No | "npx" |
Command to launch the server. |
args |
list[string] |
No | [] |
Arguments passed to the command. |
env |
dict[string, string] |
No | null |
Environment variables for the server process. |
Auto-resolution: When only server name is provided (no command), it resolves to:
npx -y @modelcontextprotocol/server-{name} [args...]
Common MCP servers:
| Server Name | Capabilities |
|---|---|
github |
Read/write repos, create PRs, manage issues |
filesystem |
Read/write local files and directories |
postgres |
Execute SQL queries against PostgreSQL |
slack |
Send messages, read channels |
browsertools |
Browser automation, E2E testing |
sentry |
Error tracking and lookup |
claude_code_flags
Extra CLI flags passed directly to the claude command. Only applies when runtime: claude.
claude_code_flags:
- "--allowedTools"
- "Edit,Write,Bash,Read"
This is useful for restricting which tools an agent can use, or passing other Claude Code CLI options.
Complete Example
# .aqm/agents.yaml
agents:
- id: planner
name: Planning Agent
runtime: claude
model: claude-sonnet-4-20250514
system_prompt: |
You are a software planner.
Analyze the requirements and write a detailed specification.
Requirements: {{ input }}
handoffs:
- to: reviewer
task: review_spec
condition: always
payload: "{{ output }}"
- id: reviewer
name: Review Agent
runtime: claude
model: claude-sonnet-4-20250514
system_prompt: |
Review this specification. Decide approve or reject.
If rejecting, always include the reason.
Spec: {{ input }}
gate:
type: llm
prompt: "Is this spec clear, complete, and ready for implementation?"
model: claude-sonnet-4-20250514
handoffs:
- to: developer
task: implement
condition: on_approve
payload: "{{ output }}"
- to: planner
task: revise_spec
condition: on_reject
payload: "REJECTED: {{ reject_reason }}\nOriginal: {{ output }}"
- id: developer
name: Development Agent
runtime: claude
model: claude-sonnet-4-20250514
mcp:
- server: filesystem
args: ["/path/to/project"]
system_prompt: |
Implement the approved specification.
Plan: {{ input }}
claude_code_flags:
- "--allowedTools"
- "Edit,Write,Bash,Read"
handoffs:
- to: qa
task: test_implementation
condition: always
- id: qa
name: QA Agent
runtime: claude
mcp:
- server: browsertools
system_prompt: |
Run tests and evaluate quality. If issues are found,
decide severity and route accordingly.
End with: HANDOFF: <agent_id>
handoffs:
# Agent decides based on analysis — no static condition needed
- to: "*"
condition: auto
In this example, the QA agent analyzes test results and autonomously decides:
HANDOFF: developerfor minor bugsHANDOFF: plannerfor critical design issuesHANDOFF: developer, docsif both code and docs need updates
Pipelines for Any Domain
| Domain | Pipeline Example |
|---|---|
| Software Dev | Planning → Review → Implementation → QA → PR |
| Content Creation | Research → Draft → Edit → Publish |
| Legal Documents | Extract → Summarize → Risk Flag → Approval |
| Data Analysis | Collect → Clean → Analyze → Report |
| Customer Support | Classify → Lookup → Respond → Escalate |
Comparison with Existing Frameworks
| LangGraph | CrewAI | OpenSWE | aqm | |
|---|---|---|---|---|
| Pipeline definition | Python code | Python code | Code | YAML |
| Pipeline sharing | ❌ | Paid platform | ❌ | Open registry |
| Explicit queue | ❌ | ❌ | ❌ | SQLite default |
| Approve/Reject gate | Interrupt pattern | ❌ | ❌ | First-class |
| Reverse feedback loop | Manual | Limited | ❌ | Built-in |
| Fan-out (parallel branches) | Manual | ❌ | ❌ | Declarative |
| Agent-decided routing | Manual | ❌ | ❌ | condition: auto |
| File-based context | ❌ | ❌ | ❌ | context.md |
| MCP agent connection | Manual | ❌ | ❌ | Declarative |
| Local/offline | ❌ | ❌ | ❌ | Default |
| Web dashboard | ❌ | Paid | ❌ | Built-in (all CLI features) |
| Task cancellation | Manual | ❌ | ❌ | aqm cancel + web UI |
Design Principles
- The ecosystem is the product — The YAML sharing ecosystem is the core value, not the execution engine
- Queue is the contract — No direct calls between agents. Communication only through queues
- Context is a file — Accumulated in human-readable
context.md - Local first — SQLite default. Runs without external infrastructure
- Declarative first — Define in YAML, code is the escape hatch
- Gate is first-class — approve/reject is a core feature
Documentation
| Document | Description |
|---|---|
| Core Concepts | Task, Queue, Handoff, Gate, Condition, Context, Pipeline — with LangGraph/CrewAI comparison tables |
| YAML Specification | Independent agents.yaml format spec (apiVersion: aqm/v0.1), field reference, processing order, versioning policy |
| JSON Schema | Machine-readable schema for validation and IDE autocomplete |
| Seed Pipelines | 10 ready-to-use pipelines in the registry |
| Contributing | How to contribute pipelines (equal to code!), submission template, review process |
Validate your pipeline against the spec:
aqm validate .aqm/agents.yaml
Architecture
aqm/
├── aqm/
│ ├── core/
│ │ ├── task.py # Task, StageRecord, TaskStatus
│ │ ├── agent.py # AgentDefinition, params, extends, imports
│ │ ├── pipeline.py # Pipeline execution loop
│ │ ├── gate.py # LLMGate / HumanGate
│ │ ├── context_file.py # File-based context accumulation
│ │ ├── context.py # Prompt builder
│ │ └── project.py # Project root detection
│ ├── queue/
│ │ ├── base.py # AbstractQueue interface
│ │ ├── sqlite.py # SQLiteQueue (default)
│ │ └── file.py # FileQueue (testing)
│ ├── runtime/
│ │ ├── base.py # AbstractRuntime interface
│ │ ├── text.py # Claude CLI runtime (claude_text)
│ │ ├── claude_code.py # Claude Code CLI runtime (claude_code)
│ │ ├── gemini.py # Google Gemini CLI runtime (gemini_cli)
│ │ └── codex.py # OpenAI Codex CLI runtime (codex_cli)
│ ├── web/
│ │ ├── app.py # FastAPI app factory
│ │ ├── templates.py # Shared CSS/layout/helpers
│ │ ├── pages/ # Page renderers (dashboard, agents, registry, validate, task_detail)
│ │ └── api/ # REST + SSE endpoints (tasks, registry, validate)
│ ├── registry.py # GitHub-based pipeline registry
│ └── cli.py # Click CLI
├── schema/
│ └── agents-schema.json # JSON Schema for agents.yaml
├── docs/
│ ├── concepts.md # Core concepts guide
│ ├── spec.md # YAML format specification
│ └── competitive-analysis.md
└── tests/
Multiple Pipelines
A single project can have multiple pipeline configurations. Each pipeline lives in .aqm/pipelines/<name>.yaml.
Managing Pipelines
# List all pipelines
aqm pipeline list
# Create a new pipeline (interactive: template or AI-generate)
aqm pipeline create code-review
aqm pipeline create marketing --ai # AI-generate
aqm pipeline create basic --template # Default template
# Edit an existing pipeline with AI
aqm pipeline edit code-review
# Set the default pipeline
aqm pipeline default code-review
# Delete a pipeline
aqm pipeline delete old-pipeline
Running a Specific Pipeline
# Run the default pipeline
aqm run "Build login feature"
# Run a specific pipeline
aqm run --pipeline code-review "Review PR #42"
aqm run --pipeline marketing "Launch campaign for Q2"
# View agents for a specific pipeline
aqm agents --pipeline code-review
Web Dashboard
The web dashboard (aqm serve) includes a pipeline selector dropdown when multiple pipelines are available. Select a pipeline to:
- View its agents and handoff graph
- Run tasks through the selected pipeline
- Pipeline selection is available on both the Tasks and Agents pages
Migration from Single Pipeline
If you have an existing .aqm/agents.yaml, it is automatically migrated to .aqm/pipelines/default.yaml on first use. No manual action needed.
Roadmap
v0.1 — Core
- SQLite-based task queue
- YAML agent declarations
- Handoff routing (static, fan-out, agent-decided)
- LLM / Human approval gates
- File-based context accumulation (context.md)
- Claude Code runtime with MCP support
- Local web UI dashboard
- Fan-out: parallel child tasks from comma-separated targets
-
condition: auto: agent-decided routing viaHANDOFF:directive -
params: parameterization with${{ params.X }}for portable pipelines -
extends/abstract: agent inheritance for DRY definitions -
imports: reuse agents across pipelines from external files - 10 seed pipelines covering software, content, legal, data, and more
- Interactive params:
prompt+auto_detectfor guided pipeline setup - Follow-up tasks:
aqm fixfor multi-turn iteration with context carry-over - Task cancellation:
aqm cancelwith graceful pipeline stop - Full web dashboard: all CLI features in browser with D3.js agent diagram
- SSE real-time pipeline progress streaming
- Stale task recovery on server restart
v0.2 — Connections
- Enhanced per-agent MCP server support
- GitHub / Slack webhook triggers
- Context summarization (prevent token explosion)
v0.3 — Ecosystem
- GitHub-based pipeline registry (
aqm publish / pull / search) - registry.aqm.dev launch (dedicated registry server)
- YAML version control and forking
v1.0 — Stabilization
- Task dependencies (DAG)
- Redis / Postgres queue backends
- Pipeline execution history and analytics
Community
Join the aqm community on Discord to share pipelines, ask questions, and get help:
Contributing
git clone https://github.com/aqm-framework/aqm
cd aqm
pip install -e ".[dev,serve]"
pytest tests/
We value pipeline contributions equally to code contributions. Creating a useful YAML pipeline, improving documentation, or fixing a bug — all are welcome.
See CONTRIBUTING.md for the full guide: setup, architecture overview, pipeline submission template, code style, and review process.
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file aqm-1.2.12.tar.gz.
File metadata
- Download URL: aqm-1.2.12.tar.gz
- Upload date:
- Size: 124.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ced454b62854d9bf823907172849c8a6c619b9fc4024035685b1958e075794c7
|
|
| MD5 |
1b3ce374cfe0cebcc460c42fe46a9233
|
|
| BLAKE2b-256 |
959407c290846743c7af6faecd8621a5a9b9bca4127a51e53aec4d0ba3165e26
|
File details
Details for the file aqm-1.2.12-py3-none-any.whl.
File metadata
- Download URL: aqm-1.2.12-py3-none-any.whl
- Upload date:
- Size: 114.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e0fe569e35a43a19bb1878fe564f93c18a8850016b77eaa716e05ff32d1f326e
|
|
| MD5 |
293b94845ea946ce2c280821625548db
|
|
| BLAKE2b-256 |
b195a284e54cf744e1de6d34e9b997b6116f548865f0ba0ee29f7dd82d08ae65
|