AI agent orchestration framework — build, share, and run YAML pipelines

These details have not been verified by PyPI

Project links

Project description

aqm | 한국어

Build AI agent teams in YAML. No code. No API keys. Just pipelines.

An orchestration framework where multiple AI agents pass tasks through explicit queues — or discuss in real-time sessions until consensus. Define once, run anywhere, share with anyone.

  [user] ──input──► [planner] ──► [reviewer] ──approve──► [design_session] ──► [implementer]
                        ▲              │                    ┌──┬──┬──┐
                        └── reject ────┘                    ▼  ▼  ▼  ▼  round-robin
                        └── ask user ──►[user]             [arch][sec][fe]  until consensus

Why aqm?

A single AI agent writes code and reviews it with the same bias. It can't catch its own blind spots.

aqm gives you a team — each agent has a dedicated role, a separate prompt, and optionally a different LLM. A quality gate rejects bad output automatically. A session lets agents debate before deciding.

# One YAML file. That's the entire pipeline.
agents:
  - id: developer
    runtime: claude
    system_prompt: "Implement: {{ input }}"
    handoffs: [{ to: reviewer }]

  - id: reviewer
    runtime: gemini                    # Different LLM catches different bugs
    system_prompt: "Review for security: {{ input }}"
    gate:
      type: llm
      prompt: "Is this production-ready?"
      max_retries: 3                   # Auto-reject → retry up to 3 times
    handoffs:
      - { to: deployer, condition: on_approve }
      - { to: developer, condition: on_reject }

  - id: deployer
    runtime: claude
    context_strategy: none             # 85% token savings — no context needed
    system_prompt: "Deploy: {{ input }}"

pip install aqm && aqm init && aqm run "Add JWT authentication"

What makes aqm different

Problem	Single Agent	aqm
Same LLM reviews its own code	One bias, one perspective	Cross-LLM verification (Claude writes, Gemini reviews)
No forced quality checks	Agent says "looks good" to itself	Quality gates auto-reject and retry
Context window explodes at scale	Everything in one conversation	5 context strategies — 55-85% token savings
Can't standardize team processes	Every run is ad-hoc	YAML pipelines — version-controlled, shareable
Complex tasks lose track of progress	No built-in task tracking	Chunk decomposition — agents break work into trackable units
Expensive API costs	Per-token API billing adds up	CLI-based — uses your existing CLI subscriptions, no extra API fees
Setup overhead	API keys, SDKs, env configs	Zero config — uses CLI tools you already have

Install

pip install aqm

Requires Python 3.11+. At least one LLM CLI must be installed:

Runtime	Provider	Install
`claude`	Anthropic	`npm i -g @anthropic-ai/claude-code && claude login`
`gemini`	Google	`npm i -g @google/gemini-cli`
`codex`	OpenAI	`npm i -g @openai/codex`

No API keys or SDK setup needed — aqm runs CLI tools as subprocesses. You pay for the CLI subscriptions you already have, not per-token API fees.

Quick Start

cd my-project
aqm init                              # Interactive setup wizard
aqm run "Add JWT authentication"       # Run pipeline
aqm serve                              # Web dashboard at localhost:8000

Real-World Examples

Example 1: Code Review Pipeline

Every PR goes through planning, implementation, review, and testing — automatically.

agents:
  - id: planner
    runtime: gemini
    system_prompt: "Break this into implementation steps: {{ input }}"
    handoffs: [{ to: developer }]

  - id: developer
    runtime: claude
    mcp: [{ server: github }]
    system_prompt: "Implement the plan: {{ input }}"
    handoffs: [{ to: reviewer }]

  - id: reviewer
    runtime: gemini                    # Different LLM = different perspective
    system_prompt: "Review for bugs and security issues: {{ input }}"
    gate:
      type: llm
      prompt: "Is this code production-ready? Check OWASP Top 10."
      max_retries: 3
    handoffs:
      - { to: qa, condition: on_approve }
      - { to: developer, condition: on_reject }

  - id: qa
    runtime: claude
    context_strategy: last_only        # Only needs reviewer's output → 55% fewer tokens
    system_prompt: "Write tests for: {{ input }}"

aqm run "Add user preferences with database, API, and frontend"

Example 2: Architecture Decision Session

Multiple experts debate until they agree — like a real design meeting.

agents:
  - id: architect
    runtime: claude
    system_prompt: |
      You are a software architect. Discuss: {{ input }}
      Previous discussion: {{ transcript }}

  - id: security
    runtime: gemini
    system_prompt: |
      You are a security expert. Focus on threats: {{ input }}
      Previous discussion: {{ transcript }}

  - id: design_session
    type: session
    participants: [architect, security]
    max_rounds: 5
    consensus:
      method: vote
      keyword: "VOTE: AGREE"
      require: all
    summary_agent: architect
    handoffs: [{ to: developer }]

── Round 1 ──
  [architect] JWT for stateless scaling. Token rotation every 15min...
  [security] Token revocation is the weak point. Consider hybrid...
── Round 2 ──
  [architect] Agreed — hybrid with Redis blacklist. VOTE: AGREE  ✓
  [security] Redis approach works. VOTE: AGREE  ✓
✓ Consensus reached (round 2)

Example 3: Human-in-the-Loop Deployment

AI does the work, but humans approve the critical steps.

agents:
  - id: developer
    runtime: claude
    human_input:
      mode: before
      prompt: "What features do you want? Any constraints?"
    system_prompt: "Build: {{ input }}"
    handoffs: [{ to: deployer }]

  - id: deployer
    runtime: claude
    gate: { type: human }              # Pipeline pauses for manual approval
    system_prompt: "Deploy: {{ input }}"

aqm run "Refactor auth module"
# → Developer asks for your input first
# → After coding, pipeline pauses at deployer
aqm approve T-ABC123 -r "LGTM, deploy to staging"

Features

Multi-LLM Runtimes

Mix providers per agent. Claude writes code, Gemini reviews it, Codex tests it.

agents:
  - id: planner
    runtime: gemini
    model: gemini-2.5-flash
    system_prompt: "Plan: {{ input }}"
    handoffs: [{ to: developer }]

  - id: developer
    runtime: claude
    mcp: [{ server: github }]         # Auto Code mode
    system_prompt: "Implement: {{ input }}"

Conversational Sessions

Session nodes let multiple agents discuss in rounds until consensus — like a meeting.

agents:
  - id: design_review
    type: session
    participants: [architect, frontend, security]
    turn_order: round_robin           # or: moderator
    max_rounds: 5
    consensus:
      method: vote                    # or: moderator_decides
      keyword: "VOTE: AGREE"
      require: all                    # or: majority
    summary_agent: architect
    handoffs: [{ to: implementer }]

Consensus methods:

Method	How It Works
`vote`	Each agent includes the keyword in their output. Consensus when `all` or `majority` agree.
`moderator_decides`	Only the `summary_agent` can declare consensus.

Produces transcript.md meeting minutes. Mix freely: batch → session → batch.

Chunk Decomposition

Break tasks into trackable work units. Agents manage chunks via output directives.

- id: build_session
  type: session
  participants: [pm, dev]
  consensus:
    require_chunks_done: true         # All chunks must be done
  chunks:
    enabled: true
    initial:
      - "Set up project structure"
      - "Implement auth flow"
      - "Add unit tests"

Agent directives:

CHUNK_ADD: Implement drag-and-drop     → adds new chunk
CHUNK_DONE: C-001                      → marks chunk complete
CHUNK_REMOVE: C-003                    → removes chunk

Template variable {{ chunks }} injects a status table into prompts. Stored in chunks.json.

CLI:

aqm chunks list T-ABC123
aqm chunks add T-ABC123 "New feature"
aqm chunks done T-ABC123 C-001
aqm chunks remove T-ABC123 C-002

Web API: CRUD at /api/tasks/{id}/chunks with SSE chunk_update events.

Context Strategy (Token Optimization)

Each agent has a context_strategy that controls what {{ context }} contains. Saves tokens by avoiding redundant context injection.

agents:
  - id: planner
    context_strategy: both            # Full visibility (default)

  - id: developer
    context_strategy: last_only       # Only previous stage → 55% savings
    context_window: 1

  - id: deployer
    context_strategy: none            # No context → 85% savings

Strategy	`{{ context }}` Contains	Token Savings	Use Case
`both` (default)	Shared context.md + agent's private notes	—	Full visibility, backward-compatible
`shared`	Smart-windowed shared context.md	~same	Agents that need pipeline history
`last_only`	Only the most recent stage output	~55%	Agents that only need the previous step
`own`	Agent's private `agent_{id}.md` only	~85%	Focused agents with their own notes
`none`	Empty (no context injected)	~85%	Self-contained agents with no context needed

Benchmarked on a 10-agent pipeline (see tests/bench_token_efficiency.py):

Strategy      Total Tokens   Savings
both              12,233        0%
last_only          5,504       55%
none               1,873       85%

Handoff Routing

Three strategies for task flow:

# Static — fixed target
handoffs:
  - to: reviewer
    condition: always

# Fan-out — multiple targets in parallel
handoffs:
  - to: qa, docs, deploy
    condition: on_approve

# Agent-decided — agent picks target at runtime
handoffs:
  - to: "*"
    condition: auto    # Agent includes HANDOFF: <id> in output

Conditions: always, on_approve, on_reject, on_pass, auto, or expressions (severity == critical)

Payload variables: {{ output }}, {{ input }}, {{ reject_reason }}, {{ gate_result }}

Human Input (Human-in-the-Loop)

agents:
  - id: planner
    human_input:
      mode: before           # Ask before agent runs
      prompt: "What specific features do you want?"

  - id: developer
    human_input: true        # Shorthand: agent can ask mid-execution via HUMAN_INPUT: <question>

Modes:

Mode	Behavior
`before`	Always pause and ask the user before the agent runs.
`on_demand`	Agent requests input via `HUMAN_INPUT: <question>` directives in output.
`both`	Combines both modes.

Gates (Quality Control)

gate:
  type: llm              # LLM auto-evaluates → approved/rejected
  prompt: "Is this production-ready?"
  max_retries: 3         # Reject → retry up to 3 times, then fail

gate:
  type: human            # Pauses pipeline → aqm approve/reject

Task Restart & Recovery

Resume failed or completed tasks from any stage — no need to start over.

How it works:

Before each stage, aqm snapshots all context files (context.md, agent notes, transcripts)
On failure, partial output from the runtime is preserved
aqm restart rolls back context to the chosen stage and re-executes from there

# Restart from the failed stage (auto-detected)
aqm restart T-A3F2B1

# Restart from a specific stage
aqm restart T-A3F2B1 --from-stage 3

# Re-run everything from scratch
aqm restart T-A3F2B1 --from-stage 1

Works for failed, completed, stalled, and cancelled tasks. The web dashboard also provides a restart button with stage selection.

Event	Action
Before each stage	Context files snapshotted to `snapshots/stage_N/`
Task completes successfully	All snapshots cleaned up
Task fails	Snapshots preserved for restart
`aqm restart --from-stage N`	Context restored from snapshot, stages truncated, pipeline resumes

MCP Servers

Give agents real-world capabilities via Model Context Protocol.

mcp:
  - server: github
  - server: filesystem
    args: ["/path/to/dir"]
  - server: custom-db
    command: node
    args: ["./mcp-server.js"]
    env: { DATABASE_URL: "postgres://..." }

Params (Portable Pipelines)

params:
  model: claude-sonnet-4-20250514
  project_path:
    type: string
    required: true
    prompt: "Project root path?"

agents:
  - id: dev
    model: ${{ params.model }}

Override: aqm run "task" --param model=claude-opus-4-6

Imports / Extends

imports:
  - from: ./shared/reviewers.yaml
    agents: [security_reviewer]

agents:
  - id: base_reviewer
    abstract: true
    runtime: claude
    gate: { type: llm }

  - id: code_reviewer
    extends: base_reviewer
    system_prompt: "Review code: {{ input }}"

Pipeline Registry (Share & Discover)

aqm search "code review"              # Find community pipelines
aqm pull security-audit               # Install in one command
aqm publish --name my-pipeline        # Share yours

CLI Reference

# Setup
aqm init                              # Interactive setup wizard
aqm validate                          # Validate agents.yaml
aqm agents                            # Show agent graph

# Run
aqm run "Add JWT auth"                # Run default pipeline
aqm run "Fix bug" --agent bug_fixer   # Start from specific agent
aqm run "Build API" --pipeline backend # Named pipeline
aqm run "Task" --param model=opus     # Override parameters

# Manage
aqm list                              # List all tasks
aqm status T-ABC123                   # Task details
aqm cancel T-ABC123                   # Cancel task
aqm fix T-ABC123 "Fix the color"      # Follow-up with context
aqm restart T-ABC123                  # Restart from failed stage
aqm restart T-ABC123 --from-stage 2   # Restart from specific stage

# Gates & Human Input
aqm approve T-ABC123                  # Approve gate
aqm reject T-ABC123 -r "Needs tests" # Reject gate
aqm human-input T-ABC123 "response"   # Answer agent's question

# Chunks
aqm chunks list T-ABC123              # Status table
aqm chunks done T-ABC123 C-001        # Mark done

# Pipelines
aqm pipeline list                     # List pipelines
aqm pipeline create review --ai       # AI-generate
aqm pipeline default review           # Set default

# Registry
aqm search "code review"              # Search
aqm pull code-review-pipeline         # Install
aqm publish --name my-pipeline        # Share

# Dashboard
aqm serve                             # Web UI at localhost:8000

agents.yaml Reference

Entry Point (Auto-Routing)

entry_point: auto    # LLM picks the best agent based on user input
# entry_point: first  # (default) Always start with the first agent

Agent Definition

Field	Type	Default	Description
`id`	`string`	—	Unique identifier (required)
`name`	`string`	`""`	Display name (auto-generated from id if empty)
`type`	`"agent"` \| `"session"`	`"agent"`	Node type
`runtime`	`"claude"` \| `"gemini"` \| `"codex"`	—	Required for `type: agent`
`model`	`string`	CLI default	Model override
`system_prompt`	`string`	`""`	Jinja2 template: `{{ input }}`, `{{ context }}`, `{{ transcript }}`, `{{ chunks }}`
`context_strategy`	`"none"` \| `"last_only"` \| `"own"` \| `"shared"` \| `"both"`	`"both"`	What context to inject (token optimization)
`context_window`	`int`	`3`	Recent stages in full; older stages summarized (0 = all)
`human_input`	`boolean` \| `object`	`null`	Human-in-the-loop input (`before`, `on_demand`, `both`)
`handoffs`	`list[Handoff]`	`[]`	Routing rules
`gate`	`object`	`null`	Quality gate
`mcp`	`list[MCPServer]`	`[]`	MCP server connections
`claude_code_flags`	`list[string]`	`null`	Extra CLI flags for Claude
`abstract`	`boolean`	`false`	Template-only agent (not executed)
`extends`	`string`	`null`	Parent agent ID for inheritance

Handoff Fields

Field	Type	Default	Description
`to`	`string`	—	Target agent ID, or comma-separated for fan-out (`"qa, docs"`)
`task`	`string`	`""`	Task name label
`condition`	`string`	`"always"`	`always`, `on_approve`, `on_reject`, `on_pass`, `auto`, or expression
`payload`	`string`	`"{{ output }}"`	Jinja2 template: `{{ output }}`, `{{ input }}`, `{{ reject_reason }}`, `{{ gate_result }}`

Session Fields (type: session)

Field	Type	Default	Description
`participants`	`list[string]`	—	Agent IDs (required)
`turn_order`	`"round_robin"` \| `"moderator"`	`"round_robin"`	Turn ordering
`max_rounds`	`int`	`10`	Hard limit
`consensus.method`	`"vote"` \| `"moderator_decides"`	`"vote"`	How to detect agreement
`consensus.keyword`	`string`	`"VOTE: AGREE"`	Agreement signal
`consensus.require`	`"all"` \| `"majority"`	`"all"`	Threshold
`consensus.require_chunks_done`	`boolean`	`false`	Gate on chunk completion
`summary_agent`	`string`	`null`	Final summary producer
`chunks.enabled`	`boolean`	`true`	Enable chunk tracking
`chunks.initial`	`list[string]`	`[]`	Seed chunks

config.yaml Reference

Project-level configuration at .aqm/config.yaml. All fields are optional.

pipeline:
  max_stages: 20
gate:
  model: claude-sonnet-4-20250514
  timeout: 120
timeouts:
  claude: 600
  gemini: 600
  codex: 600

Comparison

	LangGraph	CrewAI	AutoGen	aqm
Pipeline definition	Python	Python + YAML	Python	YAML only
Pipeline sharing	❌	Paid	❌	Open registry
Multi-agent discussion	❌	❌	Group chat	Session nodes + consensus voting
Task decomposition	❌	❌	❌	Chunk tracking
Context optimization	❌	Auto-summarize	❌	5 strategies (55-85% savings)
Multi-LLM	LangChain	LiteLLM	Multiple	CLI subprocess (no API keys)
Cost model	Per-token API	Per-token API	Per-token API	CLI subscription (no extra fees)
Human-in-the-loop	Middleware	Webhooks	HumanProxy	First-class per-agent config
Quality gates	❌	Callbacks	❌	LLM + Human gates
Auto entry routing	❌	❌	❌	LLM-based `entry_point: auto`
Fan-out parallel	Manual	Manual	❌	Declarative
Real-time streaming	❌	❌	❌	Token-level SSE
Web dashboard	Paid	Paid	❌	Built-in (free)

Architecture

aqm/
├── core/
│   ├── agent.py          # AgentDefinition, ConsensusConfig, ChunksConfig, HumanInputConfig
│   ├── pipeline.py       # Pipeline loop + _run_session() + context strategy
│   ├── chunks.py         # Chunk model, ChunkManager, directive parser
│   ├── task.py           # Task, StageRecord, TaskStatus
│   ├── gate.py           # LLMGate / HumanGate
│   ├── context_file.py   # context.md + agent_{id}.md + transcript.md + smart windowing
│   ├── context.py        # Jinja2 prompt builder
│   ├── config.py         # ProjectConfig (.aqm/config.yaml)
│   └── project.py        # Project root detection
├── queue/
│   ├── base.py           # AbstractQueue interface
│   ├── sqlite.py         # SQLiteQueue (production)
│   └── file.py           # FileQueue (testing)
├── runtime/
│   ├── base.py           # AbstractRuntime interface
│   ├── claude_code.py    # Claude Code (with MCP, token streaming)
│   ├── gemini.py         # Gemini CLI
│   └── codex.py          # Codex CLI
├── web/
│   ├── app.py            # FastAPI app factory
│   ├── templates.py      # Shared CSS/layout/helpers
│   ├── pages/            # Page renderers (dashboard, agents, registry, validate, task_detail)
│   └── api/              # REST + chunk + SSE + human input endpoints
├── registry.py           # GitHub pipeline registry
└── cli.py                # Click CLI

Community

Discord | Registry | JSON Schema

Contributing

git clone https://github.com/aqm-framework/aqm
cd aqm
pip install -e ".[dev,serve]"
pytest tests/

Pipeline contributions are valued equally to code contributions. See CONTRIBUTING.md.

License

MIT

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

1.3.2

Mar 27, 2026

1.3.1

Mar 26, 2026

This version

1.3.0

Mar 26, 2026

1.2.17

Mar 26, 2026

1.2.16

Mar 26, 2026

1.2.15

Mar 25, 2026

1.2.14

Mar 25, 2026

1.2.13

Mar 25, 2026

1.2.12

Mar 25, 2026

1.2.11

Mar 25, 2026

1.2.10

Mar 24, 2026

1.2.9

Mar 24, 2026

1.2.8

Mar 24, 2026

1.2.7

Mar 24, 2026

1.2.6

Mar 24, 2026

1.2.5

Mar 24, 2026

1.2.4

Mar 24, 2026

1.2.3

Mar 24, 2026

1.2.2

Mar 24, 2026

1.2.1

Mar 24, 2026

1.2.0

Mar 24, 2026

1.1.2

Mar 24, 2026

1.1.1

Mar 24, 2026

1.1.0

Mar 24, 2026

1.0.6

Mar 24, 2026

1.0.5

Mar 24, 2026

1.0.4

Mar 24, 2026

1.0.3

Mar 24, 2026

1.0.2

Mar 24, 2026

0.1.1

Mar 24, 2026

0.1.0

Mar 24, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

aqm-1.3.0.tar.gz (131.8 kB view details)

Uploaded Mar 26, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

aqm-1.3.0-py3-none-any.whl (126.1 kB view details)

Uploaded Mar 26, 2026 Python 3

File details

Details for the file aqm-1.3.0.tar.gz.

File metadata

Download URL: aqm-1.3.0.tar.gz
Upload date: Mar 26, 2026
Size: 131.8 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.2

File hashes

Hashes for aqm-1.3.0.tar.gz
Algorithm	Hash digest
SHA256	`dbdee505d930d792a34bdd95b683a921156edb95fe35c5be3bfd6ab87d438f13`
MD5	`788fb3b3bbcef71038606202040d95c4`
BLAKE2b-256	`333230e72d67dff9e423428b11ed2349adf9d366e78b5b32531f4745cadf15e9`

See more details on using hashes here.

File details

Details for the file aqm-1.3.0-py3-none-any.whl.

File metadata

Download URL: aqm-1.3.0-py3-none-any.whl
Upload date: Mar 26, 2026
Size: 126.1 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.2

File hashes

Hashes for aqm-1.3.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`3179be9d35f070f103df86ca22db472c39bc964f1078e2522d5fef5e13c7597e`
MD5	`6a205890c408959ebd1bba9882f39317`
BLAKE2b-256	`d0a9d295cf48d6713faed31c145cb0b41ee18a1db407ce8b93c17ad48cd039a8`

See more details on using hashes here.

aqm 1.3.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

aqm | 한국어

Why aqm?

What makes aqm different

Install

Quick Start

Real-World Examples

Example 1: Code Review Pipeline

Example 2: Architecture Decision Session

Example 3: Human-in-the-Loop Deployment

Features

Multi-LLM Runtimes

Conversational Sessions

Chunk Decomposition

Context Strategy (Token Optimization)

Handoff Routing

Human Input (Human-in-the-Loop)

Gates (Quality Control)

Task Restart & Recovery

MCP Servers

Params (Portable Pipelines)

Imports / Extends

Pipeline Registry (Share & Discover)

CLI Reference

agents.yaml Reference

Entry Point (Auto-Routing)

Agent Definition

Handoff Fields

Session Fields (type: session)

config.yaml Reference

Comparison

Architecture

Community

Contributing

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes