Graph control for AI-agent workflows.
Project description
AgentProp
Graph control for agent workflows.
AgentProp studies AI-agent workflows as directed weighted graphs. Agents, tools, context packets, verifier calls, terminal commands, and failure states become nodes and edges in a graph that can be measured, simulated, and controlled.
The research wedge is simple:
- Metric dimension is the core contribution: framing verifier placement as a resolving set makes failure localization a provable property — if resolving coverage is 1.0, every distinct failure produces a unique signature and any single faulty node is uniquely identifiable. With fault-tolerant metric dimension, this holds even if one verifier itself fails. No weighted-heuristic placement can promise this.
- Quality cascade models how correctness and compression propagate, so context allocation follows the quality actually reaching each node.
- Randomized Zero Forcing (RZF) is a secondary, scoped result: process-based RZF centrality helps on large workflows where static centrality misjudges reachability; on small graphs (under ~15 nodes) classical centrality is competitive. Reported honestly, not as a universal win.
- Runtime control turns those ideas into actions: verify, retry, stop, switch strategy, or send more context.
AgentProp is not another agent orchestrator. It wraps a workflow you already
have: each step your agent proposes work, the controller inspects the accumulated
ExecutionEvent history, and decides what happens next.
task ─► ┌─ AgentProp control loop ───────────────────────┐
│ ┌────────┐ propose ┌─────────────────────┐ │
│ │ your │ ─────────► │ Stopping Controller │ │
│ │ agent │ ◄───────── │ CONTINUE/VERIFY/ │ │ ─► result +
│ └────────┘ decision │ SWITCH/FINALIZE │ │ decision trace
│ └─ ExecutionEvent ┴─────────────────────┘ │
│ (tokens, exit code, verifier_passed, ...) │
└─────────────────────────────────────────────────┘
Every decision is logged, so the trace is auditable. The only contract your
harness must satisfy is emitting one ExecutionEvent per step. AgentProp ships
dependency-light adapters for LangGraph, AutoGen, CrewAI, OpenAI Agents, and
LlamaIndex (see framework integrations), and
controls any other harness that can return an ExecutionEvent.
Why metric dimension matters (intuition): a workflow only fails usefully if you can tell which node failed. With verifiers placed badly, a bad planner output and a bad tester output can produce the same observable signature — so you cannot route a fix. A resolving set guarantees each node's vector of distances to the verifiers is unique, giving every distinct failure a distinct fingerprint. See verifier semantics.
Early Signal
On one Terminal-Bench 2.1 smoke task using Harbor's codex agent with
gpt-5.5, the AgentProp A2 controller preserved success while reducing spend:
| Task | Arm | Result | Tokens | Cost | Time |
|---|---|---|---|---|---|
regex-log |
A0 raw Codex | pass | 123,731 | $0.333551 | 203.8s |
regex-log |
A2 AgentProp control | pass | 81,949 | $0.196834 | 173.6s |
That is 33.8% fewer tokens, 41.0% lower cost, and 14.8% less wall time on a pass-preserving comparison. This is a single-task early signal, not a benchmark claim; the point is that AgentProp can already act as a spend-aware controller around live coding-agent execution.
What Is Implemented
- Directed weighted
AgentGraphwith JSON validation, NetworkX conversion, and Graphviz export. - Propagation models: Independent Cascade, Linear Threshold, Bootstrap Percolation, deterministic Zero Forcing, Randomized Zero Forcing, learned propagation, and Quality Cascade.
- Graph algorithms for seed selection, pruning, bottlenecks, k-core, bridges, articulation points, centrality, verifier placement, and resolving coverage.
- Metric-dimension verifier placement, including fault-tolerant resolving coverage for single-verifier failure.
- RZF process-based centrality for seed selection and scaling studies.
- Runtime controllers for graph-node execution, terminal-loop control, verifier forcing, local-pass distrust, retry/stop/switch decisions, and category-conditioned bandit policies.
ControlSession, a small public facade that starts with graph analysis, observes real execution events, returns control decisions, and saves traces.- Optional ML/DL/RL baselines: learned seed scorers, torch GNNs, Q-learning, REINFORCE, PPO, and artifact/checkpoint tooling.
- Coding-agent integration helpers for Codex, Claude Code, FastMCP tools, and framework adapters.
Install
python -m pip install agentprop
For development:
python -m pip install -e ".[dev]"
python -m pytest
Optional extras:
python -m pip install -e ".[dl]" # torch-backed graph models
python -m pip install -e ".[rl]" # Gymnasium-compatible RL experiments
python -m pip install -e ".[mcp]" # FastMCP server for editor-agent tools
Quick Start
Analyze a built-in workflow:
agentprop analyze planner_coder_tester_reviewer
Recommend context seed nodes under the RZF propagation model:
agentprop optimize planner_coder_tester_reviewer \
--budget 2 \
--algorithm greedy \
--model rzf
Compare graph propagation policies:
PYTHONPATH=src:. python experiments/run_benchmark.py \
--workflows chain planner_coder_tester_reviewer research_writer_verifier \
--algorithms rzf-centrality greedy betweenness pagerank random \
--models quality-cascade independent-cascade \
--budget 2 --trials 50 --decay --decay-seed 0 \
--out-dir results/my_run
Generate verifier-placement evidence:
PYTHONPATH=src:. python experiments/verifier_placement_evidence.py
Run the RZF scaling study:
PYTHONPATH=src:. python experiments/rzf_scaling_study.py
Both scripts are deterministic and print an expected-output block at the top of
the source so you can confirm you reproduced the published numbers (metric
dimension reaching a resolving set at lower budget k, and RZF leading on large
graphs). The headline figures are summarized in
reproducible results.
Run a key-free control-layer demo:
agentprop control-demo --demo terminal --out-dir reports/control-demo
The demo writes trace.jsonl, summary.json, and report.md. The trace starts
with graph analysis, then records runtime events, features, decisions, and the
final outcome.
Use the runtime control facade from Python:
from agentprop.runtime import ControlSession, ExecutionEvent
session = ControlSession.start(
"planner_coder_tester_reviewer",
task_id="task-123",
category="implementation",
token_budget=120_000,
baseline_tokens=180_000,
)
decision = session.observe(
ExecutionEvent(
step=1,
command="pytest -q",
verifier_run=True,
verifier_passed=False,
error_signature="AssertionError:test_edge_case",
tokens_used=18_000,
)
)
session.write_artifacts("reports/task-123")
Coding-Agent Integration
AgentProp can be used with Codex CLI, Claude Code, or any MCP-capable editor
agent as a workflow-analysis layer. It does not need model API keys to generate
briefs or run local graph analysis; Codex can keep using codex login, and
Claude Code can use the included skill/MCP-style integration.
agentprop agent-instructions planner_coder_tester_reviewer \
--target codex \
--out reports/codex_agent_brief.md
agentprop agent-instructions planner_coder_tester_reviewer \
--target claude-code \
--out reports/claude_code_agent_brief.md
Use these briefs for everyday implementation/review tasks, or run
agentprop-mcp when a coding agent should call AgentProp tools directly while
designing or debugging a multi-agent workflow.
python -m pip install "agentprop[mcp]"
agentprop-mcp
The MCP server uses FastMCP when the extra is installed and exposes both analysis tools and live control-session tools. See the control layer quickstart and coding-agent integration guide.
The installable agent skill lives at
skills/agentprop-workflow-optimizer:
npx skills add https://github.com/aryan5v/AgentProp --skill agentprop-workflow-optimizer
Research Position
AgentProp sits between graph theory, diffusion models, and agent evaluation. The core hypothesis is that agent workflows should be optimized as communication graphs under quality, cost, and observability constraints, rather than treated as opaque prompt loops.
Key inspirations:
- Jesse Geneson et al., Randomized Zero Forcing: stochastic propagation on directed weighted graphs.
- Jesse Geneson, Metric dimension and pattern avoidance in graphs: resolving sets and graph observability.
- Jesse Geneson and Leslie Hogben, Propagation time for probabilistic zero forcing: expected propagation time as a graph parameter.
- Kempe, Kleinberg, and Tardos, Maximizing the Spread of Influence through a Social Network: influence maximization under cascade models.
- GPTSwarm, DyLAN, and AgentPrune: agent workflows as optimizable, sparse, task-adaptive communication graphs.
See the documentation index, research references, and the literature review for more detail.
Status
AgentProp is public alpha research software. The graph backbone, propagation models, runtime-control APIs, CLI, tests, and experiment scripts are usable, but the benchmark evidence is still early. Treat live-agent results as directional until larger repeated studies are published.
Development
ruff check .
mypy src
pytest
CI runs the same gates on pull requests. AgentProp is released under the Apache 2.0 license.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file agentprop-0.1.0a3.tar.gz.
File metadata
- Download URL: agentprop-0.1.0a3.tar.gz
- Upload date:
- Size: 1.0 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
83da4368c79f6fac1033d5b2e033a365b6fac670a93ce16edcd996e597c6d123
|
|
| MD5 |
b1775e2639079d1903759a0d2abf3540
|
|
| BLAKE2b-256 |
60c9cdc8dd1e14ed17ff05c4d95fcfd24c5bb7a9969be6effdb314865eb8792c
|
File details
Details for the file agentprop-0.1.0a3-py3-none-any.whl.
File metadata
- Download URL: agentprop-0.1.0a3-py3-none-any.whl
- Upload date:
- Size: 159.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0d8e2295a9f63d5b29fc204bda9cf4704617deb75814626b25d8f69f52d895d8
|
|
| MD5 |
9e37be29935cd80a4d694f43c2b0ef96
|
|
| BLAKE2b-256 |
ad805aa07f149f2857da42f00c0ddaed70b8403f2b38488aa72c49cdba6827bd
|