Graph control for AI-agent workflows.

These details have not been verified by PyPI

Project links

Project description

AgentProp

AgentProp logo

Graph control for agent workflows.

Version License Status

AgentProp studies AI-agent workflows as directed weighted graphs. Agents, tools, context packets, verifier calls, terminal commands, and failure states become nodes and edges in a graph that can be measured, simulated, and controlled.

The research wedge is simple:

Metric dimension is the core contribution: framing verifier placement as a resolving set makes failure localization a provable property — if resolving coverage is 1.0, every distinct failure produces a unique signature and any single faulty node is uniquely identifiable. With fault-tolerant metric dimension, this holds even if one verifier itself fails. No weighted-heuristic placement can promise this.
Quality cascade models how correctness and compression propagate, so context allocation follows the quality actually reaching each node.
Randomized Zero Forcing (RZF) is a secondary, scoped result: process-based RZF centrality helps on large workflows where static centrality misjudges reachability; on small graphs (under ~15 nodes) classical centrality is competitive. Reported honestly, not as a universal win.
Runtime control turns those ideas into actions: verify, retry, stop, switch strategy, or send more context.

AgentProp is not another agent orchestrator. It wraps a workflow you already have: each step your agent proposes work, the controller inspects the accumulated ExecutionEvent history, and decides what happens next.

   task ─► ┌─ AgentProp control loop ───────────────────────┐
           │  ┌────────┐  propose   ┌─────────────────────┐  │
           │  │  your  │ ─────────► │ Stopping Controller │  │
           │  │ agent  │ ◄───────── │ CONTINUE/VERIFY/    │  │ ─► result +
           │  └────────┘  decision  │ SWITCH/FINALIZE     │  │    decision trace
           │      └─ ExecutionEvent ┴─────────────────────┘  │
           │     (tokens, exit code, verifier_passed, ...)   │
           └─────────────────────────────────────────────────┘

Every decision is logged, so the trace is auditable. The only contract your harness must satisfy is emitting one ExecutionEvent per step. AgentProp ships dependency-light adapters for LangGraph, AutoGen, CrewAI, OpenAI Agents, and LlamaIndex (see framework integrations), and controls any other harness that can return an ExecutionEvent.

Why metric dimension matters (intuition): a workflow only fails usefully if you can tell which node failed. With verifiers placed badly, a bad planner output and a bad tester output can produce the same observable signature — so you cannot route a fix. A resolving set guarantees each node's vector of distances to the verifiers is unique, giving every distinct failure a distinct fingerprint. See verifier semantics.

Early Signal

On one Terminal-Bench 2.1 smoke task using Harbor's codex agent with gpt-5.5, the AgentProp A2 controller preserved success while reducing spend:

Task	Arm	Result	Tokens	Cost	Time
`regex-log`	A0 raw Codex	pass	123,731	$0.333551	203.8s
`regex-log`	A2 AgentProp control	pass	81,949	$0.196834	173.6s

That is 33.8% fewer tokens, 41.0% lower cost, and 14.8% less wall time on a pass-preserving comparison. This is a single-task early signal, not a benchmark claim; the point is that AgentProp can already act as a spend-aware controller around live coding-agent execution.

What Is Implemented

Directed weighted AgentGraph with JSON validation, NetworkX conversion, and Graphviz export.
Propagation models: Independent Cascade, Linear Threshold, Bootstrap Percolation, deterministic Zero Forcing, Randomized Zero Forcing, learned propagation, and Quality Cascade.
Graph algorithms for seed selection, pruning, bottlenecks, k-core, bridges, articulation points, centrality, verifier placement, and resolving coverage.
Metric-dimension verifier placement, including fault-tolerant resolving coverage for single-verifier failure.
RZF process-based centrality for seed selection and scaling studies.
Runtime controllers for graph-node execution, terminal-loop control, verifier forcing, local-pass distrust, retry/stop/switch decisions, and category-conditioned bandit policies.
ControlSession, a small public facade that starts with graph analysis, observes real execution events, returns control decisions, and saves traces.
Optional ML/DL/RL baselines: learned seed scorers, torch GNNs, Q-learning, REINFORCE, PPO, and artifact/checkpoint tooling.
Coding-agent integration helpers for Codex, Claude Code, FastMCP tools, and framework adapters.

Install

python -m pip install agentprop

For development:

python -m pip install -e ".[dev]"
python -m pytest

Optional extras:

python -m pip install -e ".[dl]"  # torch-backed graph models
python -m pip install -e ".[rl]"  # Gymnasium-compatible RL experiments
python -m pip install -e ".[mcp]" # FastMCP server for editor-agent tools

Quick Start

Analyze a built-in workflow:

agentprop analyze planner_coder_tester_reviewer

Recommend context seed nodes under the RZF propagation model:

agentprop optimize planner_coder_tester_reviewer \
  --budget 2 \
  --algorithm greedy \
  --model rzf

Compare graph propagation policies:

PYTHONPATH=src:. python experiments/run_benchmark.py \
  --workflows chain planner_coder_tester_reviewer research_writer_verifier \
  --algorithms rzf-centrality greedy betweenness pagerank random \
  --models quality-cascade independent-cascade \
  --budget 2 --trials 50 --decay --decay-seed 0 \
  --out-dir results/my_run

Generate verifier-placement evidence:

PYTHONPATH=src:. python experiments/verifier_placement_evidence.py

Run the RZF scaling study:

PYTHONPATH=src:. python experiments/rzf_scaling_study.py

Both scripts are deterministic and print an expected-output block at the top of the source so you can confirm you reproduced the published numbers (metric dimension reaching a resolving set at lower budget k, and RZF leading on large graphs). The headline figures are summarized in reproducible results.

Run a key-free control-layer demo:

agentprop control-demo --demo terminal --out-dir reports/control-demo

The demo writes trace.jsonl, summary.json, and report.md. The trace starts with graph analysis, then records runtime events, features, decisions, and the final outcome.

Use the runtime control facade from Python:

from agentprop.runtime import ControlSession, ExecutionEvent

session = ControlSession.start(
    "planner_coder_tester_reviewer",
    task_id="task-123",
    category="implementation",
    token_budget=120_000,
    baseline_tokens=180_000,
)
decision = session.observe(
    ExecutionEvent(
        step=1,
        command="pytest -q",
        verifier_run=True,
        verifier_passed=False,
        error_signature="AssertionError:test_edge_case",
        tokens_used=18_000,
    )
)
session.write_artifacts("reports/task-123")

Coding-Agent Integration

AgentProp can be used with Codex CLI, Claude Code, or any MCP-capable editor agent as a workflow-analysis layer. It does not need model API keys to generate briefs or run local graph analysis; Codex can keep using codex login, and Claude Code can use the included skill/MCP-style integration.

agentprop agent-instructions planner_coder_tester_reviewer \
  --target codex \
  --out reports/codex_agent_brief.md

agentprop agent-instructions planner_coder_tester_reviewer \
  --target claude-code \
  --out reports/claude_code_agent_brief.md

Use these briefs for everyday implementation/review tasks, or run agentprop-mcp when a coding agent should call AgentProp tools directly while designing or debugging a multi-agent workflow.

python -m pip install "agentprop[mcp]"
agentprop-mcp

The MCP server uses FastMCP when the extra is installed and exposes both analysis tools and live control-session tools. See the control layer quickstart and coding-agent integration guide.

The installable agent skill lives at skills/agentprop-workflow-optimizer:

npx skills add https://github.com/aryan5v/AgentProp --skill agentprop-workflow-optimizer

Research Position

AgentProp sits between graph theory, diffusion models, and agent evaluation. The core hypothesis is that agent workflows should be optimized as communication graphs under quality, cost, and observability constraints, rather than treated as opaque prompt loops.

Key inspirations:

Jesse Geneson et al., Randomized Zero Forcing: stochastic propagation on directed weighted graphs.
Jesse Geneson, Metric dimension and pattern avoidance in graphs: resolving sets and graph observability.
Jesse Geneson and Leslie Hogben, Propagation time for probabilistic zero forcing: expected propagation time as a graph parameter.
Kempe, Kleinberg, and Tardos, Maximizing the Spread of Influence through a Social Network: influence maximization under cascade models.
GPTSwarm, DyLAN, and AgentPrune: agent workflows as optimizable, sparse, task-adaptive communication graphs.

See the documentation index, research references, and the literature review for more detail.

Status

AgentProp is public alpha research software. The graph backbone, propagation models, runtime-control APIs, CLI, tests, and experiment scripts are usable, but the benchmark evidence is still early. Treat live-agent results as directional until larger repeated studies are published.

Development

ruff check .
mypy src
pytest

CI runs the same gates on pull requests. AgentProp is released under the Apache 2.0 license.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.1.0a3 pre-release

Jun 5, 2026

0.1.0a1 pre-release

Jun 1, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

agentprop-0.1.0a3.tar.gz (1.0 MB view details)

Uploaded Jun 5, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

agentprop-0.1.0a3-py3-none-any.whl (159.3 kB view details)

Uploaded Jun 5, 2026 Python 3

File details

Details for the file agentprop-0.1.0a3.tar.gz.

File metadata

Download URL: agentprop-0.1.0a3.tar.gz
Upload date: Jun 5, 2026
Size: 1.0 MB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for agentprop-0.1.0a3.tar.gz
Algorithm	Hash digest
SHA256	`83da4368c79f6fac1033d5b2e033a365b6fac670a93ce16edcd996e597c6d123`
MD5	`b1775e2639079d1903759a0d2abf3540`
BLAKE2b-256	`60c9cdc8dd1e14ed17ff05c4d95fcfd24c5bb7a9969be6effdb314865eb8792c`

See more details on using hashes here.

File details

Details for the file agentprop-0.1.0a3-py3-none-any.whl.

File metadata

Download URL: agentprop-0.1.0a3-py3-none-any.whl
Upload date: Jun 5, 2026
Size: 159.3 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for agentprop-0.1.0a3-py3-none-any.whl
Algorithm	Hash digest
SHA256	`0d8e2295a9f63d5b29fc204bda9cf4704617deb75814626b25d8f69f52d895d8`
MD5	`9e37be29935cd80a4d694f43c2b0ef96`
BLAKE2b-256	`ad805aa07f149f2857da42f00c0ddaed70b8403f2b38488aa72c49cdba6827bd`

See more details on using hashes here.

agentprop 0.1.0a3

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

AgentProp

Early Signal

What Is Implemented

Install

Quick Start

Coding-Agent Integration

Research Position

Status

Development

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes