Structured AI-assisted development framework with plan lifecycle, review gates, and continuous improvement.

These details have not been verified by PyPI

Project links

Project description

AgentScaffold

Stop paying for your AI agent to rediscover your codebase every session.

AgentScaffold is a governance framework and persistent knowledge graph for AI coding agents. It replaces the expensive pattern of agents re-reading files, re-grepping symbols, and re-tracing dependencies from scratch -- with a single tool call that returns exactly what the agent needs.

The Problem

Every time you start a new session with Cursor, Claude Code, Codex, or any AI coding agent, it starts from zero. It reads your files. It greps for imports. It traces call chains. It burns through your token budget and subscription quota just to understand what it already understood yesterday.

On a moderately complex codebase, a single "understand this module" task can cost 12 file reads + 2 grep searches before the agent even starts working. A full plan review pulls in 10+ files. Getting oriented in a new codebase means reading 38+ files.

This is the hidden cost of agentic development: not the coding, but the context building. AgentScaffold addresses this by separating one-time indexing from repeated reasoning work.

The Solution

AgentScaffold builds a knowledge graph of your codebase -- code structure, dependencies, governance artifacts, and session history -- and exposes it through MCP tools that your agent calls instead of reading raw files. Instead of rebuilding context from scratch in every session, the agent retrieves scoped context in one call and moves directly into analysis or implementation.

Measured results from our latest evaluation harness run (79 scenarios, 100% pass rate):

Task	Without AgentScaffold	With AgentScaffold	Savings
Understand a module and its dependents	12 reads + 2 greps	1 tool call	97% fewer tokens, 93% fewer calls
Codebase orientation	38 file reads	2 tool calls	77% fewer tokens, 95% fewer calls
Impact analysis (blast radius)	12 file reads	1 tool call	88% fewer tokens, 92% fewer calls
Find all code matching a concept	8 file reads	1 tool call	44% fewer tokens, 88% fewer calls
Full plan review with evidence	10 file reads	1 tool call	90% fewer calls (richer output)

Capability aggregate: 91% average call reduction. 58% average token reduction. 2.8x overall compression.

Capability vs behavioral reality

We now report two views so results are not sugar-coated:

Capability efficiency (raw): what the tools can do when selected (58% token and 91% call reduction average).
Behavior-adjusted efficiency: capability gains multiplied by tool-routing adherence proxy.

In real usage, adjusted values are lower because agents do not always choose tools consistently; replay-based evaluation captures that behavior directly.

Current harness outputs:

View	Token Reduction	Call Reduction
Raw capability	58.3%	91.4%
Behavioral (replay-adjusted)	43.7%	68.5%
Quality-adjusted behavioral	39.4%	61.7%

Behavioral and quality-adjusted values come from replay traces (observed tool-call sequences + quality parity checks), not just phrase-level intent matching.

Every tool call your agent doesn't make is money you don't spend on API tokens or subscription overages. And because the governance framework catches flawed assumptions and missing edge cases before implementation, you also spend less time fixing bugs that should never have been written.

What It Does

AgentScaffold combines two capabilities that are rarely integrated together in a single tool:

1. Agent Governance Framework

A structured development workflow that teaches your AI agent to follow a plan lifecycle with quality gates:

Plan lifecycle: Draft -> Review -> Ready -> In Progress -> Complete
Adversarial reviews: Devil's advocate, expansion analysis, domain-specific reviews -- all run before a single line of code is written
Interface contracts: Formal declarations of module boundaries, versioned and tracked
Retrospectives: Post-execution learning that feeds back into the process
Session tracking: State files that persist context across chat sessions

Think of it as a virtual sprint team. Most AI agents work alone -- they take instructions and start coding. AgentScaffold puts your agent on a team. Before it writes a single line of code, the plan faces a devil's advocate who asks "what if this breaks?", an expansion reviewer who asks "what did you miss?", and a domain expert -- a quant architect, a UX designer, a security engineer -- who pressure-tests the approach through the lens of your specific domain. These adversarial reviews catch flawed assumptions, missing edge cases, and architectural blind spots before they become bugs in production.

After implementation, the sprint continues. A post-implementation review verifies what was built against what was planned. A retrospective captures what worked, what didn't, and what to do differently. Those findings flow into the learnings tracker, which feeds back into the agent's rules and templates -- so the next sprint starts sharper than the last. This is the same continuous improvement loop that makes experienced engineering teams get better over time, applied to your AI agent.

The result: tighter plans that survive expert scrutiny, more robust implementations with edge cases identified up front, and a codebase that accumulates institutional knowledge rather than losing it between sessions.

2. Persistent Knowledge Graph

A KuzuDB-backed graph that indexes your codebase once and serves it to agents instantly:

Code structure: Functions, classes, methods, interfaces, import chains, call graphs -- across Python, TypeScript, Go, Rust, Java, C, and C++
Governance artifacts: Plans, contracts, learnings, review findings linked to the code they reference
Community detection: Leiden algorithm clustering identifies tightly coupled modules
Semantic search: Hybrid search combining structural graph queries with vector embeddings
Incremental indexing: SHA-256 content hashing means only changed files are re-processed
Contract drift detection: Automatically surfaces methods declared in contracts but missing from code

The graph is exposed via MCP tools that any compatible agent can call, or through the CLI for direct use.

Quick Start

pip install agentscaffold
cd my-project
scaffold init
scaffold index          # Build the knowledge graph

The init command scaffolds your project with:

docs/ai/ -- templates, prompts, standards, state files
AGENTS.md -- rules your AI agent follows automatically
.cursor/rules.md -- Cursor-specific rules
scaffold.yaml -- your project's framework configuration
justfile + Makefile -- task runner shortcuts
.github/workflows/ -- CI with security scanning

The index command builds the knowledge graph at .scaffold/graph.db, enabling search, reviews, impact analysis, and session memory.

Async freshness (low-latency graph updates for MCP)

AgentScaffold supports an async freshness mode for MCP usage. Instead of blocking a tool call to re-index, the request path runs a cheap freshness check and returns immediately. If the graph looks stale, a background incremental refresh is scheduled (with debounce and single-flight locking) while the agent continues working.

Why this design matters:

Keeps MCP interactions in milliseconds/seconds instead of minutes on large repos
Avoids duplicate refresh jobs under parallel tool usage
Surfaces explicit freshness metadata (fresh, stale, unknown, refreshing) so agents can reason about confidence
Preserves strict governance by allowing gate transitions to defer when freshness is required and not yet restored

Configure in scaffold.yaml:

freshness:
  async_enabled: true
  debounce_seconds: 120
  gate_strict: false
  background_queue_enabled: true

Install with language support

pip install agentscaffold[graph]              # Python, JS, TS
pip install agentscaffold[graph-all-languages] # + Go, Rust, Java, C, C++
pip install agentscaffold[all]                # Everything

How Agents Use It

MCP Tools (for AI agents)

When you run scaffold mcp, these tools become available to your agent.

Interaction Modes

AgentScaffold supports two complementary ways of working:

Natural-language + MCP (interactive): describe intent conversationally and let the agent route to the right governance/graph workflow.
Structural CLI commands (explicit/automation): use direct scaffold commands for deterministic setup, verification, CI, and fallback.

Teams usually get best UX with NL+MCP for day-to-day flow, then use explicit CLI commands for verification (scaffold validate, scaffold graph verify, scaffold index --incremental).

If you used the governance framework before knowledge graph integration, see docs/migrating-governance-to-nl-mcp.md for a command-first -> hybrid -> NL-first transition path.

You don't need to memorize tool names. AgentScaffold teaches the agent how to interpret user intent in natural conversation, map that intent to the right MCP workflow, and only fall back to direct reads/search when tool output is insufficient. Say "let's review plan 42" and the agent routes to scaffold_prepare_review. Say "where did we leave off?" and it routes to scaffold_orient. Run scaffold agents cursor (or windsurf, claude) to generate platform-specific rules that wire this behavior into your IDE.

Composite tools -- single calls that replace entire multi-step workflows:

Tool	What It Replaces
`scaffold_prepare_review`	Reading plan, contracts, learnings, and source to prepare a full adversarial review
`scaffold_prepare_implementation`	Tracing dependencies, checking contracts, and verifying readiness before coding
`scaffold_orient`	Reading 38+ files to understand project state, blockers, and next steps
`scaffold_decision_context`	Tracing the full decision chain (ADRs, spikes, studies) behind a plan
`scaffold_staleness_check`	Manually comparing plan dates, file changes, and overlapping completed work
`scaffold_compare_plans`	Reading two plans and their file impacts to identify conflicts
`scaffold_prepare_retro`	Gathering verification results, study outcomes, and retro insights
`scaffold_find_studies`	Searching study files by topic, tags, or outcome
`scaffold_find_adrs`	Searching architecture decision records by topic or status

Use composite tools by default for common workflows; use granular tools when you need targeted control.

Granular tools -- building blocks for custom queries:

Tool	What It Replaces
`scaffold_context`	Reading 12+ files to understand a symbol, its callers, and its layer
`scaffold_impact`	Manually tracing imports and grep-searching for consumers
`scaffold_search`	Multiple grep passes to find code by concept
`scaffold_review_context`	Reading plan files, contracts, and source to prepare a single review type
`scaffold_stats`	Scanning the entire directory tree to understand codebase shape
`scaffold_validate`	Running separate staleness checks and contract verification
`scaffold_query`	Writing ad-hoc Cypher queries against the knowledge graph

CLI (for humans)

scaffold plan create my-feature        # Create a plan from template
scaffold plan lint --plan 001          # Validate plan structure
scaffold plan status                   # Dashboard of all plans
scaffold validate                      # Run all enforcement checks
scaffold retro check                   # Find missing retrospectives
scaffold agents generate               # Regenerate AGENTS.md
scaffold agents cursor                 # Regenerate .cursor/rules.md
scaffold import chat.json --format chatgpt  # Import conversation
scaffold ci setup                      # Generate CI workflows
scaffold metrics                       # Plan analytics
scaffold graph search "data routing"   # Hybrid search
scaffold graph verify                  # Graph accuracy check
scaffold review brief 42               # Pre-review brief for plan 42
scaffold review challenges 42          # Adversarial challenges with evidence
scaffold session start --plan 42       # Start a tracked coding session

Execution Profiles

Interactive (default): Human + AI agent in an IDE conversation. The agent follows AGENTS.md, asks questions when uncertain.

Semi-Autonomous (opt-in): Agent invoked from CLI/CI without a human present. Adds session tracking, safety boundaries, notification hooks, structured PR output, and cautious execution rules.

Both profiles coexist in the same AGENTS.md. The agent self-selects based on invocation context.

Rigor Levels

Minimal: Lightweight gates for prototypes and small projects
Standard: Full plan lifecycle with reviews, contracts, and retrospectives
Strict: All gates enforced, all plans require approval

Domain Packs

The governance framework is domain-aware. Domain packs teach the adversarial reviewers to think like specialists in your field -- a trading pack adds a quant architect who challenges risk assumptions and position sizing logic, a webapp pack adds a UX reviewer who flags accessibility gaps and performance regressions. Each pack includes tailored review prompts, implementation standards, and approval gates specific to the domain:

Pack	Focus
trading	Quantitative finance, RL, traceability
webapp	UX/UI, accessibility, performance budgets
mlops	Model lifecycle, experiment tracking, drift detection
data-engineering	Pipeline quality, schema evolution, SLAs
api-services	API design, backward compatibility, contract testing
infrastructure	IaC, deployment safety, cost analysis
mobile	Platform guidelines, offline-first, app store compliance
game-dev	Game loops, ECS, frame budgets
embedded	Memory constraints, real-time deadlines, OTA safety
research	Reproducibility, statistical rigor, experiment protocol

This keeps governance strict where risk is high and lightweight where speed matters, without rewriting the core framework.

scaffold domains add trading
scaffold domains add webapp

Documentation

Full documentation is in docs/:

Getting Started -- installation, init, first plan
User Guide -- session workflow, knowledge graph, review patterns
Platform Integration -- Cursor, Claude Code, Windsurf, Cline, aider, Codex, MCP setup
Configuration Reference -- full scaffold.yaml reference
Domain Packs -- available packs and installation
Semi-Autonomous Guide -- CLI/CI agent mode
CI Integration -- GitHub Actions workflows

License

MIT

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.3.1

Mar 17, 2026

0.3.0

Mar 17, 2026

This version

0.2.4

Mar 8, 2026

0.2.3

Mar 7, 2026

0.2.2

Mar 7, 2026

0.2.1

Mar 7, 2026

0.2.0

Mar 5, 2026

0.1.0

Feb 26, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

agentscaffold-0.2.4.tar.gz (381.6 kB view details)

Uploaded Mar 8, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

agentscaffold-0.2.4-py3-none-any.whl (396.3 kB view details)

Uploaded Mar 8, 2026 Python 3

File details

Details for the file agentscaffold-0.2.4.tar.gz.

File metadata

Download URL: agentscaffold-0.2.4.tar.gz
Upload date: Mar 8, 2026
Size: 381.6 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.6

File hashes

Hashes for agentscaffold-0.2.4.tar.gz
Algorithm	Hash digest
SHA256	`84b2daaeb604908ec3d70af96ceec3ba4a13d9d945d6e50652e088c1d8325794`
MD5	`cf9e8887454e358d53b9f045686a2c74`
BLAKE2b-256	`f44be6eb82c83c7a0a71b986a4d9302a43e88f1075855cc97494623e8a04449a`

See more details on using hashes here.

File details

Details for the file agentscaffold-0.2.4-py3-none-any.whl.

File metadata

Download URL: agentscaffold-0.2.4-py3-none-any.whl
Upload date: Mar 8, 2026
Size: 396.3 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.6

File hashes

Hashes for agentscaffold-0.2.4-py3-none-any.whl
Algorithm	Hash digest
SHA256	`f42a0f0909f3408120d0b365e0d8fc1dd11c4a0db29cd619697c79042ed40022`
MD5	`10e3a650a5f517cc6fdae5fcd1c58161`
BLAKE2b-256	`bc46a42cc4e9a7268e72f4ddf27786f84f7cbc3e7abcf3d935f754345749df1c`

See more details on using hashes here.

agentscaffold 0.2.4

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

AgentScaffold

The Problem

The Solution

Capability vs behavioral reality

What It Does

1. Agent Governance Framework

2. Persistent Knowledge Graph

Quick Start

Async freshness (low-latency graph updates for MCP)

Install with language support

How Agents Use It

MCP Tools (for AI agents)

Interaction Modes

CLI (for humans)

Execution Profiles

Rigor Levels

Domain Packs

Documentation

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes