Autonomous metric-driven agentic coding framework

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

zhanglpg

These details have not been verified by PyPI

Project description

AutoForge

Autonomous metric-driven agentic coding framework. Generalizes the pattern:

measure → agent acts → re-measure → iterate until target met

AutoForge wraps any code-quality metric into an iterative improvement loop driven by an AI coding agent (e.g., Claude Code). Define a target metric, set budget limits, and let the framework orchestrate measurement, agent action, regression testing, and git management automatically.

Features

Metric-driven iteration loop — measure, act, validate, repeat until the target is reached or budget is exhausted.
Pluggable metric adapters — bring your own measurement tool. Built-in adapters for code complexity (NCS via complexity-accounting) and test quality (coverage + assertion analysis).
Budget management — hard limits on iterations, tokens, and wall-clock time, plus automatic stall detection when improvements plateau.
Regression guard — runs your test suite between iterations and enforces constraint metrics so improvements never break existing behavior.
Git integration — automatic branch creation, per-iteration commits, and rollback on failed iterations.
YAML workflow configs — declarative workflow definitions that specify metrics, budgets, constraints, agent prompts, and language-specific tooling.
Reporting — JSON and Markdown run reports with health dashboards.

Prerequisites

AutoForge requires a local AI coding agent to perform the actual code modifications. The agent is configurable — it defaults to Claude Code but can be changed per workflow or at the command line.

The configured agent binary must be installed and available on your PATH. AutoForge checks this at startup and fails fast with a clear error if the agent cannot be found.
By default, AutoForge invokes claude --print --output-format json -p "<prompt>" as a subprocess each iteration.
To use a different agent, set agent.command in your workflow YAML or use the --agent-command CLI flag (see Agent Integration).

AutoForge itself does not call the Claude API directly — it orchestrates the iteration loop (measure, budget, git, regression) and delegates code changes to the local agent process.

Installation

pip install autoforge

For development:

git clone https://github.com/zhanglpg/autoforge.git
cd autoforge
pip install -e ".[dev]"

Requires Python 3.10+.

Quick Start

Run a workflow

# Reduce code complexity in ./src to a target NCS of 3.0
autoforge run complexity_refactor --path ./src --target 3.0

# Improve test quality to 80% score
autoforge run test_quality --path ./src --target 80.0

Check project health

# Run all metric adapters and show a health dashboard
autoforge health --path ./src

# Output as JSON
autoforge health --path ./src --format json

List available workflows and adapters

autoforge list

CLI Reference

`autoforge run <workflow>`

Execute a metric-driven improvement workflow.

Flag	Description
`--path, -p`	Target path to improve (default: repo root)
`--repo, -r`	Repository root (default: `.`)
`--target, -t`	Target metric value to achieve
`--adapter, -a`	Metric adapter override
`--config, -c`	Path to a custom workflow YAML
`--max-iterations`	Override max iteration count
`--max-tokens`	Override max token budget
`--max-time`	Override max wall-clock time (minutes)
`--test-command`	Custom test command for regression guard
`--skip-tests`	Skip test validation between iterations
`--skip-git`	Skip git branch/commit management
`--dry-run`	Measure only, don't run the agent
`--agent-command`	Custom agent command (overrides workflow `agent.command`; used as-is)
`--output, -o`	Output directory for reports

`autoforge health`

Run all (or specified) metric adapters and produce a health dashboard.

Flag	Description
`--path, -p`	Target path to analyze
`--repo, -r`	Repository root (default: `.`)
`--adapters`	Comma-separated adapter names
`--format, -f`	Output format: `text` or `json`
`--output, -o`	Save output to file

`autoforge list`

List all registered workflows and adapters.

Architecture

┌─────────────────────────────────────────────────┐
│                  WorkflowRunner                  │
│         (measure → act → validate loop)          │
├────────────┬────────────┬───────────┬────────────┤
│ BudgetMgr  │  GitMgr    │ Regression│  Reporting │
│ (limits,   │  (branch,  │ Guard     │  (JSON,    │
│  stall     │   commit,  │ (tests,   │   markdown,│
│  detect)   │   rollback)│  checks)  │   health)  │
├────────────┴────────────┴───────────┴────────────┤
│              MetricAdapter (pluggable)            │
│     complexity · test_quality · your own ...      │
└─────────────────────────────────────────────────-┘

WorkflowRunner — orchestrates the iteration loop: measure baseline, invoke agent, re-measure, validate, commit or rollback.
Agent (Claude Code) — the external coding agent invoked as a subprocess each iteration. The runner constructs a prompt with the current metric value, target, and priority files, then calls claude --print --output-format json. Use --agent-command to substitute a different agent.
BudgetManager — enforces iteration/token/time limits and detects improvement stalls. Token usage is parsed best-effort from the agent's JSON output or stderr.
GitManager — creates feature branches, commits per iteration, rolls back failed iterations.
RegressionGuard — runs tests and checks constraint metrics between iterations.
MetricAdapter — protocol for plugging in any measurement tool. Adapters normalize tool output into a standard MetricResult and identify priority files for the agent to focus on.

Agent Integration

AutoForge follows a subprocess-based agent model: the framework handles everything except the actual code changes, which are delegated to a local coding agent.

Default: Claude Code

Each iteration, the WorkflowRunner:

Calls adapter.identify_targets() to find priority files needing improvement.
Builds a structured prompt containing: iteration number, current metric value, target, direction (minimize/maximize), priority file list, and any system_prompt_addendum from the workflow YAML.

Writes the prompt to a temp file and invokes:

claude --print --output-format json -p "$(cat <prompt_file>)"

Parses token usage from the agent's JSON output (or stderr) for budget tracking.

The agent runs inside the repository working directory and directly modifies files on disk. After the agent returns, AutoForge re-measures, validates (tests + constraints), and commits or rolls back.

Custom Agents

Use --agent-command to substitute any command that accepts a prompt on stdin or as an argument:

# Use a custom script
autoforge run complexity_refactor --path ./src --target 3.0 \
  --agent-command "python my_agent.py"

# Use a different CLI tool
autoforge run complexity_refactor --path ./src --target 3.0 \
  --agent-command "aider --message"

Workflow YAML Agent Config

Each workflow YAML can include an agent section to configure the agent binary and provide domain-specific instructions:

agent:
  command: "claude"  # Agent binary (default: claude). Change to use a different agent.
  skill: "refactor-complexity"
  system_prompt_addendum: |
    You are performing complexity-driven iterative refactoring.
    Prioritize extracting helper functions and reducing nesting depth.

Fail-Fast Validation

AutoForge verifies the agent binary exists on PATH before starting any iterations. If the agent is not found, the run fails immediately with a clear error message suggesting either installing the agent or using --agent-command.

Built-in Workflows

`complexity_refactor`

Reduces code complexity using complexity-accounting to measure Net Complexity Score (NCS). The agent iteratively refactors mega-functions, dispatch chains, deep nesting, and duplicated logic.

`test_quality`

Improves test suite quality by combining coverage measurement, function gap analysis, and assertion quality scoring. The agent generates missing tests and strengthens existing ones.

Adding a New Adapter

Subclass BaseMetricAdapter in src/autoforge/adapters/
Implement check_prerequisites(), measure(), identify_targets()
Register in src/autoforge/registry.py
Create a workflow YAML in src/autoforge/workflows/

See src/autoforge/adapters/complexity.py for a reference implementation.

Project Structure

src/autoforge/
├── __init__.py         # Package version
├── __main__.py         # CLI entry point (run, health, list commands)
├── models.py           # Core data models (MetricResult, WorkflowConfig, RunReport)
├── runner.py           # Workflow runner (measure-act-validate loop)
├── budget.py           # Budget manager (iteration/token/time limits, stall detection)
├── git_manager.py      # Git operations (branch, commit, rollback per iteration)
├── regression.py       # Regression guard (test runner, constraint checking)
├── reporting.py        # Report generation (JSON, markdown, health dashboard)
├── registry.py         # Workflow & adapter registry
├── adapters/
│   ├── base.py         # BaseMetricAdapter ABC
│   ├── complexity.py   # Complexity adapter (NCS)
│   └── test_quality.py # Test quality adapter (TQS)
└── workflows/
    ├── complexity_refactor.yaml
    └── test_quality.yaml

Development

pip install -e ".[dev]"
pytest

License

MIT

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

zhanglpg

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

0.1.1

Mar 26, 2026

0.1.0

Mar 26, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

open_autoforge-0.1.1.tar.gz (59.3 kB view details)

Uploaded Mar 26, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

open_autoforge-0.1.1-py3-none-any.whl (36.3 kB view details)

Uploaded Mar 26, 2026 Python 3

File details

Details for the file open_autoforge-0.1.1.tar.gz.

File metadata

Download URL: open_autoforge-0.1.1.tar.gz
Upload date: Mar 26, 2026
Size: 59.3 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for open_autoforge-0.1.1.tar.gz
Algorithm	Hash digest
SHA256	`e842d33f1da9165f3dc6ee603ae742ff674f16d8b4ac41743dc6f0e6931caa12`
MD5	`3fe9dfe3fa54c906f6ad719a5116c3b9`
BLAKE2b-256	`7e363cdcaeaa141da0f266ba66ca8a2c2d5b1890fedac19bc9470d39a3cb3f1a`

See more details on using hashes here.

Provenance

The following attestation bundles were made for open_autoforge-0.1.1.tar.gz:

Publisher: publish.yml on zhanglpg/autoforge

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: open_autoforge-0.1.1.tar.gz
- Subject digest: e842d33f1da9165f3dc6ee603ae742ff674f16d8b4ac41743dc6f0e6931caa12
- Sigstore transparency entry: 1185824174
- Sigstore integration time: Mar 26, 2026
Source repository:
- Permalink: zhanglpg/autoforge@e77dc3150c4e96ff7bfa642fc78b02717cbf527c
- Branch / Tag: refs/tags/v0.1.1
- Owner: https://github.com/zhanglpg
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@e77dc3150c4e96ff7bfa642fc78b02717cbf527c
- Trigger Event: release

File details

Details for the file open_autoforge-0.1.1-py3-none-any.whl.

File metadata

Download URL: open_autoforge-0.1.1-py3-none-any.whl
Upload date: Mar 26, 2026
Size: 36.3 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for open_autoforge-0.1.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`3a623e53dc35753da3d63e585dc1d0a9bae0ad539e916fc7451ec5692a00f1d6`
MD5	`7c902db342ad26e8d1aa835e191f0032`
BLAKE2b-256	`ca198df11aaa480a2c06b8ea8c8f5efc92e25d5416c73a9c89e14575fd270077`

See more details on using hashes here.

Provenance

The following attestation bundles were made for open_autoforge-0.1.1-py3-none-any.whl:

Publisher: publish.yml on zhanglpg/autoforge

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: open_autoforge-0.1.1-py3-none-any.whl
- Subject digest: 3a623e53dc35753da3d63e585dc1d0a9bae0ad539e916fc7451ec5692a00f1d6
- Sigstore transparency entry: 1185824176
- Sigstore integration time: Mar 26, 2026
Source repository:
- Permalink: zhanglpg/autoforge@e77dc3150c4e96ff7bfa642fc78b02717cbf527c
- Branch / Tag: refs/tags/v0.1.1
- Owner: https://github.com/zhanglpg
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@e77dc3150c4e96ff7bfa642fc78b02717cbf527c
- Trigger Event: release

open-autoforge 0.1.1

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

AutoForge

Features

Prerequisites

Installation

Quick Start

Run a workflow

Check project health

List available workflows and adapters

CLI Reference

autoforge run <workflow>

autoforge health

autoforge list

Architecture

Agent Integration

Default: Claude Code

Custom Agents

Workflow YAML Agent Config

Fail-Fast Validation

Built-in Workflows

complexity_refactor

test_quality

Adding a New Adapter

Project Structure

Development

License

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance

`autoforge run <workflow>`

`autoforge health`

`autoforge list`

`complexity_refactor`

`test_quality`