Cursor-native multi-agent development framework with 5-role review
Project description
harness-flow
Agent writes code. Harness Flow ships products.
L5 autonomous delivery for the vibe coding era — you're the copilot now.
The Problem
AI agents can write code — but they can't ship products. They lack navigation (goal management), traffic rules (quality gates), and a dashcam (audit trail). The bottleneck has shifted from "can AI write code?" to "can AI autonomously deliver?"
Where Harness Flow Fits
The Three Pillars of L5
| Navigation | Traffic Rules | Dashcam | |
|---|---|---|---|
| vision → plan → roadmap | 5-role review + gates + trust | audit trail + learnings + retro | |
| AI knows where to go | AI obeys the rules | every decision recorded |
How It Works
flowchart LR
Req["Requirement"] --> Plan["Plan"]
Plan --> PlanReview["5-role\nplan review"]
PlanReview --> Build["Build + CI"]
Build --> CodeReview["5-role\ncode review"]
CodeReview --> Ship["Ship → PR"]
PlanReview -.-|"Architect · PO · Engineer · QA · PM"| CodeReview
style Req fill:#fff,stroke:#222,stroke-width:2px,color:#000
style Plan fill:#fff,stroke:#222,stroke-width:2px,color:#000
style PlanReview fill:#222,stroke:#222,stroke-width:2px,color:#fff
style Build fill:#fff,stroke:#222,stroke-width:2px,color:#000
style CodeReview fill:#222,stroke:#222,stroke-width:2px,color:#fff
style Ship fill:#fff,stroke:#222,stroke-width:2px,color:#000
One requirement in → one PR out. Both plan and code are reviewed by 5 parallel AI reviewers. Findings from 2+ roles on the same issue are flagged [HIGH CONFIDENCE].
Fix-First classifies every review finding:
- AUTO-FIX — high certainty + small blast radius + reversible → fixed immediately
- ASK — security, behavior change, architecture → batched for your decision
Quick Start
0. 10-minute happy path
Step 1 — Install:
pip install harness-flow
Step 2 — Initialize in your project:
cd <YOUR_PROJECT_PATH>
harness init
Step 3 — Open Cursor, type a requirement:
/harness-plan add input validation to the user registration endpoint
That's it — plan, build, 5-role review, and PR. One command.
What you'll see: the agent generates a spec + contract, 5 reviewers challenge the plan in parallel, then the agent implements, runs CI, gets code reviewed by the same 5 roles, and opens a PR — all autonomously.
Deep Dive
Your AI Engineering Team — 5 parallel reviewers
Harness gives you a complete engineering team inside Cursor — each role reviews both your plan and your code:
| Role | Plan Review | Code Review |
|---|---|---|
| Architect | Feasibility, module impact, dependencies | Conformance, layering, coupling, security |
| Product Owner | Vision alignment, user value, acceptance criteria | Requirement coverage, behavioral correctness |
| Engineer | Implementation feasibility, code reuse, tech debt | Code quality, DRY, patterns, performance |
| QA | Test strategy, boundary values, regression risk | Test coverage, edge cases, CI health |
| Project Manager | Task decomposition, parallelism, scope | Scope drift, plan completion, delivery risk |
Not a simulation — these roles run as parallel AI subagents with distinct system prompts, each scoring independently. Findings from 2+ roles are flagged as high confidence.
Each role can use a different model via [native.role_models] in config. If some reviewers fail, the pipeline continues with available perspectives (graceful degradation).
Contract-Driven Development
Every task starts with a spec + contract — deliverables, acceptance criteria, and risk analysis — reviewed by 5 roles before any code is written.
The contract lives in .harness-flow/tasks/task-NNN/plan.md and serves as the single source of truth. Runtime state is tracked in workflow-state.json alongside it.
Fix-First Auto-Remediation
Every review finding is classified before presenting it to you:
- AUTO-FIX (high certainty + small blast radius + reversible) → fixed immediately, tests re-run
- ASK (security, behavior change, architecture, low confidence) → batched and presented for your decision
Typical auto-fixes: unused imports, stale comments, missing null checks, naming inconsistencies, obvious N+1 queries.
Full Audit Trail
Plans, reviews, build logs, gate results — all persisted per task. Every decision is traceable.
.harness-flow/
├── config.toml # project settings (CI command, trunk branch, language)
├── vision.md # product direction (optional)
└── tasks/task-NNN/
├── plan.md # spec + contract (scope SSOT)
├── handoff-*.json # structured context per phase (plan, build, eval, ship)
├── build-rN.md # build log per round
├── plan-eval-rN.md # plan review per round
├── code-eval-rN.md # code review per round
├── ship-metrics.json # delivery metrics (scores, test count, coverage)
├── workflow-state.json # canonical task phase / gate / blocker tracking
└── ... # feedback ledger, intervention audit, etc. (optional)
Installation & Upgrade
| Command | What it does |
|---|---|
pip install harness-flow |
Install the CLI |
harness init |
Interactive wizard → generates skills, agents, rules into .cursor/ |
harness init --force |
Regenerate all artifacts (after config changes or version upgrade) |
harness update |
Self-update the package + run config migration |
harness update --check |
Check for new version without installing |
All Skills — default: /harness-plan
/harness-plan is the default for most tasks — single-round plan → ship path.
/harness-vision covers everything from vague ideas to clear directions — it auto-detects whether to explore or clarify.
Entry points
| Skill | When to use | What it does |
|---|---|---|
/harness-plan |
"I have a requirement" | Refine plan + 5-role review → auto build/eval/ship/retro |
/harness-vision |
"I have an idea" or "a direction" | Explore or clarify → structured vision → roadmap/backlog → iterative build/eval/ship loop |
Utility & pipeline skills
| Skill | What it does |
|---|---|
/harness-investigate |
Systematic bug investigation: reproduce → hypothesize → verify → minimal fix |
/harness-learn |
Memverse knowledge management: store, retrieve, update project learnings |
/harness-retro |
Engineering retrospective: commit analytics, hotspot detection, trend tracking |
/harness-build |
Implement the contract, run CI, triage failures, write a structured build log |
/harness-eval |
5-role code review (architect + product-owner + engineer + qa + project-manager) |
/harness-ship |
Full pipeline: test → review → fix → commit → push → PR |
/harness-doc-release |
Documentation sync: detect stale docs after code changes |
Progress & next-step hints
**harness workflow next** — one machine-readable line for agents/scripts (task id, phase, suggested skill).**harness status** — Rich panel for humans ("what to do next" in task language).**HARNESS_PROGRESS**— one-line boundary marker emitted by Cursor skills.
Configuration
Project settings live in .harness-flow/config.toml:
| Key | Default | Description |
|---|---|---|
workflow.max_iterations |
3 | Max review iterations per task |
workflow.pass_threshold |
7.0 | Evaluator pass threshold (1-10) |
workflow.auto_merge |
true | Auto-merge branch after pass |
native.evaluator_model |
"inherit" | Default model for review roles; falls back to IDE default |
native.review_gate |
"eng" | Review gate strictness (eng = hard gate, advisory = log only) |
native.plan_review_gate |
"auto" | Plan review gate (human / ai / auto) |
native.role_models.* |
{} |
Per-role model overrides; falls back to IDE default |
workflow.branch_prefix |
"agent" | Task branch prefix |
CLI reference
| Command | Description |
|---|---|
harness init [--name] [--ci] [-y] [--force] |
Initialize project (interactive wizard) |
harness status |
Show current task progress |
harness gate [--task] |
Check ship-readiness gates |
harness update [--check] [--force] |
Self-update + config migration |
harness git-preflight [--json] |
Preflight checks (clean tree, branch) |
harness save-eval --task <id> [--kind] [--verdict] ... |
Save evaluation results |
harness save-build-log --task <id> [--body] |
Save build log |
harness git-prepare-branch --task-key <key> |
Create or resume task branch |
harness git-sync-trunk [--json] |
Sync feature branch with trunk |
Development
harness init generates 9 skills, 5 subagents, 4 rules into .cursor/. All task state lives under .harness-flow/ (local-first). See MIT License.
pip install -e ".[dev]"
pytest
ruff check src/ tests/
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file harness_flow-4.1.91.tar.gz.
File metadata
- Download URL: harness_flow-4.1.91.tar.gz
- Upload date:
- Size: 512.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
18d3b855a44dd949c6041ed1485e9b8319d29f6f27493d277f61c39a15196e79
|
|
| MD5 |
3045c755ffa9f017343712cded02267b
|
|
| BLAKE2b-256 |
dbfb7ed9e0819a79c0eb795baef0a2251866828439a12b29d5c011fd2d3edb48
|
Provenance
The following attestation bundles were made for harness_flow-4.1.91.tar.gz:
Publisher:
release.yml on arthaszeng/harness-flow
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
harness_flow-4.1.91.tar.gz -
Subject digest:
18d3b855a44dd949c6041ed1485e9b8319d29f6f27493d277f61c39a15196e79 - Sigstore transparency entry: 1257218294
- Sigstore integration time:
-
Permalink:
arthaszeng/harness-flow@b2433678f4629798d8421d2576089622697b1f4d -
Branch / Tag:
refs/heads/main - Owner: https://github.com/arthaszeng
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@b2433678f4629798d8421d2576089622697b1f4d -
Trigger Event:
push
-
Statement type:
File details
Details for the file harness_flow-4.1.91-py3-none-any.whl.
File metadata
- Download URL: harness_flow-4.1.91-py3-none-any.whl
- Upload date:
- Size: 290.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
cee7f74cfea36f66eacdececf26e523c6d6cf921b45a86173cc0528d58868e2b
|
|
| MD5 |
5e05d28fb3905b3b472d3d6edbcfe98b
|
|
| BLAKE2b-256 |
c683e8675e77cf1f83f11cc8ddf8db9d0f87c71fb08f3da7978c0976f6a57208
|
Provenance
The following attestation bundles were made for harness_flow-4.1.91-py3-none-any.whl:
Publisher:
release.yml on arthaszeng/harness-flow
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
harness_flow-4.1.91-py3-none-any.whl -
Subject digest:
cee7f74cfea36f66eacdececf26e523c6d6cf921b45a86173cc0528d58868e2b - Sigstore transparency entry: 1257218384
- Sigstore integration time:
-
Permalink:
arthaszeng/harness-flow@b2433678f4629798d8421d2576089622697b1f4d -
Branch / Tag:
refs/heads/main - Owner: https://github.com/arthaszeng
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@b2433678f4629798d8421d2576089622697b1f4d -
Trigger Event:
push
-
Statement type: