Skip to main content

Cursor-native multi-agent development framework with 5-role review

Project description

中文

harness-flow

Agent writes code. Harness Flow ships products.

L5 autonomous delivery for the vibe coding era — you're the copilot now.

Python PyPI License: MIT

The Problem

AI agents can write code — but they can't ship products. They lack navigation (goal management), traffic rules (quality gates), and a dashcam (audit trail). The bottleneck has shifted from "can AI write code?" to "can AI autonomously deliver?"

Where Harness Flow Fits

The Evolution of Software Development: from manual coding (L0) to AI assistant (L1) to agent mode (L3) to Harness Flow autonomous delivery (L5)

The Three Pillars of L5

Navigation Traffic Rules Dashcam
vision → plan → roadmap 5-role review + gates + trust audit trail + learnings + retro
AI knows where to go AI obeys the rules every decision recorded

How It Works

flowchart LR
  Req["Requirement"] --> Plan["Plan"]
  Plan --> PlanReview["5-role\nplan review"]
  PlanReview --> Build["Build + CI"]
  Build --> CodeReview["5-role\ncode review"]
  CodeReview --> Ship["Ship → PR"]

  PlanReview -.-|"Architect · PO · Engineer · QA · PM"| CodeReview

  style Req fill:#fff,stroke:#222,stroke-width:2px,color:#000
  style Plan fill:#fff,stroke:#222,stroke-width:2px,color:#000
  style PlanReview fill:#222,stroke:#222,stroke-width:2px,color:#fff
  style Build fill:#fff,stroke:#222,stroke-width:2px,color:#000
  style CodeReview fill:#222,stroke:#222,stroke-width:2px,color:#fff
  style Ship fill:#fff,stroke:#222,stroke-width:2px,color:#000

One requirement in → one PR out. Both plan and code are reviewed by 5 parallel AI reviewers. Findings from 2+ roles on the same issue are flagged [HIGH CONFIDENCE].

Fix-First classifies every review finding:

  • AUTO-FIX — high certainty + small blast radius + reversible → fixed immediately
  • ASK — security, behavior change, architecture → batched for your decision

Quick Start

0. 10-minute happy path

Step 1 — Install:

pip install harness-flow

Step 2 — Initialize in your project:

cd <YOUR_PROJECT_PATH>
harness init

Step 3 — Open Cursor, type a requirement:

/harness-plan add input validation to the user registration endpoint

That's it — plan, build, 5-role review, and PR. One command.

What you'll see: the agent generates a spec + contract, 5 reviewers challenge the plan in parallel, then the agent implements, runs CI, gets code reviewed by the same 5 roles, and opens a PR — all autonomously.


Deep Dive

Your AI Engineering Team — 5 parallel reviewers

Harness gives you a complete engineering team inside Cursor — each role reviews both your plan and your code:

Role Plan Review Code Review
Architect Feasibility, module impact, dependencies Conformance, layering, coupling, security
Product Owner Vision alignment, user value, acceptance criteria Requirement coverage, behavioral correctness
Engineer Implementation feasibility, code reuse, tech debt Code quality, DRY, patterns, performance
QA Test strategy, boundary values, regression risk Test coverage, edge cases, CI health
Project Manager Task decomposition, parallelism, scope Scope drift, plan completion, delivery risk

Not a simulation — these roles run as parallel AI subagents with distinct system prompts, each scoring independently. Findings from 2+ roles are flagged as high confidence.

Each role can use a different model via [native.role_models] in config. If some reviewers fail, the pipeline continues with available perspectives (graceful degradation).

Contract-Driven Development

Every task starts with a spec + contract — deliverables, acceptance criteria, and risk analysis — reviewed by 5 roles before any code is written.

The contract lives in .harness-flow/tasks/task-NNN/plan.md and serves as the single source of truth. Runtime state is tracked in workflow-state.json alongside it.

Fix-First Auto-Remediation

Every review finding is classified before presenting it to you:

  • AUTO-FIX (high certainty + small blast radius + reversible) → fixed immediately, tests re-run
  • ASK (security, behavior change, architecture, low confidence) → batched and presented for your decision

Typical auto-fixes: unused imports, stale comments, missing null checks, naming inconsistencies, obvious N+1 queries.

Full Audit Trail

Plans, reviews, build logs, gate results — all persisted per task. Every decision is traceable.

.harness-flow/
├── config.toml              # project settings (CI command, trunk branch, language)
├── vision.md                # product direction (optional)
└── tasks/task-NNN/
    ├── plan.md              # spec + contract (scope SSOT)
    ├── handoff-*.json       # structured context per phase (plan, build, eval, ship)
    ├── build-rN.md          # build log per round
    ├── plan-eval-rN.md      # plan review per round
    ├── code-eval-rN.md      # code review per round
    ├── ship-metrics.json    # delivery metrics (scores, test count, coverage)
    ├── workflow-state.json  # canonical task phase / gate / blocker tracking
    └── ...                  # feedback ledger, intervention audit, etc. (optional)

Installation & Upgrade

Command What it does
pip install harness-flow Install the CLI
harness init Interactive wizard → generates skills, agents, rules into .cursor/
harness init --force Regenerate all artifacts (after config changes or version upgrade)
harness update Self-update the package + run config migration
harness update --check Check for new version without installing

All Skills — default: /harness-plan

/harness-plan is the default for most tasks — single-round plan → ship path.

/harness-vision covers everything from vague ideas to clear directions — it auto-detects whether to explore or clarify.

Entry points

Skill When to use What it does
/harness-plan "I have a requirement" Refine plan + 5-role review → auto build/eval/ship/retro
/harness-vision "I have an idea" or "a direction" Explore or clarify → structured vision → roadmap/backlog → iterative build/eval/ship loop

Utility & pipeline skills

Skill What it does
/harness-investigate Systematic bug investigation: reproduce → hypothesize → verify → minimal fix
/harness-learn Memverse knowledge management: store, retrieve, update project learnings
/harness-retro Engineering retrospective: commit analytics, hotspot detection, trend tracking
/harness-build Implement the contract, run CI, triage failures, write a structured build log
/harness-eval 5-role code review (architect + product-owner + engineer + qa + project-manager)
/harness-ship Full pipeline: test → review → fix → commit → push → PR
/harness-doc-release Documentation sync: detect stale docs after code changes

Progress & next-step hints

  • **harness workflow next** — one machine-readable line for agents/scripts (task id, phase, suggested skill).
  • **harness status** — Rich panel for humans ("what to do next" in task language).
  • **HARNESS_PROGRESS** — one-line boundary marker emitted by Cursor skills.

Configuration

Project settings live in .harness-flow/config.toml:

Key Default Description
workflow.max_iterations 3 Max review iterations per task
workflow.pass_threshold 7.0 Evaluator pass threshold (1-10)
workflow.auto_merge true Auto-merge branch after pass
native.evaluator_model "inherit" Default model for review roles; falls back to IDE default
native.review_gate "eng" Review gate strictness (eng = hard gate, advisory = log only)
native.plan_review_gate "auto" Plan review gate (human / ai / auto)
native.role_models.* {} Per-role model overrides; falls back to IDE default
workflow.branch_prefix "agent" Task branch prefix

CLI reference

Command Description
harness init [--name] [--ci] [-y] [--force] Initialize project (interactive wizard)
harness status Show current task progress
harness gate [--task] Check ship-readiness gates
harness update [--check] [--force] Self-update + config migration
harness git-preflight [--json] Preflight checks (clean tree, branch)
harness save-eval --task <id> [--kind] [--verdict] ... Save evaluation results
harness save-build-log --task <id> [--body] Save build log
harness git-prepare-branch --task-key <key> Create or resume task branch
harness git-sync-trunk [--json] Sync feature branch with trunk

Development

harness init generates 9 skills, 5 subagents, 4 rules into .cursor/. All task state lives under .harness-flow/ (local-first). See MIT License.

pip install -e ".[dev]"
pytest
ruff check src/ tests/

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

harness_flow-4.1.91.tar.gz (512.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

harness_flow-4.1.91-py3-none-any.whl (290.5 kB view details)

Uploaded Python 3

File details

Details for the file harness_flow-4.1.91.tar.gz.

File metadata

  • Download URL: harness_flow-4.1.91.tar.gz
  • Upload date:
  • Size: 512.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for harness_flow-4.1.91.tar.gz
Algorithm Hash digest
SHA256 18d3b855a44dd949c6041ed1485e9b8319d29f6f27493d277f61c39a15196e79
MD5 3045c755ffa9f017343712cded02267b
BLAKE2b-256 dbfb7ed9e0819a79c0eb795baef0a2251866828439a12b29d5c011fd2d3edb48

See more details on using hashes here.

Provenance

The following attestation bundles were made for harness_flow-4.1.91.tar.gz:

Publisher: release.yml on arthaszeng/harness-flow

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file harness_flow-4.1.91-py3-none-any.whl.

File metadata

  • Download URL: harness_flow-4.1.91-py3-none-any.whl
  • Upload date:
  • Size: 290.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for harness_flow-4.1.91-py3-none-any.whl
Algorithm Hash digest
SHA256 cee7f74cfea36f66eacdececf26e523c6d6cf921b45a86173cc0528d58868e2b
MD5 5e05d28fb3905b3b472d3d6edbcfe98b
BLAKE2b-256 c683e8675e77cf1f83f11cc8ddf8db9d0f87c71fb08f3da7978c0976f6a57208

See more details on using hashes here.

Provenance

The following attestation bundles were made for harness_flow-4.1.91-py3-none-any.whl:

Publisher: release.yml on arthaszeng/harness-flow

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page