Skip to main content

Universal harness layer for AI coding agents — one command sets up your repo for any agent (Claude Code, Cursor, Codex, Gemini CLI, Aider, OpenHarness).

Project description

harnessforge

One command turns any repo into a project where Claude Code, Cursor, Codex, Gemini CLI, Aider, and any other coding agent show up already knowing the codebase.

uvx harnessforge init

PyPI License: MIT Python 3.11+ CI Tests


Why this exists

2026 is the year developers still build the harness. 2027 is the year the LLM builds its own harness.

Today, every agent system still needs humans to manually prepare the environment: MCP servers, repo instructions, memory files, test commands, validation scripts, permission rules, browser credentials, and the task-loop scaffolding that holds the whole thing together. OpenAI's own definition of an agent — a system that plans, calls tools, collaborates, and keeps state across multi-step work — depends on every one of those layers being in place before the first plan step runs. MCP has emerged as the common connection layer for tools, data, and workflows. But which tools to connect, what's forbidden, what counts as done — those still get hand-authored, once per project, and then re-authored for the next.

In 2026, developers still spend too much time on this setup: MCP tools, repo rules, test commands, memory files, browser validation, credentials, workflow loops. By 2027, I don't think they will. The LLM will inspect the project, understand the task, generate the right harness, connect the right tools, create its own memory, write its own validation scripts, and keep refining the loop until the task is done. The harness layer disappears as a separately-authored artifact.

The next big open-source project won't be another coding agent. It will be the universal harness layer that every coding agent can use — one simple framework that lets Claude, OpenAI, Gemini, local models, and future agents download a project, understand its environment, and call tools safely through a common interface. Model-neutral by design, because the model is the part that keeps changing.

harnessforge is the 2026 bridge. A deterministic repo walker plus an opinionated blueprint set. In ~3 seconds, fully local with no network calls, it generates everything your coding agent needs to start fast — AGENTS.md, SOUL.md, TOOLS.md, MEMORY.md, SKILLS/, per-IDE adapter files, blueprint validators, MCP recommendations, forbidden-path rules. Your coding agent stays the brain — harnessforge just lays the ground truth it reads on startup. You commit the output once per repo and every coding agent you use shows up already knowing the codebase. When the next generation of models can build this layer on the fly themselves, harnessforge has done its job and ages out gracefully.


What you get

Run harness init in your repo. ~3 seconds later, you have:

your-repo/
├── AGENTS.md           ← every coding agent reads this (OpenAI Codex CLI convention)
├── SOUL.md             ← personality / tone for this project
├── TOOLS.md            ← which tools / MCPs to use
├── MEMORY.md           ← memory schemas
├── SKILLS/             ← anthropics/skills-compatible procedures
│   ├── chunk-and-embed/SKILL.md
│   ├── retrieve-and-rerank/SKILL.md
│   └── …
├── .claude/CLAUDE.md       ← Claude Code reads this automatically
├── .cursor/rules           ← Cursor reads this automatically
├── .continue/config.json   ← Continue reads this automatically
├── .windsurf/rules         ← Windsurf reads this automatically
├── harness.config.json     ← what blueprint this repo is bound to
└── .harness/
    ├── profile.yaml        ← the canonical machine-readable description
    ├── manifest.json       ← sha256 of every file for safe re-runs
    └── memory_schemas/     ← JSON Schemas the blueprint expects

These aren't placeholder stubs. Here's the first 25 lines of an AGENTS.md harnessforge init produced for a tiny stock-analysis repo:

# AGENTS.md

> _Generated by harnessforge v0.2.1 · blueprint `finance-agent` v1.0.0._

You are a **finance / market-data analyst agent** working in **portfoliowatch**.

This project is **read-only by default.** You fetch market data, compute
signals, surface insights. You **never** place orders, move money, or
modify positions without an **explicit per-action human-approval gate**
that the user typed "yes" through in this session.

---

## The analyst loop

fetch → compute → screen → flag


- `SKILLS/fetch-market-data` — get prices/quotes/fundamentals; respect rate limits.
- `SKILLS/compute-technicals` — RSI, SMA, MACD, etc. Vectorized; tested against canonical references.
- `SKILLS/screen-positions` — filter the universe by your declared criteria.
- `SKILLS/flag-attention` — surface what changed and why — calibrated, not alarmist.

A Claude Code session opened in that repo reads it and knows the loop, the safety contract, and which skills to invoke — without you typing any of it into the chat.


Install

# No install, run once:
uvx harnessforge init

# Or install globally:
pipx install harnessforge
harness init

# Or pip into a venv:
pip install harnessforge

Init is fully deterministic by default — no LLM call, no network, no API key, ~2 seconds. The optional MCP-server install lets your coding agent call harnessforge verify and harnessforge inspect as typed tools:

pip install "harnessforge[mcp]"   # expose harnessforge itself as an MCP server

Older releases also shipped [anthropic] / [openai] / [gemini] extras that called an LLM during init to "refine" the profile. They still work, but we now recommend against them: if you want LLM-assisted refinement, let your coding agent (Claude Code, Cursor, Codex, Gemini CLI, Aider) do it after init — it has the full repo context and a chat loop, both of which a one-shot init-time call doesn't. These extras are scheduled for deprecation in 0.3.


Five blueprints in 0.2.x

Pick with --blueprint, or let the recommender choose based on inspection (yfinance deps → finance-agent; langchain/qdrant → rag-agent; airflow → workflow-agent; generic Python → python-cli-app).

Blueprint For Skills it ships
python-cli-app Build a Python CLI / library / web API — the default for greenfield Python work add-cli-command, add-unit-test, manage-dependency, check-style
finance-agent Market data + portfolio analysis fetch-market-data, compute-technicals, screen-positions, flag-attention + no_trades_without_gate validator that fails if generated code calls a broker function without an approval check
rag-agent Retrieval-augmented Q&A with citation enforcement chunk-and-embed, retrieve-and-rerank, answer-with-citations, eval-recall-precision + citation cross-checker
support-agent Customer support: intent → KB → ticket → escalate classify-intent, retrieve-kb-answer, file-ticket, escalate-if-unresolved + ticket-lineage validator
workflow-agent Multi-step orchestration (Zapier/n8n-style) decompose-task, call-tool-with-retry, check-result + tool-log + idempotency validators

Beyond the catalog you can author project-specific skills:

harness skills add fetch-portfolio-prices \
  --domain --description "Fetch live prices for tickers in positions.json from Polygon."

Domain skills land under SKILLS/domain/<name>/SKILL.md and surface in harness skills list alongside blueprint-shipped skills.


Validation that's actually a contract, not a vibe

harness verify runs blueprint-defined checks and emits a stable JSON contract — the same contract whether you call the CLI or the MCP tool. Your coding agent reads the JSON and self-corrects:

$ harness verify --json
{
  "schema_version": 1,
  "blueprint": "finance-agent",
  "checks": [
    {"name": "structure", "status": "pass", "duration_ms": 1, "messages": []},
    {"name": "tests",     "status": "pass", "duration_ms": 42, "messages": []},
    {"name": "no_trades_without_gate", "status": "pass", "duration_ms": 8, "messages": []}
  ],
  "summary": {"total": 3, "passed": 3, "failed": 0}
}

Exit codes: 0 all pass · 1 failures · 2 config error · 3 not a harness-bootstrapped repo. Drop in CI:

- run: pip install harnessforge
- run: harness sync --check    # fail if generated files drifted from manifest
- run: harness verify --json   # fail if blueprint contract broken

The no_trades_without_gate validator on finance-agent is the spicy one: it static-scans the repo for place_order(, buy(, sell(, etc., and fails the build if any of them sit behind only a config flag instead of a runtime user-approval gate. See docs/concepts/trust-model.md for the full trust boundary (profile.test_command and lint_command are executed as shell — same model as make test / npm test).


Commands

harness init [PATH]                       # inspect → profile → blueprint → render
harness sync [PATH] [--check]             # re-render adapters; --check = drift detect for CI
harness inspect [PATH] [--json|yaml]      # deterministic InspectionReport
harness verify [TARGET] [--json] [--tests|--lint]
harness blueprint {list, show, apply}
harness skills {list, show, add [--domain]}
harness doctor                            # diagnose env: provider keys, extras, repo health
harness mcp                               # run stdio MCP server (5 typed tools)
harness version

Use with your existing coding agent

Agent What it reads Setup
Claude Code .claude/CLAUDE.md nothing — automatic
Cursor .cursor/rules nothing
Codex CLI AGENTS.md nothing
Continue .continue/config.json nothing
Windsurf .windsurf/rules nothing
Gemini CLI / Aider AGENTS.md nothing
Anything that speaks MCP harness mcp exposes 5 typed tools add to client config

For the MCP path:

{
  "mcpServers": {
    "harness": {
      "command": "uvx",
      "args": ["--from", "harnessforge[mcp]", "harness", "mcp"]
    }
  }
}

Exposes harness_inspect, harness_blueprint_list, harness_skills_list, harness_verify, harness_profile_read as typed tools your agent can call.


Hero demos (reproducible from this repo)

Three end-to-end demos against real public repos, pinned to specific SHAs:

Demo Repo Blueprint Run
FastAPI + RAG fastapi/full-stack-fastapi-template rag-agent examples/hero/fastapi_rag/run.sh
Zulip + Support zulip/zulip support-agent examples/hero/zulip_support/run.sh
Airflow + Workflow apache/airflow workflow-agent examples/hero/airflow_workflow/run.sh

Each run.sh shallow-clones the upstream repo at the pinned SHA, runs harnessforge init, and runs harnessforge verify to confirm everything passes. CI runs all three on every push.


How this compares to alternatives

The "agent infrastructure" space has runtimes (Hermes, OpenClaw, OpenHarness), SDKs (OpenAI Agents SDK, Mastra), and now provisioners (harnessforge). The honest version of the comparison — including when not to use harnessforge — lives in docs/concepts/vs-hermes-openclaw-openharness.md.

The shortest version: if you write your own CLAUDE.md for every repo you start, harnessforge replaces that file with one that's actually project-specific, plus everything the other coding agents you use need to read. That's the comparison most readers are actually making.


Quality bar

  • 186 tests across unit / golden-file / interop / integration tiers, all green on Python 3.11 / 3.12 / 3.13
  • 83% line coverage on src/harness/
  • mypy strict + ruff clean
  • mkdocs --strict builds clean
  • Fresh-venv install verifiedpip install harnessforge && harness version works on a clean machine (the v0.2.1 release was held until this passed)
  • Two rounds of real-agent A/B evaluation — Claude Code building the same stock-analysis agent WITH vs. WITHOUT the harness; the diff drove the v0.2 + v0.2.1 designs. See CHANGELOG.md for the per-fix-per-eval breakdown.

Status

harnessforge 0.2.1 — first public release. Following Semantic Versioning.

Feedback welcome via issues — see CONTRIBUTING.md and SECURITY.md. The 5 shipped blueprints are intentionally opinionated; PRs proposing a 6th are encouraged. Sales / browser blueprints land in 0.3 alongside auth-bearing MCP catalog entries.

License

MIT — see LICENSE.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

harnessforge-0.2.2.tar.gz (169.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

harnessforge-0.2.2-py3-none-any.whl (228.9 kB view details)

Uploaded Python 3

File details

Details for the file harnessforge-0.2.2.tar.gz.

File metadata

  • Download URL: harnessforge-0.2.2.tar.gz
  • Upload date:
  • Size: 169.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for harnessforge-0.2.2.tar.gz
Algorithm Hash digest
SHA256 cbd8d648e6772f5f3eef9cfa1d710d54cb4daa0532d973e7f0e5b3fe8191d8d1
MD5 45027366641b36bcb33933ae5c1a5e7d
BLAKE2b-256 43e628a87eff1ce65a513dc6da3c0b62efff4e76b1094981d054b8c23cfa6f45

See more details on using hashes here.

Provenance

The following attestation bundles were made for harnessforge-0.2.2.tar.gz:

Publisher: release.yml on jcaiagent7143-ui/harnessforge

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file harnessforge-0.2.2-py3-none-any.whl.

File metadata

  • Download URL: harnessforge-0.2.2-py3-none-any.whl
  • Upload date:
  • Size: 228.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for harnessforge-0.2.2-py3-none-any.whl
Algorithm Hash digest
SHA256 79c00b89d09a815ef8a03d41fae53a4e9858ac912580475068947c1125106029
MD5 3bb29b9cfec83d3e410aae79c094d909
BLAKE2b-256 bab907ee259beb9be26fd356f3b522af0cdd6ae3c6aa1eca74a3f12ae38ffb2c

See more details on using hashes here.

Provenance

The following attestation bundles were made for harnessforge-0.2.2-py3-none-any.whl:

Publisher: release.yml on jcaiagent7143-ui/harnessforge

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page