Skip to main content

Multi-provider coding agent CLI + SDK

Project description

Harness

State-of-the-art open-source coding agent

CLI + SDK that works with any LLM — Claude, GPT, Gemini, Ollama, or any OpenAI-compatible endpoint.

The only open-source agent to score 100% on Harness-Bench and outperform Claude Code, OpenCode, and pi-mono.

License: MIT Python 3.12+ PyPI GitHub stars

Quick Start · Screenshots · Commands · Providers · Features · SDK · Benchmarks · Tutorials · Contributing


⚡ Quick Start

Install

curl -fsSL https://raw.githubusercontent.com/AgentBoardTT/openharness/main/install.sh | bash

Or with pip:

pip install harness-agent

Connect

harness connect

Pick a provider, paste your API key, done. Your key is saved to ~/.harness/config.toml.

Where do I get an API key?

Use

# Interactive REPL
harness

# One-shot command
harness "Fix the authentication bug in auth.py"

# Bypass mode — full auto-approve, no prompts (great for CI/scripts)
harness --permission bypass "Run all tests and fix failures"
# Use a specific model
harness -p openai -m gpt-5.2 "Refactor this function"

# Use a local model (no API key, fully private)
harness -p ollama -m llama3.3 "Write unit tests for utils.py"

# Resume a previous session
harness --session abc123 "Continue where we left off"

That's it. You're running a state-of-the-art coding agent.

back to top


📸 Screenshots

REPL Banner

Harness REPL banner showing version, provider, and model

Interactive Command Palette

Type / to open a filterable command palette. Arrow keys navigate, Enter selects, Escape dismisses.

Interactive slash command palette with filtering

Agent Execution

Agent building a tic-tac-toe game with tool calls

Status & Models

/status command showing provider, model, session, and cost /models command listing available models by provider

back to top


🎛 Interactive REPL Commands

Type / in the REPL to open the command palette, or use any command directly.

Command Description
/help Show available commands and tips
/connect Set up or change your API key
/model Switch model (e.g. /model gpt-5.2)
/models List available models
/plan Plan implementation with a read-only agent
/review Review code changes or a specific file
/team Decompose a task and run agents in parallel
/status Show provider, model, session, and cost
/cost Show token usage and cost for this session
/compact Summarize conversation to free up context
/session Show or switch session ID
/diff Show git diff of changes in working directory
/init Create a HARNESS.md project config file
/doctor Check your setup (provider, API key, tools)
/permission View or change permission mode
/clear Clear the screen

back to top


🔌 Providers

Harness works with every major AI provider — switch with a single flag.

Provider Models How to connect
Anthropic Claude Opus 4.6, Sonnet 4.6, Haiku 4.5 harness connect and choose Anthropic
OpenAI GPT-5.2, GPT-4.1, o3, o4-mini, GPT-4o harness connect and choose OpenAI
Google Gemini 2.5 Pro, 2.5 Flash, 2.0 Flash harness connect and choose Google
Ollama Llama, Mistral, Qwen, Phi, etc. No key needed — runs locally
OpenAI-compatible DeepSeek, Groq, OpenRouter --base-url flag
harness models list          # Browse 50+ supported models
harness models info sonnet   # Get details for a specific model

back to top


🛠 Features

Built-in Tools

Tool What it does
Read Read file contents
Write Create or overwrite files
Edit Find-and-replace inside files
Bash Run shell commands
Glob Find files by name pattern
Grep Search inside files with regex
Task Spawn sub-agents for parallel work
WebFetch Pull content from web pages
AskUser Ask you a question mid-task
Checkpoint Save/restore file snapshots

Sub-Agents

The agent can spin up specialized workers in parallel:

Agent Access Use Case
general Full tools Complex multi-step tasks
explore Read-only Fast codebase exploration
plan Read-only Architecture planning
review Read-only Structured code review

Permission Modes

You control what the agent can do:

Mode Behavior
default Reads are automatic, writes ask for approval
accept_edits File edits are automatic, shell commands ask
plan Read-only — nothing gets changed
bypass Full auto-approve (for scripts/CI)

Interactive Command Palette

Type / in the REPL to open a filterable dropdown. Arrow keys navigate, typing filters, Enter selects, Escape dismisses. All 16 slash commands are accessible instantly.

Async Steering (Live Message Injection)

While the agent is executing, type a message and press Enter to inject it between turns. The steering channel queues your message and the agent processes it at the next turn boundary — no need to wait for it to finish.

Context Compaction

When the conversation approaches the model's context limit (85% threshold), Harness automatically summarizes earlier messages while preserving key information. The compacted context targets 50% of the window, giving you room to keep working without starting a new session.

MCP (Model Context Protocol)

Connect external tool servers — Jira, Slack, databases, anything with an MCP adapter:

async for msg in harness.run(
    "Search our Jira board",
    mcp_servers={
        "jira": {
            "command": "npx",
            "args": ["-y", "@anthropic/mcp-server-jira"],
            "env": {"JIRA_TOKEN": "..."},
        }
    },
):
    ...

Skills

Teach the agent custom workflows by dropping a .md file in .harness/skills/:

---
name: deploy
description: Deploy to production
user_invocable: true
---

1. Run the test suite: `pytest tests/ -v`
2. Build the Docker image: `docker build -t myapp .`
3. Push to registry and deploy

Hooks

Run your own commands before/after every tool call:

hooks = [
    harness.Hook(
        event=harness.HookEvent.PRE_TOOL_USE,
        command="echo 'About to run {tool_name}'",
        matcher="Bash",
    ),
]

async for msg in harness.run("Fix the tests", hooks=hooks):
    ...

Memory

  • Project instructions — Drop a HARNESS.md in your project root
  • Auto-memory — Learnings persist across sessions in ~/.harness/memory/

back to top


🐍 SDK

Use Harness as a Python library to build your own tools on top of it.

Basic Usage

import harness

async for msg in harness.run("Fix the bug in auth.py"):
    match msg:
        case harness.TextMessage(text=t, is_partial=False):
            print(t)
        case harness.ToolUse(name=name):
            print(f"Using tool: {name}")
        case harness.Result(text=t, total_tokens=tok):
            print(f"Done ({tok} tokens): {t}")

With Configuration

async for msg in harness.run(
    "Refactor the database module",
    provider="openai",
    model="gpt-4.1",
    permission_mode="accept_edits",
    max_turns=50,
):
    ...

Sub-Agent API

from harness.agents.manager import AgentManager

mgr = AgentManager(provider=provider, tools=tools, cwd=".")
result = await mgr.spawn("explore", "Find all API endpoints")

# Parallel execution
results = await mgr.spawn_parallel([
    ("explore", "Find all API endpoints"),
    ("explore", "Find all database models"),
    ("review", "Review the auth module"),
])

Steering Channel (Async Message Injection)

Inject messages into the agent loop while it's running:

from harness.core.steering import SteeringChannel

steering = SteeringChannel()

# Start the agent with steering enabled
async for msg in harness.run(
    "Refactor the API layer",
    steering=steering,
):
    print(msg)

# From another coroutine, inject a message between turns:
await steering.send("Actually, skip the auth endpoints")

back to top


📊 Benchmark Results

Harness was benchmarked against the leading coding agents on 8 real-world tasks covering multi-file editing, bug fixing, error recovery, refactoring, context understanding, and code analysis.

Overall Scores

Agent Claude Opus 4.6 GPT-5.2
Harness 7/8 (88%) 8/8 (100%)
Claude Code 7/8 (88%)
OpenCode 7/8 (88%) 7/8 (88%)
pi-mono 7/8 (88%) 8/8 (100%)

Harness is the only open-source agent that achieves a perfect score — and it does so across providers, not locked to one.

Per-Task Breakdown (GPT-5.2)

Task Harness OpenCode pi-mono
Multi-file editing PASS (17.5s) PASS (19.4s) PASS (26.8s)
Error recovery PASS (5.2s) PASS (11.7s) PASS (10.1s)
Tool efficiency PASS (1.8s) PASS (5.6s) PASS (9.2s)
Context understanding PASS (9.7s) FAIL PASS (41.3s)
Project creation PASS (3.0s) PASS (7.6s) PASS (3.8s)
Bug fixing PASS (5.5s) PASS (12.9s) PASS (10.0s)
Code analysis PASS (1.9s) PASS (5.2s) PASS (2.3s)
Refactoring PASS (6.4s) PASS (11.7s) PASS (12.7s)

Speed

Agent Model Avg per Task Total (8 tasks)
Harness GPT-5.2 6.4s 51.0s
Harness Opus 4.6 12.5s 99.7s
Claude Code Opus 4.6 16.4s 131.5s
OpenCode GPT-5.2 10.7s 85.8s
pi-mono GPT-5.2 14.5s 116.2s

Harness is 2x faster than the next-fastest agent on GPT-5.2, and 30% faster than Claude Code on Opus.

back to top


⚙️ Configuration

Config File

Created automatically by harness connect. Lives at ~/.harness/config.toml:

[providers.anthropic]
api_key = "sk-ant-..."

[providers.openai]
api_key = "sk-..."

Environment Variables

If you prefer env vars, those work too:

export ANTHROPIC_API_KEY="sk-ant-..."
export OPENAI_API_KEY="sk-..."
export GOOGLE_API_KEY="AIza..."

back to top


🏗 Architecture

src/harness/
  core/
    engine.py          Top-level run() entry point
    loop.py            Agent loop (provider -> tools -> repeat)
    session.py         JSONL session persistence
    context.py         Context window management + compaction
    config.py          Config loading (env, TOML, HARNESS.md)
    steering.py        Async steering channel for live message injection
  providers/
    anthropic.py       Claude adapter
    openai.py          GPT / OpenAI-compatible adapter
    google.py          Gemini adapter
    ollama.py          Ollama local model adapter
    registry.py        Model catalogue (50+ models)
  tools/               Read, Write, Edit, Bash, Glob, Grep, Task, Web, etc.
  agents/              Sub-agent registry + lifecycle manager
  hooks/               Pre/post tool-use hook system
  mcp/                 MCP client + progressive tool discovery
  skills/              Skill loader (SKILL.md parser)
  memory/              Auto-memory + project instructions
  permissions/         Permission rules engine
  ui/                  Rich terminal output + streaming + diffs
  eval/                SWE-bench, Harness-Bench, metrics, reports
  cli/                 Click CLI entry point + subcommands

back to top


🧪 Evaluation

Run Benchmarks

# Quick validation — 8 tasks, ~$1
harness eval harness-bench --provider anthropic --model sonnet

# SWE-bench Lite — 300 real GitHub issues
harness eval swe-bench --split lite --max-tasks 10

# List benchmarks
harness eval list

Available Benchmarks

Benchmark Tasks Description
Harness-Bench 8 Multi-file editing, error recovery, refactoring, analysis
SWE-bench Lite 300 Curated subset of real GitHub issues
SWE-bench Verified 500 Human-verified solvable issues
SWE-bench Full 2,294 Complete benchmark

back to top


🔧 Development

git clone https://github.com/AgentBoardTT/openharness.git
cd openharness
uv pip install -e ".[dev]"
uv run pytest tests/ -v
uv run ruff check src/ tests/

📚 Tutorials

Learn Harness from zero to enterprise with our step-by-step tutorial series:

# Tutorial Difficulty Time
1 Getting Started — Your First AI Coding Agent Beginner 10 min
2 The Python SDK — Building AI-Powered Tools Beginner 15 min
3 Permission Modes & Safety Controls Intermediate 15 min
4 Budget Controls & Cost Optimization Intermediate 15 min
5 Policy-as-Code — Governing Agent Behavior Intermediate 20 min
6 Audit Logging & Compliance Intermediate 20 min
7 Sandboxed Execution Intermediate 20 min
8 Sub-Agents & Parallel Execution Intermediate 20 min
9 CI/CD Integration — AI Agent in Your Pipeline Advanced 25 min
10 Enterprise Production Deployment Advanced 30 min

👉 Browse all tutorials

back to top


🤝 Contributing

We'd love your help. Here's how:

Areas where we especially need help:

  • New provider adapters
  • Additional tools
  • Benchmark tasks and evaluation
  • Documentation and examples

📄 License

MIT

The best agent scaffold is an open one.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

harness_agent-0.6.0.tar.gz (977.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

harness_agent-0.6.0-py3-none-any.whl (158.2 kB view details)

Uploaded Python 3

File details

Details for the file harness_agent-0.6.0.tar.gz.

File metadata

  • Download URL: harness_agent-0.6.0.tar.gz
  • Upload date:
  • Size: 977.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.7.22

File hashes

Hashes for harness_agent-0.6.0.tar.gz
Algorithm Hash digest
SHA256 88cb3fac51cb3bfe3ecdf1ba94979c02649ee67849d1ca163cb5b67f061369ba
MD5 af17c613947c75306765614630600fac
BLAKE2b-256 066cf498a4c8b1a6b8a321682d71687e54f87344da5eb47e27f2e2dca6c22eeb

See more details on using hashes here.

File details

Details for the file harness_agent-0.6.0-py3-none-any.whl.

File metadata

File hashes

Hashes for harness_agent-0.6.0-py3-none-any.whl
Algorithm Hash digest
SHA256 5ea5dcc10ee01e8ebac2d0326cfd2278c548d6b4b29c0c8044693c5aaae3e11d
MD5 8e23b2ab88ed537fb5771bcc9f5fda7b
BLAKE2b-256 1a33c510d97c90986697139ee67e6cd9287b2d1cd80390c28322a55385e7fa74

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page