Multi-provider coding agent CLI + SDK
Project description
Harness
State-of-the-art open-source coding agent
CLI + SDK that works with any LLM — Claude, GPT, Gemini, Ollama, or any OpenAI-compatible endpoint.
The only open-source agent to score 100% on Harness-Bench and outperform Claude Code, OpenCode, and pi-mono.
Quick Start · Screenshots · Commands · Providers · Features · SDK · Benchmarks · Tutorials · Contributing
⚡ Quick Start
Install
curl -fsSL https://raw.githubusercontent.com/AgentBoardTT/openharness/main/install.sh | bash
Or with pip:
pip install harness-agent
Connect
harness connect
Pick a provider, paste your API key, done. Your key is saved to ~/.harness/config.toml.
Where do I get an API key?
- Anthropic (Claude): https://console.anthropic.com/settings/keys
- OpenAI (GPT): https://platform.openai.com/api-keys
- Google (Gemini): https://aistudio.google.com/apikey
Use
# Interactive REPL
harness
# One-shot command
harness "Fix the authentication bug in auth.py"
# Bypass mode — full auto-approve, no prompts (great for CI/scripts)
harness --permission bypass "Run all tests and fix failures"
# Use a specific model
harness -p openai -m gpt-5.2 "Refactor this function"
# Use a local model (no API key, fully private)
harness -p ollama -m llama3.3 "Write unit tests for utils.py"
# Resume a previous session
harness --session abc123 "Continue where we left off"
That's it. You're running a state-of-the-art coding agent.
📸 Screenshots
REPL Banner
Interactive Command Palette
Type / to open a filterable command palette. Arrow keys navigate, Enter selects, Escape dismisses.
Agent Execution
Status & Models
🎛 Interactive REPL Commands
Type / in the REPL to open the command palette, or use any command directly.
| Command | Description |
|---|---|
/help |
Show available commands and tips |
/connect |
Set up or change your API key |
/model |
Switch model (e.g. /model gpt-5.2) |
/models |
List available models |
/plan |
Plan implementation with a read-only agent |
/review |
Review code changes or a specific file |
/team |
Decompose a task and run agents in parallel |
/status |
Show provider, model, session, and cost |
/cost |
Show token usage and cost for this session |
/compact |
Summarize conversation to free up context |
/session |
Show or switch session ID |
/diff |
Show git diff of changes in working directory |
/init |
Create a HARNESS.md project config file |
/doctor |
Check your setup (provider, API key, tools) |
/permission |
View or change permission mode |
/clear |
Clear the screen |
🔌 Providers
Harness works with every major AI provider — switch with a single flag.
| Provider | Models | How to connect |
|---|---|---|
| Anthropic | Claude Opus 4.6, Sonnet 4.6, Haiku 4.5 | harness connect and choose Anthropic |
| OpenAI | GPT-5.2, GPT-4.1, o3, o4-mini, GPT-4o | harness connect and choose OpenAI |
| Gemini 2.5 Pro, 2.5 Flash, 2.0 Flash | harness connect and choose Google |
|
| Ollama | Llama, Mistral, Qwen, Phi, etc. | No key needed — runs locally |
| OpenAI-compatible | DeepSeek, Groq, OpenRouter | --base-url flag |
harness models list # Browse 50+ supported models
harness models info sonnet # Get details for a specific model
🛠 Features
Built-in Tools
| Tool | What it does |
|---|---|
| Read | Read file contents |
| Write | Create or overwrite files |
| Edit | Find-and-replace inside files |
| Bash | Run shell commands |
| Glob | Find files by name pattern |
| Grep | Search inside files with regex |
| Task | Spawn sub-agents for parallel work |
| WebFetch | Pull content from web pages |
| AskUser | Ask you a question mid-task |
| Checkpoint | Save/restore file snapshots |
Sub-Agents
The agent can spin up specialized workers in parallel:
| Agent | Access | Use Case |
|---|---|---|
| general | Full tools | Complex multi-step tasks |
| explore | Read-only | Fast codebase exploration |
| plan | Read-only | Architecture planning |
| review | Read-only | Structured code review |
Permission Modes
You control what the agent can do:
| Mode | Behavior |
|---|---|
default |
Reads are automatic, writes ask for approval |
accept_edits |
File edits are automatic, shell commands ask |
plan |
Read-only — nothing gets changed |
bypass |
Full auto-approve (for scripts/CI) |
Interactive Command Palette
Type / in the REPL to open a filterable dropdown. Arrow keys navigate, typing filters, Enter selects, Escape dismisses. All 16 slash commands are accessible instantly.
Async Steering (Live Message Injection)
While the agent is executing, type a message and press Enter to inject it between turns. The steering channel queues your message and the agent processes it at the next turn boundary — no need to wait for it to finish.
Context Compaction
When the conversation approaches the model's context limit (85% threshold), Harness automatically summarizes earlier messages while preserving key information. The compacted context targets 50% of the window, giving you room to keep working without starting a new session.
MCP (Model Context Protocol)
Connect external tool servers — Jira, Slack, databases, anything with an MCP adapter:
async for msg in harness.run(
"Search our Jira board",
mcp_servers={
"jira": {
"command": "npx",
"args": ["-y", "@anthropic/mcp-server-jira"],
"env": {"JIRA_TOKEN": "..."},
}
},
):
...
Skills
Teach the agent custom workflows by dropping a .md file in .harness/skills/:
---
name: deploy
description: Deploy to production
user_invocable: true
---
1. Run the test suite: `pytest tests/ -v`
2. Build the Docker image: `docker build -t myapp .`
3. Push to registry and deploy
Hooks
Run your own commands before/after every tool call:
hooks = [
harness.Hook(
event=harness.HookEvent.PRE_TOOL_USE,
command="echo 'About to run {tool_name}'",
matcher="Bash",
),
]
async for msg in harness.run("Fix the tests", hooks=hooks):
...
Memory
- Project instructions — Drop a
HARNESS.mdin your project root - Auto-memory — Learnings persist across sessions in
~/.harness/memory/
🐍 SDK
Use Harness as a Python library to build your own tools on top of it.
Basic Usage
import harness
async for msg in harness.run("Fix the bug in auth.py"):
match msg:
case harness.TextMessage(text=t, is_partial=False):
print(t)
case harness.ToolUse(name=name):
print(f"Using tool: {name}")
case harness.Result(text=t, total_tokens=tok):
print(f"Done ({tok} tokens): {t}")
With Configuration
async for msg in harness.run(
"Refactor the database module",
provider="openai",
model="gpt-4.1",
permission_mode="accept_edits",
max_turns=50,
):
...
Sub-Agent API
from harness.agents.manager import AgentManager
mgr = AgentManager(provider=provider, tools=tools, cwd=".")
result = await mgr.spawn("explore", "Find all API endpoints")
# Parallel execution
results = await mgr.spawn_parallel([
("explore", "Find all API endpoints"),
("explore", "Find all database models"),
("review", "Review the auth module"),
])
Steering Channel (Async Message Injection)
Inject messages into the agent loop while it's running:
from harness.core.steering import SteeringChannel
steering = SteeringChannel()
# Start the agent with steering enabled
async for msg in harness.run(
"Refactor the API layer",
steering=steering,
):
print(msg)
# From another coroutine, inject a message between turns:
await steering.send("Actually, skip the auth endpoints")
📊 Benchmark Results
Harness was benchmarked against the leading coding agents on 8 real-world tasks covering multi-file editing, bug fixing, error recovery, refactoring, context understanding, and code analysis.
Overall Scores
| Agent | Claude Opus 4.6 | GPT-5.2 |
|---|---|---|
| Harness | 7/8 (88%) | 8/8 (100%) |
| Claude Code | 7/8 (88%) | — |
| OpenCode | 7/8 (88%) | 7/8 (88%) |
| pi-mono | 7/8 (88%) | 8/8 (100%) |
Harness is the only open-source agent that achieves a perfect score — and it does so across providers, not locked to one.
Per-Task Breakdown (GPT-5.2)
| Task | Harness | OpenCode | pi-mono |
|---|---|---|---|
| Multi-file editing | PASS (17.5s) | PASS (19.4s) | PASS (26.8s) |
| Error recovery | PASS (5.2s) | PASS (11.7s) | PASS (10.1s) |
| Tool efficiency | PASS (1.8s) | PASS (5.6s) | PASS (9.2s) |
| Context understanding | PASS (9.7s) | FAIL | PASS (41.3s) |
| Project creation | PASS (3.0s) | PASS (7.6s) | PASS (3.8s) |
| Bug fixing | PASS (5.5s) | PASS (12.9s) | PASS (10.0s) |
| Code analysis | PASS (1.9s) | PASS (5.2s) | PASS (2.3s) |
| Refactoring | PASS (6.4s) | PASS (11.7s) | PASS (12.7s) |
Speed
| Agent | Model | Avg per Task | Total (8 tasks) |
|---|---|---|---|
| Harness | GPT-5.2 | 6.4s | 51.0s |
| Harness | Opus 4.6 | 12.5s | 99.7s |
| Claude Code | Opus 4.6 | 16.4s | 131.5s |
| OpenCode | GPT-5.2 | 10.7s | 85.8s |
| pi-mono | GPT-5.2 | 14.5s | 116.2s |
Harness is 2x faster than the next-fastest agent on GPT-5.2, and 30% faster than Claude Code on Opus.
⚙️ Configuration
Config File
Created automatically by harness connect. Lives at ~/.harness/config.toml:
[providers.anthropic]
api_key = "sk-ant-..."
[providers.openai]
api_key = "sk-..."
Environment Variables
If you prefer env vars, those work too:
export ANTHROPIC_API_KEY="sk-ant-..."
export OPENAI_API_KEY="sk-..."
export GOOGLE_API_KEY="AIza..."
🏗 Architecture
src/harness/
core/
engine.py Top-level run() entry point
loop.py Agent loop (provider -> tools -> repeat)
session.py JSONL session persistence
context.py Context window management + compaction
config.py Config loading (env, TOML, HARNESS.md)
steering.py Async steering channel for live message injection
providers/
anthropic.py Claude adapter
openai.py GPT / OpenAI-compatible adapter
google.py Gemini adapter
ollama.py Ollama local model adapter
registry.py Model catalogue (50+ models)
tools/ Read, Write, Edit, Bash, Glob, Grep, Task, Web, etc.
agents/ Sub-agent registry + lifecycle manager
hooks/ Pre/post tool-use hook system
mcp/ MCP client + progressive tool discovery
skills/ Skill loader (SKILL.md parser)
memory/ Auto-memory + project instructions
permissions/ Permission rules engine
ui/ Rich terminal output + streaming + diffs
eval/ SWE-bench, Harness-Bench, metrics, reports
cli/ Click CLI entry point + subcommands
🧪 Evaluation
Run Benchmarks
# Quick validation — 8 tasks, ~$1
harness eval harness-bench --provider anthropic --model sonnet
# SWE-bench Lite — 300 real GitHub issues
harness eval swe-bench --split lite --max-tasks 10
# List benchmarks
harness eval list
Available Benchmarks
| Benchmark | Tasks | Description |
|---|---|---|
| Harness-Bench | 8 | Multi-file editing, error recovery, refactoring, analysis |
| SWE-bench Lite | 300 | Curated subset of real GitHub issues |
| SWE-bench Verified | 500 | Human-verified solvable issues |
| SWE-bench Full | 2,294 | Complete benchmark |
🔧 Development
git clone https://github.com/AgentBoardTT/openharness.git
cd openharness
uv pip install -e ".[dev]"
uv run pytest tests/ -v
uv run ruff check src/ tests/
📚 Tutorials
Learn Harness from zero to enterprise with our step-by-step tutorial series:
| # | Tutorial | Difficulty | Time |
|---|---|---|---|
| 1 | Getting Started — Your First AI Coding Agent | Beginner | 10 min |
| 2 | The Python SDK — Building AI-Powered Tools | Beginner | 15 min |
| 3 | Permission Modes & Safety Controls | Intermediate | 15 min |
| 4 | Budget Controls & Cost Optimization | Intermediate | 15 min |
| 5 | Policy-as-Code — Governing Agent Behavior | Intermediate | 20 min |
| 6 | Audit Logging & Compliance | Intermediate | 20 min |
| 7 | Sandboxed Execution | Intermediate | 20 min |
| 8 | Sub-Agents & Parallel Execution | Intermediate | 20 min |
| 9 | CI/CD Integration — AI Agent in Your Pipeline | Advanced | 25 min |
| 10 | Enterprise Production Deployment | Advanced | 30 min |
🤝 Contributing
We'd love your help. Here's how:
- Bug reports — Open an issue
- Feature requests — Open an issue
- Pull requests — Fork, branch, submit
Areas where we especially need help:
- New provider adapters
- Additional tools
- Benchmark tasks and evaluation
- Documentation and examples
📄 License
The best agent scaffold is an open one.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file harness_agent-0.6.0.tar.gz.
File metadata
- Download URL: harness_agent-0.6.0.tar.gz
- Upload date:
- Size: 977.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.7.22
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
88cb3fac51cb3bfe3ecdf1ba94979c02649ee67849d1ca163cb5b67f061369ba
|
|
| MD5 |
af17c613947c75306765614630600fac
|
|
| BLAKE2b-256 |
066cf498a4c8b1a6b8a321682d71687e54f87344da5eb47e27f2e2dca6c22eeb
|
File details
Details for the file harness_agent-0.6.0-py3-none-any.whl.
File metadata
- Download URL: harness_agent-0.6.0-py3-none-any.whl
- Upload date:
- Size: 158.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.7.22
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
5ea5dcc10ee01e8ebac2d0326cfd2278c548d6b4b29c0c8044693c5aaae3e11d
|
|
| MD5 |
8e23b2ab88ed537fb5771bcc9f5fda7b
|
|
| BLAKE2b-256 |
1a33c510d97c90986697139ee67e6cd9287b2d1cd80390c28322a55385e7fa74
|