Production-ready autonomous coding harness using Claude Code SDK
Project description
claude-harness
Production-ready autonomous coding harness using Claude Code SDK. Build complete applications autonomously with a two-agent pattern (initializer + coding agents).
Proven Success: Built SHERPA v1.0 - 165 features, production-ready, A- grade quality.
Prerequisites
Required: Install the latest versions of both Claude Code and the Claude Agent SDK:
# Install Claude Code CLI (latest version required)
npm install -g @anthropic-ai/claude-code
# Install Python dependencies
pip install -r requirements.txt
Verify your installations:
claude --version # Should be latest version
pip show claude-code-sdk # Check SDK is installed
OAuth Token: Generate and set your Claude Code OAuth token:
# Generate the token using Claude Code CLI
claude setup-token
# Set the environment variable
export CLAUDE_CODE_OAUTH_TOKEN='your-oauth-token-here'
Installation
# Install from source
cd /path/to/claude-harness
pip install -e .
# Verify installation
claude-harness --version
Quick Start
# Set OAuth token (required)
export CLAUDE_CODE_OAUTH_TOKEN='your-token-here'
# Build a new app
claude-harness --project-dir ./my_project
# Test with limited iterations
claude-harness --project-dir ./my_project --max-iterations 3
# Enhancement mode (existing projects)
claude-harness --mode enhancement --project-dir ./existing-app --spec ./features.txt
๐ Read the full User Guide โ
What's New in v3.1.0
โ Production Reliability Features:
- Triple Timeout Protection - 15/10/120 min timeouts prevent infinite hangs
- Retry + Skip Logic - Auto-retry failed features (3 attempts), skip after max failures
- Loop Detection - Prevents infinite loops and repeated file reads
- Comprehensive Error Logging - Structured error logs in
.claude/errors.json - E2E Validation Enforcement - Commits blocked without E2E tests (CRITICAL BUG FIX)
- MCP Auto-Configuration - Context7 and Puppeteer servers pre-configured
- Security Hooks - Secrets scanning, bash allowlist, filesystem restrictions
๐ Full v3.1.0 changelog โ
Important Timing Expectations
Warning: This demo takes a long time to run!
-
First session (initialization): The agent generates a
feature_list.jsonwith 200 test cases. This takes several minutes and may appear to hang - this is normal. The agent is writing out all the features. -
Subsequent sessions: Each coding iteration can take 5-15 minutes depending on complexity.
-
Full app: Building all 200 features typically requires many hours of total runtime across multiple sessions.
Tip: The 200 features parameter in the prompts is designed for comprehensive coverage. If you want faster demos, you can modify prompts/initializer_prompt.md to reduce the feature count (e.g., 20-50 features for a quicker demo).
How It Works
Two-Agent Pattern
-
Initializer Agent (Session 1): Reads
app_spec.txt, createsfeature_list.jsonwith 200 test cases, sets up project structure, and initializes git. -
Coding Agent (Sessions 2+): Picks up where the previous session left off, implements features one by one, and marks them as passing in
feature_list.json.
Session Management
- Each session runs with a fresh context window
- Progress is persisted via
feature_list.jsonand git commits - The agent auto-continues between sessions (3 second delay)
- Press
Ctrl+Cto pause; run the same command to resume
Security Model
This demo uses a defense-in-depth security approach (see security.py and client.py):
- OS-level Sandbox: Bash commands run in an isolated environment
- Filesystem Restrictions: File operations restricted to the project directory only
- Bash Allowlist: Only specific commands are permitted:
- File inspection:
ls,cat,head,tail,wc,grep - Node.js:
npm,node - Version control:
git - Process management:
ps,lsof,sleep,pkill(dev processes only)
- File inspection:
Commands not in the allowlist are blocked by the security hook.
Project Structure
claude-harness/
โโโ autonomous_agent.py # Main entry point
โโโ agent.py # Agent session logic
โโโ client.py # Claude SDK client configuration
โโโ security.py # Bash command allowlist and validation
โโโ progress.py # Progress tracking utilities
โโโ prompts.py # Prompt loading utilities
โโโ prompts/
โ โโโ app_spec.txt # Application specification
โ โโโ initializer_prompt.md # First session prompt
โ โโโ coding_prompt.md # Continuation session prompt
โโโ requirements.txt # Python dependencies
Generated Project Structure
After running, your project directory will contain:
my_project/
โโโ feature_list.json # Test cases (source of truth)
โโโ app_spec.txt # Copied specification
โโโ init.sh # Environment setup script
โโโ claude-progress.txt # Session progress notes
โโโ .claude_settings.json # Security settings
โโโ [application files] # Generated application code
Running the Generated Application
After the agent completes (or pauses), you can run the generated application:
cd generations/my_project
# Run the setup script created by the agent
./init.sh
# Or manually (typical for Node.js apps):
npm install
npm run dev
The application will typically be available at http://localhost:3000 or similar (check the agent's output or init.sh for the exact URL).
Command Line Options
| Option | Description | Default |
|---|---|---|
--project-dir |
Directory for the project | ./autonomous_demo_project |
--mode |
Mode: greenfield/enhancement/bugfix | greenfield |
--spec |
Specification file path | None |
--max-iterations |
Max agent iterations | Unlimited |
--model |
Claude model to use | claude-sonnet-4-5-20250929 |
--session-timeout |
Session timeout (minutes) | 120 |
--stall-timeout |
Stall timeout (minutes) | 10 |
--max-retries |
Max retry attempts per feature | 3 |
--version |
Show version and exit | - |
--help |
Show help and exit | - |
๐ Full command reference in User Guide โ
Customization
Changing the Application
Edit prompts/app_spec.txt to specify a different application to build.
Adjusting Feature Count
Edit prompts/initializer_prompt.md and change the "200 features" requirement to a smaller number for faster demos.
Modifying Allowed Commands
Edit security.py to add or remove commands from ALLOWED_COMMANDS.
Troubleshooting
"Appears to hang on first run"
This is normal. The initializer agent is generating 200 detailed test cases, which takes significant time. Watch for [Tool: ...] output to confirm the agent is working.
"Command blocked by security hook"
The agent tried to run a command not in the allowlist. This is the security system working as intended. If needed, add the command to ALLOWED_COMMANDS in security.py.
"OAuth token not set"
Run claude setup-token to generate your token, then ensure CLAUDE_CODE_OAUTH_TOKEN is exported in your shell environment.
License
Internal Anthropic use.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file claude_harness-3.2.2.tar.gz.
File metadata
- Download URL: claude_harness-3.2.2.tar.gz
- Upload date:
- Size: 92.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
bbc8e76069b7bb994e32cf25a05ca7963da95d939a374e880de951224a0524d8
|
|
| MD5 |
db0095ad7c110454a2003722047fd63d
|
|
| BLAKE2b-256 |
fcf1706069a979b33aad8a815bc23659eba05f13ac2d188286f0c77e9c05ab57
|
Provenance
The following attestation bundles were made for claude_harness-3.2.2.tar.gz:
Publisher:
publish-to-pypi.yml on nirmalarya/claude-harness
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
claude_harness-3.2.2.tar.gz -
Subject digest:
bbc8e76069b7bb994e32cf25a05ca7963da95d939a374e880de951224a0524d8 - Sigstore transparency entry: 789145725
- Sigstore integration time:
-
Permalink:
nirmalarya/claude-harness@ffc34cbfe7f574a4be7711a97d0b223c157c19c5 -
Branch / Tag:
refs/heads/main - Owner: https://github.com/nirmalarya
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish-to-pypi.yml@ffc34cbfe7f574a4be7711a97d0b223c157c19c5 -
Trigger Event:
workflow_dispatch
-
Statement type:
File details
Details for the file claude_harness-3.2.2-py3-none-any.whl.
File metadata
- Download URL: claude_harness-3.2.2-py3-none-any.whl
- Upload date:
- Size: 108.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
786da5ecc2dc318a306dc8fc7459aa16c6c07da053b4eee3143968c27da2cc79
|
|
| MD5 |
e01a80427f50b9ab71ae82fca45faea9
|
|
| BLAKE2b-256 |
84fbfdf3bd8e675c0ecb1b5a6509f727d1f4e814d0ca7dc13a50ff0cd556bbc8
|
Provenance
The following attestation bundles were made for claude_harness-3.2.2-py3-none-any.whl:
Publisher:
publish-to-pypi.yml on nirmalarya/claude-harness
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
claude_harness-3.2.2-py3-none-any.whl -
Subject digest:
786da5ecc2dc318a306dc8fc7459aa16c6c07da053b4eee3143968c27da2cc79 - Sigstore transparency entry: 789145732
- Sigstore integration time:
-
Permalink:
nirmalarya/claude-harness@ffc34cbfe7f574a4be7711a97d0b223c157c19c5 -
Branch / Tag:
refs/heads/main - Owner: https://github.com/nirmalarya
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish-to-pypi.yml@ffc34cbfe7f574a4be7711a97d0b223c157c19c5 -
Trigger Event:
workflow_dispatch
-
Statement type: