Race coding agents against each other on real tasks
Project description
coderace
Race coding agents against each other on real tasks in your repo.
Define a task. Run it against Claude Code, Codex, and Aider. Get a scored comparison table.
Install
pip install coderace
Quick Start
# Create a task template
coderace init fix-auth-bug
# Edit the task file (describe the bug, set test command)
# Then race the agents:
coderace run fix-auth-bug.yaml
# Or race them in parallel (uses git worktrees):
coderace run fix-auth-bug.yaml --parallel
# View results from the last run
coderace results fix-auth-bug.yaml
Task Format
name: fix-auth-bug
description: |
The login endpoint returns 500 when email contains a plus sign.
Fix the email validation in auth/validators.py.
repo: .
test_command: pytest tests/test_auth.py -x
lint_command: ruff check .
timeout: 300
agents:
- claude
- codex
- aider
What It Does
For each agent in the task:
- Creates a fresh git branch (
coderace/<agent>-<task>) - Invokes the agent CLI with the task description
- Runs your test command
- Runs your lint command (optional)
- Computes a composite score
Scoring
| Metric | Weight | Description |
|---|---|---|
| Tests pass | 40% | Did the test command exit 0? |
| Exit clean | 20% | Did the agent itself exit 0 without timeout? |
| Lint clean | 15% | Did the lint command exit 0? |
| Wall time | 15% | Faster is better (normalized across agents) |
| Lines changed | 10% | Fewer is better (normalized across agents) |
Output
Terminal table with Rich formatting:
┌──────┬────────┬───────┬───────┬──────┬──────┬──────────┬───────┐
│ Rank │ Agent │ Score │ Tests │ Exit │ Lint │ Time (s) │ Lines │
├──────┼────────┼───────┼───────┼──────┼──────┼──────────┼───────┤
│ 1 │ claude │ 85.0 │ PASS │ PASS │ PASS │ 10.5 │ 42 │
│ 2 │ codex │ 70.0 │ PASS │ PASS │ FAIL │ 15.2 │ 98 │
│ 3 │ aider │ 55.0 │ FAIL │ PASS │ PASS │ 8.1 │ 31 │
└──────┴────────┴───────┴───────┴──────┴──────┴──────────┴───────┘
Results also saved as JSON in .coderace/<task>-results.json.
Supported Agents
| Agent | CLI | Command |
|---|---|---|
| Claude Code | claude |
claude --print --output-format json -p "<task>" |
| Codex | codex |
codex --quiet --full-auto -p "<task>" |
| Aider | aider |
aider --message "<task>" --yes --no-auto-commits |
| Gemini CLI | gemini |
gemini --non-interactive -p "<task>" |
Each agent must be installed and authenticated separately.
Parallel Mode
Use --parallel (or -p) to run all agents simultaneously using git worktrees. Each agent gets its own isolated working directory, so they don't interfere with each other.
coderace run task.yaml --parallel
Sequential mode (default) runs agents one at a time on the same repo.
Requirements
- Python 3.10+
- Git
- At least one coding agent CLI installed
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file coderace-0.1.0.tar.gz.
File metadata
- Download URL: coderace-0.1.0.tar.gz
- Upload date:
- Size: 13.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.10.19
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0fda65f19ad836a666d25eccd3873bc13426ed2f727dd80cd5096d3844fba51d
|
|
| MD5 |
5e4a3a4537e2f6313aa3f173aaa856de
|
|
| BLAKE2b-256 |
ffaa84337800a7d291fb7a72b70a9b157c1879c6399653ed6c16e89995871db8
|
File details
Details for the file coderace-0.1.0-py3-none-any.whl.
File metadata
- Download URL: coderace-0.1.0-py3-none-any.whl
- Upload date:
- Size: 15.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.10.19
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
fd32bd667014a6809bb8298d750e728e13aaf606d1daf749868c5daa18b04b42
|
|
| MD5 |
94463c4831f5486476815ea9b396a03a
|
|
| BLAKE2b-256 |
bbff40eabe6f42eaf8d966668783595b62837f5a20663d7d32352af2efb6541c
|