Skip to main content

Race coding agents against each other on real tasks

Project description

coderace

Race coding agents against each other on real tasks in your repo.

Define a task. Run it against Claude Code, Codex, and Aider. Get a scored comparison table.

Install

pip install coderace

Quick Start

# Create a task template
coderace init fix-auth-bug

# Edit the task file (describe the bug, set test command)
# Then race the agents:
coderace run fix-auth-bug.yaml

# Or race them in parallel (uses git worktrees):
coderace run fix-auth-bug.yaml --parallel

# View results from the last run
coderace results fix-auth-bug.yaml

Task Format

name: fix-auth-bug
description: |
  The login endpoint returns 500 when email contains a plus sign.
  Fix the email validation in auth/validators.py.
repo: .
test_command: pytest tests/test_auth.py -x
lint_command: ruff check .
timeout: 300
agents:
  - claude
  - codex
  - aider

What It Does

For each agent in the task:

  1. Creates a fresh git branch (coderace/<agent>-<task>)
  2. Invokes the agent CLI with the task description
  3. Runs your test command
  4. Runs your lint command (optional)
  5. Computes a composite score

Scoring

Metric Weight Description
Tests pass 40% Did the test command exit 0?
Exit clean 20% Did the agent itself exit 0 without timeout?
Lint clean 15% Did the lint command exit 0?
Wall time 15% Faster is better (normalized across agents)
Lines changed 10% Fewer is better (normalized across agents)

Output

Terminal table with Rich formatting:

┌──────┬────────┬───────┬───────┬──────┬──────┬──────────┬───────┐
│ Rank │ Agent  │ Score │ Tests │ Exit │ Lint │ Time (s) │ Lines │
├──────┼────────┼───────┼───────┼──────┼──────┼──────────┼───────┤
│  1   │ claude │  85.0 │ PASS  │ PASS │ PASS │     10.5 │    42 │
│  2   │ codex  │  70.0 │ PASS  │ PASS │ FAIL │     15.2 │    98 │
│  3   │ aider  │  55.0 │ FAIL  │ PASS │ PASS │      8.1 │    31 │
└──────┴────────┴───────┴───────┴──────┴──────┴──────────┴───────┘

Results also saved as JSON in .coderace/<task>-results.json.

Supported Agents

Agent CLI Command
Claude Code claude claude --print --output-format json -p "<task>"
Codex codex codex --quiet --full-auto -p "<task>"
Aider aider aider --message "<task>" --yes --no-auto-commits
Gemini CLI gemini gemini --non-interactive -p "<task>"

Each agent must be installed and authenticated separately.

Parallel Mode

Use --parallel (or -p) to run all agents simultaneously using git worktrees. Each agent gets its own isolated working directory, so they don't interfere with each other.

coderace run task.yaml --parallel

Sequential mode (default) runs agents one at a time on the same repo.

Requirements

  • Python 3.10+
  • Git
  • At least one coding agent CLI installed

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

coderace-0.1.0.tar.gz (13.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

coderace-0.1.0-py3-none-any.whl (15.2 kB view details)

Uploaded Python 3

File details

Details for the file coderace-0.1.0.tar.gz.

File metadata

  • Download URL: coderace-0.1.0.tar.gz
  • Upload date:
  • Size: 13.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.19

File hashes

Hashes for coderace-0.1.0.tar.gz
Algorithm Hash digest
SHA256 0fda65f19ad836a666d25eccd3873bc13426ed2f727dd80cd5096d3844fba51d
MD5 5e4a3a4537e2f6313aa3f173aaa856de
BLAKE2b-256 ffaa84337800a7d291fb7a72b70a9b157c1879c6399653ed6c16e89995871db8

See more details on using hashes here.

File details

Details for the file coderace-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: coderace-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 15.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.19

File hashes

Hashes for coderace-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 fd32bd667014a6809bb8298d750e728e13aaf606d1daf749868c5daa18b04b42
MD5 94463c4831f5486476815ea9b396a03a
BLAKE2b-256 bbff40eabe6f42eaf8d966668783595b62837f5a20663d7d32352af2efb6541c

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page