Skip to main content

Know what an AI task will cost before you run it

Project description

agent-estimate

PyPI Version Python Versions License CI

Know what an AI task will cost before you run it.

agent-estimate tells you how long an AI agent will take — and how that compares to doing it yourself.

$ agent-estimate estimate "Implement OAuth 2.0 flow (Google + GitHub)"
Metric Value
Expected case 75.4m
Human-speed equivalent 190.9m
Compression ratio 2.53x

One command. Three numbers. Now you know whether to dispatch an agent or do it yourself.

See it in action

Single task — coding

$ agent-estimate estimate "Implement OAuth 2.0 flow (Google + GitHub)"
Metric Value
Task Implement OAuth 2.0 flow (Google + GitHub)
Tier / Agent M / Claude
Base PERT (O/M/P) 25m / 50m / 90m (E=52.5m)
Best case 44.7m
Expected case 75.4m
Worst case 117.2m
Human-speed equivalent 190.9m
Compression ratio 2.53x
Review overhead +15m (standard)
Estimated cost $1.45

A medium coding task. Your agent finishes in ~75 minutes. Doing it yourself? ~3 hours. (Full output)

Single task — research

$ agent-estimate estimate "Audit dependencies for known CVEs" --type research
Metric Value
Task Audit dependencies for known CVEs
Tier / Agent S / Claude
Base PERT (O/M/P) 10m / 20m / 30m (E=20m)
Expected case 38m
Human-speed equivalent 99m
Compression ratio 2.61x
Estimated cost $0.55

Research tasks have high human-multipliers — pattern matching across hundreds of dependencies is exactly where agents shine. (Full output)

Multi-agent session — 3 tasks in parallel

$ agent-estimate estimate --file tasks.txt

Where tasks.txt contains:

Implement OAuth 2.0 flow (Google + GitHub)
Write unit tests for OAuth flow
Write API reference for auth module
Task Tier Agent Expected Human Equiv
Implement OAuth 2.0 flow M Codex 52.5m 190.9m
Write unit tests for OAuth flow M Gemini 52.5m 190.9m
Write API reference for auth module L Claude 100.8m 327.6m
Metric Value
Wave 0 All 3 tasks in parallel (Claude + Codex + Gemini)
Expected case 131m
Human-speed equivalent 709.5m
Compression ratio 5.42x
Estimated cost $4.84

Three agents working in parallel. ~2 hours wall clock vs ~12 hours sequential human work. That's the power of fleet estimation — you see the compression before you commit the compute. (Full output)

More examples in examples/ — coding S/M, research, documentation, and multi-agent sessions.

Try it now

pip install agent-estimate
agent-estimate estimate "your task description here"

That's it. No config needed — sensible defaults for a 3-agent fleet (Claude, Codex, Gemini).

What it does

agent-estimate produces three-point PERT estimates calibrated for AI agents, not humans:

  • Tier classification — auto-sizes tasks as XS/S/M/L/XL based on complexity signals
  • PERT math — optimistic, most-likely, pessimistic with weighted expected value
  • Human comparison — multiplier per task type (coding, research, docs) so you see the compression
  • METR thresholds — warns when estimates exceed model reliability limits (METR p80 benchmarks)
  • Wave planning — dependency-aware scheduling across multiple agents with parallelism
  • Review overhead — models review cycles as flat additive cost, amortized per agent per wave
  • Modifiers — tune estimates with --spec-clarity, --warm-context, --agent-fit

Supported task types

Type Flag What it models
Coding (default) Feature work, bug fixes, refactors — tier-based PERT
Research --type research Audits, investigations, analysis — flat PERT with depth scaling
Documentation --type documentation API docs, guides, changelogs
Brainstorm --type brainstorm Ideation, spikes, design exploration
Config/SRE --type config Deploys, infra changes, CI/CD work

Integrations

Claude Code Plugin

/plugin marketplace add kiloloop/agent-estimate
/plugin install agent-estimate@agent-estimate-marketplace

Then use directly in Claude Code sessions:

/estimate Add a login page with OAuth
/estimate --file spec.md
/estimate --issues 1,2,3 --repo myorg/myrepo
/validate-estimate observation.yaml
/calibrate

GitHub Action

- uses: kiloloop/agent-estimate@v0
  with:
    issues: '11,12,14'
Full workflow example
name: Estimate
on:
  pull_request:
    types: [opened, synchronize]

permissions:
  contents: read
  pull-requests: write

jobs:
  estimate:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: kiloloop/agent-estimate@v0
        with:
          issues: '11,12,14'
          output-mode: summary+pr-comment
Action inputs and outputs
Input Required Default Description
issues yes GitHub issue numbers (comma-separated)
repo no current repo GitHub repo (owner/name)
format no markdown Output format: markdown or json
output-mode no summary summary, pr-comment, step-output, summary+pr-comment
config no Path to agent config YAML
title no Agent Estimate Report Report title
review-mode no standard Review tier: none, standard, complex
spec-clarity no 1.0 Spec clarity modifier (0.3-1.3)
warm-context no 1.0 Warm context modifier (0.3-1.15)
agent-fit no 1.0 Agent fit modifier (0.9-1.2)
task-type no Task category: coding, brainstorm, research, config, documentation
python-version no 3.12 Python version to use
version no latest agent-estimate version to install
token no ${{ github.token }} GitHub token
Output Description
report Full estimation report content
expected-minutes Expected minutes (when format: json)

Codex Skill

Codex-specific skill at .agent/skills/estimate/SKILL.md. Claude plugin skill at skills/estimate/SKILL.md.

Configuration

Agent fleet

Pass a custom config to model your own agent fleet:

agents:
  - name: Claude
    capabilities: [planning, implementation, review]
    parallelism: 2
    cost_per_turn: 0.12
    model_tier: frontier
  - name: Codex
    capabilities: [implementation, debugging, testing]
    parallelism: 3
    cost_per_turn: 0.08
    model_tier: production
settings:
  friction_multiplier: 1.15
  inter_wave_overhead: 0.25
  review_overhead: 0.2
  metr_fallback_threshold: 45.0
agent-estimate estimate "Ship packaging flow" --config ./my_agents.yaml

Default METR thresholds

Model p80 threshold
Opus 4.6 90 minutes
GPT-5.4 60 minutes
Gemini 3.1 Pro 45 minutes
Sonnet 4.6 30 minutes
Haiku 4.5 15 minutes

Legacy model keys (Opus, GPT-5/5.2/5.3, Gemini 3 Pro, Sonnet) are still supported for backward compatibility.

Note: Estimates are calibrated against Claude Code (Opus 4.6, High thinking) and Codex (GPT-5.4, Extra High thinking). Different thinking levels or model versions may shift estimates — adjust with --spec-clarity and --warm-context modifiers as needed.

Output formats

agent-estimate estimate "Refactor auth pipeline" --format json    # machine-readable
agent-estimate estimate --repo myorg/myrepo --issues 11,12,14     # from GitHub issues
agent-estimate estimate --file tasks.txt                           # from file

Calibration

Validate estimates against observed outcomes and build a calibration database:

agent-estimate validate observation.yaml --db ~/.agent-estimate/calibration.db

Related

OACP — Coordinate the agents you just estimated. Open Agent Coordination Protocol for multi-agent async workflows.

Contributing

See CONTRIBUTING.md for full workflow and expectations.

pip install -e '.[dev]'
ruff check .
pytest -q

Community

License

Apache License 2.0

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

agent_estimate-0.6.1.tar.gz (96.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

agent_estimate-0.6.1-py3-none-any.whl (61.3 kB view details)

Uploaded Python 3

File details

Details for the file agent_estimate-0.6.1.tar.gz.

File metadata

  • Download URL: agent_estimate-0.6.1.tar.gz
  • Upload date:
  • Size: 96.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for agent_estimate-0.6.1.tar.gz
Algorithm Hash digest
SHA256 3ecc20eac4be4bac2812e08e741062d4ed56874ad63f2ed1a94476730f0db8e2
MD5 96b4e5111d8d17920fcea3a1732a5c15
BLAKE2b-256 e4608b952e7844a64ac864203e619a7526e5f7d39d939a3191c81cc0fa52d3bd

See more details on using hashes here.

Provenance

The following attestation bundles were made for agent_estimate-0.6.1.tar.gz:

Publisher: publish.yml on kiloloop/agent-estimate

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file agent_estimate-0.6.1-py3-none-any.whl.

File metadata

  • Download URL: agent_estimate-0.6.1-py3-none-any.whl
  • Upload date:
  • Size: 61.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for agent_estimate-0.6.1-py3-none-any.whl
Algorithm Hash digest
SHA256 cafc9c93cb456825139ce1869688e1e56d6d8bfd4fd96cace6c8916224d42fc2
MD5 cecba247067580b3275d0b082566c32d
BLAKE2b-256 9ff11c014ed0e006865626e7121b78fb2844f9fd6004b16e54e4263f1e9e768c

See more details on using hashes here.

Provenance

The following attestation bundles were made for agent_estimate-0.6.1-py3-none-any.whl:

Publisher: publish.yml on kiloloop/agent-estimate

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page