Benchmark how agent-ready a code repository is for LLM coding agents.

These details have not been verified by PyPI

Project links

Project description

agent-readiness

A benchmark for AI agent readiness of a code repository.

You bought the seats. Your team is using Claude Code, Cursor, Copilot, Cline. And the agents keep going off the rails on your codebase.

The model is the variable you can't change. The repo is what you can.

agent-readiness scans a repository and scores how ready it is for AI coding agents — across cognitive load, feedback loops, and flow — then hands you a prioritised punchlist of fixes. Like Lighthouse, but for AI agent readiness instead of page load.

$ agent-readiness scan .

AI Readiness  62 / 100

  Cognitive load     70 / 100
  Feedback loops     40 / 100   ← biggest drag
  Flow & reliability 75 / 100
  Safety             OK

Top friction (fix these first):
  1. test_command.discoverable — no test invocation found in Makefile,
     package.json, or pyproject.toml
  2. agent_docs.present — no AGENTS.md / CLAUDE.md / .cursorrules at root
  3. headless.no_setup_prompts — README mentions "log in to the dashboard"
     during setup; agents can't traverse this

Design principles

Agents are headless. We assume the agent has stdin / stdout / files / git / HTTP and nothing else. No browser, no dashboard, no clickable button. If important state is reachable only through a UI, it's invisible to the agent — and the repo loses points wherever that's true.

This applies to our own tool, too. agent-readiness is fully headless: no required interactive prompts, stable JSON via --json, exit codes that mean things, machine-readable findings.

Code quality counts only where it predicts agent success. Mega-files, ambiguous names, dead code, missing types — those have direct lines to agent failure modes and get measured. We don't reproduce the full SonarQube taxonomy. Other tools do that well.

Run untrusted code in Docker, always. Any check that executes code from the target repo runs inside a sandboxed container. See docs/SANDBOX.md.

What gets measured

See docs/RUBRIC.md for the full definition. Short version:

Pillar	What it captures
Cognitive load	What the agent must absorb to make a correct change.
Feedback loops	How fast and clear is the signal after a change.
Flow / reliability	Headless walkability + how often friction outside the task blocks the agent.
Safety & trust	Secrets, destructive scripts, gitignore hygiene. (Cap, not weight.)

This repo's score

Dogfooding: agent-readiness scan . run against this repository itself.

╭─────────────────────────────╮
│  AI Readiness  100.0 / 100  │
╰─────────────────────────────╯
 Cognitive load      100.0  ████████████████████
 Feedback loops      100.0  ████████████████████
 Flow & reliability  100.0  ████████████████████
 Safety              100.0  ████████████████████

No findings. Looking good.

Score updated after each iteration as part of the development workflow.

Usage

# Static scan (no Docker needed)
agent-readiness scan .
agent-readiness scan . --json
agent-readiness scan . --fail-below 70        # exit 1 if score < 70 (CI gate)
agent-readiness scan . --only feedback        # filter to one pillar
agent-readiness scan . --baseline prev.json   # diff against a previous run
agent-readiness scan . --report report.html   # HTML report (requires jinja2)
agent-readiness scan . --badge badge.svg      # score badge SVG
agent-readiness scan . --sarif findings.sarif # SARIF for GitHub code scanning

# Runtime scan (executes tests inside Docker)
agent-readiness scan . --run

# Other commands
agent-readiness list-checks
agent-readiness explain manifest.detected
agent-readiness init                          # write .agent-readiness.toml

Status

All phases implemented (v0.1–v0.9). 22 checks across 4 pillars, Docker sandbox, HTML + SARIF renderers, CLI surface, plugin API. See docs/PLAN.md for the full roadmap and CHANGELOG.md for per-phase release notes.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

2.1.1

May 4, 2026

2.1.0

May 4, 2026

1.5.0

May 2, 2026

1.4.0

May 2, 2026

This version

1.1.0

Apr 29, 2026

1.0.0

Apr 28, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

agent_readiness-1.1.0.tar.gz (122.8 kB view details)

Uploaded Apr 29, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

agent_readiness-1.1.0-py3-none-any.whl (102.4 kB view details)

Uploaded Apr 29, 2026 Python 3

File details

Details for the file agent_readiness-1.1.0.tar.gz.

File metadata

Download URL: agent_readiness-1.1.0.tar.gz
Upload date: Apr 29, 2026
Size: 122.8 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.8

File hashes

Hashes for agent_readiness-1.1.0.tar.gz
Algorithm	Hash digest
SHA256	`89aaf3efc734aaaf47109f3ae6226090c4e36e546861890b6301a7ad7f5339ce`
MD5	`c1ad2b2ba742de0dd39bc74d7a2c62c2`
BLAKE2b-256	`8877f7cbbcb85b29e114e39d3cb4fb696e03c8fd76d72422b979e40b4f7f4695`

See more details on using hashes here.

File details

Details for the file agent_readiness-1.1.0-py3-none-any.whl.

File metadata

Download URL: agent_readiness-1.1.0-py3-none-any.whl
Upload date: Apr 29, 2026
Size: 102.4 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.8

File hashes

Hashes for agent_readiness-1.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`245f60a25e3af8ba80e363566c3e256f6856ef41b654b41a1ac64d54136d524d`
MD5	`89d2164d37fc69679ec61f383cbf132c`
BLAKE2b-256	`845dce2792937a18f8ee5a370398c09dca25c0c2b0b7538225b1b354c43ef716`

See more details on using hashes here.

agent-readiness 1.1.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

agent-readiness

Design principles

What gets measured

This repo's score

Usage

Status

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes