Analyze codebases and generate optimized context files for AI coding agents (Claude Code, Codex, Cursor, Copilot)

These details have not been verified by PyPI

Project links

Project description

agentmd

agentmd analyzes your codebase and generates optimized context files for AI coding agents. Point it at any Python, Swift/Xcode, Rust, Go, TypeScript, or multi-language project and it produces ready-to-use CLAUDE.md, AGENTS.md, .cursorrules, or Copilot instruction files — scored and ranked so your agent starts with the best possible picture of your project.

What's New in 0.6.0

agentmd eval — generate a context file AND measure its performance impact in one command. Closes the GENERATE → MEASURE loop. Integrates with coderace (optional) for automated benchmarking.

What's New in 0.5.0

--tiered mode — generate a directory of context files instead of a single file. Based on the Codified Context paper (arXiv 2602.20478) which showed single-file manifests don't scale past ~1000 lines. Tiered mode automatically detects subsystem boundaries and generates a Tier 1 CLAUDE.md (always loaded) plus per-subsystem Tier 2 files in .agents/.
Subsystem detection — automatically identifies subsystem boundaries from directory structure, package manifests, and source file distribution
Trigger table — Tier 1 CLAUDE.md includes a table mapping directories to their context files so agents know which subsystem context to load

Install

pip install agentmd-gen

Usage

scan — inspect a project

agentmd scan                   # scan current directory
agentmd scan ~/repos/myapp     # scan a specific path
agentmd scan --json            # output as JSON

Prints detected languages, frameworks, package managers, test runners, linters, CI systems, and existing context files.

generate — create agent context files

agentmd generate                          # generate for all supported agents
agentmd generate --agent claude           # Claude Code (CLAUDE.md)
agentmd generate --agent codex            # OpenAI Codex (AGENTS.md)
agentmd generate --agent cursor           # Cursor (.cursorrules)
agentmd generate --agent copilot          # GitHub Copilot (.github/copilot-instructions.md)
agentmd generate --minimal                # lean, essential-only output (recommended)
agentmd generate -m --agent claude        # minimal mode for a single agent
agentmd generate --json                   # output generated content as JSON
agentmd generate --json --minimal         # JSON with "mode": "minimal" metadata
agentmd generate --tiered                 # tiered context (CLAUDE.md + .agents/)
agentmd generate --tiered --force         # overwrite existing tiered files

eval — measure what you generate

Close the GENERATE → MEASURE loop: generate a context file and immediately benchmark its impact on agent performance.

agentmd eval ~/repos/myproject               # generate + benchmark (if coderace installed)
agentmd eval --no-benchmark ~/repos/myproject # generate only, skip benchmarking
agentmd eval --existing ~/repos/myproject    # benchmark existing CLAUDE.md without regenerating
agentmd eval --json ~/repos/myproject        # JSON output for CI
agentmd eval --json --no-benchmark .         # JSON without benchmarking

Requires coderace for performance measurement (optional). If not installed, agentmd still generates the context file and shows an install hint. coderace is never a hard dependency.

Example output:

Generating context file...
  ✓ Analyzed 47 files across 8 source directories
  ✓ Generated CLAUDE.md (52 lines)

Measuring performance impact...
  Running coderace context-eval (this takes 2-3 minutes)

Context File Impact Report
━━━━━━━━━━━━━━━━━━━━━━━━━
  With context: avg score 84 (n=3 tasks)
  Without context: avg score 67 (n=3 tasks)
  Net improvement: +17 points (+25%)

  ✓ Context file improves agent performance

  Saved: CLAUDE.md

score — evaluate existing context files

agentmd score                             # score all context files in cwd
agentmd score CLAUDE.md                   # score a specific file
agentmd score --json                      # output scores as JSON

Outputs a score (0–100) broken down by dimension.

Example JSON output:

{
  "file": "CLAUDE.md",
  "total": 84,
  "dimensions": {
    "completeness": 18,
    "specificity": 17,
    "clarity": 16,
    "agent_awareness": 18,
    "freshness": 15
  }
}

diff — compare context files

agentmd diff --agent claude               # diff current file vs freshly generated output
agentmd diff --minimal --agent claude     # diff against minimal-mode output
agentmd diff --json                       # output diff as JSON

drift — detect context drift

agentmd drift                             # check all agent context files in cwd
agentmd drift --agent claude             # check only CLAUDE.md
agentmd drift --minimal --agent claude   # check drift against minimal-mode output
agentmd drift --json                     # machine-readable drift report
agentmd drift --format github            # GitHub workflow command annotations
agentmd drift --format markdown          # PR comment markdown report

Exit codes:

0 = context files are fresh
1 = drift detected (or missing context file)

GitHub Action

Use the published action in your PR workflow:

name: agentmd-drift

on:
  pull_request:
    types: [opened, synchronize, reopened]

jobs:
  drift:
    runs-on: ubuntu-latest
    permissions:
      contents: read
      pull-requests: write
    steps:
      - uses: actions/checkout@v4
      - uses: mikiships/agentmd@v0.4.0
        with:
          agent: claude
          fail-on-drift: "true"
          comment: "true"
          python-version: "3.11"

For reusable workflow usage, see .github/workflows/agentmd-drift.yml.

Minimal Mode

Research (arXiv 2602.11988 "Evaluating AGENTS.md") found that verbose context files can reduce task success rates and increase costs by ~20%. The most valuable information is exact commands to run (build, test, lint). The least valuable: generic tips, style guides, and anti-patterns that agents already know.

--minimal generates only what the agent can't infer itself:

A one-line header
Build, test, and lint commands (highest-value section)
Source and test directory roots

Everything else is omitted. For Claude, a single /compact tip is appended.

agentmd generate --minimal               # recommended for most projects
agentmd generate --minimal --agent claude
agentmd diff --minimal                   # compare existing files against minimal output
agentmd drift --minimal                  # check drift against minimal baseline

Tiered Mode

The Codified Context paper (arXiv 2602.20478) showed that single-file context manifests don't scale past ~1000 lines of context. They built a three-tier architecture manually for a 108k-line C# project across 283 sessions. --tiered automates that pattern.

--tiered detects subsystem boundaries in your project and generates:

project/
├── CLAUDE.md              # Tier 1: conventions, build/test/lint, trigger table (~30 lines)
└── .agents/
    ├── api.md             # Tier 2: per-subsystem context
    ├── database.md
    └── web.md

The Tier 1 CLAUDE.md includes a trigger table that maps directories to their context files:

## Context Files (load when working in these areas)
| Directory | Context File |
|-----------|-------------|
| api/      | .agents/api.md |
| db/       | .agents/database.md |
| web/      | .agents/web.md |

Projects with fewer than 20 source files or 2000 lines are too small for tiered mode — use generate without --tiered instead.

agentmd generate --tiered                # detect subsystems and generate tiered context
agentmd generate --tiered --force        # overwrite existing files
agentmd generate --tiered --dry-run      # preview without writing

Supported Agents

Agent	Output file
Claude Code	`CLAUDE.md`
OpenAI Codex	`AGENTS.md`
Cursor	`.cursorrules`
GitHub Copilot	`.github/copilot-instructions.md`

Supported Languages

Language	Detection	Generators	What's detected
Python	`requirements.txt`, `pyproject.toml`, `setup.py`, `Pipfile`	All agents	Frameworks (Django, Flask, FastAPI, Starlette, Litestar), pytest, ruff/flake8/mypy, GitHub Actions
Swift/Xcode	`.xcodeproj`, `Package.swift`	All agents	SwiftUI/UIKit/AppKit targets, SwiftLint, xcodebuild CI, Swift Package Manager
Rust	`Cargo.toml`	All agents	tokio, actix-web, serde, axum, clap, and other common crates; clippy/rustfmt, cargo-based CI
Go	`go.mod`	All agents	gin, echo, fiber, cobra, and other common modules; golangci-lint, go test CI

How Scoring Works

Each context file is evaluated on five dimensions (total: 100 points):

Dimension	Points	What it measures
Completeness	20	All key project facts present (languages, stack, test commands)
Specificity	20	Concrete details vs. generic boilerplate
Clarity	20	Readable structure, scannable headings, no walls of text
Agent-awareness	20	Instructions tailored to the target agent's strengths and quirks
Freshness	20	Content reflects the current state of the codebase (no stale info)

Note on freshness scoring (v0.2.0): Earlier versions could false-positive on freshness — penalizing files that referenced current stable versions or recent stable APIs. This has been corrected. The freshness dimension now only flags genuinely stale references (deprecated packages, EOL runtime versions, removed APIs).

Run agentmd score after generating to see where your files land and what to improve.

Part of the Agent Toolkit

agentmd is one of three tools for AI coding agent quality:

coderace — Race coding agents against each other on real tasks. Automated, reproducible, scored comparisons.
agentmd — Generate and score context files for AI coding agents.
agentlint — Lint AI agent git diffs for risky patterns. Static analysis, no LLM required.

Measure (coderace) → Optimize (agentmd) → Guard (agentlint).

Contributing

Fork the repo and create a branch
pip install -e ".[dev]" to get dev dependencies
Write tests in tests/unit/
pytest tests/unit -q must pass
Open a PR — CI runs on Python 3.10–3.13

License

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.6.0

Mar 12, 2026

0.5.0

Mar 6, 2026

0.4.0

Mar 5, 2026

0.3.0

Mar 4, 2026

0.2.0

Mar 3, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

agentmd_gen-0.6.0.tar.gz (86.5 kB view details)

Uploaded Mar 12, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

agentmd_gen-0.6.0-py3-none-any.whl (59.6 kB view details)

Uploaded Mar 12, 2026 Python 3

File details

Details for the file agentmd_gen-0.6.0.tar.gz.

File metadata

Download URL: agentmd_gen-0.6.0.tar.gz
Upload date: Mar 12, 2026
Size: 86.5 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.2

File hashes

Hashes for agentmd_gen-0.6.0.tar.gz
Algorithm	Hash digest
SHA256	`b922e237c78386c7b3328af5cd26809a53a4332bc8026480494d2d37639d7188`
MD5	`9855e4546c624ce4cbfb6ef40d69ba6c`
BLAKE2b-256	`1b253fe95511ea0f76a7b1e90d1bc9e698f6cb928af9ecfebf8bfc6d2681ea9d`

See more details on using hashes here.

File details

Details for the file agentmd_gen-0.6.0-py3-none-any.whl.

File metadata

Download URL: agentmd_gen-0.6.0-py3-none-any.whl
Upload date: Mar 12, 2026
Size: 59.6 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.2

File hashes

Hashes for agentmd_gen-0.6.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`26a844e34003d7c0817eeaeee49dfa78533e420b624b3f5399632821cf0a8064`
MD5	`99669d96adc59a4d0bb22d5d14fbd220`
BLAKE2b-256	`3710ce9a0c0e311ee747ac50133081781a48dad4bac7b78cec62ef366301a428`

See more details on using hashes here.

agentmd-gen 0.6.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

agentmd

What's New in 0.6.0

What's New in 0.5.0

Install

Usage

scan — inspect a project

generate — create agent context files

eval — measure what you generate

score — evaluate existing context files

diff — compare context files

drift — detect context drift

GitHub Action

Minimal Mode

Tiered Mode

Supported Agents

Supported Languages

How Scoring Works

Part of the Agent Toolkit

Contributing

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes