Skip to main content

Reference implementation for the Agent Almanac Code Uniformity benchmark

Project description

agent-uniformity

Reference implementation for the Agent Almanac Code Uniformity benchmark. Same code path that produced the published numbers — install it, point it at any of the 48 sampled repos, get matching results.

PyPI version License Methodology

What this is

A Python package that implements every step of the Agent Almanac Code Uniformity benchmark: function extraction (Python ast + tree-sitter for TS/JS/Go/Rust), AI authorship detection via git blame against Claude-tagged commits, semble-driven similarity scoring, the six per-repo evaluators, and the multi-repo aggregation that produces the headline hypothesis tests.

The locked task list (48 repos with HEAD SHAs from the inaugural run) ships with the package. You can re-run any single repo against the locked SHA and verify the numbers we published, or re-run the whole sample sequentially and compare your aggregates against ours.

Install

pip install agent-uniformity

Requires Python 3.11 or 3.12 (semble's tree-sitter pin doesn't currently work on 3.13). Brings in semble, radon, tree-sitter-language-pack (pinned to <1.8 — newer versions have a broken Python API), and pydantic.

Quick verification — reproduce one repo

agent-uniformity tasks                                     # list the 48 task IDs
agent-uniformity run-one davila7-claude-code-templates --output ./out

This clones the repo at the SHA in tasks/q2-2026.json, runs the analysis pipeline, and writes ./out/davila7-claude-code-templates.json containing every per-function metric (similarity scores, AI ratio, complexity, etc.) and per-repo evaluator scores.

Compare key numbers against agent-uniformity-q2-2026/analysis/summary.csv. Expected variance: ~1% from semble's BM25 non-determinism.

Run the full benchmark

agent-uniformity run-all --output ./out         # ~6-8 hours sequential on a laptop
agent-uniformity aggregate ./out                # writes ./out/analysis/*.csv
agent-uniformity deep ./out                     # writes H4 + H5 CSVs

Use as a library

from pathlib import Path
from agent_uniformity import analyze_repo, runner

# Run one task by ID
tasks = {t.task_id: t for t in runner.load_tasks()}
result = runner.run_one(tasks["dora-rs-dora"], Path("./out"))
print(result.output.function_count, result.output.repo_ai_ratio_observed)

# Or analyze an arbitrary repo you've already cloned
out = analyze_repo(
    repo=Path("./my-repo"),
    repo_slug="myorg/my-repo",
    base_sha="HEAD",
)
print(out.function_count)

What's in this repo

agent-uniformity/
├── methodology.md             frozen v0.1.0 — pre-registered hypotheses + metrics
├── tasks/q2-2026.json         the 48 sampled repos with locked HEAD SHAs
├── agent_uniformity/
│   ├── extract.py             function extraction (Python ast, tree-sitter)
│   ├── blame.py               git blame + AI commit detection
│   ├── analyze.py             per-repo metrics + 6 evaluators
│   ├── aggregate.py           multi-repo summary CSVs + hypothesis tests
│   ├── deep.py                H4 + H5 deep-pass analyses
│   ├── runner.py              sequential runner (clone → checkout → analyze)
│   ├── schema.py              FunctionFact, RepoOutput pydantic models
│   └── cli.py                 click-based CLI
└── tests/

Versioning policy

The package version (currently 0.1.0) tracks the methodology version. A change to:

  • the metric definitions
  • the evaluator formulas
  • the hypothesis-test logic
  • the function-extraction filters
  • the AI-commit detection signals

…requires a methodology version bump and a corresponding package release. Old methodology versions remain installable from PyPI; old reports cite specific package versions.

Related repos

License

MIT — see LICENSE. Each upstream repository referenced in tasks/q2-2026.json retains its original license.

Citation

Datta, Y. (saucam). (2026). agent-uniformity: reference implementation for the
Agent Almanac Code Uniformity benchmark. https://github.com/saucam/agent-uniformity

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

agent_uniformity-0.1.0.tar.gz (29.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

agent_uniformity-0.1.0-py3-none-any.whl (23.6 kB view details)

Uploaded Python 3

File details

Details for the file agent_uniformity-0.1.0.tar.gz.

File metadata

  • Download URL: agent_uniformity-0.1.0.tar.gz
  • Upload date:
  • Size: 29.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.5.30

File hashes

Hashes for agent_uniformity-0.1.0.tar.gz
Algorithm Hash digest
SHA256 fd4d6cce6e79ae8be7ef8223ab631306ddd8901fc5077d48cda5ce0039f99a74
MD5 de1454884bd16301c63445f6efc4bb1d
BLAKE2b-256 ffad8cf2b62aa6ed914c87397f877eb8658ef310001bbdf9ba520a5524e93f55

See more details on using hashes here.

File details

Details for the file agent_uniformity-0.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for agent_uniformity-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 99465bf0fac9a022cc8167627e7f67959f4ccdd85b1322ccd76eb826608173ac
MD5 90009532e1d057cf697ab10f9cdf4bd1
BLAKE2b-256 f927617a961fb46277f38abc02c59ca5a246a6d7286a8c06b8a7b983d4e1a1ee

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page