Git forensics tool for analyzing AI-authored code quality
Project description
CodeAssay
Git forensics tool for analyzing AI-authored code quality. Detects AI-generated commits, identifies rework, classifies root causes, measures code turnover rate, and generates quality reports.
Install
pip install codeassay
Quick Start
# Scan a repository
codeassay scan /path/to/repo
# View a report
codeassay report
# Scan multiple repos with combined output
codeassay scan ../repo1 ../repo2 ../repo3
Commands
| Command | Purpose |
|---|---|
codeassay scan <paths> |
Scan repos (supports --with-scorer, --dry-run, --fail-on=turnover-red) |
codeassay report |
Generate quality report (CLI or markdown) |
codeassay commits |
List AI-authored commits (supports --source <glob>, --tool) |
codeassay rework |
List rework events with classification |
codeassay reclassify <commit> <category> |
Override a classification |
codeassay export --format json |
Export raw data |
codeassay dashboard |
Open interactive HTML dashboard in browser |
codeassay config init |
Scaffold .codeassay.toml in the current repo |
codeassay config show |
Print merged effective config |
codeassay detect-test <hash> |
Dry-run detection pipeline against one commit |
codeassay tag [--tool <name>] |
Add AI-Assisted: trailer to a commit (hook or amend) |
codeassay install-hook [--tool <name>] [--mode always|prompt] |
Install prepare-commit-msg hook |
codeassay uninstall-hook |
Remove the hook |
How It Works
AI Commit Detection is layered — configurable in .codeassay.toml:
- User-defined rules (highest priority) — match commits by author regex, branch pattern, commit-message regex, or author-plus-date-window.
- Built-in profiles — shipped detectors for Claude Code, GitHub Copilot, Cursor, Aider, Windsurf, ChatGPT, and Gemini. Enabled by default, disable individually.
- Probabilistic scorer (opt-in) — weights weak signals (diff shape, message structure, commit velocity, emoji, boilerplate headers, file diversity, punctuation) for commits no deterministic rule caught.
First match wins. Run codeassay config init to scaffold a starter config, codeassay config show to inspect the effective config, and codeassay detect-test <hash> to dry-run detection against a single commit.
Per-author fingerprint detection (opt-in) catches AI commits from authors with ≥20 prior commits by comparing each commit against the author's historical baseline on 5 metrics (diff size, comment ratio, identifier entropy, punctuation density, message length). Flags when ≥3 metrics diverge ≥2σ. Enable in .codeassay.toml:
[fingerprint]
enabled = true
min_prior_commits = 20
sigma_threshold = 2.0
min_divergent_metrics = 3
Use codeassay detect-test <hash> to see per-metric Z-scores when tuning thresholds.
To reliably tag AI commits going forward, install a git hook:
codeassay install-hook --tool cursor --mode always
This appends AI-Assisted: cursor to each commit message (idempotent — skips if a Co-Authored-By AI trailer is already present).
Rework Detection traces subsequent commits that modify AI-authored lines using git blame ancestry within a configurable time window (default: 14 days).
Classification categorizes rework into 7 types using commit message keywords and diff shape analysis:
- Bug fix, Misunderstanding, Test failure, Style/convention violation
- Security issue, Incomplete implementation, Over-engineering
Turnover Rate
Turnover Rate measures the fraction of lines added in a lookback window that are subsequently discarded or rewritten. Computed separately for AI and human cohorts; the AI/human ratio is the headline number. CodeAssay ships benchmarks.json with industry baselines (pre-AI 3.3%, 2026 avg 5.7%, healthy target <4%) so your report can show percentile vs industry.
Configure thresholds in .codeassay.toml:
[turnover]
lookback_days = 90
rewrite_window_days = 30
yellow_threshold = 0.04
red_threshold = 0.06
Fail CI builds on excessive turnover:
codeassay scan . --fail-on=turnover-red
Dashboard
Generate an interactive HTML dashboard with charts and visualizations:
codeassay dashboard
Opens a self-contained HTML file in your browser with:
- Summary metric cards (AI commit rate, first-pass success, rework rate, MTTR)
- Rework category doughnut chart with percentages
- Monthly trend line chart (AI commits vs rework events)
- Top rework file hotspots
- Rework by AI tool comparison
The dashboard works offline, requires no server, and is shareable — copy the HTML file to screenshot or embed in publications. Use --no-open to generate without opening the browser, or --output path.html to save to a custom location.
Ignoring Files
Create a .codeassayignore file in your repo root to exclude files from analysis. Uses gitignore-style patterns:
# Exclude documentation and config noise
*.md
.DS_Store
.organization
# Exclude a directory (one level)
docs/*
# Exclude a directory (recursive)
docs/**
Ignored files are filtered from both AI commit tracking and rework detection, giving you cleaner metrics focused on actual code quality.
Data Storage
Scan data is stored in .codeassay/quality.db (SQLite) inside each scanned repo. Query it directly with any SQL tool:
sqlite3 .codeassay/quality.db "SELECT tool, COUNT(*) FROM ai_commits GROUP BY tool"
Claude Code Plugin
Install as a Claude Code plugin to use /codeassay within Claude Code sessions.
License
MIT
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file codeassay-0.3.0.tar.gz.
File metadata
- Download URL: codeassay-0.3.0.tar.gz
- Upload date:
- Size: 126.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d14a4252903213a72c100cbfb5523f0a9e39ceecb72fdde944d8b2212627b91e
|
|
| MD5 |
7751b8608f1bd276cb23bf3f1e5778ad
|
|
| BLAKE2b-256 |
f8c0d96bb2851764b1af3ff5548a3548ca9c0cbe1e58612ccbc297869c34fbdc
|
File details
Details for the file codeassay-0.3.0-py3-none-any.whl.
File metadata
- Download URL: codeassay-0.3.0-py3-none-any.whl
- Upload date:
- Size: 122.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
fe75f002224d2bc4a5d5b6cefa1d6f181a71044d33a975b98e1b087c38f565cc
|
|
| MD5 |
45dd5df67ddab79501535be2aaa80eaa
|
|
| BLAKE2b-256 |
1acc9a287c54d39984ea548c14ea4b2397148f678ea79b58262ad919bd5c4647
|