Detect file coupling risk in pull requests from git co-change history.
Project description
What it is.
couplingguardis a free GitHub Action and GitLab CI integration that detects file coupling risk in pull requests by analyzing your repository's git co-change history. On every PR it posts a collapsible markdown comment with normalized coupling scores for the files you're changing, suggests reviewers fromCODEOWNERS, and can optionally fail CI above a configurable risk threshold. Install in 5 lines of YAML. No signup, no hosted service, MIT licensed.
The hidden cost of code review: two files that always break together still ship in the same PR with no one looking at both sides. Your git log has known about this pairing for months — every co-change is a data point. Nobody reads it. couplingguard does, on every PR, before merge.
The leverage point is PR open time, not the post-incident review. AI coding agents (Copilot, Claude Code, Cursor) now routinely land diffs touching 15–30 files at once; coupling risk has never been higher or harder to spot by scrolling a unified diff. This is the cheapest bug-prevention tool you can add to your stack: five lines of YAML, an MIT license, and a comment on every PR.
🎬 The demo above is a real rendered video (source MP4, 24 s, 1080p, 2.1 MB). Built with Remotion — the source composition lives at
demo/remotion/. Runnpm install && npm run buildin that folder to re-render it yourself (npm run build:gifproduces the inline-embeddable version above). An accessible static SVG fallback is atassets/animated-demo.svg.
Who is this for?
| If you are… | What couplingguard gives you |
|---|---|
| A platform engineer at a monorepo company | A quantified, CI-enforceable coupling budget that replaces tribal knowledge about "files that always break together" |
| A senior reviewer on AI-generated PRs | A second pair of eyes that flags coupled files the diff doesn't obviously show — before you approve a 25-file Copilot change |
| An OSS maintainer reviewing external contributions | Instant context on which historical owners should weigh in, on top of static CODEOWNERS |
| A DevOps lead enforcing review standards | An opt-in fail_threshold that exits 1 when a PR's coupling density crosses a line you choose |
| A solo developer on a long-running project | A check on your own blind spots: which files in your codebase you've forgotten are coupled |
Install in 5 lines
name: Coupling Guard
on:
pull_request:
types: [opened, synchronize, reopened]
permissions:
contents: read
pull-requests: write
jobs:
coupling:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
with:
fetch-depth: 0 # required: couplingguard needs the full git log
- uses: Meru143/couplingguard@v1
with:
github_token: ${{ github.token }}
What the PR comment looks like
Real output from running couplingguard against its own repository
(synthetic PR over 10 commits of real history, captured from
tests/e2e/test_dogfood.py):
🔍 couplingguard — 6 pairs detected, highest risk: 🔴 1.00
File in PR Coupled With Score Risk Co-changes pyproject.tomlskills-lock.json1.00 🔴 High 2/2 commits tests/integration/test_github_poster.pytests/integration/test_gitlab_poster.py1.00 🔴 High 2/2 commits .gitignorepyproject.toml0.67 🟡 Medium 2/3 commits .gitignoreskills-lock.json0.67 🟡 Medium 2/3 commits
Note the paired integration test files at 1.00 — test_github_poster.py and
test_gitlab_poster.py always land in the same commit because they cover
mirror-image functionality. A reviewer looking only at the GitHub test
file would benefit from knowing the GitLab one almost certainly changed
too.
Illustrative example showing the score-delta line on re-push and
CODEOWNERS-based reviewer suggestions (the names are placeholders — the
real action only suggests usernames that actually appear in your
CODEOWNERS file):
🔍 couplingguard — 2 pairs detected, highest risk: 🔴 0.82
⚠️ Score changed since last push: 🟡 0.45 → 🔴 0.82 ↑
File in PR Coupled With Score Risk Co-changes src/payment.pysrc/billing.py0.82 🔴 High 41/50 commits src/payment.pytests/test_billing.py0.64 🟡 Medium 32/50 commits Suggested reviewers for coupled files: @alice, @team-payments
The comment is collapsible (<details>-wrapped) and edits itself on
every push to the PR with a "score changed" line showing the delta.
Inputs
| Input | Type | Default | Description |
|---|---|---|---|
github_token |
string | ${{ github.token }} |
Token for PR comment + check |
gitlab_token |
string | "" |
Personal access token for GitLab CI |
lookback_days |
number | 90 |
Days of history to analyze |
min_occurrences |
number | 3 |
Minimum co-change count to include a pair |
max_pairs |
number | 10 |
Maximum pairs shown in the comment |
low_threshold |
number | 0.3 |
Score boundary 🟢 → 🟡 |
high_threshold |
number | 0.7 |
Score boundary 🟡 → 🔴 |
fail_threshold |
string | "" |
low/medium/high to fail CI; empty disables |
exclude |
string | "" |
Newline-separated glob patterns |
publish_dashboard |
boolean | false |
Generate static dashboard + history + badge artifact |
dry_run |
boolean | false |
Print comment to stdout; don't post |
How it works
flowchart TD
A[git log<br/>lookback_days, no-merges] --> B[co-change matrix<br/>file pairs × commit count]
B --> C[normalize<br/>score = co_count / max(count_a, count_b)]
C --> D{filter by<br/>min_occurrences}
D --> E[PR analyzer<br/>keep pairs touching PR files]
E --> F[classify risk<br/>🟢 < 0.3 ≤ 🟡 < 0.7 ≤ 🔴]
F --> G[CODEOWNERS lookup<br/>suggest reviewers]
G --> H[render markdown<br/>+ hidden JSON marker]
H --> I[find existing<br/>PR comment by marker]
I -->|exists| J[edit in place<br/>with delta line]
I -->|new| K[create issue comment]
J --> L[fail_threshold check<br/>exit 0 / 1]
K --> L
L --> M{publish_dashboard?}
M -->|yes| N[append history JSON<br/>+ Chart.js HTML<br/>+ shields.io badge]
M -->|no| O[done]
N --> O
style A fill:#fef3c7,stroke:#f59e0b,color:#000
style C fill:#dbeafe,stroke:#3b82f6,color:#000
style F fill:#fce7f3,stroke:#ec4899,color:#000
style L fill:#dcfce7,stroke:#16a34a,color:#000
The key insight is normalization: raw co-change counts inflate for
old / large files, while co_count / max(count_a, count_b) produces a
0–1 ratio that's comparable across repos of any size and age.
Local CLI
pip install couplingguard
couplingguard --repo . --dry-run --lookback-days 90
The CLI uses the same code path as the Action — --dry-run prints
the rendered PR comment to stdout without reaching GitHub, so you
can preview what couplingguard would post against any local repo.
Run couplingguard --help for the full flag list (every Action input
has a matching CLI flag).
GitLab CI
coupling:
image: python:3.11
variables:
GIT_DEPTH: "0" # required: GitLab clones shallow by default
GITLAB_TOKEN: ${GITLAB_TOKEN}
script:
- pip install couplingguard
- couplingguard --repo .
only:
- merge_requests
CI_SERVER_URL, CI_PROJECT_ID, and CI_MERGE_REQUEST_IID are
auto-set by every GitLab Runner. GITLAB_TOKEN should be a
project access token
with the api scope, stored as a masked CI/CD variable.
Permissions
For GitHub Actions, couplingguard needs:
contents: readto read the git history.pull-requests: writeto post / edit the comment.
For GitLab CI, the GITLAB_TOKEN needs api scope on the project.
When publish_dashboard: true, the action writes coupling-history.json,
coupling-dashboard.html, and coupling-score.json to the workspace and
uploads them as a GitHub Actions artifact. Nothing is committed back to
your repo unless you add an explicit git commit && git push step yourself.
FAQ
Is couplingguard free?
Yes — entirely. MIT licensed, no paid tier, no signup, no hosted service. The Action runs on your own runner; your code never leaves your CI.
What is file coupling and why should I care?
File coupling is when two files in your repository historically change together. Tightly coupled files almost always need to be modified in the same PR, but reviewers can't see the relationship from the diff alone. Coupling is one of the strongest predictors of regression risk: changing one half of a coupled pair without the other is how production incidents start. Adam Tornhill's Your Code as a Crime Scene covers the research; couplingguard operationalizes it at PR time.
How does normalization work?
A pair where a.py was touched 100 times, b.py 5 times, and both together 5 times is not the same as a pair where both were touched 5 times each. Raw count = 5 in both cases. Normalized:
5 / max(100, 5) = 0.05→ noise (file_a changes for many reasons)5 / max(5, 5) = 1.00→ genuine coupling (whenever one changes, so does the other)
The formula is score = co_changes / max(file_a_total_changes, file_b_total_changes). It produces a 0–1 ratio comparable across repos of any size and age.
📋 See the coupling cheatsheet for the full math, default thresholds, common couplings to look for, and per-repo-type tuning (solo, small team, monorepo, mature OSS library).
Why does couplingguard need fetch-depth: 0?
Default actions/checkout@v4 does a shallow clone (depth=1). couplingguard needs the full git log to count co-changes across the configurable lookback_days window. If you forget, the Action exits 1 with an actionable error (E001) rather than producing wrong results from a truncated history.
Does couplingguard work on monorepos?
Yes. For repos with 50+ committers and 10k+ commits in the window:
- Use
excludeto drop noisy paths (docs, migrations, generated code, lockfiles). - Bump
min_occurrencesto 5+ to filter out rare pairs. - Lower
lookback_daysto 60 — recent coupling is more actionable than ancient.
The matrix builder is O(commits × avg_files_per_commit²) which is sub-second for ≤50k commits in the lookback window.
What if my repo has fewer than min_occurrences commits?
The Action posts an informational comment ("not enough git history in lookback window") and exits 0. No false failures on new repos. The threshold under which this kicks in is min_occurrences, which defaults to 3.
How is couplingguard different from CODEOWNERS?
CODEOWNERS encodes static file-ownership: "this team reviews these paths." couplingguard encodes dynamic co-change risk: "these files have historically broken together." The two are complementary — couplingguard reads your CODEOWNERS file and suggests owners of coupled files who aren't already on the PR, on top of GitHub's normal review-request flow.
Does it work for AI-coded PRs?
That's the primary use case. AI coding agents (Copilot, Claude Code, Cursor) routinely produce PRs touching 15-30 files at once. A human wrote the PR description, but no human held the entire change in their head as a unified mental model. couplingguard is the cheapest backstop: a comment that surfaces the files the agent should have touched but didn't.
What does couplingguard NOT do?
- ❌ Predict bugs (it's a historical signal, not a model)
- ❌ Replace CODEOWNERS (complementary)
- ❌ Modify your code (read-only on the working tree)
- ❌ Send your code anywhere (analysis runs entirely on your runner)
- ❌ Support Bitbucket or Azure DevOps in v0.1 (GitHub + GitLab only)
How couplingguard compares
| couplingguard | CodeScene | code-maat | Danger.js | CODEOWNERS | |
|---|---|---|---|---|---|
| Posts a comment on every PR | ✅ | ✅ | ❌ (CSV only) | ⚙️ (write your own) | ❌ |
| Normalized co-change scoring | ✅ | ✅ | ❌ | ❌ | ❌ |
| Suggests reviewers from CODEOWNERS | ✅ | ❌ | ❌ | ⚙️ | ❌ |
| Optional CI failure gate | ✅ | ✅ | ❌ | ⚙️ | ❌ |
Re-push delta line (🟡 0.45 → 🔴 0.82) |
✅ | ❌ | ❌ | ❌ | ❌ |
| GitLab CI support | ✅ | ✅ | ❌ | ✅ | ❌ |
| No hosted service / no signup | ✅ | ❌ | ✅ | ✅ | ✅ |
| Open source | ✅ MIT | ❌ commercial | ✅ GPLv3 | ✅ MIT | (native GitHub feature) |
| Cost | Free | Per-seat license | Free | Free | Free |
| Install effort | 5 lines YAML | Hosted onboarding | CLI + scripting | Framework + scripts | One file |
Bottom line. CODEOWNERS encodes static ownership; couplingguard adds dynamic co-change signal. They're complementary — couplingguard uses CODEOWNERS to suggest better reviewers for the files historically coupled to your PR's files. code-maat (the original normalized co-change CLI from Adam Tornhill's Your Code as a Crime Scene) runs after the fact; couplingguard runs at PR open, when the fix is still cheap.
Limitations
Known constraints in v0.1.1:
- Shallow clones are rejected. Detected and surfaced as error E001 with an actionable message. Add
fetch-depth: 0(GitHub) orGIT_DEPTH: "0"(GitLab). - PR file cap at 200. PRs touching more than 200 files are truncated with a warning. The pairs analysis is O(200 × matrix_size), so this is a deliberate ceiling.
- No auto-commit of dashboard files.
publish_dashboard: trueproduces an artifact; pushing the score JSON back tomainfor badge updates is on the v0.2 roadmap. - GitLab self-managed not officially tested. Should work via
CI_SERVER_URLbut only verified against gitlab.com. - Bitbucket / Azure DevOps — not supported yet. Open an issue if you want to vote it up the roadmap.
Contributing
See CONTRIBUTING.md. Bugs → Issues. Security → SECURITY.md.
License
MIT. See LICENSE.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file couplingguard-0.1.2rc1.tar.gz.
File metadata
- Download URL: couplingguard-0.1.2rc1.tar.gz
- Upload date:
- Size: 2.5 MB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
456e2feea804c8fa350e9398c2017adee08bcdb6d8bb10739ccd22314f8ed2e2
|
|
| MD5 |
df4dbbb89829dc9fd3076bfbf9ed94ea
|
|
| BLAKE2b-256 |
496ad388cce3c2fd8b2745cb0b41ba8ec13f9a6199e543cdc75fc909ee759a90
|
Provenance
The following attestation bundles were made for couplingguard-0.1.2rc1.tar.gz:
Publisher:
release.yml on Meru143/couplingguard
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
couplingguard-0.1.2rc1.tar.gz -
Subject digest:
456e2feea804c8fa350e9398c2017adee08bcdb6d8bb10739ccd22314f8ed2e2 - Sigstore transparency entry: 1652050768
- Sigstore integration time:
-
Permalink:
Meru143/couplingguard@0684199b078dbbcbca78c434e60fb8be0e0e891f -
Branch / Tag:
refs/tags/v0.1.2-rc1 - Owner: https://github.com/Meru143
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@0684199b078dbbcbca78c434e60fb8be0e0e891f -
Trigger Event:
push
-
Statement type:
File details
Details for the file couplingguard-0.1.2rc1-py3-none-any.whl.
File metadata
- Download URL: couplingguard-0.1.2rc1-py3-none-any.whl
- Upload date:
- Size: 29.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
7b60b58e156741722272bfdba37e62afd57d60e81d9962190cd632858feddf3a
|
|
| MD5 |
12fd953a01014d1a3bc9c4c438866699
|
|
| BLAKE2b-256 |
009f1f8654a6f432492ea0aa935fd4756a36f6dff536b4dde49cc04f5a43f746
|
Provenance
The following attestation bundles were made for couplingguard-0.1.2rc1-py3-none-any.whl:
Publisher:
release.yml on Meru143/couplingguard
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
couplingguard-0.1.2rc1-py3-none-any.whl -
Subject digest:
7b60b58e156741722272bfdba37e62afd57d60e81d9962190cd632858feddf3a - Sigstore transparency entry: 1652050855
- Sigstore integration time:
-
Permalink:
Meru143/couplingguard@0684199b078dbbcbca78c434e60fb8be0e0e891f -
Branch / Tag:
refs/tags/v0.1.2-rc1 - Owner: https://github.com/Meru143
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@0684199b078dbbcbca78c434e60fb8be0e0e891f -
Trigger Event:
push
-
Statement type: