CI quality gate that scores pull requests by how hard they are to review

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

kirvolque

These details have not been verified by PyPI

Project description

reviewability

A CI/CD quality gate that scores pull requests by how hard they are to review.

Catch diffs that are too large, too tangled, or too scattered to review safely — before they merge.

code review bottleneck

It doesn't matter how fast AI generates code — the bottleneck is the human reviewer.

Installation

pip install reviewability

Requires Python 3.12+.

The Idea

A pull request can be hard to review not because the code is poorly written, but because of how the changes are combined. Mixing renames, movements, and logic changes in one PR makes each harder to verify. This is especially common with AI-generated code. Unlike linters, Reviewability does not analyze the code — only how the changes are structured.

When a diff scores low, the typical remedies are splitting it into focused pull requests or deferring non-essential changes.

Reviewability computes metrics at the level of individual hunks, files, and the whole diff, feeding into Reviewability Scores (0.0 = hardest, 1.0 = easiest) with configurable thresholds for what counts as problematic.

Key Concepts

Hunk — a contiguous block of changes within a single file (the smallest unit of analysis)
Metric — a calculated value attached to a hunk, a file, or the whole diff
Score — a float [0.0, 1.0] representing reviewability at hunk, file, or diff level

Extensibility

The metric system is designed to be extended:

Add a metric — subclass HunkMetric, FileMetric, or OverallMetric, implement calculate(), register via registry.add()
Adjust scoring — provide a custom ReviewabilityScorer implementation
Adjust thresholds — edit the default config or provide your own reviewability.toml

Usage

# Analyze a range of commits
reviewability HEAD~1 HEAD

# Analyze from stdin
git diff HEAD~1 | reviewability --from-stdin

# Use a custom config
reviewability --config path/to/reviewability.toml HEAD~1 HEAD

# Include per-file and per-hunk breakdowns
reviewability --detailed HEAD~1 HEAD

Output is JSON. Exit code is 0 if the gate passes, 1 if it fails.

Claude Code Skill

If you use Claude Code, a /reviewability skill is included. It runs the tool on the current diff, summarizes the results, and attempts to address any recommendations directly.

Configuration

All thresholds and limits are configured via a single reviewability.toml file. The tool looks for it in the current directory, or you can specify a path explicitly:

reviewability -c path/to/reviewability.toml HEAD~1 HEAD

If no config file is found, the built-in default is used. You can edit that file directly to change the defaults, or copy it into your project root. The config must contain all mandatory fields — there is no merging with defaults.

# Scores below these thresholds mark hunks/files as problematic
hunk_score_threshold = 0.5
file_score_threshold = 0.5

# Size limits (used for score normalisation)
max_diff_lines = 500
max_hunk_lines = 50

# Gate: fail if overall score drops below this (provisional, based on calibration)
min_overall_score = 0.7

# Optional limits (remove a line to disable that check)
max_problematic_hunks = 3
max_problematic_files = 2
max_file_hunk_count = 5
max_files_changed = 10
max_added_lines = 400

[movement_detection]
hunk_min_lines = 8
file_min_lines = 15
similarity_threshold = 0.95

Movement Detection

Moved code is easy to review — the logic hasn't changed, only the location. The tool detects when a block of code is deleted from one place and inserted elsewhere (accounting for reindentation and package/import changes), and treats those hunks and files as relocations.

Relocations receive a perfect score and are excluded from the size calculations that drive the overall score. A diff that is large only because of relocations is not penalized.

Metrics

Metrics are calculated at three levels: hunk, file, and overall diff.

Hunk-level

Metric	Description
`hunk.lines_changed`	Total lines added and removed in a hunk
`hunk.added_lines`	Lines added in a hunk
`hunk.removed_lines`	Lines removed in a hunk
`hunk.context_lines`	Unchanged context lines surrounding the change
`hunk.change_balance`	Ratio of added lines to total changed lines (0.0 = pure deletion, 1.0 = pure addition)
`hunk.is_likely_moved`	Whether this hunk is a movement of code from another location

File-level

Metric	Description
`file.lines_changed`	Total lines added and removed across all hunks in a file
`file.added_lines`	Total lines added in a file
`file.removed_lines`	Total lines removed in a file
`file.hunk_count`	Number of separate change regions in a file
`file.max_hunk_lines`	Lines changed in the largest single hunk within a file
`file.is_likely_moved`	Whether this file is a movement from another path

Overall-level

Metric	Description
`overall.lines_changed`	Total lines changed across the entire diff
`overall.added_lines`	Total lines added across the entire diff
`overall.removed_lines`	Total lines removed across the entire diff
`overall.files_changed`	Number of files changed
`overall.moved_lines`	Total lines in hunks identified as code movements
`overall.change_entropy`	Shannon entropy of the distribution of changes across files
`overall.largest_file_ratio`	Fraction of total diff lines in the most-changed file
`overall.scatter_factor`	Normalized entropy of how changes are distributed across files (0.0 = all in one file, 1.0 = evenly spread)
`overall.problematic_hunk_count`	Hunks with a score below the configured threshold
`overall.problematic_file_count`	Files with a score below the configured threshold

Overall Scoring

score = max(0, 1 − effective_size_ratio × (1 + scatter_factor))

effective_size_ratio = (lines_changed − moved_lines) / max_diff_lines   [capped at 1.0]

The score is driven by effective diff size and scatter. Moved lines are excluded from the size count — relocations are easy to review and should not penalize the score.

scatter_factor measures how evenly changes are spread across files (normalized entropy, 0.0 = all in one file, 1.0 = evenly spread). It amplifies the size penalty: a large diff that touches many files evenly scores worse than an equally large diff concentrated in a few files.

A large but focused diff (e.g. a bulk rename in one file) or a scattered but small diff each score better than a diff that is both large and scattered.

Validation

The scoring formula was calibrated against ~2,000 pull requests from 15 permissively licensed open-source repositories. Ground truth labels were derived from review outcomes (change requests, revision cycles, comment density). Metrics that did not improve prediction over a naive size baseline were removed from the formula.

Research

Metrics are informed by peer-reviewed research on code review effectiveness. Most are heuristics derived from research concepts rather than direct paper-defined variables:

Jureczko et al. — Code review effectiveness: an empirical study on selected factors influence (IET Software, 2021) https://doi.org/10.1049/iet-sen.2020.0134
McIntosh et al. — An Empirical Study of the Impact of Modern Code Review Practices on Software Quality (EMSE, 2015) https://doi.org/10.1007/s10664-015-9381-9
Fregnan et al. — First Come First Served: The Impact of File Position on Code Review (EMSE, 2022) https://doi.org/10.1007/s10664-021-10034-0
Uchôa et al. — Predicting Design Impactful Changes in Modern Code Review (MSR, 2020) https://doi.org/10.1145/3379597.3387480
Baum et al. — The Choice of Code Review Process: A Survey on the State of the Practice (EMSE, 2019) https://doi.org/10.1007/s10664-018-9657-6
Hijazi et al. — Using Biometric Data to Measure Code Review Quality (TSE, 2021) https://doi.org/10.1109/TSE.2020.2992169
Olewicki et al. — Towards Better Code Reviews: Using Mutation Testing to Prioritise Code Changes (2024) https://arxiv.org/abs/2402.01860
Barnett et al. — Helping Developers Help Themselves: Automatic Decomposition of Code Review Changesets (ICSE, 2015) https://doi.org/10.1109/ICSE.2015.35
Brito & Valente — RAID — Refactoring-Aware and Intelligent Diffs (2021) https://doi.org/10.1109/ICSME52107.2021.00037
Hu & Pradel — CodeMapper: Mapping and Analyzing Code Changes across Commits (ICSE, 2026)

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

kirvolque

These details have not been verified by PyPI

Release history Release notifications | RSS feed

0.3.0

Mar 29, 2026

0.2.1

Mar 18, 2026

0.2.0

Mar 17, 2026

This version

0.1.0

Mar 17, 2026

0.1.0a3 pre-release

Mar 15, 2026

0.1.0a2 pre-release

Mar 15, 2026

0.1.0a1 pre-release

Mar 14, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

reviewability-0.1.0.tar.gz (28.1 kB view details)

Uploaded Mar 17, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

reviewability-0.1.0-py3-none-any.whl (42.1 kB view details)

Uploaded Mar 17, 2026 Python 3

File details

Details for the file reviewability-0.1.0.tar.gz.

File metadata

Download URL: reviewability-0.1.0.tar.gz
Upload date: Mar 17, 2026
Size: 28.1 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for reviewability-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`362c2e80489cf9ebc217a53c60471d75365d56dfab5974809f34e98d8e668590`
MD5	`e520f8bec0efbe36fe2cd437a5e52285`
BLAKE2b-256	`78d182d7ddf191787bf119d6ef3853b8d2e0112814519b0bd73d80b26d85222e`

See more details on using hashes here.

Provenance

The following attestation bundles were made for reviewability-0.1.0.tar.gz:

Publisher: publish.yml on Kirvolque/reviewability

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: reviewability-0.1.0.tar.gz
- Subject digest: 362c2e80489cf9ebc217a53c60471d75365d56dfab5974809f34e98d8e668590
- Sigstore transparency entry: 1115430332
- Sigstore integration time: Mar 17, 2026
Source repository:
- Permalink: Kirvolque/reviewability@bf64f02ce303289bd6d4223895ff20b837d05850
- Branch / Tag: refs/tags/v0.1.0
- Owner: https://github.com/Kirvolque
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@bf64f02ce303289bd6d4223895ff20b837d05850
- Trigger Event: push

File details

Details for the file reviewability-0.1.0-py3-none-any.whl.

File metadata

Download URL: reviewability-0.1.0-py3-none-any.whl
Upload date: Mar 17, 2026
Size: 42.1 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for reviewability-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`f6296ddfa520df56096def46a1b94eb08c8b4999ca4457945aad48d1c7fe9ed8`
MD5	`8032752d5d6fa792bebd67cba1c8bb7d`
BLAKE2b-256	`f0dc8440a9497c0dc587d5d1e761a4343f28b914a499784a48a95d24b94c29be`

See more details on using hashes here.

Provenance

The following attestation bundles were made for reviewability-0.1.0-py3-none-any.whl:

Publisher: publish.yml on Kirvolque/reviewability

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: reviewability-0.1.0-py3-none-any.whl
- Subject digest: f6296ddfa520df56096def46a1b94eb08c8b4999ca4457945aad48d1c7fe9ed8
- Sigstore transparency entry: 1115430339
- Sigstore integration time: Mar 17, 2026
Source repository:
- Permalink: Kirvolque/reviewability@bf64f02ce303289bd6d4223895ff20b837d05850
- Branch / Tag: refs/tags/v0.1.0
- Owner: https://github.com/Kirvolque
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@bf64f02ce303289bd6d4223895ff20b837d05850
- Trigger Event: push

reviewability 0.1.0

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

reviewability

Installation

The Idea

Key Concepts

Extensibility

Usage

Claude Code Skill

Configuration

Movement Detection

Metrics

Hunk-level

File-level

Overall-level

Overall Scoring

Validation

Research

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance