Evolutionary Quality Metric for source code

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

These details have not been verified by PyPI

Project description

Evolutionary Quality Metric (EQM)

EQM scores the functions in a git repository by how strongly they have been preserved under pressure to change. Code that survives many opportunities to be modified, especially when callers depended on it remaining stable, scores high. Code that churns frequently or is rarely referenced by the rest of the codebase scores low.

The metric is inspired by purifying selection in molecular evolution.

$$\text{EQM}(f) = \mathrm{LCB}_p!\left[\mathrm{Beta}(\alpha + k,; \beta + n - k)\right]$$

Each commit that touches a direct caller of f, or f itself, is a trial (n). If f did not change in that commit, it survived (k). α, β are Beta prior parameters (default 1, 1 — uniform). EQM is the lower credible bound of the resulting Beta posterior, penalising functions with few trials more aggressively than the posterior mean alone.

Quickstart

pip install eqm-score

# Analyze a repository (run once; subsequent commands are sub-second)
eqm analyze /path/to/your/repo

# Print per-function scores (JSONL, one object per function)
eqm score /path/to/your/repo

# Colorized heatmap in the terminal
eqm score /path/to/your/repo --format terminal

# Top 20 most-conserved functions
eqm top /path/to/your/repo --n 20

# Explain a single function's score
eqm explain /path/to/your/repo src/core/processor.py:42

# Debug what trials were counted for a function
eqm debug trials /path/to/your/repo my_module.MyClass.my_method

How scoring works

A trial for function f is triggered when:

A direct caller of f had a nonsynonymous change in a commit (caller pressure), or
f itself had a nonsynonymous change in a commit (direct mutation)

Both triggers in the same commit count as one trial. The trial is synonymous (survived) if f did not change; nonsynonymous (mutated) if f changed Changes first normalize the function's token sequence before comparing: local variable names, parameter names, string/integer literals, docstrings, and comments are all stripped or replaced with type tokens.

The following cases count as a direct caller:

Same-file direct call
From-import cross-file call
Module-attribute cross-file call
Intra-class self/cls

Some known limitations:

self.method() via inheritance or other instances
Star imports (from module import *
Dynamic dispatch: calls through variables (fn = get_fn(); fn()), getattr, or __call__
Ambiguous global names

A function with many trials (high n) and a high survival rate gets a score near 1.0. A function with few trials gets a score near 0.5 regardless of its survival rate (maximally uncertain).

This library only supports Python at this time, though it's pretty easy to extend!

Concepts

Bernoulli Model

Each trial for function f is a Bernoulli event with unknown survival probability p. We place a Beta conjugate prior over p and observe k synonymous (survival) outcomes in n total trials:

$$\text{Prior:} \quad p \sim \mathrm{Beta}(\alpha, \beta)$$

$$\text{Posterior:} \quad p \mid k, n \sim \mathrm{Beta}(\alpha + k,; \beta + n - k)$$

Because Beta is conjugate to the Binomial, the posterior parameters update by simple arithmetic.

$$\mu = \frac{\alpha + k}{\alpha + \beta + n}$$

where:

α, β — Beta prior parameters (default 1, 1 — uniform; prior mean = 0.5)
n — total trials for f
k — synonymous trials (commits where f survived unchanged)

n	k	μ
0	—	0.500 (no evidence)
5	5	0.857
10	10	0.917
100	100	0.990
10	8	0.750 (mutated 2/10)

EQM (lower credible bound)

$$\text{EQM}(f) = \mathrm{LCB}_p!\left[\mathrm{Beta}(\alpha + k,; \beta + n - k)\right]$$

EQM is the lower credible bound — the p-th quantile of the Beta posterior (default p = 0.05, i.e. the 95% one-sided LCB). This penalises functions with few trials relative to those with many, even if their observed survival rates are identical. As evidence accumulates, the LCB converges toward the true survival rate.

EQM is in (0, 1). A score near 1.0 means a function that rarely needed to change, either under caller pressure or on its own. A score near 0.5 means either no evidence yet (uncertain) or a function that tends to mutate frequently.

EQM does not measure correctness, readability or code style, or test coverage. High-EQM code can be buggy; it's just stable buggy code.

CLI Reference

`eqm analyze`

Build or update the lineage and reference databases for a repository.

eqm analyze REPO_PATH [OPTIONS]

Arguments:

REPO_PATH — path to the git repository to analyze.

Options:

Option	Default	Description
`--ref`	`HEAD`	Git ref to analyze
`--since DATE`	(all time)	Only process commits since this ISO date
`--lang LANGS`	`python`	Comma-separated languages
`--cache PATH`	`.eqm-cache.db`	SQLite cache path
`--workers N`	4	Parallel workers
`--force`	false	Re-analyze already-processed commits

Example:

eqm analyze . --lang python --since 2023-01-01

`eqm score`

Emit per-line EQM scores from the analysis cache.

eqm score REPO_PATH [OPTIONS]

Options:

Option	Default	Description
`--file / -f`	(all)	Restrict to specific file(s)
`--format`	`jsonl`	Output format: `json`, `jsonl`, `terminal`
`--threshold`	0.0	Only emit lines with EQM ≥ threshold

JSON output schema (per line):

{
  "file": "src/foo/bar.py",
  "line": 42,
  "eqm": 0.917,
  "components": {
    "bayesian_survival": 0.917
  },
  "scope": {
    "function": "process_batch",
    "class": "BatchProcessor",
    "module": "foo.bar"
  },
  "scope_uuid": "fn:7a3b9c..."
}

This schema is the contract between v1 and the future VS Code extension. It will be kept stable across minor versions.

`eqm explain`

Print the score breakdown for a single source line.

eqm explain REPO_PATH FILE:LINE

Example:

eqm explain . src/core/processor.py:42

Prints a JSON object with the full component breakdown, lineage stats, and incoming reference count.

`eqm top`

List the top-N highest-EQM functions or classes.

eqm top REPO_PATH [OPTIONS]

Options:

Option	Default	Description
`--n`	50	Number of entries to show
`--scope`	`function`	Level: `function`, `class`, `module`

`eqm cache info`

Show cache statistics: row counts, DB size, last analysis timestamp.

eqm cache info REPO_PATH

`eqm cache clear`

Wipe all cached analysis data.

eqm cache clear REPO_PATH [--yes]

`eqm version`

Print the installed EQM version.

eqm version

Configuration

EQM reads configuration from pyproject.toml ([tool.eqm]) or .eqm.toml at the repository root. CLI flags take precedence.

[tool.eqm]
languages = ["python"]
exclude = ["tests/", "vendor/", "**/*.generated.*"]

[tool.eqm.weights]
# Beta prior on BayesianSurvival — default Beta(1,1) = uniform prior
# New code (n=0) starts at prior_alpha / (prior_alpha + prior_beta) = 0.5
prior_alpha = 1.0
prior_beta = 1.0
# Lower Credible Bound z-score: EQM = posterior_mean - lcb_z * posterior_std
# 1.645 = one-sided 95% lower bound. Higher values penalise low-n nodes more.
lcb_z = 1.645

[tool.eqm.cache]
path = ".eqm-cache.db"

Development setup

# Requires: Python 3.11+, uv
git clone https://github.com/mskarlin/evolutionarily_quality_metric
cd eqm

uv sync --group dev

# Run the fast test suite
uv run pytest -m "not slow" -x

# Run including end-to-end tests
uv run pytest

License

Apache-2.0

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

mskarlinski

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

0.1.0

May 31, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

eqm_score-0.1.0.tar.gz (57.6 kB view details)

Uploaded May 31, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

eqm_score-0.1.0-py3-none-any.whl (47.5 kB view details)

Uploaded May 31, 2026 Python 3

File details

Details for the file eqm_score-0.1.0.tar.gz.

File metadata

Download URL: eqm_score-0.1.0.tar.gz
Upload date: May 31, 2026
Size: 57.6 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for eqm_score-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`f1c3abdb08528d48817be44d29e8fe7c8bc6946a9d12faa1412e6c73d5c3183d`
MD5	`a479f0c72890dd9e21e9906fc88c80d2`
BLAKE2b-256	`54c0efd7be67fd66f56a1f2819ae8b4cfa0be24cc21e27d25ea3a44146b0b07a`

See more details on using hashes here.

Provenance

The following attestation bundles were made for eqm_score-0.1.0.tar.gz:

Publisher: release.yml on mskarlin/evolutionarily_quality_metric

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: eqm_score-0.1.0.tar.gz
- Subject digest: f1c3abdb08528d48817be44d29e8fe7c8bc6946a9d12faa1412e6c73d5c3183d
- Sigstore transparency entry: 1685291835
- Sigstore integration time: May 31, 2026
Source repository:
- Permalink: mskarlin/evolutionarily_quality_metric@075eba9acab22595f0e395e63c834d425f0e4e1a
- Branch / Tag: refs/tags/v0.1.0
- Owner: https://github.com/mskarlin
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release.yml@075eba9acab22595f0e395e63c834d425f0e4e1a
- Trigger Event: push

File details

Details for the file eqm_score-0.1.0-py3-none-any.whl.

File metadata

Download URL: eqm_score-0.1.0-py3-none-any.whl
Upload date: May 31, 2026
Size: 47.5 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for eqm_score-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`75e4eae3bc4cb47b5c627bbcbfa5fc91bd43e3c1cadc70831219329849f862ed`
MD5	`cb645331cbb5934b32e226963dbda89c`
BLAKE2b-256	`07b47b71cd162c6ac330527e5700b4a71a83f88fa70d2de3b18606f6377da406`

See more details on using hashes here.

Provenance

The following attestation bundles were made for eqm_score-0.1.0-py3-none-any.whl:

Publisher: release.yml on mskarlin/evolutionarily_quality_metric

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: eqm_score-0.1.0-py3-none-any.whl
- Subject digest: 75e4eae3bc4cb47b5c627bbcbfa5fc91bd43e3c1cadc70831219329849f862ed
- Sigstore transparency entry: 1685291916
- Sigstore integration time: May 31, 2026
Source repository:
- Permalink: mskarlin/evolutionarily_quality_metric@075eba9acab22595f0e395e63c834d425f0e4e1a
- Branch / Tag: refs/tags/v0.1.0
- Owner: https://github.com/mskarlin
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release.yml@075eba9acab22595f0e395e63c834d425f0e4e1a
- Trigger Event: push

eqm-score 0.1.0

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

Evolutionary Quality Metric (EQM)

Quickstart

How scoring works

Concepts

Bernoulli Model

EQM (lower credible bound)

CLI Reference

eqm analyze

eqm score

eqm explain

eqm top

eqm cache info

eqm cache clear

eqm version

Configuration

Development setup

License

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance

`eqm analyze`

`eqm score`

`eqm explain`

`eqm top`

`eqm cache info`

`eqm cache clear`

`eqm version`