Skip to main content

Adjusted group-aware clinical prediction metrics with Python API and CLI.

Project description

metrics-adjuster

metrics-adjuster computes conventional and adjusted group-aware metrics for binary prediction models. It provides a typed Python API, a command-line interface, and an end-to-end synthetic demo for users who want to validate the workflow before running it on their own data.

The codebase is organized around a small functional core: IO lives at the boundaries, public options are captured in typed configuration objects, and tests describe the expected contracts for validation and output shape.

What it computes

For each requested adjusted metric, the package now returns the conventional metric and adjusted metric side by side:

Output file/key Conventional column Adjusted column
aTPR TPR aTPR
aPPV PPV aPPV
aNB NB aNB
aHR HR aHR

Metrics are computed by group across caller-selected risk-score quantiles. The main pipeline is:

  1. validate typed configuration and input columns
  2. calibrate predicted risk within groups
  3. estimate density ratios against a reference group
  4. compute conventional and adjusted metrics at risk thresholds
  5. optionally bootstrap adjusted-metric uncertainty summaries
  6. optionally render a self-contained HTML report with tables and plots
  7. return metric tables through the API or write one CSV per metric through the CLI

Installation

With uv

For local development from a checkout:

uv sync --extra dev

Parquet input support is included through pyarrow, so the public CLI/API path can read .parquet and .pq files without extra local setup.

Run commands inside the project environment with uv run:

uv run pytest
uv run metrics-adjuster --help

Install from GitHub into another uv-managed project:

uv add "metrics-adjuster @ git+https://github.com/SaehwanPark/metrics-adjuster.git"

With pip

For local development from a checkout:

python -m pip install -e '.[dev]'

Install from GitHub:

python -m pip install "metrics-adjuster @ git+https://github.com/SaehwanPark/metrics-adjuster.git"

For released versions, prefer installing from PyPI:

python -m pip install metrics-adjuster

Python API

from metrics_adjuster import (
  CalibrationConfig,
  ColumnSpec,
  DensityRatioConfig,
  MetricConfig,
  MetricName,
  adjusted_metrics,
)
from metrics_adjuster.synthetic import generate_synthetic_metrics_data

frame = generate_synthetic_metrics_data(n=600, seed=2026)
config = MetricConfig(
  columns=ColumnSpec(
    group="group",
    response="outcome",
    risk="risk",
    id="patient_id",
  ),
  ref_group="ref",
  quantiles=(0.2, 0.4, 0.6, 0.8),
  metrics=tuple(MetricName),
  calibration=CalibrationConfig(degree=2, cv=False),
  density_ratio=DensityRatioConfig(degree=1, cv=False),
  random_state=2026,
)

result = adjusted_metrics(frame, config)
print(result.metrics["aTPR"])

For an end-user report, call adjusted_metrics_report(...):

from metrics_adjuster import ReportConfig, ReportLabelConfig, adjusted_metrics_report

bundle = adjusted_metrics_report(
  frame,
  config,
  ReportConfig(
    title="Adjusted metrics report",
    x_scale="log_odds",
    labels=ReportLabelConfig(
      columns={"group": "Cohort group"},
      groups={"group": {"ref": "Reference group"}},
      metrics={"aTPR": "True positive rate"},
    ),
  ),
)
html = bundle.html
table = bundle.metric_table

result.metrics["aTPR"] contains columns like:

group,quantile,tau,TPR,aTPR

The return value is a MetricFrames object containing:

  • metrics: a dictionary from adjusted metric name to pandas.DataFrame
  • bootstrap: optional raw long-form bootstrap records when bootstrapping is enabled
  • as_dict(): a compatibility-shaped dictionary for older integrations

Compatibility API

Existing callers can keep using the previous function-style entry point:

from metrics_adjuster import compute_adjusted_metrics
from metrics_adjuster.synthetic import generate_synthetic_metrics_data

frame = generate_synthetic_metrics_data(n=600, seed=2026)
result = compute_adjusted_metrics(
  df=frame,
  idvar="patient_id",
  group_col="group",
  ref_group="ref",
  response_col="outcome",
  orig_risk_col="risk",
  metrics=["aTPR"],
  quantiles=[0.2, 0.4, 0.6, 0.8],
  random_state=2026,
)

Historical import shims from the private development repository are preserved under legacy/ for migration research only and are not part of the supported v1 package.

CLI

Run the built-in synthetic demo:

uv run metrics-adjuster demo --output-dir demo_outputs

By default, the CLI computes aTPR. Pass --metrics aTPR,aPPV,aNB,aHR when you want every supported metric.

Run on your own CSV or Parquet file:

uv run metrics-adjuster run \
  --input path/to/input.csv \
  --output-dir adjusted_metric_outputs \
  --group-col group \
  --ref-group ref \
  --response-col outcome \
  --risk-col risk \
  --id-col patient_id \
  --quantiles 0.2,0.4,0.6,0.8 \
  --metrics aTPR,aPPV,aNB,aHR \
  --seed 2026

The CLI writes one CSV per adjusted metric, such as aTPR.csv and aPPV.csv. Each file includes the conventional companion column as well as the adjusted metric column.

Add --report to write a self-contained report.html with compact metric tables, calibrated probability density plots, and densities normalized to the reference group:

uv run metrics-adjuster run \
  --input path/to/input.csv \
  --output-dir adjusted_metric_outputs \
  --group-col group \
  --ref-group ref \
  --response-col outcome \
  --risk-col risk \
  --report \
  --report-title "Adjusted metrics report" \
  --report-config-yaml path/to/report.yml

The optional report YAML can provide human-friendly display names and choose plot x-axis scaling:

x_scale: log_odds
labels:
  columns:
    Prior1245: Veteran Priority Group
  groups:
    Prior1245:
      "0": Default Priority
      "99": Priority Group 5
  metrics:
    aTPR: True positive rate

Without uv, use the same commands after installing the package, but drop the uv run prefix.

Numeric reference groups are supported through the public CLI. For example, --ref-group 0 will match an integer-valued group column such as Prior1245.

Required input columns

Your input table must include:

  • a group column, such as group, sex, or race
  • a binary outcome column encoded as 0 and 1
  • a predicted risk/probability column
  • optionally, a stable identifier column

The reference group named by ref_group must be present in the group column.

End-to-end integration demo

The demo script is intentionally outside src/ so it remains a runnable example rather than package code:

uv run python scripts/run_synthetic_integration.py \
  --output-dir demo_outputs \
  --n 600 \
  --seed 2026

It writes:

  • synthetic_metrics_data.csv
  • aTPR.csv with TPR and aTPR
  • additional metric CSVs only when --metrics requests them

Sampled real-data evaluation

For a bounded real-data report review, first prepare an ignored sampled input:

uv run python scripts/prepare_sampled_report_input.py \
  --input /path/to/input.parquet \
  --output-dir data/generated/va-can-2019-prior1245-hosp1y-sample50k \
  --group-col Prior1245 \
  --ref-group 0 \
  --response-col Hosp_1y \
  --risk-col pHosp_1y \
  --sample-size 50000 \
  --bootstrap-iterations 25 \
  --seed 20260521 \
  --report-config-yaml scripts/va_can_report_config.yml

Then reproduce the report through the public CLI:

uv run metrics-adjuster run \
  --input data/generated/va-can-2019-prior1245-hosp1y-sample50k/sample.parquet \
  --output-dir results/va-can-2019-prior1245-hosp1y-sample50k \
  --group-col Prior1245 \
  --ref-group 0 \
  --response-col Hosp_1y \
  --risk-col pHosp_1y \
  --quantiles 0.1,0.3,0.5,0.7,0.9 \
  --metrics aTPR,aPPV,aNB,aHR \
  --bootstrap \
  --n-boot 25 \
  --seed 20260521 \
  --report \
  --report-config-yaml scripts/va_can_report_config.yml

The preparation script reads only the required columns, samples rows deterministically, writes row-level data under ignored data/generated/, and records copy-pasteable commands in its README. Metric CSVs, bootstrap.csv, and report.html are produced only by metrics-adjuster run. By default, the generated CLI report command mirrors the generated-data path under results/; pass --report-output-dir to the preparation script to record a different report destination.

Full user manuals

Development

Preferred setup:

uv sync --extra dev

Common checks:

uv run pytest
uv run ruff check .
uv run mypy src/metrics_adjuster

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

metrics_adjuster-1.0.0.tar.gz (34.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

metrics_adjuster-1.0.0-py3-none-any.whl (23.9 kB view details)

Uploaded Python 3

File details

Details for the file metrics_adjuster-1.0.0.tar.gz.

File metadata

  • Download URL: metrics_adjuster-1.0.0.tar.gz
  • Upload date:
  • Size: 34.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.11.7 {"installer":{"name":"uv","version":"0.11.7","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for metrics_adjuster-1.0.0.tar.gz
Algorithm Hash digest
SHA256 4eb03ed4944aae62c77e5260a97db888c50b0ff04e6dbe3db38ef499f05b2888
MD5 ec708e9093bb11d2fc76ed631a6da1f3
BLAKE2b-256 3a56bc763f331b70d459fc2482c38d1f35f0854d36edaa0ce8b44c8574fb5708

See more details on using hashes here.

File details

Details for the file metrics_adjuster-1.0.0-py3-none-any.whl.

File metadata

  • Download URL: metrics_adjuster-1.0.0-py3-none-any.whl
  • Upload date:
  • Size: 23.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.11.7 {"installer":{"name":"uv","version":"0.11.7","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for metrics_adjuster-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 fe2170c35695a83f1883ef59fc29ef115fa642ef18c31f2891ee840895052e6a
MD5 3a68a5a132020b4f5958007f76d75f5c
BLAKE2b-256 70acad4a9a5e969c6e91c6d017e8d5a1b206415d172ff5c33dbf112c024380c8

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page