Skip to main content

Standardised evaluation metrics for epileptic seizure detection and forecasting.

Project description

SciTeX Seizure Metrics (scitex-seizure-metrics)

SciTeX

Unified evaluation library for seizure detection and forecasting — sample-based, alarm-based, and the bridge between them.

Full Documentation · uv pip install scitex-seizure-metrics[all]

PyPI Python Tests Coverage Docs License: AGPL v3


Problem and Solution

# Problem Solution
1 Cross-paper comparison is broken — Cook 2013 reports time-in-warning, Karoly 2017 reports AUROC + Brier, Maturana 2020 reports AUROC + IoC, Kuhlmann 2018 reports AUROC, Proix 2021 reports IoC + AUC of sensitivity vs proportion-time-in-warning. No two of these can be plotted on the same axis without re-running their methods. One MetricsReport object carries both regimes through one API; bridge.sample_to_alarm gives analytic bounds when only one side is reported.
2 Sample- vs alarm-based collapse is documented but untooled — Andrade 2024 showed that 50/56 patients beat chance under sample-based eval but only 6/46 under alarm-based. The community accepts the warning but has no packaged tool to apply both regimes routinely. detection.evaluate + forecasting.evaluate_stream through one library; same input, both regimes side-by-side.
3 FP/hr lacks a denominator convention — some papers normalise by total recording time, some by interictal-only time, refractory rules vary or are unstated. Explicit AlarmPolicy required by every alarm-aware function — no silent defaults; every reported number is reproducible.
Comparison with existing tools
Tool Language Sample-based Event-based Forecasting (SPH/SOP) IoC vs surrogate Cross-paper convertor Status
timescoring (SzCORE engine, Dan et al. 2024) Python maintained
szcore-evaluation (BIDS wrapper) Python maintained
EPILAB (Direito et al. 2011) MATLAB last release 2018
PySeizure (2025) Python early — focused on detection
SeizyML (2024) Python detection scope
Andrade et al. 2024 (paper) research code, not a package
scitex-seizure-metrics Python this repo

Supported Metrics

Quick definitions for the metrics and policy knobs that recur throughout the README, the docstrings, and the cited papers.

Sample-based metrics
Term Meaning
AUROC Area Under the Receiver Operating Characteristic curve. Probability the model ranks a random positive window above a random negative window. Threshold-free; insensitive to class prevalence.
AUPRC Area Under the Precision–Recall curve. Threshold-free; sensitive to class prevalence — the value to read on heavily-imbalanced seizure data when AUROC looks deceptively high.
Brier Mean squared error between predicted probability and the 0/1 label. Lower is better. Decomposes into reliability + resolution + uncertainty (scitex_seizure_metrics.calibration).
MCC Matthews Correlation Coefficient. A single balanced summary statistic robust to class imbalance; ranges from −1 (anti-correlation) through 0 (chance) to +1 (perfect).
Balanced accuracy (Sensitivity + Specificity) / 2. The accuracy you would get if the prevalence were 50/50.
Sensitivity (recall) Fraction of true seizures detected. Reported at a chosen threshold.
Precision (PPV) Fraction of detections that were true seizures. Drops fast under low prevalence.
ECE Expected Calibration Error. Average gap between predicted probability and observed frequency across bins.
Alarm-based metrics
Term Meaning
Alarm A single binary "warning is on" event derived from a thresholded probability stream + the AlarmPolicy.
FP/hr (false-positive rate per hour) Number of alarms not followed by a seizure within (SPH, SPH + SOP], normalised by the chosen denominator (fp_denominator='total' or 'interictal').
IoC Improvement over Chance. The signed gap between the model's alarm-based sensitivity and the same statistic recomputed under a chance-baseline alarm generator (scitex_seizure_metrics.surrogates, default Poisson). Significance is read from a surrogate distribution.
Time-in-warning (TIW, "proportion time in warning") Fraction of recording time spent inside an active warning window (between alarm onset and refractory end). The natural denominator that pairs with sensitivity in the Proix 2021 operating curve.
Sensitivity vs proportion-time-in-warning Operating curve introduced by Proix 2021. Plotted instead of sensitivity vs FP/hr when alarm refractory periods make per-hour counts misleading. Same x-axis units as Cook 2013's "time-in-warning" reporting.
Beats chance (alarm) Boolean — is the model's IoC above the surrogate distribution at the configured significance level? Andrade 2024's headline: 50/56 patients beat chance under sample-based eval but only 6/46 under alarm-based.

The AlarmPolicy config knobs (SPH · SOP · cadence · refractory · alarm-threshold · FP-denominator) are documented inline on the dataclass and shown in the forecasting example below — they pin alarm-derivation, not metric definitions.

Installation

pip install scitex-seizure-metrics

Demo

from scitex_seizure_metrics import detection, forecasting, AlarmPolicy

# Per-window detection metrics (sensitivity, false-positives/hour, ...)
m = detection.evaluate(y_true=labels, y_pred=preds, fs=256)
print(m["sensitivity"], m["fp_per_hour"])

# Forecasting metrics (Improvement-over-chance, AUROC, alarm count)
f = forecasting.evaluate(
    seizure_times=onsets, alarm_times=alarms, policy=AlarmPolicy.STANDARD
)
print(f["ioc"], f["auroc"])
graph LR
    Labels["per-window y_true / y_pred"] --> Det["detection.evaluate"]
    Onsets["seizure_times + alarm_times"] --> Fore["forecasting.evaluate"]
    Det --> Out["sensitivity / FP-per-hour / latency"]
    Fore --> Out2["IoC / AUROC / alarm count"]

Quick Start

from scitex_seizure_metrics import detection, forecasting, AlarmPolicy

# Detection — per-window classification
rep = detection.evaluate(y_true, y_proba, threshold=0.5, fs=1)
print(rep.roc_auc, rep.pr_auc, rep.brier, rep.mcc)

# Forecasting — continuous stream with explicit alarm policy
policy = AlarmPolicy(
    sph_seconds=300, sop_seconds=600, cadence_seconds=60,
    refractory_seconds=600, alarm_threshold=0.5,
    fp_denominator="interictal",   # Mormann tradition
)
rep = forecasting.evaluate_stream(
    proba, times, seizures, policy,
    total_recording_time=24 * 3600,
)
print(rep.sensitivity, rep.fp_per_hour, rep.ioc, rep.time_in_warning_frac)

See examples/quick_start_detection.py and examples/quick_start_forecasting.py.

Architecture

flowchart LR
    Probs["per-window proba<br/>+ ground truth"] --> Det["detection.evaluate"]
    Probs --> StreamIn["forecasting.evaluate_stream"]
    Policy["AlarmPolicy<br/>SPH · SOP · cadence · refractory · FP denom"] --> StreamIn
    Det --> RepDet["MetricsReport<br/>AUROC · AUPRC · Brier · MCC"]
    StreamIn --> RepFc["MetricsReport<br/>sensitivity · FP/hr · IoC · TIW"]
    RepDet -.->|"bridge analytic bounds"| RepFc
    RepFc --> Plots["plots: sensitivity vs FP/hr,<br/>IoC vs surrogate, cadence ablation"]

The split mirrors how the seizure-evaluation literature itself is organised — sample-based vs alarm-based vs the bridge — so a paper-faithful re-implementation lives in exactly one place. MetricsReport is the single object that travels between regimes; AlarmPolicy is the single object that pins every reproducibility decision an alarm-based metric requires.

6 Interfaces

scitex_seizure_metrics.forecasting — alarm-based metrics with explicit AlarmPolicy (primary)
from scitex_seizure_metrics import AlarmPolicy, forecasting

policy = AlarmPolicy(
    sph_seconds=300, sop_seconds=600, cadence_seconds=60,
    refractory_seconds=600, alarm_threshold=0.5,
    fp_denominator="interictal",
)
rep = forecasting.evaluate_stream(
    proba, times, seizures, policy,
    total_recording_time=24 * 3600, n_surrogate=1000,
)
print(rep.sensitivity, rep.fp_per_hour, rep.ioc, rep.time_in_warning_frac)

# Operating curve across thresholds
df = forecasting.sweep_thresholds(proba, times, seizures, policy)

# Cadence ablation
policies = [AlarmPolicy(..., cadence_seconds=c) for c in [30, 60, 120, 300]]
df = forecasting.sweep_policies(proba, times, seizures, policies)
scitex_seizure_metrics.detection — sample-based metrics (AUROC, AUPRC, Brier, MCC, ...)
from scitex_seizure_metrics import detection
rep = detection.evaluate(y_true, y_proba, threshold=0.5, fs=1)
print(rep.roc_auc, rep.pr_auc, rep.brier, rep.mcc, rep.balanced_accuracy)
scitex_seizure_metrics.bridge — sample↔alarm analytic bounds for cross-paper comparison
from scitex_seizure_metrics import bridge

bnd = bridge.sample_to_alarm(
    sample_sensitivity=0.79, sample_specificity=0.85,
    sop_seconds=600, cadence_seconds=60, refractory_seconds=600,
)
print(bnd.alarm_sensitivity_upper, bnd.fp_per_hour_upper)
scitex_seizure_metrics.papers — paper-replica shims (Karoly 2017, Maturana 2020, Kuhlmann 2018, Andrade 2024)
from scitex_seizure_metrics.papers import andrade2024
out = andrade2024.metrics(
    y_true=labels, y_proba=preds,
    times_seconds=times, seizure_times=onsets,
)
print(out["sample_auroc"], out["alarm_sensitivity"], out["beats_chance_alarm"])
# Reproduces the side-by-side sample-vs-alarm panel from the paper.

Available shims: karoly2017, maturana2020, kuhlmann2018, andrade2024. Each metrics(...) returns a dict in the paper's preferred metric set.

scitex_seizure_metrics.calibration — Brier decomposition + reliability diagram
from scitex_seizure_metrics import calibration, plots
cal = calibration.calibration_report(y_true, y_proba, n_bins=10)
print(cal.brier, cal.reliability, cal.resolution, cal.uncertainty,
      cal.expected_calibration_error)
plots.reliability_diagram(cal)
scitex_seizure_metrics.plots — relationships between metrics
from scitex_seizure_metrics import plots
plots.sensitivity_vs_fp_per_hour(sweep_df)        # operating curve
plots.ioc_vs_surrogate(sweep_df)                  # model vs chance
plots.cadence_ablation(policy_sweep_df)           # FP/hr vs cadence
plots.sample_vs_alarm_scatter(per_patient_df)     # the Andrade 2024 figure
plots.metric_correlation_heatmap(per_patient_df)  # redundancy diagnostic

References

Part of SciTeX

scitex-seizure-metrics is part of SciTeX. Install via the umbrella with pip install scitex[seizure-metrics] to use as scitex.seizure_metrics (the seizure-evaluation surface re-exported from this peer; equivalent to scitex-ml[seizure] / scitex_ml.metrics.seizure for users who only want this slice without the rest of scitex-ml).

Four Freedoms for Research

  1. The freedom to run your research anywhere — your machine, your terms.
  2. The freedom to study how every step works — from raw data to final manuscript.
  3. The freedom to redistribute your workflows, not just your papers.
  4. The freedom to modify any module and share improvements with the community.

AGPL-3.0 — because we believe research infrastructure deserves the same freedoms as the software it runs on.


SciTeX

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

scitex_seizure_metrics-0.1.1.tar.gz (2.4 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

scitex_seizure_metrics-0.1.1-py3-none-any.whl (1.9 MB view details)

Uploaded Python 3

File details

Details for the file scitex_seizure_metrics-0.1.1.tar.gz.

File metadata

  • Download URL: scitex_seizure_metrics-0.1.1.tar.gz
  • Upload date:
  • Size: 2.4 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for scitex_seizure_metrics-0.1.1.tar.gz
Algorithm Hash digest
SHA256 b77482886dbeb394647b8b6c57b65a14ba80c4dc01475775bca7304820d93b31
MD5 8cb14f1898c2c1b35ad672aa68127228
BLAKE2b-256 dbedf69f6a9bb15034923029ac0d4b14ffb1e760bdc66e05d38de275b7061622

See more details on using hashes here.

Provenance

The following attestation bundles were made for scitex_seizure_metrics-0.1.1.tar.gz:

Publisher: publish-pypi.yml on ywatanabe1989/scitex-seizure-metrics

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file scitex_seizure_metrics-0.1.1-py3-none-any.whl.

File metadata

File hashes

Hashes for scitex_seizure_metrics-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 4da9a0b5977d10f3dbdad6388315d0755ec899ca988299cab6b7b905d342798c
MD5 bacb28b84b220d0a70a79a758516e214
BLAKE2b-256 fac1e726d62043c33953cc798bd346902987c0fd10bfb7be0ffe225d8b63bdfa

See more details on using hashes here.

Provenance

The following attestation bundles were made for scitex_seizure_metrics-0.1.1-py3-none-any.whl:

Publisher: publish-pypi.yml on ywatanabe1989/scitex-seizure-metrics

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page