Standardised evaluation metrics for epileptic seizure detection and forecasting.
Project description
SciTeX Seizure Metrics (scitex-seizure-metrics)
Unified evaluation library for seizure detection and forecasting — sample-based, alarm-based, and the bridge between them.
Full Documentation · uv pip install scitex-seizure-metrics[all]
Problem and Solution
| # | Problem | Solution |
|---|---|---|
| 1 | Cross-paper comparison is broken — Cook 2013 reports time-in-warning, Karoly 2017 reports AUROC + Brier, Maturana 2020 reports AUROC + IoC, Kuhlmann 2018 reports AUROC, Proix 2021 reports IoC + AUC of sensitivity vs proportion-time-in-warning. No two of these can be plotted on the same axis without re-running their methods. | One MetricsReport object carries both regimes through one API; bridge.sample_to_alarm gives analytic bounds when only one side is reported. |
| 2 | Sample- vs alarm-based collapse is documented but untooled — Andrade 2024 showed that 50/56 patients beat chance under sample-based eval but only 6/46 under alarm-based. The community accepts the warning but has no packaged tool to apply both regimes routinely. | detection.evaluate + forecasting.evaluate_stream through one library; same input, both regimes side-by-side. |
| 3 | FP/hr lacks a denominator convention — some papers normalise by total recording time, some by interictal-only time, refractory rules vary or are unstated. | Explicit AlarmPolicy required by every alarm-aware function — no silent defaults; every reported number is reproducible. |
Comparison with existing tools
| Tool | Language | Sample-based | Event-based | Forecasting (SPH/SOP) | IoC vs surrogate | Cross-paper convertor | Status |
|---|---|---|---|---|---|---|---|
timescoring (SzCORE engine, Dan et al. 2024) |
Python | ✅ | ✅ | ❌ | ❌ | ❌ | maintained |
szcore-evaluation (BIDS wrapper) |
Python | ✅ | ✅ | ❌ | ❌ | ❌ | maintained |
EPILAB (Direito et al. 2011) |
MATLAB | ✅ | ◐ | ✅ | ✅ | ❌ | last release 2018 |
PySeizure (2025) |
Python | ✅ | ❌ | ❌ | ❌ | ❌ | early — focused on detection |
SeizyML (2024) |
Python | ✅ | ✅ | ❌ | ❌ | ❌ | detection scope |
| Andrade et al. 2024 (paper) | — | ✅ | ✅ | ✅ | ✅ | ❌ | research code, not a package |
| scitex-seizure-metrics | Python | ✅ | ✅ | ✅ | ✅ | ✅ | this repo |
Supported Metrics
Quick definitions for the metrics and policy knobs that recur throughout the README, the docstrings, and the cited papers.
Sample-based metrics
| Term | Meaning |
|---|---|
| AUROC | Area Under the Receiver Operating Characteristic curve. Probability the model ranks a random positive window above a random negative window. Threshold-free; insensitive to class prevalence. |
| AUPRC | Area Under the Precision–Recall curve. Threshold-free; sensitive to class prevalence — the value to read on heavily-imbalanced seizure data when AUROC looks deceptively high. |
| Brier | Mean squared error between predicted probability and the 0/1 label. Lower is better. Decomposes into reliability + resolution + uncertainty (scitex_seizure_metrics.calibration). |
| MCC | Matthews Correlation Coefficient. A single balanced summary statistic robust to class imbalance; ranges from −1 (anti-correlation) through 0 (chance) to +1 (perfect). |
| Balanced accuracy | (Sensitivity + Specificity) / 2. The accuracy you would get if the prevalence were 50/50. |
| Sensitivity (recall) | Fraction of true seizures detected. Reported at a chosen threshold. |
| Precision (PPV) | Fraction of detections that were true seizures. Drops fast under low prevalence. |
| ECE | Expected Calibration Error. Average gap between predicted probability and observed frequency across bins. |
Alarm-based metrics
| Term | Meaning |
|---|---|
| Alarm | A single binary "warning is on" event derived from a thresholded probability stream + the AlarmPolicy. |
| FP/hr (false-positive rate per hour) | Number of alarms not followed by a seizure within (SPH, SPH + SOP], normalised by the chosen denominator (fp_denominator='total' or 'interictal'). |
| IoC | Improvement over Chance. The signed gap between the model's alarm-based sensitivity and the same statistic recomputed under a chance-baseline alarm generator (scitex_seizure_metrics.surrogates, default Poisson). Significance is read from a surrogate distribution. |
| Time-in-warning (TIW, "proportion time in warning") | Fraction of recording time spent inside an active warning window (between alarm onset and refractory end). The natural denominator that pairs with sensitivity in the Proix 2021 operating curve. |
| Sensitivity vs proportion-time-in-warning | Operating curve introduced by Proix 2021. Plotted instead of sensitivity vs FP/hr when alarm refractory periods make per-hour counts misleading. Same x-axis units as Cook 2013's "time-in-warning" reporting. |
| Beats chance (alarm) | Boolean — is the model's IoC above the surrogate distribution at the configured significance level? Andrade 2024's headline: 50/56 patients beat chance under sample-based eval but only 6/46 under alarm-based. |
The
AlarmPolicyconfig knobs (SPH · SOP · cadence · refractory · alarm-threshold · FP-denominator) are documented inline on the dataclass and shown in the forecasting example below — they pin alarm-derivation, not metric definitions.
Installation
pip install scitex-seizure-metrics
Demo
from scitex_seizure_metrics import detection, forecasting, AlarmPolicy
# Per-window detection metrics (sensitivity, false-positives/hour, ...)
m = detection.evaluate(y_true=labels, y_pred=preds, fs=256)
print(m["sensitivity"], m["fp_per_hour"])
# Forecasting metrics (Improvement-over-chance, AUROC, alarm count)
f = forecasting.evaluate(
seizure_times=onsets, alarm_times=alarms, policy=AlarmPolicy.STANDARD
)
print(f["ioc"], f["auroc"])
graph LR
Labels["per-window y_true / y_pred"] --> Det["detection.evaluate"]
Onsets["seizure_times + alarm_times"] --> Fore["forecasting.evaluate"]
Det --> Out["sensitivity / FP-per-hour / latency"]
Fore --> Out2["IoC / AUROC / alarm count"]
Quick Start
from scitex_seizure_metrics import detection, forecasting, AlarmPolicy
# Detection — per-window classification
rep = detection.evaluate(y_true, y_proba, threshold=0.5, fs=1)
print(rep.roc_auc, rep.pr_auc, rep.brier, rep.mcc)
# Forecasting — continuous stream with explicit alarm policy
policy = AlarmPolicy(
sph_seconds=300, sop_seconds=600, cadence_seconds=60,
refractory_seconds=600, alarm_threshold=0.5,
fp_denominator="interictal", # Mormann tradition
)
rep = forecasting.evaluate_stream(
proba, times, seizures, policy,
total_recording_time=24 * 3600,
)
print(rep.sensitivity, rep.fp_per_hour, rep.ioc, rep.time_in_warning_frac)
See examples/quick_start_detection.py and examples/quick_start_forecasting.py.
Architecture
flowchart LR
Probs["per-window proba<br/>+ ground truth"] --> Det["detection.evaluate"]
Probs --> StreamIn["forecasting.evaluate_stream"]
Policy["AlarmPolicy<br/>SPH · SOP · cadence · refractory · FP denom"] --> StreamIn
Det --> RepDet["MetricsReport<br/>AUROC · AUPRC · Brier · MCC"]
StreamIn --> RepFc["MetricsReport<br/>sensitivity · FP/hr · IoC · TIW"]
RepDet -.->|"bridge analytic bounds"| RepFc
RepFc --> Plots["plots: sensitivity vs FP/hr,<br/>IoC vs surrogate, cadence ablation"]
The split mirrors how the seizure-evaluation literature itself is
organised — sample-based vs alarm-based vs the bridge — so a
paper-faithful re-implementation lives in exactly one place.
MetricsReport is the single object that travels between regimes;
AlarmPolicy is the single object that pins every reproducibility
decision an alarm-based metric requires.
6 Interfaces
scitex_seizure_metrics.forecasting — alarm-based metrics with explicit AlarmPolicy (primary)
from scitex_seizure_metrics import AlarmPolicy, forecasting
policy = AlarmPolicy(
sph_seconds=300, sop_seconds=600, cadence_seconds=60,
refractory_seconds=600, alarm_threshold=0.5,
fp_denominator="interictal",
)
rep = forecasting.evaluate_stream(
proba, times, seizures, policy,
total_recording_time=24 * 3600, n_surrogate=1000,
)
print(rep.sensitivity, rep.fp_per_hour, rep.ioc, rep.time_in_warning_frac)
# Operating curve across thresholds
df = forecasting.sweep_thresholds(proba, times, seizures, policy)
# Cadence ablation
policies = [AlarmPolicy(..., cadence_seconds=c) for c in [30, 60, 120, 300]]
df = forecasting.sweep_policies(proba, times, seizures, policies)
scitex_seizure_metrics.detection — sample-based metrics (AUROC, AUPRC, Brier, MCC, ...)
from scitex_seizure_metrics import detection
rep = detection.evaluate(y_true, y_proba, threshold=0.5, fs=1)
print(rep.roc_auc, rep.pr_auc, rep.brier, rep.mcc, rep.balanced_accuracy)
scitex_seizure_metrics.bridge — sample↔alarm analytic bounds for cross-paper comparison
from scitex_seizure_metrics import bridge
bnd = bridge.sample_to_alarm(
sample_sensitivity=0.79, sample_specificity=0.85,
sop_seconds=600, cadence_seconds=60, refractory_seconds=600,
)
print(bnd.alarm_sensitivity_upper, bnd.fp_per_hour_upper)
scitex_seizure_metrics.papers — paper-replica shims (Karoly 2017, Maturana 2020, Kuhlmann 2018, Andrade 2024)
from scitex_seizure_metrics.papers import andrade2024
out = andrade2024.metrics(
y_true=labels, y_proba=preds,
times_seconds=times, seizure_times=onsets,
)
print(out["sample_auroc"], out["alarm_sensitivity"], out["beats_chance_alarm"])
# Reproduces the side-by-side sample-vs-alarm panel from the paper.
Available shims: karoly2017, maturana2020, kuhlmann2018, andrade2024. Each metrics(...) returns a dict in the paper's preferred metric set.
scitex_seizure_metrics.calibration — Brier decomposition + reliability diagram
from scitex_seizure_metrics import calibration, plots
cal = calibration.calibration_report(y_true, y_proba, n_bins=10)
print(cal.brier, cal.reliability, cal.resolution, cal.uncertainty,
cal.expected_calibration_error)
plots.reliability_diagram(cal)
scitex_seizure_metrics.plots — relationships between metrics
from scitex_seizure_metrics import plots
plots.sensitivity_vs_fp_per_hour(sweep_df) # operating curve
plots.ioc_vs_surrogate(sweep_df) # model vs chance
plots.cadence_ablation(policy_sweep_df) # FP/hr vs cadence
plots.sample_vs_alarm_scatter(per_patient_df) # the Andrade 2024 figure
plots.metric_correlation_heatmap(per_patient_df) # redundancy diagnostic
References
- Andrade I, Teixeira C, Pinto M (2024). On the performance of seizure prediction machine learning methods across different databases: the sample and alarm-based perspectives. Frontiers in Neuroscience. doi:10.3389/fnins.2024.1417748.
- Cook MJ et al. (2013). Lancet Neurology. doi:10.1016/S1474-4422(13)70075-9.
- Dan J et al. (2024). SzCORE. Epilepsia. doi:10.1111/epi.18113.
- Direito B et al. (2011). EPILAB. J Neurosci Methods. doi:10.1016/j.jneumeth.2011.06.022.
- Karoly PJ et al. (2017). Brain. doi:10.1093/brain/awx173.
- Kuhlmann L et al. (2018). Brain. doi:10.1093/brain/awy210.
- Maturana MI et al. (2020). Nature Communications. doi:10.1038/s41467-020-15908-3.
- Mormann F et al. (2007). Seizure prediction: the long and winding road. Brain. doi:10.1093/brain/awl241.
- Schulze-Bonhage A et al. (2020). Performance Metrics for Online Seizure Prediction. PMC7340210.
Part of SciTeX
scitex-seizure-metrics is part of SciTeX. Install via the umbrella with pip install scitex[seizure-metrics] to use as scitex.seizure_metrics (the seizure-evaluation surface re-exported from this peer; equivalent to scitex-ml[seizure] / scitex_ml.metrics.seizure for users who only want this slice without the rest of scitex-ml).
Four Freedoms for Research
- The freedom to run your research anywhere — your machine, your terms.
- The freedom to study how every step works — from raw data to final manuscript.
- The freedom to redistribute your workflows, not just your papers.
- The freedom to modify any module and share improvements with the community.
AGPL-3.0 — because we believe research infrastructure deserves the same freedoms as the software it runs on.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file scitex_seizure_metrics-0.1.1.tar.gz.
File metadata
- Download URL: scitex_seizure_metrics-0.1.1.tar.gz
- Upload date:
- Size: 2.4 MB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b77482886dbeb394647b8b6c57b65a14ba80c4dc01475775bca7304820d93b31
|
|
| MD5 |
8cb14f1898c2c1b35ad672aa68127228
|
|
| BLAKE2b-256 |
dbedf69f6a9bb15034923029ac0d4b14ffb1e760bdc66e05d38de275b7061622
|
Provenance
The following attestation bundles were made for scitex_seizure_metrics-0.1.1.tar.gz:
Publisher:
publish-pypi.yml on ywatanabe1989/scitex-seizure-metrics
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
scitex_seizure_metrics-0.1.1.tar.gz -
Subject digest:
b77482886dbeb394647b8b6c57b65a14ba80c4dc01475775bca7304820d93b31 - Sigstore transparency entry: 1506179947
- Sigstore integration time:
-
Permalink:
ywatanabe1989/scitex-seizure-metrics@1f0d890ad60a25d36c4c8d66048a875f7fee16fb -
Branch / Tag:
refs/tags/v0.1.1 - Owner: https://github.com/ywatanabe1989
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish-pypi.yml@1f0d890ad60a25d36c4c8d66048a875f7fee16fb -
Trigger Event:
push
-
Statement type:
File details
Details for the file scitex_seizure_metrics-0.1.1-py3-none-any.whl.
File metadata
- Download URL: scitex_seizure_metrics-0.1.1-py3-none-any.whl
- Upload date:
- Size: 1.9 MB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4da9a0b5977d10f3dbdad6388315d0755ec899ca988299cab6b7b905d342798c
|
|
| MD5 |
bacb28b84b220d0a70a79a758516e214
|
|
| BLAKE2b-256 |
fac1e726d62043c33953cc798bd346902987c0fd10bfb7be0ffe225d8b63bdfa
|
Provenance
The following attestation bundles were made for scitex_seizure_metrics-0.1.1-py3-none-any.whl:
Publisher:
publish-pypi.yml on ywatanabe1989/scitex-seizure-metrics
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
scitex_seizure_metrics-0.1.1-py3-none-any.whl -
Subject digest:
4da9a0b5977d10f3dbdad6388315d0755ec899ca988299cab6b7b905d342798c - Sigstore transparency entry: 1506180043
- Sigstore integration time:
-
Permalink:
ywatanabe1989/scitex-seizure-metrics@1f0d890ad60a25d36c4c8d66048a875f7fee16fb -
Branch / Tag:
refs/tags/v0.1.1 - Owner: https://github.com/ywatanabe1989
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish-pypi.yml@1f0d890ad60a25d36c4c8d66048a875f7fee16fb -
Trigger Event:
push
-
Statement type: