Prepare detector features and run seizure EEG cross-validation on CHB-MIT and EU extracted records.
Project description
Seizure EEG Detector
Prepare detector-ready EEG features and run seizure detection cross-validation on extracted CHB-MIT and EU Epilepsy records.
This package is the companion detector pipeline for
seizure-eeg-extractor.
Use the extractor first to convert raw dataset files into eeg.npy and
info.pkl record folders. This package then computes detector feature arrays,
creates seizure/interictal arrays, and trains simple baseline classifiers.
Two temporal feature encodings are implemented. Choose one explicitly with
--feature-method when preparing features:
- Energy-decay-memory (
--feature-method edm): O'Leary, G., Groppe, D. M., Valiante, T. A., Verma, N., and Genov, R. (2018). "NURIP: Neural Interface Processor for Brain-State Classification and Programmable-Waveform Neurostimulation." IEEE Journal of Solid-State Circuits, 53(11), 3150-3162. https://doi.org/10.1109/JSSC.2018.2869579 - Windowed spectral features (
--feature-method windowed): Shoeb, A. H. (2009). "Application of Machine Learning to Epileptic Seizure Onset Detection and Treatment." PhD thesis, Massachusetts Institute of Technology.
The raw EEG datasets are not included. Download and use CHB-MIT and EU Epilepsy/EPILEPSIAE data according to their own access, citation, privacy, and data-use terms.
Installation
Use Python 3.10 or newer.
python3 -m venv .venv
source .venv/bin/activate
python -m pip install --upgrade pip
python -m pip install seizure-eeg-detector
For local development:
python -m pip install -e ".[dev]"
Expected Input
seizure-eeg-detector expects the NumPy record layout produced by
seizure-eeg-extractor,
which is also available on
PyPI:
extracted_data/
<patient_id>/
record_0/
eeg.npy
info.pkl
record_1/
eeg.npy
info.pkl
Each info.pkl must include fs, channel_names, num_seizures, and
seizure_times. Seizure intervals use record-local sample indices:
onset_index and offset_index.
Command Line Usage
Prepare detector feature arrays for selected CHB-MIT patients:
eeg-detect prepare chbmit /path/to/extracted_data \
--patients chb01 chb02 \
--feature-method edm \
--downsample-factor 256
Prepare windowed spectral features:
eeg-detect prepare chbmit /path/to/extracted_data \
--patients chb01 chb02 \
--feature-method windowed \
--window-seconds 2 \
--window-count 3 \
--downsample-factor 256
Prepare selected EU patients using manually chosen channels:
eeg-detect prepare eu /path/to/extracted_data \
--patients pat_FR_548 pat_FR_1096 pat_FR_1125 \
--channels HL1 HL2 HL3 HL4 HL5 HL6 HL7 HL8 \
--channel-mode within-patient \
--feature-method edm \
--downsample-factor 1024
--downsample-factor keeps every Nth prepared feature sample when writing
seizure_<n>.npy and interictal.npy. With EDM features, the raw EEG is
filtered and the EDM state is updated at the original sampling rate before this
output downsampling. For example, with 256 Hz CHB-MIT data,
--downsample-factor 256 creates one detector feature row per second
(effective_fs = 1 Hz).
During preparation, the CLI reports the selected channels, whether it had to
fall back to the largest compatible record subset, whether
--channel-mode across-patients removed patient-available channels, and any
records skipped because they do not contain the selected channel set.
Run cross-validation after preparation:
eeg-detect cross-validate chbmit /path/to/extracted_data /path/to/results \
--patients chb01 \
--model lgbm \
--num-trees 1024 \
--threshold 0.5 \
--balance-training undersample
Apply additional thresholds to saved raw prediction scores without retraining:
eeg-detect threshold-sweep /path/to/results --thresholds 0.1 0.3 0.5
Summarize patient-level and overall results:
eeg-detect summarize /path/to/results
This writes:
summary/summary.jsonsummary/patient_summary.csvsummary/overall_summary.csv
The summary includes the decision threshold, detected seizures, missed seizures,
sensitivity, latency statistics for detected seizures, total false positives,
and FPR/hour. Overall rows use pooled_* for metrics computed after combining
all patients, and mean_patient_* for unweighted means of patient-level
metrics.
Supported models are lgbm, svm, and adaboost. The CLI defaults to
LightGBM on CPU for portability; pass --lgbm-device gpu only when your local
LightGBM build supports GPU training.
By default, cross-validation trains on every prepared seizure and interictal
feature row. Pass --balance-training undersample to randomly downsample the
majority class inside each training fold so the model sees equal numbers of
seizure and interictal rows. This does not change the held-out test records or
the reported metrics. Balanced result directories include the balance mode in
the path, for example lgbm/trees_1024/balance_undersample/threshold_0.5.
Python Usage
from seizure_eeg_detector import (
CHBPatient,
CVTraining,
DataPrepper,
FeatureExtractor,
ModelType,
TrainingBalanceMode,
)
input_path = "/path/to/extracted_data"
patient = CHBPatient("chb01", input_path)
feature_extractor = FeatureExtractor(
bands=[
(0.5, 3.5),
(3.5, 6.5),
(6.5, 9.5),
(9.5, 12.5),
(12.5, 15.5),
(15.5, 18.5),
(18.5, 21.5),
(21.5, 24.5),
],
alphas=[7, 9, 10, 11, 12, 16],
feature_method="edm",
)
dataprepper = DataPrepper(feature_extractor, downsample_factor=256)
channels = dataprepper.select_channels(patient.record_paths)
dataprepper.prep_data(patient.record_paths, channels)
trainer = CVTraining(
"/path/to/results",
threshold=0.5,
balance_training=TrainingBalanceMode.UNDERSAMPLE,
)
trainer.run_cv(patient, ModelType.LGBM, params={"objective": "binary"}, num_trees=1024)
from seizure_eeg_detector import summarize_results, write_summary_outputs
summary = summarize_results("/path/to/results")
write_summary_outputs(summary, "/path/to/results/summary")
from seizure_eeg_detector import sweep_thresholds
sweep_thresholds("/path/to/results", [0.1, 0.3, 0.5])
Processing Method
Feature preparation always starts by computing spectral features:
- Spectral energy: selected EEG channels are filtered into configurable frequency bands with FIR bandpass filters, then absolute amplitudes are used as per-band features.
The temporal feature encoding must be selected with --feature-method:
edm: each spectral-energy feature is expanded with a set of exponential decay traces controlled byalphas, following the EDM feature approach described by O'Leary et al. (2018).windowed: spectral-energy features are averaged over--window-secondsepochs, then the most recent--window-countepoch vectors are concatenated. This follows the temporal feature-vector design described by Shoeb (2009), whereL = 2seconds andW = 3are the usual settings.
Prepared files are saved into each record directory:
seizure_<n>.npyfor each labeled seizure interval.interictal.npyfor records with no labeled seizures.- Optional
se.npyfiles when--save-seis passed, and optionaledm.npyfiles for EDM runs when--save-edmis passed.
Cross-validation leaves out one seizure record or one interictal record at a
time, trains on the remaining prepared arrays, writes metrics, and saves raw
and thresholded predictions under the results directory. The default decision
threshold is 0.5, and result directories include it in the run path, for
example lgbm/trees_1024/threshold_0.5. If training balancing is enabled, only
the training fold is resampled; held-out seizure and interictal arrays are
evaluated unchanged. The summarizer treats
latency: nan as a missed seizure when computing sensitivity. Detected latency
statistics exclude missed seizures. The fpr metric reported by this package is
false positives per hour, counted from positive prediction samples rather than
event-collapsed false alarms.
Threshold sweeps reuse the saved preds_*.txt raw score files and write
sibling threshold_<value> result directories with recomputed metrics and
binary predictions. Cross-validation saves the effective prediction sampling
rate in run_config.json, and prediction files must contain complete raw score
arrays because omitted scores cannot be recovered during threshold sweeps.
Citation
If you use this package in academic work, please cite the detector software:
@software{koerner2026seizure_eeg_detector,
author = {Koerner, Jamie},
title = {seizure-eeg-detector},
year = {2026},
url = {https://github.com/jamiekoe/seizure-eeg-detector}
}
The repository also includes CITATION.cff for citation managers and GitHub's
citation UI.
License
This project is distributed under the MIT License.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file seizure_eeg_detector-0.1.0.tar.gz.
File metadata
- Download URL: seizure_eeg_detector-0.1.0.tar.gz
- Upload date:
- Size: 37.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d0e9ac4c6d1bc6f186aa8f5e0e29cf6065a24a3eb9e7f1a05162f51a160f626e
|
|
| MD5 |
7015bc6b99597ae57c352b5dcacbb640
|
|
| BLAKE2b-256 |
c8cddc804e50ce14a585113e287ca5d7121dfcaeb0165e757590ff95f9e5dc16
|
Provenance
The following attestation bundles were made for seizure_eeg_detector-0.1.0.tar.gz:
Publisher:
publish.yml on jamiekoe/seizure-eeg-detector
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
seizure_eeg_detector-0.1.0.tar.gz -
Subject digest:
d0e9ac4c6d1bc6f186aa8f5e0e29cf6065a24a3eb9e7f1a05162f51a160f626e - Sigstore transparency entry: 1573632433
- Sigstore integration time:
-
Permalink:
jamiekoe/seizure-eeg-detector@ffcdb0d75212cadd8acc6cb2fdc02031b3754511 -
Branch / Tag:
refs/heads/main - Owner: https://github.com/jamiekoe
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@ffcdb0d75212cadd8acc6cb2fdc02031b3754511 -
Trigger Event:
workflow_dispatch
-
Statement type:
File details
Details for the file seizure_eeg_detector-0.1.0-py3-none-any.whl.
File metadata
- Download URL: seizure_eeg_detector-0.1.0-py3-none-any.whl
- Upload date:
- Size: 32.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a4c1c1d8a1a4c9f59cd7548f8062ba4c87afe8976ab7801d08b1d6cb0501d19c
|
|
| MD5 |
b4ab6b116078d04c569e42468ce2ebab
|
|
| BLAKE2b-256 |
986a110b251ada528c6da83403b1d16a8e05dc23ad33b9978e819264a0d3acd9
|
Provenance
The following attestation bundles were made for seizure_eeg_detector-0.1.0-py3-none-any.whl:
Publisher:
publish.yml on jamiekoe/seizure-eeg-detector
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
seizure_eeg_detector-0.1.0-py3-none-any.whl -
Subject digest:
a4c1c1d8a1a4c9f59cd7548f8062ba4c87afe8976ab7801d08b1d6cb0501d19c - Sigstore transparency entry: 1573632464
- Sigstore integration time:
-
Permalink:
jamiekoe/seizure-eeg-detector@ffcdb0d75212cadd8acc6cb2fdc02031b3754511 -
Branch / Tag:
refs/heads/main - Owner: https://github.com/jamiekoe
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@ffcdb0d75212cadd8acc6cb2fdc02031b3754511 -
Trigger Event:
workflow_dispatch
-
Statement type: