Clinically-grounded discrete tokenization and per-frame wave segmentation for electrocardiograms

These details have not been verified by PyPI

Project links

Project description

OpenECG

Clinically-grounded discrete tokenization and per-frame wave segmentation for electrocardiograms.

OpenECG ships:

A 13-symbol RLE token format (openecg.codec, openecg.vocab) that compresses 12-lead ECGs into a clinically interpretable sequence.
A pretrained Conv+Transformer per-frame wave classifier with a parallel boundary-regression head (openecg.stage2) trained on LUDB + QTDB + ISP. Reaches near-SOTA P / QRS / T boundary F1 across all three datasets (see Performance).
Loaders and converters for LUDB, QTDB, and ISP datasets so you can reproduce every number in this README.

Install

pip install openecg

PyTorch is a runtime dependency. On CUDA boxes, install the matching wheel first (pip install torch --index-url https://download.pytorch.org/whl/cu124).

Quickstart

from openecg import codec, vocab

# Tokenise a hand-built event stream of (sym_id, length_ms) tuples.
events = [
    (vocab.ID_ISO, 200), (vocab.ID_P, 80),  (vocab.ID_ISO, 80),
    (vocab.ID_Q,   20),  (vocab.ID_R, 40),  (vocab.ID_S, 40),
    (vocab.ID_ISO, 120), (vocab.ID_T, 200), (vocab.ID_ISO, 220),
]
packed = codec.encode(events)              # uint16 array (RLE pack)
print(codec.render_compact(events))        # one char per event
print(codec.render_timed(events, 20))      # char count proportional to ms
print(codec.decode(packed) == events)      # round-trip

For wave segmentation on a real ECG signal (10s, 250 Hz, single lead → per-frame P/QRS/T/other labels), use openecg.stage2.infer.predict_frames after loading a checkpoint with load_model. End-to-end usage: scripts/sota_comparison.py.

Performance

Headline numbers come from the current best checkpoint, stage2_v15_canonical.pt — a Conv+Transformer per-frame classifier with a parallel boundary-regression head and an auxiliary QRS head tapped after the lower 4 transformer layers, whose softmaxed logits are concatenated with the lower features and projected back into the upper transformer's input (Phase 2 of the QRS-first hierarchy). Trained jointly on LUDB + QTDB + ISP + a synthetic AV-block mix that includes Mobitz I / II / complete + paced ventricular escape scenarios. Average F1 across the six P / QRS / T on/off boundaries (Martinez per-boundary tolerances: P 50 ms, QRS 40 ms, T_on 50 ms, T_off 100 ms):

Dataset (eval split)	OpenECG v15	OpenECG v13_aux	OpenECG v12_reg (legacy)	Reference SOTA
LUDB val	0.947	0.953	0.947	DENS-ECG / Moskalenko 2020 ≈ 0.97
ISP test	0.967	0.964	0.966	SemiSegECG 2025 (semi-supervised) ≈ 0.97
QTDB pu0	0.859	0.856	0.847	Martinez 2004 wavelet ≈ 0.97 (T-annotated subset only)
BUT PDB AVB peak F1	0.714	0.680	0.709	— (only public AVB dataset with P labels)

v15 is the first model that improves BUT PDB AVB peak F1 over the v12_reg baseline (+0.005) while also setting new records on ISP and QTDB. The concat path lets the upper layers see the explicit QRS estimate as an input feature — implementing the clinical "find P/T relative to QRS" workflow as an architectural prior. Median boundary timing error is ≤20 ms on every wave on every dataset, meeting the clinical spec target. Full design notes are in docs/superpowers/specs/2026-05-06-v12-postmortem.md; the QTDB +0.020 lift over the original v12_reg came from fix(qtdb): density-based window selection, which corrected a label-window bug that had silenced ~12 % of q1c records during training (scripts/verify_qtdb_fix.py). Run scripts/sota_comparison.py to reproduce per-boundary breakdowns.

Single-lead robustness across the 12 LUDB leads is documented in scripts/per_lead_v4.py; lead III and aVL are the physiologically expected weak spots (small P / T amplitude due to axis orthogonality), which are uncommon as sole monitoring leads in clinical practice.

Reproduce

uv sync
$env:UV_LINK_MODE = "copy"     # Windows + OneDrive workaround
$env:OPENECG_LUDB_ZIP = "<path-to-LUDB-zip>"

uv run pytest                              # unit + stage2 (LUDB integration if env set)

# Train the current best (v15 concat+paced) — needs CUDA, ~1 h on RTX 4090
uv run python scripts/retrain_v15_concat_paced.py  # → data/checkpoints/stage2_v15_concat_paced.pt

# Phase 1 ablation (aux QRS head only, no concat)
uv run python scripts/retrain_v13_aux_qrs.py   # → data/checkpoints/stage2_v13_aux.pt

# Original boundary-only baseline (kept for backward compatibility)
uv run python scripts/train_v12_reg.py     # → data/checkpoints/stage2_v12_reg.pt

# Reproduce the headline table
uv run python scripts/sota_comparison.py   # → out/sota_comparison_*.json

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.3.2

May 8, 2026

0.3.1

May 8, 2026

0.3.0

May 8, 2026

This version

0.2.0

May 8, 2026

0.1.0

May 6, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

openecg-0.2.0.tar.gz (103.8 kB view details)

Uploaded May 8, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

openecg-0.2.0-py3-none-any.whl (126.3 kB view details)

Uploaded May 8, 2026 Python 3

File details

Details for the file openecg-0.2.0.tar.gz.

File metadata

Download URL: openecg-0.2.0.tar.gz
Upload date: May 8, 2026
Size: 103.8 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.9

File hashes

Hashes for openecg-0.2.0.tar.gz
Algorithm	Hash digest
SHA256	`08cffcfd46078d99c111938bc88967708e44dd43fa25e11f7471ac0570cf6189`
MD5	`c0f2cde6bddb56f5fcb4c2489dae1610`
BLAKE2b-256	`f93ef41f5d1285aa1e911b34d3b8f47b13d4d3777bbd8cf37e96dea8acce49b9`

See more details on using hashes here.

File details

Details for the file openecg-0.2.0-py3-none-any.whl.

File metadata

Download URL: openecg-0.2.0-py3-none-any.whl
Upload date: May 8, 2026
Size: 126.3 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.9

File hashes

Hashes for openecg-0.2.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`53f4eabffcb096cfa6d67f29afba881c61e94f6f90a1c9593f3e147fb589c489`
MD5	`5ac895d405541ea87fa42c5d1c42517c`
BLAKE2b-256	`417c6f9e9a12052a7597f31e53c7606c83240f8919e17204e63442768348edf3`

See more details on using hashes here.

openecg 0.2.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

OpenECG

Install

Quickstart

Performance

Reproduce

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes