ECG generation and modeling experiments
Project description
ECGEN
ECG Generation Framework — PyTorch Lightning implementations of generative models for 12-lead electrocardiogram (ECG) signal synthesis and latent representation learning.
Overview
ECGEN provides two deep generative models for ECG signals, both trained primarily on the MIMIC-IV-ECG dataset (12-lead, 5000 samples/signal):
| Model | Type | Leads | Purpose |
|---|---|---|---|
| VAE | Variational Autoencoder | 12 | Unsupervised latent representation & reconstruction |
| Pulse2Pulse | WaveGAN (WGAN-GP) | 8 | Conditional ECG signal generation |
Installation
# From source
git clone https://github.com/vlbthambawita/ECGEN.git
cd ECGEN
pip install -e .
# With HuggingFace Hub support (for checkpoint uploads)
pip install 'ecgen[hf]'
Requirements: Python ≥ 3.8, PyTorch, PyTorch Lightning
Models
VAE — Variational Autoencoder
1D convolutional VAE with residual blocks for learning compact ECG representations. The encoder maps a 12-lead signal to a Gaussian latent space; the decoder reconstructs the signal from sampled latents.
Architecture: ResidualBlock1D → Encoder1D → latent (μ, σ) → Decoder1D
Loss: reconstruction_loss + kl_weight × KL_divergence
from ecgen.models import VAELightning, VAEConfig
config = VAEConfig(
in_channels=12,
base_channels=64,
latent_channels=8,
channel_multipliers=(1, 2, 4, 4),
num_res_blocks=2,
lr=1e-4,
kl_weight=1e-4,
)
model = VAELightning(config)
# Generate new ECG signals
samples = model.sample(n_samples=16, seq_length=5000) # (16, 12, 5000)
Pulse2Pulse — WaveGAN
Wasserstein GAN with gradient penalty for conditional ECG generation. Uses the first 8 leads of MIMIC-IV-ECG signals.
from ecgen.models import Pulse2PulseGAN, Pulse2PulseConfig
config = Pulse2PulseConfig(
model_size=50,
num_channels=8,
seq_length=5000,
lr=1e-4,
lmbda=10.0, # gradient penalty weight
n_critic=5, # discriminator steps per generator step
)
model = Pulse2PulseGAN(config)
Generating ECG Signals
Generate synthetic 8-lead ECG signals from any trained Pulse2Pulse checkpoint — local or directly from HuggingFace Hub:
from ecgen import generate
# From HuggingFace Hub — downloaded once, cached forever
paths = generate(
model_path="hf://vlbthambawita/ECGEN/pulse2pulse/ptbxl/pulse2pulse_exp_ptbxl_full_epoch:900.pt",
n_samples=10,
output_dir="outputs/generated",
)
# → outputs/generated/sample_01.csv … sample_10.csv
# With interactive D3 HTML plots alongside each CSV (default)
generate(..., ecgplot=True)
# → sample_01.csv + sample_01.html …
# Static SVG plots
generate(..., ecgplot=True, plot_format="svg")
# PDF plots (requires: pip install 'ecgen[plot]')
generate(..., ecgplot=True, plot_format="pdf")
| Parameter | Default | Description |
|---|---|---|
model_path |
— | Local .pt path or hf://owner/repo/path |
n_samples |
— | Number of ECGs to generate |
output_dir |
— | Output directory (created if absent) |
format |
"csv" |
Signal output format |
header |
True |
Lead-name header row in CSV (I,II,V1…V6) |
ecgplot |
False |
Also save an ECG plot per sample |
plot_format |
"html" |
"html" · "svg" · "pdf" |
model_size |
50 |
Generator model_size (must match training) |
batch_size |
32 |
Samples per GPU forward pass |
device |
auto | "cuda", "cpu", or None |
denorm |
1.0 |
Multiply raw output by this factor. Use 6000.0 for models trained on ECGDataSimple (÷6000 normalisation). Default 1.0 is correct for the PTB-XL checkpoint (physical mV, no normalisation). |
ECG Plots
plot_ecg renders any ECG array as a real ECG graph-sheet — pink/red grid paper, labelled leads, 25 mm/s paper speed, 10 mm/mV scale:
from ecgen import plot_ecg
import numpy as np
ecg = np.load("my_ecg.npy") # shape (8, 5000) or (5000, 8)
# Interactive HTML — pan/zoom with mouse/touch, save-SVG and Print buttons
plot_ecg(ecg, output_path="ecg.html", title="My ECG")
# Static SVG — embed in documents, no JS required
plot_ecg(ecg, output_path="ecg.svg", format="svg")
# PDF (requires: pip install 'ecgen[plot]')
plot_ecg(ecg, output_path="ecg.pdf", format="pdf")
# Return content string instead of writing a file
html_str = plot_ecg(ecg)
svg_str = plot_ecg(ecg, format="svg")
The interactive HTML uses D3 v7 and is fully self-contained (single file). Features:
- Scroll / pinch to zoom (x-axis); drag to pan
- ECG grid scales with zoom so the mm² boxes always represent the same time/amplitude
- ⤓ SVG button exports the current view as a clean SVG file
- 🖨 Print / PDF button opens the browser print dialog (set destination to "Save as PDF")
| Parameter | Default | Description |
|---|---|---|
ecg |
— | (n_leads, n_samples) or (n_samples, n_leads) |
sample_rate |
500 |
Hz |
lead_names |
auto | Defaults to ["I","II","V1"…"V6"] |
title |
"ECG" |
Shown in toolbar and file header |
output_path |
None |
Write to file; None returns the content string |
format |
"html" |
"html" · "svg" · "pdf" |
paper_speed |
25.0 |
mm/s |
amplitude_scale |
10.0 |
mm/mV |
Training
VAE
# Config-driven (recommended)
python scripts/train_vae_mimic.py --config configs/experiments/vae_mimic.yaml
# Quick test with a small subset
python scripts/train_vae_mimic.py \
--data-dir /path/to/mimic-iv-ecg \
--max-samples 1000 --max-epochs 10 --batch-size 16
# Resume from checkpoint
python scripts/train_vae_mimic.py \
--config configs/experiments/vae_mimic.yaml \
--resume runs/vae_mimic/seed_42/checkpoints/last.ckpt
Pulse2Pulse
python -m ecgen.training.train --config configs/experiments/pulse2pulse_mimic.yaml
Run outputs are saved to runs/{experiment_name}/seed_{seed}/ containing:
checkpoints/— best and last model weightssamples/— generated ECG batchestb/— TensorBoard logs
Config format
All experiments use a YAML config with target + params for dynamic instantiation:
experiment:
name: vae_mimic
seed: 42
model:
target: ecgen.models.vae.VAELightning
params:
config:
in_channels: 12
latent_channels: 8
kl_weight: 0.0001
data:
target: ecgen.data.mimic_dataset.MIMICIVECGDataset
params:
mimic_path: /path/to/mimic-iv-ecg
batch_size: 32
max_samples: null # null = full dataset
trainer:
max_epochs: 100
accelerator: gpu
devices: [0]
Uploading Checkpoints to HuggingFace
pip install 'ecgen[hf]'
huggingface-cli login
From Python:
from ecgen.f import upload_hf, upload_hf_single
# Upload using a YAML config (supports subcategories like vae/ptbxl/)
upload_hf("configs/upload_checkpoints.yaml")
# Upload a single checkpoint
upload_hf_single(
local_path="runs/vae/ptbxl/best.ckpt",
repo_path="vae/ptbxl/best.ckpt",
repo_id="your_username/ECGEN",
)
From CLI:
python scripts/upload_checkpoints_to_hf.py --dry-run # preview
python scripts/upload_checkpoints_to_hf.py # upload
Edit configs/upload_checkpoints.yaml to define which checkpoints map to which paths in the HF repo.
Package Structure
src/ecgen/
├── models/ — VAE and Pulse2Pulse model definitions
├── data/ — MIMIC-IV-ECG dataset loaders and data modules
├── training/ — Training loop, losses, metrics, callbacks
├── f/ — Functional utilities (upload_hf, upload_hf_single)
└── utils/ — Seeding, I/O, logging helpers
Releasing to PyPI
The package is published as ecgen automatically on tagged commits:
# Bump __version__ in src/ecgen/__init__.py, commit, then:
git tag v0.2.0
git push origin v0.2.0
GitHub Actions builds and publishes to PyPI via OIDC trusted publishing (no API token required).
License
See LICENSE.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file ecgen-1.0.0.tar.gz.
File metadata
- Download URL: ecgen-1.0.0.tar.gz
- Upload date:
- Size: 42.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a4c7c673404a419f321c04ed09d54474d45328c7c612b4e1d4babfc5eb758284
|
|
| MD5 |
4b25f295613e52f9faeb02029399793f
|
|
| BLAKE2b-256 |
9efe60daabf411d51daeaa54ef4a865e9b68ffc767c341d2f76270223a59c482
|
Provenance
The following attestation bundles were made for ecgen-1.0.0.tar.gz:
Publisher:
publish.yml on vlbthambawita/ECGEN
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
ecgen-1.0.0.tar.gz -
Subject digest:
a4c7c673404a419f321c04ed09d54474d45328c7c612b4e1d4babfc5eb758284 - Sigstore transparency entry: 1247146874
- Sigstore integration time:
-
Permalink:
vlbthambawita/ECGEN@a54b5131a3c3c908ee62c33b6d91de7a9c6320ec -
Branch / Tag:
refs/tags/v1.0.0 - Owner: https://github.com/vlbthambawita
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@a54b5131a3c3c908ee62c33b6d91de7a9c6320ec -
Trigger Event:
push
-
Statement type:
File details
Details for the file ecgen-1.0.0-py3-none-any.whl.
File metadata
- Download URL: ecgen-1.0.0-py3-none-any.whl
- Upload date:
- Size: 42.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b77d3c5e023ccfbfa4bf155e782ffa3b04000b7f42e87e2793e387597dbc0691
|
|
| MD5 |
f3decdcdd7582c8c08338a4278f7e61a
|
|
| BLAKE2b-256 |
eddf62c2455bc4c235164769d0689edbe9fbed23c823a3ebe8271727b51b0eaa
|
Provenance
The following attestation bundles were made for ecgen-1.0.0-py3-none-any.whl:
Publisher:
publish.yml on vlbthambawita/ECGEN
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
ecgen-1.0.0-py3-none-any.whl -
Subject digest:
b77d3c5e023ccfbfa4bf155e782ffa3b04000b7f42e87e2793e387597dbc0691 - Sigstore transparency entry: 1247146876
- Sigstore integration time:
-
Permalink:
vlbthambawita/ECGEN@a54b5131a3c3c908ee62c33b6d91de7a9c6320ec -
Branch / Tag:
refs/tags/v1.0.0 - Owner: https://github.com/vlbthambawita
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@a54b5131a3c3c908ee62c33b6d91de7a9c6320ec -
Trigger Event:
push
-
Statement type: