Skip to main content

ECG generation and modeling experiments

Project description

ECGEN

ECG Generation Framework — PyTorch Lightning implementations of generative models for 12-lead electrocardiogram (ECG) signal synthesis and latent representation learning.

Overview

ECGEN provides two deep generative models for ECG signals, both trained primarily on the MIMIC-IV-ECG dataset (12-lead, 5000 samples/signal):

Model Type Leads Purpose
VAE Variational Autoencoder 12 Unsupervised latent representation & reconstruction
Pulse2Pulse WaveGAN (WGAN-GP) 8 Conditional ECG signal generation

Installation

# From source
git clone https://github.com/vlbthambawita/ECGEN.git
cd ECGEN
pip install -e .

# With HuggingFace Hub support (for checkpoint uploads)
pip install 'ecgen[hf]'

Requirements: Python ≥ 3.8, PyTorch, PyTorch Lightning


Models

VAE — Variational Autoencoder

1D convolutional VAE with residual blocks for learning compact ECG representations. The encoder maps a 12-lead signal to a Gaussian latent space; the decoder reconstructs the signal from sampled latents.

Architecture: ResidualBlock1D → Encoder1D → latent (μ, σ) → Decoder1D

Loss: reconstruction_loss + kl_weight × KL_divergence

from ecgen.models import VAELightning, VAEConfig

config = VAEConfig(
    in_channels=12,
    base_channels=64,
    latent_channels=8,
    channel_multipliers=(1, 2, 4, 4),
    num_res_blocks=2,
    lr=1e-4,
    kl_weight=1e-4,
)
model = VAELightning(config)

# Generate new ECG signals
samples = model.sample(n_samples=16, seq_length=5000)  # (16, 12, 5000)

Pulse2Pulse — WaveGAN

Wasserstein GAN with gradient penalty for conditional ECG generation. Uses the first 8 leads of MIMIC-IV-ECG signals.

from ecgen.models import Pulse2PulseGAN, Pulse2PulseConfig

config = Pulse2PulseConfig(
    model_size=50,
    num_channels=8,
    seq_length=5000,
    lr=1e-4,
    lmbda=10.0,   # gradient penalty weight
    n_critic=5,   # discriminator steps per generator step
)
model = Pulse2PulseGAN(config)

Generating ECG Signals

Generate synthetic 8-lead ECG signals from any trained Pulse2Pulse checkpoint — local or directly from HuggingFace Hub:

from ecgen import generate

# From HuggingFace Hub — downloaded once, cached forever
paths = generate(
    model_path="hf://vlbthambawita/ECGEN/pulse2pulse/ptbxl/pulse2pulse_exp_ptbxl_full_epoch:900.pt",
    n_samples=10,
    output_dir="outputs/generated",
)
# → outputs/generated/sample_01.csv … sample_10.csv

# With interactive D3 HTML plots alongside each CSV (default)
generate(..., ecgplot=True)
# → sample_01.csv + sample_01.html …

# Static SVG plots
generate(..., ecgplot=True, plot_format="svg")

# PDF plots (requires: pip install 'ecgen[plot]')
generate(..., ecgplot=True, plot_format="pdf")
Parameter Default Description
model_path Local .pt path or hf://owner/repo/path
n_samples Number of ECGs to generate
output_dir Output directory (created if absent)
format "csv" Signal output format
header True Lead-name header row in CSV (I,II,V1…V6)
ecgplot False Also save an ECG plot per sample
plot_format "html" "html" · "svg" · "pdf"
model_size 50 Generator model_size (must match training)
batch_size 32 Samples per GPU forward pass
device auto "cuda", "cpu", or None
denorm 1.0 Multiply raw output by this factor. Use 6000.0 for models trained on ECGDataSimple (÷6000 normalisation). Default 1.0 is correct for the PTB-XL checkpoint (physical mV, no normalisation).

ECG Plots

plot_ecg renders any ECG array as a real ECG graph-sheet — pink/red grid paper, labelled leads, 25 mm/s paper speed, 10 mm/mV scale:

from ecgen import plot_ecg
import numpy as np

ecg = np.load("my_ecg.npy")  # shape (8, 5000) or (5000, 8)

# Interactive HTML — pan/zoom with mouse/touch, save-SVG and Print buttons
plot_ecg(ecg, output_path="ecg.html", title="My ECG")

# Static SVG — embed in documents, no JS required
plot_ecg(ecg, output_path="ecg.svg", format="svg")

# PDF (requires: pip install 'ecgen[plot]')
plot_ecg(ecg, output_path="ecg.pdf", format="pdf")

# Return content string instead of writing a file
html_str = plot_ecg(ecg)
svg_str  = plot_ecg(ecg, format="svg")

The interactive HTML uses D3 v7 and is fully self-contained (single file). Features:

  • Scroll / pinch to zoom (x-axis); drag to pan
  • ECG grid scales with zoom so the mm² boxes always represent the same time/amplitude
  • ⤓ SVG button exports the current view as a clean SVG file
  • 🖨 Print / PDF button opens the browser print dialog (set destination to "Save as PDF")
Parameter Default Description
ecg (n_leads, n_samples) or (n_samples, n_leads)
sample_rate 500 Hz
lead_names auto Defaults to ["I","II","V1"…"V6"]
title "ECG" Shown in toolbar and file header
output_path None Write to file; None returns the content string
format "html" "html" · "svg" · "pdf"
paper_speed 25.0 mm/s
amplitude_scale 10.0 mm/mV

Training

VAE

# Config-driven (recommended)
python scripts/train_vae_mimic.py --config configs/experiments/vae_mimic.yaml

# Quick test with a small subset
python scripts/train_vae_mimic.py \
  --data-dir /path/to/mimic-iv-ecg \
  --max-samples 1000 --max-epochs 10 --batch-size 16

# Resume from checkpoint
python scripts/train_vae_mimic.py \
  --config configs/experiments/vae_mimic.yaml \
  --resume runs/vae_mimic/seed_42/checkpoints/last.ckpt

Pulse2Pulse

python -m ecgen.training.train --config configs/experiments/pulse2pulse_mimic.yaml

Run outputs are saved to runs/{experiment_name}/seed_{seed}/ containing:

  • checkpoints/ — best and last model weights
  • samples/ — generated ECG batches
  • tb/ — TensorBoard logs

Config format

All experiments use a YAML config with target + params for dynamic instantiation:

experiment:
  name: vae_mimic
  seed: 42

model:
  target: ecgen.models.vae.VAELightning
  params:
    config:
      in_channels: 12
      latent_channels: 8
      kl_weight: 0.0001

data:
  target: ecgen.data.mimic_dataset.MIMICIVECGDataset
  params:
    mimic_path: /path/to/mimic-iv-ecg
    batch_size: 32
    max_samples: null   # null = full dataset

trainer:
  max_epochs: 100
  accelerator: gpu
  devices: [0]

Uploading Checkpoints to HuggingFace

pip install 'ecgen[hf]'
huggingface-cli login

From Python:

from ecgen.f import upload_hf, upload_hf_single

# Upload using a YAML config (supports subcategories like vae/ptbxl/)
upload_hf("configs/upload_checkpoints.yaml")

# Upload a single checkpoint
upload_hf_single(
    local_path="runs/vae/ptbxl/best.ckpt",
    repo_path="vae/ptbxl/best.ckpt",
    repo_id="your_username/ECGEN",
)

From CLI:

python scripts/upload_checkpoints_to_hf.py --dry-run   # preview
python scripts/upload_checkpoints_to_hf.py             # upload

Edit configs/upload_checkpoints.yaml to define which checkpoints map to which paths in the HF repo.


Package Structure

src/ecgen/
├── models/        — VAE and Pulse2Pulse model definitions
├── data/          — MIMIC-IV-ECG dataset loaders and data modules
├── training/      — Training loop, losses, metrics, callbacks
├── f/             — Functional utilities (upload_hf, upload_hf_single)
└── utils/         — Seeding, I/O, logging helpers

Releasing to PyPI

The package is published as ecgen automatically on tagged commits:

# Bump __version__ in src/ecgen/__init__.py, commit, then:
git tag v0.2.0
git push origin v0.2.0

GitHub Actions builds and publishes to PyPI via OIDC trusted publishing (no API token required).


License

See LICENSE.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ecgen-1.0.1.tar.gz (81.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ecgen-1.0.1-py3-none-any.whl (43.0 kB view details)

Uploaded Python 3

File details

Details for the file ecgen-1.0.1.tar.gz.

File metadata

  • Download URL: ecgen-1.0.1.tar.gz
  • Upload date:
  • Size: 81.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for ecgen-1.0.1.tar.gz
Algorithm Hash digest
SHA256 182efbebda0032ba0b7e53476afd8ed9a91f3550da050650e9612833c18dd3e3
MD5 b8bfa57b67675bddbd06a5e0ca6aba8e
BLAKE2b-256 cd4d6242edc788ea3d58810bdd99d50c1233019e04a6ce8f36a1bd28c9859f03

See more details on using hashes here.

Provenance

The following attestation bundles were made for ecgen-1.0.1.tar.gz:

Publisher: publish.yml on vlbthambawita/ECGEN

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file ecgen-1.0.1-py3-none-any.whl.

File metadata

  • Download URL: ecgen-1.0.1-py3-none-any.whl
  • Upload date:
  • Size: 43.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for ecgen-1.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 d3d310dbe7c2cabbee3b133e7ad1b0414a58bb4c90709f0df88d8856c1f4fbdb
MD5 fd1fbe61aafd6a965171500d2c964f46
BLAKE2b-256 ad20354ea4b8bf7ef17b2654bf4451c1c3c1e4f55badd92e647cf9964ae694ad

See more details on using hashes here.

Provenance

The following attestation bundles were made for ecgen-1.0.1-py3-none-any.whl:

Publisher: publish.yml on vlbthambawita/ECGEN

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page