Skip to main content

Package curating cohesive training & inference pipelines for ECG analysis.

Project description

ecg-transform

Installation

pip install ecg-transform

Example

Here is an example of defining an input schema and transforms,

from ecg_transform.inp import ECGInputSchema
from ecg_transform.t.common import LinearResample, ReorderLeads
from ecg_transform.t.scale import MinMaxNormalize
from ecg_transform.t.cut import Pad, SegmentNonoverlapping

LEAD_ORDER = ['I', 'II', 'III', 'aVR', 'aVL', 'aVF', 'V1', 'V2', 'V3', 'V4', 'V5', 'V6']
SAMPLE_RATE = 500
N_SAMPLES = SAMPLE_RATE*10

SCHEMA = ECGInputSchema(
    sample_rate=SAMPLE_RATE,
    expected_lead_order=LEAD_ORDER,
    required_num_samples=N_SAMPLES,
)

TRANSFORMS = [
    ReorderLeads(
        expected_order=LEAD_ORDER,
        missing_lead_strategy='raise',
    ),
    LinearResample(desired_sample_rate=SAMPLE_RATE),
    MinMaxNormalize(),
    SegmentNonoverlapping(segment_length=N_SAMPLES),
    Pad(pad_to_num_samples=N_SAMPLES, value=0)
]

Unit conversion & fixed-length resampling

For models with amplitude-sensitive front-ends (e.g. frozen BatchNorm calibrated in mV), use ConvertUnit to bring a signal to physical mV based on the unit declared on ECGMetadata.unit, and FourierResampleToLength to pin a fixed sample grid via scipy.signal.resample (FFT):

from ecg_transform.t.unit import ConvertUnit, FourierResampleToLength

TRANSFORMS = [
    ConvertUnit('mV'),                              # 'adc'/'counts' -> mV; 'mV' is a no-op
    ReorderLeads(LEAD_ORDER, 'raise'),
    FourierResampleToLength(1000, sample_rate=100), # length-normalize to a fixed grid
]

ConvertUnit raises if the declared unit is unknown (incl. None) — callers must declare the source unit, which makes "ADC counts silently treated as mV" (or vice versa) impossible.

The ADC gain belongs to the data, not to the transform, so there is no device-specific default. When converting ADC counts → mV, supply the gain (µV/count) one of two ways:

# (a) per-record (real, sample-specific) gain on the metadata:
meta = ECGMetadata(..., unit='adc', adc_microvolts_per_count=4.88)
transforms = [ConvertUnit('mV')]                       # reads it off each record

# (b) one corpus-wide constant (mock / no per-record metadata):
transforms = [ConvertUnit('mV', adc_microvolts_per_count=4.88)]  # applies to all

A constant passed to ConvertUnit takes precedence over per-record metadata; ADC input with neither raises.

FourierResampleToLength preserves the input dtype and is a no-op when the signal is already the target length. Prefer it over LinearResample when output must match an FFT-based pipeline — linear interpolation lacks anti-aliasing and diverges at sharp QRS peaks.

Here is an example of how ecg-transform could be used in PyTorch (which we do not require to minimize dependencies),

from typing import List
from itertools import chain

from scipy.io import loadmat

import numpy as np

import torch
from torch.utils.data import Dataset
from torch.utils.data.dataloader import DataLoader

from ecg_transform.inp import ECGInput, ECGInputSchema
from ecg_transform.t.base import ECGTransform
from ecg_transform.sample import ECGMetadata, ECGSample

class ECGDataset(Dataset):
    def __init__(
        self,
        schema,
        transforms,
        file_paths,
    ):
        self.schema = schema
        self.transforms = transforms
        self.file_paths = file_paths

    def __len__(self):
        return len(self.file_paths)

    def __getitem__(self, idx):
        mat = loadmat(self.file_paths[idx])
        metadata = ECGMetadata(
            sample_rate=int(mat['org_sample_rate'][0, 0]),
            num_samples=mat['feats'].shape[1],
            lead_names=['I', 'II', 'III', 'aVR', 'aVL', 'aVF', 'V1', 'V2', 'V3', 'V4', 'V5', 'V6'],
            unit=None,
            input_start=0,
            input_end=mat['feats'].shape[1],
        )
        inp = ECGInput(mat['feats'], metadata)
        sample = ECGSample(
            inp,
            self.schema,
            self.transforms,
        )

        return torch.from_numpy(sample.out).float(), self.file_paths[idx]

def collate_fn(inps):
    sample_ids = list(
        chain.from_iterable([[inp[1]]*inp[0].shape[0] for inp in inps])
    )
    return torch.concatenate([inp[0] for inp in inps]), sample_ids

def file_paths_to_loader(
    file_paths: List[str],
    schema: ECGInputSchema,
    transforms: List[ECGTransform],
    batch_size = 64,
    num_workers = 7,
):
    dataset = ECGDataset(
        schema,
        transforms,
        file_paths,
    )

    return DataLoader(
        dataset,
        batch_size=batch_size,
        num_workers=num_workers,
        pin_memory=True,
        sampler=None,
        shuffle=False,
        collate_fn=collate_fn,
        drop_last=False,
    )

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ecg_transform-0.2.0.tar.gz (21.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ecg_transform-0.2.0-py3-none-any.whl (19.4 kB view details)

Uploaded Python 3

File details

Details for the file ecg_transform-0.2.0.tar.gz.

File metadata

  • Download URL: ecg_transform-0.2.0.tar.gz
  • Upload date:
  • Size: 21.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for ecg_transform-0.2.0.tar.gz
Algorithm Hash digest
SHA256 134aebdd7deb4d27fd990052d06ccc62092da97d0dcfcfe3902f69391518ed81
MD5 e2ce8872683c8fb5856a0dcb0c5ecb0a
BLAKE2b-256 35d7eeffb69a03c195296a887816dd45793acc44957fde9320ec5a481d791db1

See more details on using hashes here.

File details

Details for the file ecg_transform-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: ecg_transform-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 19.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for ecg_transform-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 04e2d3e91599c472fc0c2bdc22f7721cdb1b793e05b718f6a85ba5c19de683f6
MD5 f11a9a1399cf39846194f47996a9a102
BLAKE2b-256 180276efc5673b0155e9785c866fb03e5a75061b0d75418054d0e31e60027aba

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page