A framwork for creating deep learning pipelines for sleep data
Project description
SleePyPhases
SleePyPhases is an open-source Python workflow framework that provides unified, FAIR-compliant access to multiple sleep data repositories through a configuration-driven harmonization approach.
Overview
Sleep research relies on polysomnography (PSG) data from diverse public repositories and vendor systems, yet the lack of standardized access methods and semantic harmonization creates substantial barriers to data reuse and reproducibility. SleePyPhases addresses these challenges by:
- Standardizing channel naming across different datasets and vendors
- Harmonizing annotation semantics for sleep stages, arousals, respiratory events, and leg movements
- Unifying data formats to enable seamless multi-dataset studies
- Providing configuration-driven preprocessing with efficient storage mechanisms
- Ensuring reproducibility through configuration-based provenance tracking
Features
- ๐ Unified Data Access: Load data from 10+ public repositories and 2 commercial vendors through a single interface
- โ๏ธ Configuration-Driven: Define preprocessing, data manipulation, and training pipelines through YAML configuration
- ๐ Automatic Synchronization: Generated artifacts stay synchronized with configuration changes
- ๐งฉ Modular Architecture: Extend functionality through plugins for datasets, preprocessing, and ML frameworks
- ๐ ML Pipeline Integration: Built-in support for PyTorch and TensorFlow training workflows
- ๐ Comprehensive Evaluation: Segment-wise and event-wise evaluation with clinical metrics
Supported Datasets & Formats
Public Repositories
- Sleep Heart Health Study (SHHS)
- Multi-Ethnic Study of Atherosclerosis (MESA)
- MrOS Sleep Study
- PhysioNet 2018 Challenge
- SleepEDF Database Expanded
- Cleveland Family Study (CFS) - WIP
- PhysioNet 2023 Challenge - WIP
- Human Sleep Project (HSP) - WIP
- Dreem Open Dataset - WIP
- CAP Sleep Database - WIP
- ISRUC-Sleep - WIP
Vendor Formats
- Philips Aliceยฎ
- Somnomedics Dominoยฎ
- Nox Medicalยฎ - WIP
- Profusion Sleep Softwareยฎ - WIP
- Sonataยฎ - WIP
Requirements
SleePyPhases requires Python 3.8+ or Docker.
Quick Start
1. Clone Example Project
- Clone the example project:
git clone https://gitlab.com/sleep-is-all-you-need/pyphases/spp-boilderplate.git SPP-MyProject - Move to project:
cd SPP-MyProject
The example project can be customized using:
src/SignalPreprocessing.pysignal preprocessing to be stored on the filesystemsrc/DataManipulation.pydata manipulation before passing to the ml modelsrc/models/SimpleCRNN/SimpleCRNN.pyvery basic cnn/lstm pytorch modelconfig/config.ymlworkflow configuration
The example project also provides:
- basic pyPhases structure
Dockerfiledefining the docker imagedocker-compose.ymlto build and run the docker container
The example project can be extended by:
Init-Phase to inject data manipulation and preprocessing custom code to the projectproject.yamlbasic project configuration to add additional phases
Setup (docker compose)
- update the volumes in
docker-compose.yml - remove nvidia GPU
deploysection indocker-compose.ymlif no nvida GPU is available data,logsandevalfolder will be created and require write-access- phase can be executed using:
docker compose run phases run Training
Setup (Python)
- install requirement:
pip install -r requirements.txt data,logsandevalfolder will be created and require write-access- phase can be executed using
phases run Training(if installed in environment)python -m phases run Training(if python is installed)
Change Configuration
Changes can be made using the configs/config.yml or creating a new config file and loading it with the -c paramater: phases run -c myconfig1.yml,myconfig2.yml Training.
useLoader: shhs
shhs-path: /path/to/shhs/dataset
preprocessing:
targetFrequency: 64
labelFrequency: 64
stepsPerType:
eeg: [filter, resample, standardize]
targetChannels:
- [EEG]
labelChannels:
- SleepStagesAASM
dataversion:
version: my-experiment
seed: 2025
folds: 5
split:
test: "0:100"
trainval: "100:500"
The shhs dataset needs to be downloaded to the shhs-path location.
Train and Evaluate a Model
-
run the training
phases run Training -
evaluate a trained model:
phases run EvalReport
Architecture
SleePyPhases is built on three main components:
- pyPhases: Core framework for configuration-driven project management
- SleepHarmonizer: PSG data harmonization plugin with standardized interfaces
- pyPhasesML: Machine learning operations including preprocessing and training
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ SleePyPhases โ
โโโโโโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโโโโโโโโโค
โ pyPhases โ SleepHarmonizer โ pyPhasesML โ
โ (Core) โ (Data Access) โ (ML Pipeline) โ
โโโโโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโโโโโโโโค
โ - Configuration โ - Record Loader โ - Preprocessing โ
โ - Phases โ - Channel Map โ - Data Manipulation โ
โ - Data Storage โ - Annotations โ - Model Training โ
โ - Plugins โ - Metadata โ - Evaluation โ
โโโโโโโโโโโโโโโโโโโดโโโโโโโโโโโโโโโโโโดโโโโโโโโโโโโโโโโโโโโโโ
Configuration Examples
Data Filtering
dataversion:
version: shhs1-ahi15
filterQuery: recordId.str.startswith("shhs1-") and ahi > 15
seed: 2025
folds: 4
split:
test: "0:1000"
trainval: "1000:2056"
Training/Modle
modelName: MyModel # needs to be stored in src/models/MyModel.py
trainingParameter:
learningRate: 0.00025
learningRateDecay: 0.001
batchSize: 32
optimizer: adams
shuffle: True
shuffleSeed: 2025
# test to run longer
stopAfterNotImproving: 25
maxEpochs: 1000
validationMetrics: # for each label channel
- [f1, kappa]
- [f1, kappa]
Evaluation
labelChannels:
- SleepStagesAASM
- SleepArousals
- SleepApnea
- SleepLegMovements
eval:
batchSize: 1
metrics:
- [f1, kappa] # sleep stages
- [auprc, f1] # arousal
- [f1, kappa] # respiratory events
- [auprc, f1] # leg movements
clinicalMetrics:
- tst # Total Sleep Time
- waso # Wake After Sleep Onset
- ahi # Apnea-Hypopnea Index
- arousalIndex
- indexPLMS
Custom PSG Loader Configuration
This show a custom file structure where a recording is stored using the Alice 6ยฎ PSG software in following structure:
/recordings/{recordId} with three subfiles:{recording}.edf, {recording}.txt and {recording}.rml.
loader:
my-alice:
dataBase: DSDS
dataIsFinal: False # more recordings will be stored in the future
dataset:
loaderName: RecordLoaderAlice
dataHandler:
type: folders
listFilter: acq
canReadRemote: True
basePath: .
extensions: [.edf, .rml, .txt]
force: False
idPattern: .*/(.*]).edf
signal-path: "{recordId}/{recordId}.edf"
annotation-path: "{recordId}/{recordId}.rml"
metadata-path: "{recordId}/{recordId}.txt"
# the channels that should be extracted from the edf files
sourceChannels:
- name: EEG F3-A2
type: eeg
- name: EEG F4-A1
type: eeg
- name: EEG C3-A2
# ...
useLoader: my-alice
alice-path: /recordings
Validation & Reproducibility
SleePyPhases has been validated through reproduction of five published sleep analysis studies:
| Study | Model | Datasets | Original | SPP | Difference |
|---|---|---|---|---|---|
| Pourbabaee et al. | DRCNN | PhysioNet | 0.528 | 0.548 | +3.6% |
| Phan et al. | Transformer | SHHS | 0.828 | 0.828 | 0.0% |
| Kotzen et al. | CNN | MESA | 0.74 | 0.733 | -1.0% |
| Zahid et al. | CNN | MrOS | 0.704 | 0.679 | -3.7% |
| Lee et al. | Transformer | SleepEDF | 0.682 | 0.662 | -3.0% |
Included Projects
- pyPhases Core - Core framework
- pyPhases Plugins - Recordloaders, Machine learning Plugins
- Sleep Harmonizer - PSG harmonization plugin
- Reproduction Studies - All reproduction experiments
FAIR Principles
SleePyPhases adheres to FAIR principles:
- Findable: Public repositories on GitLab and Python Package Index
- Accessible: Open-source MIT licensing
- Interoperable: Supports multiple signal formats and vendor formats
- Reusable: Modular plugin architecture with versioned releases
License
This project is licensed under the MIT License - see the LICENSE file for details.
Acknowledgements
This research was funded by the Federal Ministry of Research, Technology and Space under the funding code 01ZZ2324F.
Computing resources were provided by the NHR Center of TU Dresden, jointly supported by the Federal Ministry of Education and Research and participating state governments.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
File details
Details for the file sleepyphases-0.6.0.tar.gz.
File metadata
- Download URL: sleepyphases-0.6.0.tar.gz
- Upload date:
- Size: 90.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.2.0 CPython/3.13.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f48017a6d24407be7b5402719c9260d08b85aa266aa9e0b0c59b2eee79c4bd1c
|
|
| MD5 |
c2746b2863f6383169e3732f5a29bd75
|
|
| BLAKE2b-256 |
2348de715a1c08e71596b48df509eabe9e05160866705788f63d430a29b7f2fe
|