A modular, configuration-driven extensible M/EEG preprocessing pipeline using MNE-Python

These details have been verified by PyPI

Project links

Homepage

GitHub Statistics

Maintainers

lbelloli

These details have not been verified by PyPI

Project description

MEEGFlow: MEEG Preprocessing Pipeline

A modular, configuration-driven MEEG preprocessing pipeline using MNE-BIDS. The pipeline uses auxiliary functions for each preprocessing step, allowing you to choose which steps to run, their order, and their parameters through a simple YAML configuration.

Features

Flexible File Discovery: Support for both BIDS-formatted datasets and custom glob patterns
MNE-BIDS Integration: Seamlessly reads MEEG data in BIDS format
Modular Design: Each preprocessing step is a separate function
Configuration-Driven: Choose steps, their order, and parameters via YAML
Custom Steps Support: Extend the pipeline with your own preprocessing functions
Progress Tracking: Rich progress bars show real-time progress for recordings and preprocessing steps
Comprehensive Logging: MNE logger integration with optional log file output
Multiple Output Formats:
- Clean preprocessed epochs in .fif format
- Clean preprocessed raw data in .fif format
- Interactive HTML reports using MNE Report
- JSON reports for easy downstream processing
Batch Processing: Process multiple subjects sequentially
Command-line Interface: Easy to use from the terminal

Installation

Option 1: Docker (Recommended)

Using Docker is the easiest way to get started, as it includes all dependencies and system libraries.

Build the Docker image:

git clone https://github.com/Laouen/meegflow.git
cd meegflow
docker build -t meegflow .

Run the container:

docker run --rm -v /path/to/bids/data:/data meegflow \
    --bids-root /data \
    --subjects 01 02 \
    --tasks rest \
    --config /app/configs/config_example.yaml

Option 2: Local Installation

Clone this repository:

git clone https://github.com/Laouen/meegflow.git
cd meegflow

Install dependencies:

pip install -r requirements.txt

(Optional) Install the package to use the meegflow command:

pip install -e .

Usage

Using Docker

To use the Docker image, mount your BIDS dataset directory to /data in the container. The outputs will be written to the derivatives/meegflow subdirectory within your BIDS root.

Basic usage:

docker run --rm \
    -v /path/to/bids:/data \
    meegflow \
    --bids-root /data \
    --tasks rest

With custom configuration:

docker run --rm \
    -v /path/to/bids:/data \
    -v /path/to/custom/config.yaml:/config.yaml \
    meegflow \
    --bids-root /data \
    --subjects 01 02 03 \
    --tasks rest \
    --config /config.yaml

With log file output:

docker run --rm \
    -v /path/to/bids:/data \
    -v /path/to/logs:/logs \
    meegflow \
    --bids-root /data \
    --tasks rest \
    --log-file /logs/pipeline.log

Processing specific sessions:

docker run --rm \
    -v /path/to/bids:/data \
    meegflow \
    --bids-root /data \
    --subjects 01 02 \
    --sessions 01 02 \
    --tasks rest

Using Local Installation

Process Multiple Subjects

Run the preprocessing pipeline on multiple subjects:

python src/cli.py \
    --bids-root /path/to/bids/dataset \
    --subjects 01 02 03 \
    --tasks rest \
    --config configs/config_example.yaml

If you installed the package with pip install -e ., you can use the meegflow command:

meegflow \
    --bids-root /path/to/bids/dataset \
    --subjects 01 02 03 \
    --tasks rest \
    --config configs/config_example.yaml

Process all subjects with a specific task:

python src/cli.py \
    --bids-root /path/to/bids/dataset \
    --tasks rest

Process specific subjects with multiple tasks:

python src/cli.py \
    --bids-root /path/to/bids/dataset \
    --subjects 01 02 \
    --tasks rest task1 task2

Python API Usage

You can also use the pipeline directly in Python:

import sys
sys.path.insert(0, 'src')
from meegflow import MEEGFlowPipeline
from readers import BIDSReader

# Load configuration
import yaml
with open('configs/config_example.yaml', 'r') as f:
    config = yaml.safe_load(f)

# Create a BIDS reader
reader = BIDSReader('/path/to/bids/dataset')

# Initialize pipeline
pipeline = MEEGFlowPipeline(
    reader=reader,
    output_root='/path/to/derivatives',
    config=config
)

# Run preprocessing on multiple subjects
results = pipeline.run_pipeline(
    subjects=['01', '02', '03'],
    tasks='rest'
)

# Access results for each subject
for subject, result in results.items():
    print(f"Subject {subject}: {result}")

File Discovery with Readers

The pipeline supports two types of file readers for discovering data files:

BIDS Reader (Default)

The BIDS reader uses MNE-BIDS to automatically discover files in BIDS-formatted datasets:

# BIDS reader is the default (--reader bids can be omitted)
python src/cli.py \
    --bids-root /path/to/bids/dataset \
    --subjects 01 02 \
    --tasks rest \
    --config configs/config_example.yaml

Glob Reader

The glob reader allows you to work with custom directory structures using glob patterns with variable extraction:

python src/cli.py \
    --reader glob \
    --data-root /path/to/data \
    --glob-pattern "sub-{subject}/ses-{session}/eeg/sub-{subject}_task-{task}_eeg.vhdr" \
    --subjects 01 02 \
    --tasks rest \
    --config configs/config_example.yaml

Pattern syntax: Use {variable_name} placeholders which:

Convert to * wildcards for file matching
Extract matched values as metadata

Python API:

from readers import GlobReader

# Create a glob reader with your custom pattern
reader = GlobReader(
    data_root='/path/to/data',
    pattern='sub-{subject}/ses-{session}/eeg/sub-{subject}_task-{task}_eeg.vhdr'
)

# Initialize pipeline with the glob reader
pipeline = MEEGFlowPipeline(
    reader=reader,
    config=config
)

# Run pipeline
results = pipeline.run_pipeline(subjects=['01', '02'], tasks='rest')

For detailed information on readers, pattern examples, and troubleshooting, see READERS.md.

Output Structure

The pipeline creates outputs in a BIDS-derivatives structure:

derivatives/meegflow/
├── epochs/              # When saving epochs with save_clean_instance
│   └── sub-01/
│       └── eeg/
│           └── sub-01_task-rest_proc-clean_desc-cleaned_epo.fif
├── raw/                 # When saving raw data with save_clean_instance
│   └── sub-01/
│       └── eeg/
│           └── sub-01_task-rest_proc-clean_desc-cleaned_epo.fif
└── reports/
    └── sub-01/
        └── eeg/
            ├── sub-01_task-rest_proc-clean_desc-cleaned_report.json
            └── sub-01_task-rest_proc-clean_desc-cleaned_report.html

Output Details

epochs/ or raw/: Contains MNE data objects saved in .fif format (if save_clean_instance step is included)
- Epochs can be loaded with mne.read_epochs()
- Raw data can be loaded with mne.io.read_raw_fif()
- Includes all preprocessing (filtering, artifact removal, baseline correction)
reports/: Contains preprocessing reports
- JSON report: Preprocessing parameters, quality metrics, steps performed (generated by generate_json_report step)
- HTML report: Interactive visualization (generated by generate_html_report step)

Configuration

The pipeline is configuration-driven. You define a list of preprocessing steps, their order, and parameters in a YAML file.

Available Steps

Data Organization:

strip_recording: Crop recordings to remove data outside the first and last events
concatenate_recordings: Concatenate multiple raw recordings into a single continuous recording
copy_instance: Create a copy of a data instance for comparison or backup purposes

Setup:

set_montage: Set channel montage for EEG data
drop_unused_channels: Explicitly drop specified channels by name

Filtering:

bandpass_filter: Apply bandpass filtering
notch_filter: Apply notch filtering

Preprocessing:

resample: Resample data to different sampling frequency
reference: Apply re-referencing
ica: ICA-based artifact removal

Bad Channel Detection:

find_flat_channels: Find flat/disconnected channels based on variance
find_bads_channels_threshold: Find bad channels using threshold-based rejection
find_bads_channels_variance: Find bad channels using variance-based detection
find_bads_channels_high_frequency: Find bad channels using high-frequency variance

Bad Channel Handling:

interpolate_bad_channels: Interpolate bad channels
drop_bad_channels: Drop bad channels without interpolation

Epoching:

find_events: Find events in the data
epoch: Create epochs around events
chunk_in_epoch: Create fixed-length epochs from continuous data
find_bads_epochs_threshold: Find and remove bad epochs using threshold-based rejection

Output:

save_clean_instance: Save raw or epochs data to .fif file
generate_json_report: Generate JSON report
generate_html_report: Generate HTML report

Example Configuration

See configs/config_example.yaml for a full pipeline with epochs:

pipeline:
  - name: bandpass_filter
    l_freq: 0.5
    h_freq: 40.0
  - name: reference
    ref_channels: average
    instance: 'raw'
  - name: find_events
    shortest_event: 1
  - name: epoch
    tmin: -0.2
    tmax: 0.8
    baseline: [null, 0]
    event_id: null
    reject:
      eeg: 1.5e-04
  - name: save_clean_instance
    instance: epochs
  - name: generate_json_report
  - name: generate_html_report

See configs/config_raw_only.yaml for a simpler pipeline without epoching:

pipeline:
  - name: bandpass_filter
    l_freq: 1.0
    h_freq: 30.0
  - name: reference
    ref_channels: average
  - name: ica
    n_components: 15
    method: fastica
    find_eog: true
    apply: true
  - name: generate_json_report

See configs/config_with_adaptive_reject.yaml for a pipeline with adaptive autoreject steps. This config includes additional preprocessing steps like montage setting, notch filtering, and resampling:

pipeline:
  - name: concatenate_recordings
  
  - name: set_montage
    montage: standard_1020
  
  - name: bandpass_filter
    l_freq: 0.1
    h_freq: 40.0
  
  - name: notch_filter
    freqs: [50.0, 100.0]
  
  - name: resample
    instance: raw
    sfreq: 250.0
    npad: auto
  
  - name: find_events
    get_events_from: annotations
    shortest_event: 1
    event_id:
      stim/12hz: 10001
      stim/15hz: 10002
  
  - name: epoch
    tmin: -0.2
    tmax: 1.2
    baseline: [null, 0.0]
    reject: null
  
  - name: reference
    instance: 'epochs'
    ref_channels: average
  
  - name: reference
    instance: 'raw'
    ref_channels: average
  
  - name: generate_html_report

Note: This config file also includes commented-out examples of bad channel detection steps (find_bads_channels_threshold, find_bads_channels_variance, find_bads_channels_high_frequency) that can be uncommented and customized as needed.

See configs/config_minimal.yaml for a comprehensive pipeline including strip_recording, copy_instance, and ICA:

pipeline:
  - name: strip_recording
    instance: all_raw
    get_events_from: annotations
    shortest_event: 5
    start_padding: 1
    end_padding: 1
  
  - name: concatenate_recordings
  
  - name: set_montage
    montage: GSN-HydroCel-256
  
  - name: copy_instance
    from_instance: raw
    to_instance: raw_before_cleaning
  
  - name: find_flat_channels
    threshold: 1.0e-12
  
  - name: bandpass_filter
    l_freq: 0.1
    h_freq: 40.0
  
  - name: chunk_in_epoch
    duration: 1
  
  - name: ica
    n_components: 20
    method: fastica
    find_eog: true
    apply: true
  
  - name: save_clean_instance
    instance: epochs
    overwrite: true
  
  - name: generate_html_report
    compare_instances:
      - title: 'Before vs After Cleaning'
        instance_a:
          name: 'raw'
          label: 'After Cleaning'
        instance_b:
          name: 'raw_before_cleaning'
          label: 'Before Cleaning'

Additional example configurations available in configs/:

config_with_drop_bad_channels.yaml - Example using drop_bad_channels instead of interpolation
config_with_excluded_channels.yaml - Example using excluded_channels parameter to preserve reference channels
config_with_custom_steps.yaml - Example showing how to integrate custom preprocessing steps

Command-Line Arguments

Required Arguments

--bids-root: Path to BIDS root directory

Optional Filter Arguments

These arguments use the same matching logic as mne-bids find_matching_paths. If not specified, all matching files will be processed.

--subjects: Subject ID(s) to process, space-separated (e.g., --subjects 01 02 03)
--sessions: Session ID(s) to process, space-separated
--tasks: Task name(s) to process, space-separated (e.g., --tasks rest task1)
--acquisitions: Acquisition parameter(s) to process
--runs: Run number(s) to process
--extension: File extension to process (default: .vhdr)

Other Arguments

--output-root: Custom output path (optional, defaults to bids-root/derivatives/meegflow)
--config: Path to YAML configuration file (optional)
--log-file: Path to log file (optional, defaults to console output)
--log-level: Logging level - DEBUG, INFO, WARNING, or ERROR (optional, default: INFO)

Custom Preprocessing Steps

The pipeline supports custom preprocessing steps, allowing you to extend the pipeline with your own processing functions without modifying the core code.

Creating Custom Steps

Create a Python file with your custom step functions:

# my_custom_steps.py
def my_custom_filter(data, step_config):
    """Apply custom filtering to raw data."""
    if 'raw' not in data:
        raise ValueError("my_custom_filter requires 'raw' in data")
    
    # Get parameters from step_config
    cutoff_freq = step_config.get('cutoff_freq', 30.0)
    
    # Apply custom processing
    data['raw'].filter(h_freq=cutoff_freq, l_freq=None)
    
    # Record the step for reporting
    data['preprocessing_steps'].append({
        'step': 'my_custom_filter',
        'cutoff_freq': cutoff_freq
    })
    
    return data

Place the file in a dedicated folder, for example: /path/to/my_custom_steps/
Update your config file to specify the custom steps folder:

custom_steps_folder: /path/to/my_custom_steps

pipeline:
  - name: my_custom_filter
    cutoff_freq: 30.0
  - name: bandpass_filter  # Built-in steps still work
    l_freq: 0.5
    h_freq: 40.0

Run the pipeline as usual - custom steps are automatically loaded and available.

Custom Step Requirements

Custom step functions must follow these rules:

Signature: Accept exactly 2 parameters: data (Dict) and step_config (Dict)
Return: Return the updated data dictionary
Validation: Check that required data instances exist (e.g., 'raw', 'epochs')
Recording: Append a summary to data['preprocessing_steps'] for reporting
Naming: Function names become step names; avoid starting with underscore

See configs/example_custom_steps.py for complete examples.

Using Custom Steps with Docker

Mount your custom steps folder when running the container:

docker run -v /host/bids:/data \
           -v /host/custom_steps:/custom_steps \
           -v /host/config:/config \
           meegflow \
           --bids-root /data \
           --subjects 01 02 \
           --tasks rest \
           --config /config/my_config.yaml

In your config file, use the container path:

custom_steps_folder: /custom_steps
pipeline:
  - name: my_custom_filter
    cutoff_freq: 30.0

Advanced Features

Override built-in steps: Custom steps with the same name as built-in steps will override them
Multiple files: Place multiple .py files in the custom steps folder - all will be loaded
Error handling: If a custom step file has errors, other files will still be loaded
Private functions: Functions starting with _ are ignored and not loaded as steps

Preprocessing Steps Details

Each step can be customized through the configuration:

Excluding Channels from Analysis

Many preprocessing steps support an excluded_channels parameter that allows you to exclude specific channels (e.g., reference channels like 'Cz') from analysis to avoid reference problems. This is useful when you want to preserve a reference channel or exclude channels that should not be analyzed in certain steps.

Steps that support excluded_channels:

bandpass_filter - Exclude channels from filtering
notch_filter - Exclude channels from notch filtering
ica - Exclude channels from ICA decomposition
find_flat_channels - Exclude channels from flat channel detection
find_bads_channels_threshold - Exclude channels from bad channel detection
find_bads_channels_variance - Exclude channels from variance-based detection
find_bads_channels_high_frequency - Exclude channels from high-frequency analysis
find_bads_epochs_threshold - Exclude channels from epoch rejection criteria
interpolate_bad_channels - Exclude channels from interpolation even if marked as bad
drop_bad_channels - Exclude channels from dropping even if marked as bad

Steps where exclusion doesn't apply:

reference - Reference computation uses selected channels; use ref_channels parameter instead
resample - Resamples all data uniformly
set_montage - Sets electrode positions for all channels
drop_unused_channels - Use this for explicit channel removal

Example usage:

- name: bandpass_filter
  l_freq: 0.5
  h_freq: 45.0
  excluded_channels: ['Cz']  # Exclude Cz from filtering

- name: find_bads_channels_threshold
  reject:
    eeg: 1.0e-4
  excluded_channels: ['Cz', 'FCz']  # Don't mark these as bad

- name: drop_bad_channels
  instance: epochs
  excluded_channels: ['Cz']  # Don't drop Cz even if marked as bad

See configs/config_with_excluded_channels.yaml for a complete example.

Data Organization Steps

strip_recording

Crop recordings to remove data outside the first and last events. This is useful for removing unnecessary data at the beginning and end of recordings that don't contain task-relevant data.

instance: Which data instance to crop - 'all_raw' or 'raw' (default: 'raw')
get_events_from: How to extract events - 'stim' or 'annotations' (default: 'annotations')
shortest_event: Minimum number of samples for an event (default: 1)
event_id: Event IDs to use for finding start/end points. Can be a dict mapping event names to IDs or 'auto' (default: 'auto')
start_padding: Time in seconds to keep before the first event (default: 1)
end_padding: Time in seconds to keep after the last event (default: 1)

Example:

- name: strip_recording
  instance: all_raw
  get_events_from: annotations
  shortest_event: 1
  event_id:
    Stimulus/CatNewRepeated/CR: 91
    Stimulus/CatOld/Hit: 101
  start_padding: 1.0
  end_padding: 1.0

concatenate_recordings

Concatenate multiple raw recordings into a single continuous recording. This is useful when data is split across multiple files but needs to be processed as a single session.

No parameters required
Requires 'all_raw' to be present in data
Creates a single 'raw' instance from all recordings in 'all_raw'

Example:

- name: concatenate_recordings

copy_instance

Create a copy of a data instance. This is useful for comparing data at different stages of preprocessing (e.g., before/after cleaning or ICA).

from_instance: Name of the instance to copy from (default: 'raw')
to_instance: Name of the new instance to create (default: 'raw_cleaned')

Example:

- name: copy_instance
  from_instance: raw
  to_instance: raw_before_ica

Preprocessing Steps

1. set_montage

Set channel montage for EEG data. Useful when data lacks electrode position information.

montage: Name of standard montage to use (default: 'standard_1020')
- Examples: 'standard_1020', 'standard_1005', 'biosemi64', etc.
- See MNE documentation for available montages

2. drop_unused_channels

Explicitly drop specified channels from the data by name. Different from drop_bad_channels, this drops channels regardless of whether they're marked as bad.

channels_to_drop: List of channel names to drop
instance: Which data instance to drop channels from - 'raw' or 'epochs' (default: 'raw')

3. bandpass_filter

Apply bandpass filtering.

l_freq: High-pass filter frequency (Hz)
h_freq: Low-pass filter frequency (Hz)
l_freq_order: Filter order for high-pass (default: 6)
h_freq_order: Filter order for low-pass (default: 8)
picks: Optional channel indices to filter
excluded_channels: List of channel names to exclude from filtering (optional)
n_jobs: Number of parallel jobs (default: 1)

4. notch_filter

Apply notch filtering to remove line noise.

freqs: Frequencies to notch filter (e.g., [50.0, 100.0])
notch_widths: Width of notch filters (optional)
method: Filtering method (default: 'fft')
picks: Optional channel indices to filter
excluded_channels: List of channel names to exclude from filtering (optional)
n_jobs: Number of parallel jobs (default: 1)

5. resample

Resample the data to a different sampling frequency.

instance: Which data instance to resample - 'raw' or 'epochs' (default: 'raw')
sfreq: Target sampling frequency in Hz (default: 250)
npad: Padding to use for resampling (default: 'auto')
resample_events: Whether to also resample events (default: false)
n_jobs: Number of parallel jobs (default: 1)

6. reference

Apply re-referencing.

ref_channels: Reference channels ('average' or channel names)
instance: Which data instance to reference - 'raw' or 'epochs' (default: 'epochs')

7. find_flat_channels

Find flat/disconnected channels based on variance threshold. Channels with variance below the threshold are marked as bad.

picks: Channel indices to check (optional, default: EEG channels)
excluded_channels: List of channel names to exclude from flat channel detection (optional)
threshold: Variance threshold below which channels are considered flat (default: 1e-12)

8. interpolate_bad_channels

Interpolate bad channels using spherical spline interpolation.

instance: Which data instance to interpolate - 'raw' or 'epochs' (default: 'epochs')
excluded_channels: List of channel names to exclude from interpolation (optional)

9. drop_bad_channels

Drop bad channels without interpolation. This step removes channels marked as bad from the data instead of interpolating them.

instance: Which data instance to drop channels from - 'raw' or 'epochs' (default: 'epochs')
excluded_channels: List of channel names to exclude from dropping even if marked as bad (optional)

10. ica

ICA-based artifact removal.

n_components: Number of ICA components (default: 20)
method: ICA method ('fastica', 'infomax', 'picard', default: 'fastica')
random_state: Random state for reproducibility (default: 97)
picks: Channel types to include in ICA (optional, default: EEG channels)
excluded_channels: List of channel names to exclude from ICA decomposition (optional)
ica_fit_l_freq: High-pass frequency for filtering data before ICA fit (default: 1.0 Hz)
ica_fit_h_freq: Low-pass frequency for filtering data before ICA fit (optional, default: None)
find_eog: Automatically find EOG artifacts (true/false, default: false)
- eog_channels: List of channel names to use for EOG detection (optional, auto-detects if not provided)
- eog_threshold: Correlation threshold for EOG component detection (default: 'auto')
- eog_measure: Measure for EOG detection ('correlation' or 'ctps', default: 'correlation')
- eog_l_freq: High-pass frequency for EOG correlation (default: 1.0 Hz)
- eog_h_freq: Low-pass frequency for EOG correlation (default: 10.0 Hz)
find_ecg: Automatically find ECG artifacts (true/false, default: false)
- ecg_channels: List of channel names to use for ECG detection (optional)
- ecg_threshold: Correlation threshold for ECG component detection (default: 'auto')
- ecg_measure: Measure for ECG detection ('correlation' or 'ctps', default: 'correlation')
- ecg_l_freq: High-pass frequency for ECG correlation (default: 1.0 Hz)
- ecg_h_freq: Low-pass frequency for ECG correlation (default: 10.0 Hz)
selected_indices: Manually specify component indices to exclude (optional, list of integers)
apply: Apply ICA to remove artifacts (true/false, default: true)

11. find_events

Find events in the data.

get_events_from: How to extract events - 'stim' or 'annotations' (default: 'annotations')
shortest_event: Minimum event duration in samples (default: 1)
event_id: Event IDs to extract. Can be 'auto' for all events or a dict mapping event names to IDs (default: 'auto')

12. epoch

Create epochs around events.

tmin: Start time before event (seconds, default: -0.2)
tmax: End time after event (seconds, default: 0.5)
baseline: Baseline correction window (tuple or null, default: (null, 0.0))
event_id: Event IDs to include (dict or null for all)
reject: Rejection criteria (dict with channel type keys, optional)

13. chunk_in_epoch

Create fixed-length epochs from continuous raw data. This is an alternative to event-based epoching that splits the data into equal-duration segments.

duration: Duration of each epoch in seconds (default: 1.0)

Example:

- name: chunk_in_epoch
  duration: 1.0  # Create 1-second epochs

14. find_bads_channels_threshold

Find bad channels using threshold-based rejection. Marks channels as bad if they exceed rejection thresholds in too many epochs.

picks: Channel indices to check (optional, default: EEG channels)
excluded_channels: List of channel names to exclude from bad channel detection (optional)
reject: Rejection thresholds by channel type (e.g., {"eeg": 150e-6})
n_epochs_bad_ch: Fraction or number of epochs a channel must be bad in to be marked as bad (default: 0.5)
apply_on: List of instances to mark bad channels on (default: ['epochs'])

15. find_bads_channels_variance

Find bad channels using variance-based detection. Identifies channels with abnormally high or low variance.

instance: Which data instance to use - 'raw' or 'epochs' (default: 'epochs')
picks: Channel indices to check (optional, default: EEG channels)
excluded_channels: List of channel names to exclude from variance analysis (optional)
zscore_thresh: Z-score threshold for outlier detection (default: 4)
max_iter: Maximum iterations for iterative outlier removal (default: 2)
apply_on: List of instances to mark bad channels on (default: [instance])

15. find_bads_channels_high_frequency

Find bad channels using high-frequency variance. Detects channels with excessive high-frequency noise.

instance: Which data instance to use - 'raw' or 'epochs' (default: 'epochs')
picks: Channel indices to check (optional, default: EEG channels)
excluded_channels: List of channel names to exclude from high-frequency analysis (optional)
zscore_thresh: Z-score threshold for outlier detection (default: 4)
max_iter: Maximum iterations for iterative outlier removal (default: 2)
apply_on: List of instances to mark bad channels on (default: [instance])

16. find_bads_epochs_threshold

Find and remove bad epochs using threshold-based rejection. Drops epochs that have too many bad channels.

picks: Channel indices to check (optional, default: EEG channels)
excluded_channels: List of channel names to exclude from epoch rejection criteria (optional)
reject: Rejection thresholds by channel type (e.g., {"eeg": 150e-6})
n_channels_bad_epoch: Fraction or number of channels that must be bad for an epoch to be rejected (default: 0.1)

17. save_clean_instance

Save clean raw or epochs data to .fif file in BIDS-derivatives format.

instance: Which data instance to save - 'raw' or 'epochs' (default: 'epochs')
overwrite: Whether to overwrite existing files (default: true)

18. generate_json_report

Generate JSON report with preprocessing information. No parameters needed.

19. generate_html_report

Generate HTML report with interactive visualizations.

picks: Channel types to include in plots (optional, default: EEG channels)
excluded_channels: List of channel names to exclude from plots (optional)
compare_instances: List of instance comparisons to plot (optional, see config_minimal.yaml for example)
plot_raw_kwargs: Additional keyword arguments for raw data plots (optional, dict)
plot_ica_kwargs: Additional keyword arguments for ICA plots (optional, dict)
plot_events_kwargs: Additional keyword arguments for event plots (optional, dict)
plot_epochs_kwargs: Additional keyword arguments for epoch plots (optional, dict)
plot_evokeds_kwargs: Additional keyword arguments for evoked response plots (optional, dict)

Batch Processing

The pipeline processes multiple subjects and files sequentially. You can process:

# Process specific subjects with a specific task
python src/cli.py \
    --bids-root /path/to/bids/dataset \
    --subjects 01 02 03 04 05 \
    --tasks rest \
    --config configs/config_example.yaml

# Process all subjects in the dataset
python src/cli.py \
    --bids-root /path/to/bids/dataset \
    --config configs/config_example.yaml

# Process specific sessions for specific subjects
python src/cli.py \
    --bids-root /path/to/bids/dataset \
    --subjects 01 02 \
    --sessions 01 02 \
    --tasks rest

For HPC/cluster environments, you can create your own SLURM or other batch submission scripts that call the pipeline with subject lists.

Progress Tracking and Logging

The pipeline includes comprehensive progress tracking and logging features:

Progress Bars

When running the pipeline, you'll see two levels of progress bars:

Overall progress: Shows progress across all recordings being processed
Step progress: Shows progress through preprocessing steps for each recording

The progress bars use the rich library and display:

Spinner animation
Progress bar with percentage
Time remaining estimate
Current step being executed

Logging

The pipeline uses MNE's logger for all output messages. You can:

Console Output (default):

python src/cli.py \
    --bids-root /path/to/bids/dataset \
    --subjects 01 02

Log to File:

python src/cli.py \
    --bids-root /path/to/bids/dataset \
    --subjects 01 02 \
    --log-file /path/to/logs/pipeline.log

Adjust Logging Level:

python src/cli.py \
    --bids-root /path/to/bids/dataset \
    --subjects 01 02 \
    --log-level DEBUG

Available log levels: DEBUG, INFO (default), WARNING, ERROR

The pipeline also saves a summary of results to derivatives/meegflow/pipeline_results.json for easy programmatic access.

Docker Notes

Volume Mounting

When using Docker, you need to mount your local directories to paths inside the container using the -v flag:

BIDS dataset: Mount your BIDS root directory to /data or any path you specify with --bids-root
Configuration files: Mount custom config files if not using the built-in configs in /app/configs/
Output directory: The pipeline writes outputs to <bids-root>/derivatives/meegflow/ by default
Log files: If using --log-file, mount a directory for log output

File Permissions

The Docker container runs as root by default. Files created by the container will be owned by root. To avoid permission issues:

Run with your user ID:

docker run --rm --user $(id -u):$(id -g) \
    -v /path/to/bids:/data \
    meegflow \
    --bids-root /data \
    --tasks rest

Or fix permissions after processing:

sudo chown -R $USER:$USER /path/to/bids/derivatives

Using Built-in Configurations

The Docker image includes several pre-configured pipeline examples in /app/configs/:

/app/configs/config_example.yaml - Standard pipeline with epochs
/app/configs/config_raw_only.yaml - Raw data processing without epoching
/app/configs/config_with_adaptive_reject.yaml - Advanced pipeline with concatenation and event-based epochs
/app/configs/config_minimal.yaml - Comprehensive pipeline with strip_recording, ICA, and instance comparison
/app/configs/config_with_drop_bad_channels.yaml - Pipeline using drop_bad_channels instead of interpolation
/app/configs/config_with_excluded_channels.yaml - Pipeline demonstrating excluded_channels parameter
/app/configs/config_with_custom_steps.yaml - Example template for using custom preprocessing steps

Example using a built-in config:

docker run --rm \
    -v /path/to/bids:/data \
    meegflow \
    --bids-root /data \
    --tasks rest \
    --config /app/configs/config_with_adaptive_reject.yaml

Building from Source

If you want to customize the Docker image or use a development version:

git clone https://github.com/Laouen/meegflow.git
cd meegflow
docker build -t meegflow:custom .

Building in CI/CD environments with self-signed certificates:

If you're building in a CI/CD environment with self-signed SSL certificates, use the PIP_TRUSTED_HOST build argument:

docker build --build-arg PIP_TRUSTED_HOST=1 -t meegflow:custom .

Note: This disables SSL verification for PyPI and should only be used in trusted CI/CD environments, not for production builds.

Requirements

Python >= 3.8
mne >= 1.5.0
mne-bids >= 0.14
numpy >= 1.24.0
scipy >= 1.11.0
rich >= 13.0.0
matplotlib >= 3.7.0 (recommended)
pandas >= 2.0.0 (recommended)

License

This project is ready to use for several projects and includes scripts for SLURM execution.

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

Support

For issues or questions, please open an issue on the GitHub repository.

Project details

These details have been verified by PyPI

Project links

Homepage

GitHub Statistics

Maintainers

lbelloli

These details have not been verified by PyPI

Release history Release notifications | RSS feed

0.1.13

May 16, 2026

0.1.12

May 16, 2026

0.1.11

May 16, 2026

0.1.10

May 16, 2026

0.1.9

May 15, 2026

0.1.7

May 14, 2026

0.1.6

May 14, 2026

0.1.5

May 14, 2026

0.1.4

May 5, 2026

0.1.3

May 5, 2026

0.1.2

May 5, 2026

This version

0.1.1

May 5, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

meegflow-0.1.1.tar.gz (77.5 kB view details)

Uploaded May 5, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

meegflow-0.1.1-py3-none-any.whl (42.4 kB view details)

Uploaded May 5, 2026 Python 3

File details

Details for the file meegflow-0.1.1.tar.gz.

File metadata

Download URL: meegflow-0.1.1.tar.gz
Upload date: May 5, 2026
Size: 77.5 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for meegflow-0.1.1.tar.gz
Algorithm	Hash digest
SHA256	`f8d5f096a1701f79f2b72c11141d761ccc3fbc74b9f33e8d2d93be633566b2c0`
MD5	`21947e3b7cb30cef6018e471ae70944e`
BLAKE2b-256	`31dd64bf6f834bf83e31af5506d513166cf01a5a8e125c8d05ac76b82d3d4298`

See more details on using hashes here.

Provenance

The following attestation bundles were made for meegflow-0.1.1.tar.gz:

Publisher: release.yml on Picnic-DoC/meegflow

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: meegflow-0.1.1.tar.gz
- Subject digest: f8d5f096a1701f79f2b72c11141d761ccc3fbc74b9f33e8d2d93be633566b2c0
- Sigstore transparency entry: 1439694730
- Sigstore integration time: May 5, 2026
Source repository:
- Permalink: Picnic-DoC/meegflow@a85c5819da3a3d9aeea00b1f22a984a40e0bacea
- Branch / Tag: refs/heads/main
- Owner: https://github.com/Picnic-DoC
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release.yml@a85c5819da3a3d9aeea00b1f22a984a40e0bacea
- Trigger Event: push

File details

Details for the file meegflow-0.1.1-py3-none-any.whl.

File metadata

Download URL: meegflow-0.1.1-py3-none-any.whl
Upload date: May 5, 2026
Size: 42.4 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for meegflow-0.1.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`823c5642c57f5a5cad712fd8b3796f01431eca38680ed11538948a9c3f7549ad`
MD5	`10d160491b8559752e67fba9bdfcdd1a`
BLAKE2b-256	`6c9756fc4e3ceff1f2e518a6c6b4e934b46b49f2c7f30bc3f6de2dd885b831d5`

See more details on using hashes here.

Provenance

The following attestation bundles were made for meegflow-0.1.1-py3-none-any.whl:

Publisher: release.yml on Picnic-DoC/meegflow

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: meegflow-0.1.1-py3-none-any.whl
- Subject digest: 823c5642c57f5a5cad712fd8b3796f01431eca38680ed11538948a9c3f7549ad
- Sigstore transparency entry: 1439694734
- Sigstore integration time: May 5, 2026
Source repository:
- Permalink: Picnic-DoC/meegflow@a85c5819da3a3d9aeea00b1f22a984a40e0bacea
- Branch / Tag: refs/heads/main
- Owner: https://github.com/Picnic-DoC
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release.yml@a85c5819da3a3d9aeea00b1f22a984a40e0bacea
- Trigger Event: push

meegflow 0.1.1

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

MEEGFlow: MEEG Preprocessing Pipeline

Features

Installation

Option 1: Docker (Recommended)

Option 2: Local Installation

Usage

Using Docker

Using Local Installation

Process Multiple Subjects

Python API Usage

File Discovery with Readers

BIDS Reader (Default)

Glob Reader

Output Structure

Output Details

Configuration

Available Steps

Example Configuration

Command-Line Arguments

Required Arguments

Optional Filter Arguments

Other Arguments

Custom Preprocessing Steps

Creating Custom Steps

Custom Step Requirements

Using Custom Steps with Docker

Advanced Features

Preprocessing Steps Details

Excluding Channels from Analysis

Data Organization Steps

strip_recording

concatenate_recordings

copy_instance

Preprocessing Steps

1. set_montage

2. drop_unused_channels

3. bandpass_filter

4. notch_filter

5. resample

6. reference

7. find_flat_channels

8. interpolate_bad_channels

9. drop_bad_channels

10. ica

11. find_events

12. epoch

13. chunk_in_epoch

14. find_bads_channels_threshold

15. find_bads_channels_variance

15. find_bads_channels_high_frequency

16. find_bads_epochs_threshold

17. save_clean_instance

18. generate_json_report

19. generate_html_report

Batch Processing

Progress Tracking and Logging

Progress Bars

Logging

Docker Notes

Volume Mounting

File Permissions

Using Built-in Configurations

Building from Source

Requirements

License

Contributing

Support

Project details

Verified details

Project links