A slurm friendly MEEG derivative extraction package leveraging bids-like data organization and DAG processing.

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

yjmantilla

These details have not been verified by PyPI

Project description

NeuroDAGs

An Extensible and Declarative DAG Framework for Reproducible Neuroscience Workflows

M/EEG studies generate many interdependent intermediate derivatives. Recomputing full pipelines is wasteful; reusing valid intermediates is non-trivial. Large-scale studies require reproducible, extensible, and efficient workflows. NeuroDAGs addresses this with a declarative, graph-based framework for scalable and reusable derivative computation.

Docs | Comparison with Snakemake/Pydra | Poster BRaIN Symposium 2026 Montreal

Core Idea

Pipelines are defined as a directed acyclic graph (DAG) of computation nodes that output reusable derivatives, executed for each input file.

Design Principles

Reproducible, transparent workflows defined declaratively in YAML — version-controllable and LLM-friendly.
Uniform node abstraction — preprocessing, features, and any custom nodes are treated identically.
Directory-agnostic — outputs mirror inputs' organization. Derivatives are labeled with a @DerivativeName suffix.
xarray-centered outputs — derivatives stored as language-agnostic, metadata-rich, dimension-aware xarray → NetCDF.
Graph-based reuse — if a derivative is already computed and overwrite=False, it is skipped automatically.

Features

Agnostic to data organization / directory hierarchy
SLURM / HPC friendly with file-level parallelism via joblib
Graph-based caching: skip already-computed derivatives
Extensible node system — add nodes without forking the package
YAML-based declarative configuration
Unified CLI: neurodags run, dry-run, dataframe, dag, view, validate, tui
Built-in Terminal User Interface (TUI) for pipeline management and execution
Built-in nodes for preprocessing, spectral analysis, entropy, complexity, and data transformations
Dataframe assembly (wide or long format) from derivative artifacts
Dry-run mode — inspect planned computations without executing
Built-in Dash-Plotly explorer for .fif and .nc files

Installation

pip install neurodags
# Or with TUI support
pip install neurodags[tui]

With uv (recommended):

uv add neurodags
# Or with TUI support
uv add neurodags[tui]

Quickstart

See the quickstart example — full synthetic pipeline, no real data required.

CLI

NeuroDAGs installs a unified neurodags command:

neurodags validate pipeline.yml
neurodags run pipeline.yml                          # all derivatives in DerivativeList
neurodags run pipeline.yml --derivative CleanedEEG  # or a specific one
neurodags dry-run pipeline.yml --output dry_run.csv
neurodags dataframe pipeline.yml --format wide --output features.csv
neurodags dag pipeline.yml --html pipeline_dag.html
neurodags view path/to/file.nc

If you install the optional TUI extra, you also get:

neurodags tui pipeline.yml --datasets datasets.yml

Development

git clone https://github.com/yjmantilla/neurodags
cd neurodags
uv sync --all-extras --all-groups # creates .venv and installs all deps incl. dev/test/docs
uv run pre-commit install

Key commands (all via uv run):

uv run ruff check src/              # lint  (fix: uv run ruff check src/ --fix)
uv run black --check .              # format check  (fix: uv run black .)
uv run pytest -q                    # run tests
uv run pytest -s -q --no-cov --pdb  # debug a failing test

uv run sphinx-build -b html docs docs/_build/html -W --keep-going  # build docs
rm -rf docs/_build                                                   # clean docs

No uv? Install it with pip install uv or curl -Ls https://astral.sh/uv/install.sh | sh. All commands above work with plain python/pip too — swap uv run → activate .venv, uv sync → pip install -e .[dev,test,docs].

Project Structure

my_project/
├── datasets.yml      # Dataset sources and paths
├── pipeline.yml      # Derivative definitions and execution list
└── custom_nodes.py   # Optional custom node definitions

Quick Example

datasets.yml

my_dataset:
  name: MyDataset
  file_pattern:
    local: data/**/*.vhdr
    hpc: /cluster/BIDS/**/*.vhdr
  derivatives_path:
    local: outputs/
    hpc: /cluster/scratch/out

pipeline.yml

datasets: datasets.yml
mount_point: local
new_definitions: custom_nodes.py  # optional

DerivativeDefinitions:
  CleanedEEG:
    nodes:
      - id: 0
        derivative: SourceFile
      - id: 1
        node: basic_preprocessing
        args:
          mne_object: id.0
          resample: 256
          filter_args:
            l_freq: 0.5
            h_freq: 110

  PowerSpectrum:
    for_dataframe: True
    nodes:
      - id: 0
        derivative: CleanedEEG.fif
      - id: 1
        node: mne_spectrum_array
        args:
          meeg: id.0
          method: multitaper

DerivativeList:
  - CleanedEEG
  - PowerSpectrum

Python

from neurodags.loaders import load_configuration
from neurodags.orchestrators import run_pipeline

config = load_configuration("pipeline.yml")

# Run all derivatives in "DerivativeList", auto-sorted by dependency order
run_pipeline(config)

# Or run specific ones (also sorted by dependency order)
run_pipeline(config, derivatives=["CleanedEEG"])

CLI

neurodags validate pipeline.yml

# Run all derivatives in DerivativeList (dependency-sorted)
neurodags run pipeline.yml

# Or run specific ones
neurodags run pipeline.yml --derivative CleanedEEG

Custom Nodes

Add nodes without modifying or forking the package:

# custom_nodes.py
from neurodags.nodes import register_node
from neurodags.definitions import Artifact, NodeResult

@register_node
def my_node(data) -> NodeResult:
    result = compute(data)
    return NodeResult(
        artifacts={
            ".nc": Artifact(
                item=result,
                writer=lambda path: result.to_netcdf(path),
            ),
        },
    )

Key rules:

A node is a function decorated with @register_node.
It returns a NodeResult.
A NodeResult contains artifacts — a dict mapping file extension to Artifact(item, writer).

Dataframe Assembly

from neurodags.orchestrators import build_derivative_dataframe

df = build_derivative_dataframe("pipeline.yml", output_format="wide")

Derivatives marked for_dataframe: True are collected automatically. Supports "wide" (one row per file) and "long" (one row per value) formats.

CLI equivalent:

neurodags dataframe pipeline.yml --format wide --output derivative_dataframe.csv

Parallel Execution

# pipeline.yml
n_jobs: 4           # -1 = all cores, 1 or null = serial
joblib_backend: loky
joblib_prefer: processes

Or via Python:

run_pipeline(config, derivatives=["MyDerivative"], n_jobs=4)

Or via CLI:

neurodags run pipeline.yml --derivative MyDerivative --n-jobs 4

Visualization

neurodags view path/to/file.fif
neurodags view path/to/file.nc

# Alternative module entry point
python -m neurodags.visualization path/to/file.fif
python -m neurodags.visualization path/to/file.nc

Built-in Dash-Plotly explorer with dimension-aware UI — dropdown per axis, plot types: Line, Scatter, Bar, Heatmap.

Inspection (Dry Run)

# All derivatives in DerivativeList
run_pipeline(config, dry_run=True)

# Or a specific one
run_pipeline(config, derivatives=["MyDerivative"], dry_run=True)

Returns a dataframe describing the execution plan without running any nodes. When a node fails, a .error marker file is written with the error message — failed files are retried on the next run. If a retry succeeds, the .error marker is automatically removed.

CLI equivalent:

# All derivatives in DerivativeList
neurodags dry-run pipeline.yml --output dry_run_results.csv

# Or a specific one
neurodags dry-run pipeline.yml --derivative MyDerivative --output dry_run_results.csv

Derivative Flags

Flag	Default	Description
`save`	`True`	Persist artifacts to disk. `False` = compute but don't write.
`overwrite`	`False`	Force recompute even if output exists.
`for_dataframe`	`False`	Include this derivative in `build_derivative_dataframe`.

Custom Node Definitions

Point new_definitions to one or more Python files:

new_definitions:
  - custom_nodes/my_nodes.py
  - /abs/path/to/other_nodes.py

Relative paths are resolved from the pipeline YAML location.

Documentation

https://yjmantilla.github.io/neurodags/

HDF5 / NetCDF Note

If you encounter RuntimeError: NetCDF: HDF error:

uv run pip install --no-binary=h5py h5py
# or without uv:
pip install --no-binary=h5py h5py

Contributing

See CONTRIBUTING.md.

License

MIT. See LICENSE.

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

yjmantilla

These details have not been verified by PyPI

Release history Release notifications | RSS feed

0.2.1

May 14, 2026

This version

0.2.0

May 14, 2026

0.1.1

May 12, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

neurodags-0.2.0.tar.gz (83.0 kB view details)

Uploaded May 14, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

neurodags-0.2.0-py3-none-any.whl (96.4 kB view details)

Uploaded May 14, 2026 Python 3

File details

Details for the file neurodags-0.2.0.tar.gz.

File metadata

Download URL: neurodags-0.2.0.tar.gz
Upload date: May 14, 2026
Size: 83.0 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for neurodags-0.2.0.tar.gz
Algorithm	Hash digest
SHA256	`dd126e776195594860ace431fb0613d409c6dfa78343386a71d777755bb8ec0b`
MD5	`68497081f491384381ecff652a0d6f6d`
BLAKE2b-256	`9fb4853a594da39f7146634222392e6673741ed2e5c59aca24c89ac9721ddb49`

See more details on using hashes here.

Provenance

The following attestation bundles were made for neurodags-0.2.0.tar.gz:

Publisher: publish.yml on yjmantilla/neurodags

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: neurodags-0.2.0.tar.gz
- Subject digest: dd126e776195594860ace431fb0613d409c6dfa78343386a71d777755bb8ec0b
- Sigstore transparency entry: 1535760951
- Sigstore integration time: May 14, 2026
Source repository:
- Permalink: yjmantilla/neurodags@4d24d873419520a255021d464a04c5ab7a1d6a54
- Branch / Tag: refs/tags/v0.2.0
- Owner: https://github.com/yjmantilla
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@4d24d873419520a255021d464a04c5ab7a1d6a54
- Trigger Event: push

File details

Details for the file neurodags-0.2.0-py3-none-any.whl.

File metadata

Download URL: neurodags-0.2.0-py3-none-any.whl
Upload date: May 14, 2026
Size: 96.4 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for neurodags-0.2.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`6ea35432605b27cba3a9377acc8959f0be5b7506778eadb51ab9e297068bb1da`
MD5	`244a1d8bff519091d918de3f2590c097`
BLAKE2b-256	`e5bb9a3b168ee63755d7ff6851c8646516bc7ff1aff3f8f0af40db5605bb2e6b`

See more details on using hashes here.

Provenance

The following attestation bundles were made for neurodags-0.2.0-py3-none-any.whl:

Publisher: publish.yml on yjmantilla/neurodags

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: neurodags-0.2.0-py3-none-any.whl
- Subject digest: 6ea35432605b27cba3a9377acc8959f0be5b7506778eadb51ab9e297068bb1da
- Sigstore transparency entry: 1535761221
- Sigstore integration time: May 14, 2026
Source repository:
- Permalink: yjmantilla/neurodags@4d24d873419520a255021d464a04c5ab7a1d6a54
- Branch / Tag: refs/tags/v0.2.0
- Owner: https://github.com/yjmantilla
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@4d24d873419520a255021d464a04c5ab7a1d6a54
- Trigger Event: push

neurodags 0.2.0

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

NeuroDAGs

Core Idea

Design Principles

Features

Installation

Quickstart

CLI

Development

Project Structure

Quick Example

Custom Nodes

Dataframe Assembly

Parallel Execution

Visualization

Inspection (Dry Run)

Derivative Flags

Custom Node Definitions

Documentation

HDF5 / NetCDF Note

Contributing

License

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance