Data loader and file reader for the OpenBIDS format
Project description
bidsreader
A Python library for reading and working with neuroimaging data stored in the BIDS (Brain Imaging Data Structure) format. Provides a structured, object-oriented interface for loading EEG and iEEG data, events, electrodes, and channel metadata, with built-in support for MNE-Python and PTSA.
Features
- Load BIDS-compliant EEG/iEEG datasets with minimal boilerplate
- Automatic detection of device type (EEG vs iEEG) and coordinate space
- Load events, electrodes, channels, raw data, and epochs through a unified reader API
- Filter trials by type across events DataFrames, MNE Raw, and MNE Epochs
- Convert between MNE and PTSA data formats
- Detect and convert EEG signal units (V, mV, uV, nV, etc.)
- Custom exception hierarchy for clear, actionable error messages
Installation
Prerequisites
- Python 3.10+
- Access to a BIDS-formatted dataset
Install from source
git clone <repository-url>
cd bidsreader
pip install -e .
Note: The project currently has no
pyproject.tomlorsetup.py. To use it without one, add the project root to your Python path or install in development mode after creating a minimalpyproject.toml(see Development Setup).
Dependencies
Required:
| Package | Purpose |
|---|---|
| mne | EEG data structures and I/O |
| mne-bids | BIDS path resolution and reading |
| pandas | Tabular data (events, channels) |
| numpy | Numeric operations |
Optional:
| Package | Purpose |
|---|---|
| ptsa | PTSA TimeSeries conversion (convert_unit, mne_*_to_ptsa) |
| pytest | Running the test suite |
Install all dependencies:
pip install mne mne-bids pandas numpy
# Optional
pip install ptsa pytest
Quick Start
See a more robust tutorial in tutorials/
Basic usage with CMLBIDSReader
from bidsreader import CMLBIDSReader
# Initialize a reader (defaults to /data/LTP_BIDS for CML data)
reader = CMLBIDSReader(subject="R1001P", task="FR1", session=0)
# Load behavioral events
events = reader.load_events("beh")
# Load electrode locations
electrodes = reader.load_electrodes()
# Load channel metadata (intracranial requires acquisition type)
channels = reader.load_channels("monopolar")
# Load combined channel + electrode data
combined = reader.load_combined_channels("bipolar")
# Load raw EEG data (returns MNE Raw object)
raw = reader.load_raw(acquisition="monopolar")
# Load epochs around events
epochs = reader.load_epochs(tmin=-0.5, tmax=1.5, acquisition="monopolar")
Using a custom BIDS root
reader = CMLBIDSReader(
root="/path/to/your/bids/dataset",
subject="sub01",
task="rest",
session="01",
device="eeg",
)
Querying dataset metadata
reader = CMLBIDSReader(root="/data/LTP_BIDS", subject="R1001P", task="FR1")
# List all subjects in the dataset
subjects = reader.get_dataset_subjects()
# List all tasks in the dataset
tasks = reader.get_dataset_tasks()
# List sessions for this subject
sessions = reader.get_subject_sessions()
# List tasks for this subject
subject_tasks = reader.get_subject_tasks()
# Get the highest session number across all subjects
max_session = reader.get_dataset_max_sessions(outlier_thresh=100)
Changing reader fields after creation
reader = CMLBIDSReader(subject="R1001P", task="FR1", session=0)
# Switch to a different session
reader.set_fields(session=1)
# Switch subject and task
reader.set_fields(subject="R1002P", task="catFR1")
Filtering events by trial type
from bidsreader import filter_events_df_by_trial_types, filter_by_trial_types
# Filter a DataFrame
events = reader.load_events("beh")
word_events, indices = filter_events_df_by_trial_types(events, ["WORD"])
# Filter across multiple data objects at once (with consistency checks)
filtered_df, filtered_raw_events, filtered_epochs, event_id, idx = filter_by_trial_types(
["WORD", "STIM"],
events_df=events,
epochs=epochs,
)
Unit detection and conversion
from bidsreader import detect_unit, get_scale_factor, convert_unit
# Detect the unit of an MNE object
unit = detect_unit(raw) # e.g., "V"
# Get conversion factor
factor = get_scale_factor("V", "uV") # 1_000_000.0
# Convert data to a target unit (returns a copy by default)
raw_uv = convert_unit(raw, "uV")
Converting MNE data to PTSA TimeSeries
from bidsreader import mne_epochs_to_ptsa, mne_raw_to_ptsa
# Convert epochs (requires events DataFrame with 'sample' column)
ts = mne_epochs_to_ptsa(epochs, events)
# Convert raw data (optionally select channels and time window)
ts = mne_raw_to_ptsa(raw, picks=["E1", "E2"], tmin=0.0, tmax=10.0)
Architecture
Class hierarchy
BaseReader # Abstract base — BIDS path construction, metadata queries, field validation
└── CMLBIDSReader # Concrete reader for the CML (Computational Memory Lab) dataset
Module overview
| Module | Purpose |
|---|---|
basereader.py |
BaseReader class — shared BIDS logic and metadata queries |
cmlbidsreader.py |
CMLBIDSReader — CML-specific loading and auto-detection |
filtering.py |
Trial-type filtering for DataFrames, MNE Raw, and Epochs |
convert.py |
MNE to PTSA TimeSeries conversion |
units.py |
Unit detection, scaling, and conversion |
helpers.py |
Utility functions (validation, BIDS prefix handling, bipolar electrode merging) |
exc.py |
Custom exception hierarchy |
_errorwrap.py |
@public_api decorator for consistent exception wrapping |
Exception hierarchy
All exceptions inherit from BIDSReaderError, so you can catch everything with a single handler:
BIDSReaderError
├── InvalidOptionError # Invalid argument value
├── MissingRequiredFieldError # Required reader field not set
├── FileNotFoundBIDSError # Expected BIDS file not found
├── AmbiguousMatchError # Multiple files matched when one expected
├── DataParseError # TSV/JSON parsing failure
├── DependencyError # Optional dependency issue
└── ExternalLibraryError # Unexpected error from MNE/pandas/etc.
from bidsreader.exc import BIDSReaderError, FileNotFoundBIDSError
try:
events = reader.load_events()
except FileNotFoundBIDSError:
print("Events file not found for this subject/session")
except BIDSReaderError as e:
print(f"Something went wrong: {e}")
Creating a New Reader
To support a different BIDS dataset, subclass BaseReader and implement your dataset-specific logic. Here is a step-by-step guide.
Step 1: Create your reader class
Create a new file (e.g., bidsreader/myreader.py):
import pandas as pd
import mne
from pathlib import Path
from typing import Optional, Union
from .basereader import BaseReader
from ._errorwrap import public_api
from .helpers import validate_option
from .exc import FileNotFoundBIDSError
class MyDatasetReader(BaseReader):
"""Reader for the My Dataset BIDS archive."""
# Valid options for constrained fields
VALID_DEVICES = ("eeg", "meg")
def __init__(
self,
root: Optional[Union[str, Path]] = "/data/my_dataset",
subject: Optional[str] = None,
task: Optional[str] = None,
session: Optional[str | int] = None,
space: Optional[str] = None,
acquisition: Optional[str] = None,
device: Optional[str] = None,
):
# Validate device before passing to base
device = validate_option("device", device, self.VALID_DEVICES)
super().__init__(
root=root,
subject=subject,
task=task,
session=session,
space=space,
acquisition=acquisition,
device=device,
)
# --- Override auto-detection hooks ---
def _determine_device(self) -> Optional[str]:
"""Infer device type from subject ID or dataset structure.
Return None if it cannot be determined.
"""
if self.subject is None:
return None
# Example: subjects starting with "MEG" use MEG
if self.subject.startswith("MEG"):
return "meg"
return "eeg"
def _determine_space(self) -> Optional[str]:
"""Infer coordinate space from files on disk.
Return None or raise FileNotFoundBIDSError / AmbiguousMatchError
if it cannot be determined.
"""
# Implement dataset-specific logic here
return "MNI152NLin2009aSym"
# --- Add your loading methods ---
@public_api
def load_events(self) -> pd.DataFrame:
"""Load behavioral events for the current subject/session/task."""
self._require(("subject", "task", "session", "device"), context="load_events")
bp = self._bp(datatype="beh", suffix="beh", extension=".tsv")
matches = bp.match()
if not matches:
raise FileNotFoundBIDSError(f"No events file found for {bp}")
return pd.read_csv(matches[0].fpath, sep="\t")
@public_api
def load_raw(self) -> mne.io.BaseRaw:
"""Load raw continuous data."""
from mne_bids import read_raw_bids
self._require(("subject", "task", "session", "device"), context="load_raw")
bp = self._bp(datatype=self.device)
return read_raw_bids(bp)
Step 2: Key patterns to follow
-
Validate constrained fields in
__init__usingvalidate_option()before callingsuper().__init__(). -
Override
_determine_device()and_determine_space()to enable automatic detection. These are called lazily the first timereader.deviceorreader.spaceis accessed. ReturnNoneif detection fails — the base class will emit a warning. -
Use
self._require(fields, context=...)at the start of each loading method to ensure the necessary fields are set before attempting file I/O. -
Use
self._bp(**kwargs)to constructBIDSPathobjects for file matching. This handles BIDS-standard path construction using the reader's current field values. -
Decorate all public methods with
@public_apiso that external exceptions (FileNotFoundError, JSONDecodeError, etc.) are automatically mapped to theBIDSReaderErrorhierarchy. -
Use
self._add_bids_prefix(field, value)when you need to manually construct BIDS-prefixed path segments (e.g.,"sub-001","ses-0").
Step 3: Export your reader
Add your reader to __init__.py:
from .myreader import MyDatasetReader
Step 4: Write tests
Follow the patterns in tests/conftest.py for fixtures and tests/test_cmlbidsreader.py for test structure. Key patterns:
- Use
tmp_pathfixtures to create temporary BIDS directory structures - Use skip decorators for integration tests that require real data on disk
- Test both the happy path and error cases (missing fields, invalid options, missing files)
import pytest
from bidsreader import MyDatasetReader
@pytest.fixture
def my_reader(tmp_path):
return MyDatasetReader(root=tmp_path, subject="EEG001", task="rest", session=1)
def test_device_detection(my_reader):
assert my_reader.device == "eeg"
def test_missing_field_raises(tmp_path):
reader = MyDatasetReader(root=tmp_path, subject="EEG001", task="rest")
reader.session = None
with pytest.raises(Exception):
reader.load_events()
Development Setup
Running tests
# Run all tests
python -m pytest tests/
# Run a specific test file
python -m pytest tests/test_basereader.py -v
# Run with output
python -m pytest tests/ -v -s
Integration tests that depend on real data at /data/LTP_BIDS/ are skipped automatically when that data is not available.
Creating a pyproject.toml (recommended)
If you want proper pip install -e . support, create a pyproject.toml:
[build-system]
requires = ["setuptools>=64"]
build-backend = "setuptools.backends._legacy:_Backend"
[project]
name = "bidsreader"
version = "0.1.0"
description = "Data loader and file reader for the OpenBIDS format"
requires-python = ">=3.10"
dependencies = [
"mne",
"mne-bids",
"pandas",
"numpy",
]
[project.optional-dependencies]
ptsa = ["ptsa"]
dev = ["pytest"]
Then install with:
pip install -e ".[dev]"
API Reference
BaseReader
| Method | Description |
|---|---|
set_fields(**kwargs) |
Set multiple reader fields at once (chainable) |
get_dataset_subjects() |
List all subjects in the dataset |
get_dataset_tasks() |
List all tasks in the dataset |
get_subject_sessions() |
List sessions for the current subject |
get_subject_tasks() |
List tasks for the current subject |
get_dataset_max_sessions(outlier_thresh=None) |
Get highest session number across all subjects |
CMLBIDSReader
Inherits all BaseReader methods, plus:
| Method | Description |
|---|---|
is_intracranial() |
Returns True if device is "ieeg" |
load_events(event_type="beh") |
Load events TSV ("beh" or device-type events) |
load_electrodes() |
Load electrode coordinates TSV |
load_channels(acquisition=None) |
Load channel metadata TSV (iEEG requires "monopolar" or "bipolar") |
load_combined_channels(acquisition=None) |
Merge channel + electrode data into one DataFrame |
load_coordsystem_desc() |
Load coordinate system JSON metadata |
load_raw(acquisition=None) |
Load raw continuous data (returns mne.io.BaseRaw) |
load_epochs(tmin, tmax, events=None, baseline=None, acquisition=None, event_repeated="merge", channels=None, preload=False) |
Create mne.Epochs from raw data and events |
Standalone Functions
| Function | Module | Description |
|---|---|---|
filter_events_df_by_trial_types(events_df, trial_types) |
filtering |
Filter events DataFrame by trial type |
filter_raw_events_by_trial_types(raw, trial_types) |
filtering |
Filter MNE Raw annotations by trial type |
filter_epochs_by_trial_types(epochs, trial_types) |
filtering |
Filter MNE Epochs by trial type |
filter_by_trial_types(trial_types, *, events_df, raw, epochs) |
filtering |
Filter multiple data objects with consistency checks |
detect_unit(data, current_unit=None) |
units |
Detect or validate EEG data unit |
get_scale_factor(from_unit, to_unit) |
units |
Get multiplicative conversion factor between units |
convert_unit(data, target, *, current_unit=None, copy=True) |
units |
Convert EEG data to a target unit |
mne_epochs_to_ptsa(epochs, events) |
convert |
Convert MNE Epochs to PTSA TimeSeries |
mne_raw_to_ptsa(raw, picks=None, tmin=None, tmax=None) |
convert |
Convert MNE Raw to PTSA TimeSeries |
License
TBD
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file bidsreader-0.2.1.tar.gz.
File metadata
- Download URL: bidsreader-0.2.1.tar.gz
- Upload date:
- Size: 44.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
987aafb8fe56b5d4c7debfb9f5c80d3f8a9ce57a1fc0cbfaabb374942884d03b
|
|
| MD5 |
c91840cd6042ac93ee04cfff6e5ff42d
|
|
| BLAKE2b-256 |
cf8b9e67580533c7bf8145cae83eb8db30da56d0ff1c8ee6edec052aa41dedd2
|
Provenance
The following attestation bundles were made for bidsreader-0.2.1.tar.gz:
Publisher:
publish.yml on pennmem/bidsreader
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
bidsreader-0.2.1.tar.gz -
Subject digest:
987aafb8fe56b5d4c7debfb9f5c80d3f8a9ce57a1fc0cbfaabb374942884d03b - Sigstore transparency entry: 1255412907
- Sigstore integration time:
-
Permalink:
pennmem/bidsreader@116727f182c70281150a77d2181e01923b1b4122 -
Branch / Tag:
refs/tags/v0.2.1 - Owner: https://github.com/pennmem
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@116727f182c70281150a77d2181e01923b1b4122 -
Trigger Event:
push
-
Statement type:
File details
Details for the file bidsreader-0.2.1-py3-none-any.whl.
File metadata
- Download URL: bidsreader-0.2.1-py3-none-any.whl
- Upload date:
- Size: 22.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a3141525e553c6c818ca2a8d0970fa1db2a01ebd2521dc959a95320ea0338c90
|
|
| MD5 |
afdcd4a5df31c78472550960437f5394
|
|
| BLAKE2b-256 |
0b1415e54b478578d788a759ecba8e0f8dfb7acb16595bcfff01e0875861e6e5
|
Provenance
The following attestation bundles were made for bidsreader-0.2.1-py3-none-any.whl:
Publisher:
publish.yml on pennmem/bidsreader
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
bidsreader-0.2.1-py3-none-any.whl -
Subject digest:
a3141525e553c6c818ca2a8d0970fa1db2a01ebd2521dc959a95320ea0338c90 - Sigstore transparency entry: 1255413050
- Sigstore integration time:
-
Permalink:
pennmem/bidsreader@116727f182c70281150a77d2181e01923b1b4122 -
Branch / Tag:
refs/tags/v0.2.1 - Owner: https://github.com/pennmem
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@116727f182c70281150a77d2181e01923b1b4122 -
Trigger Event:
push
-
Statement type: