Skip to main content

Dataset fetcher for neuroscience research (OpenNeuro, BIDS, etc.)

Project description

SciTeX Dataset

Unified access to neuroscience datasets for AI-powered research

PyPI version Tests License: AGPL-3.0

SciTeX Dataset provides a unified interface to discover and fetch metadata from major neuroscience data repositories.

Part of SciTeX.

Data Sources

Repository Description Data Types
OpenNeuro Open platform for sharing neuroimaging data MRI, EEG, MEG, iEEG, PET
DANDI BRAIN Initiative data archive Electrophysiology, Ophys
PhysioNet Physiological signal databases ECG, EEG, clinical data

Quick Start

pip install scitex-dataset

Python API

from scitex_dataset import fetch_all_datasets, format_dataset

# Fetch datasets from OpenNeuro
datasets = fetch_all_datasets(max_datasets=10)

# Format for analysis
for ds in datasets:
    formatted = format_dataset(ds)
    print(f"{formatted['id']}: {formatted['name']} ({formatted['n_subjects']} subjects)")

CLI

# Fetch OpenNeuro datasets
scitex-dataset openneuro -n 100 -o datasets.json -v

# Search across repositories
scitex-dataset search "epilepsy EEG" --source openneuro

# Database operations
scitex-dataset db init
scitex-dataset db sync openneuro
scitex-dataset db query "modality:eeg"

MCP Server

SciTeX Dataset includes an MCP (Model Context Protocol) server, enabling AI agents like Claude to discover and query neuroscience datasets.

# Add to Claude Code MCP config
scitex-dataset mcp install

# Or run directly
scitex-dataset mcp start

Available MCP Tools:

Tool Description
dataset_openneuro_fetch Fetch datasets from OpenNeuro
dataset_openneuro_search Search OpenNeuro by query
dataset_dandi_fetch Fetch datasets from DANDI Archive
dataset_dandi_search Search DANDI by query
dataset_physionet_fetch Fetch datasets from PhysioNet
dataset_physionet_search Search PhysioNet by query
dataset_search Unified search across all repositories
dataset_stats Get repository statistics

With SciTeX Session

import scitex as stx
from scitex_dataset import fetch_all_datasets, format_dataset

@stx.session
def main(logger=stx.INJECTED):
    datasets = fetch_all_datasets(max_datasets=100, logger=logger)
    formatted = [format_dataset(ds) for ds in datasets]
    stx.io.save(formatted, "openneuro_datasets.json")
    return 0

if __name__ == "__main__":
    main()

Why SciTeX Dataset?

  • Unified Interface: One API for OpenNeuro, DANDI, PhysioNet, and more
  • AI-Ready: MCP server enables LLMs to discover relevant datasets
  • Metadata Focus: Fast metadata queries without downloading full datasets
  • SciTeX Integration: Works seamlessly with @stx.session for reproducible research

SciTeX
AGPL-3.0 · ywatanabe@scitex.ai

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

scitex_dataset-0.1.0.tar.gz (46.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

scitex_dataset-0.1.0-py3-none-any.whl (43.8 kB view details)

Uploaded Python 3

File details

Details for the file scitex_dataset-0.1.0.tar.gz.

File metadata

  • Download URL: scitex_dataset-0.1.0.tar.gz
  • Upload date:
  • Size: 46.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.0rc1

File hashes

Hashes for scitex_dataset-0.1.0.tar.gz
Algorithm Hash digest
SHA256 c6ec3d78b277ac73a8e00be5491ff9ad7bcefc9a04630e93497c865056d688f4
MD5 b3e3424552fcab5acc2f4f0594bae8cf
BLAKE2b-256 87fee4bf8e92fff0df7404c62b54e51d57f88b8a0af28eb2c72652a5842c64a7

See more details on using hashes here.

File details

Details for the file scitex_dataset-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: scitex_dataset-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 43.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.0rc1

File hashes

Hashes for scitex_dataset-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 019ee921d1640242347697f49087c49b3b9b851c8abe24409353e9bf60294342
MD5 9b751d47171a9ffdf37018541bc8619b
BLAKE2b-256 b5ed283ef064202e2bc622e91ab6679c8237bda48c14196054eb37e3a07cb684

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page