Skip to main content

A command-line interface for downloading neuroimaging datasets

Project description

NeuroDataHub CLI

PyPI version Python versions License Downloads

A command-line interface for downloading neuroimaging datasets with ease! Access over 30 popular neuroimaging datasets from various sources including INDI, OpenNeuro, ReproBrainChart, and more.

🌐 Homepage: https://blackpearl006.github.io/NeuroDataHub/
📦 Repository: https://github.com/blackpearl006/neurodatahub-cli

Features

  • 🗂️ 30+ Datasets: Access popular neuroimaging datasets from multiple sources
  • 🔍 Smart Search: Find datasets by name, description, or category
  • 🚀 Multiple Backends: Supports AWS CLI, aria2c, DataLad, and more
  • 🔐 Authentication Support: Handles various authentication workflows
  • 📊 Rich UI: Beautiful tables, progress bars, and interactive prompts
  • Resume Support: Interrupted downloads can be resumed
  • 🛡️ Dependency Checking: Validates required tools with helpful installation guidance
  • 🎯 Filtering: Filter datasets by category, authentication requirements, size, etc.

Quick Start

Installation

# Via pip
pip install neurodatahub-cli

# Via conda
conda install -c conda-forge neurodatahub-cli

Basic Usage

# List all available datasets
neurodatahub --list

# Search for specific datasets
neurodatahub search "brain development"

# Get detailed info about a dataset
neurodatahub info HBN

# Download a dataset
neurodatahub pull HBN --path ./data/HBN

# Check system dependencies
neurodatahub check

Dataset Categories

INDI Datasets (No Authentication Required)

Access datasets from the International Neuroimaging Data-sharing Initiative:

  • HBN - Healthy Brain Network (2TB)
  • CORR - Consortium for Reliability and Reproducibility (500GB)
  • ADHD200 - ADHD diagnosis dataset (200GB)
  • NKI - Nathan Kline Institute Rockland Sample (800GB)
  • And many more...

OpenNeuro Datasets (No Authentication Required)

Open platform datasets:

  • AOMIC variants - Amsterdam Open MRI Collection
  • Pixar - fMRI responses to movie clips
  • MPI - Max Planck Institute datasets
  • Dense sampling - High-resolution single subjects

Independent Datasets

  • IXI - Imperial College London brain MRI (12GB)
  • OASIS-1/2 - Cross-sectional and longitudinal aging studies
  • HCP - Human Connectome Project (requires authentication)
  • CamCAN - Cambridge aging study (requires authentication)

ReproBrainChart (RBC) Datasets

Git/DataLad-based datasets:

  • PNC - Philadelphia Neurodevelopmental Cohort
  • BHRC - Brain Health Registry Cohort
  • CCNP - Chinese Color Nest Project

IDA-LONI Datasets (Interactive Authentication)

Datasets requiring complex authentication workflows:

  • ADNI - Alzheimer's Disease Neuroimaging Initiative
  • PPMI - Parkinson's Progression Markers Initiative
  • AIBL - Australian Imaging Biomarkers study
  • MCSA - Mayo Clinic Study of Aging

Installation & Dependencies

System Dependencies

Different datasets require different tools. The CLI will guide you through installation:

# AWS CLI (for INDI and OpenNeuro datasets)
pip install awscli
# or
conda install -c conda-forge awscli

# aria2c (for fast parallel downloads)
brew install aria2          # macOS
apt-get install aria2       # Ubuntu/Debian
conda install -c conda-forge aria2

# DataLad (for RBC datasets)
pip install datalad

# Firefox (for interactive authentication)
# Download from https://www.mozilla.org/firefox/

Check Dependencies

neurodatahub check

This will show you which tools are installed and provide installation guidance for missing dependencies.

Command Reference

Core Commands

List Datasets

# List all datasets
neurodatahub --list
neurodatahub list

# Filter by category
neurodatahub --list --category indi
neurodatahub list --category openneuro

# Show only datasets requiring authentication
neurodatahub --list --auth-only
neurodatahub list --auth-required

# Show only datasets NOT requiring authentication  
neurodatahub --list --no-auth-only
neurodatahub list --no-auth

# Show detailed information
neurodatahub --list --detailed

Download Datasets

# Basic download
neurodatahub --pull HBN --path ./data/HBN
neurodatahub pull HBN ./data/HBN

# Dry run (see what would be downloaded)
neurodatahub pull HBN ./data/HBN --dry-run

# Skip confirmation prompts
neurodatahub pull HBN ./data/HBN --force

Information Commands

# Dataset information
neurodatahub info HBN

# Search datasets
neurodatahub search "alzheimer"
neurodatahub search "resting state"

# Show categories
neurodatahub categories

# Show datasets in specific category
neurodatahub categories --category ida

# Show statistics
neurodatahub stats

# Check system dependencies
neurodatahub check

# Show version
neurodatahub version

Authentication Workflows

No Authentication Required

Most INDI and OpenNeuro datasets can be downloaded immediately:

neurodatahub pull HBN ./data/HBN

AWS Credentials Required (HCP)

For datasets like HCP that require AWS credentials:

# The CLI will guide you through AWS setup
neurodatahub pull HCP_1200 ./data/HCP

# Or set up manually:
aws configure

IDA-LONI Interactive Workflow

For complex datasets like ADNI, PPMI, etc., the CLI provides an interactive checklist:

neurodatahub pull ADNI ./data/ADNI

This will walk you through:

  1. ✅ IDA-LONI account registration
  2. ✅ Data Use Agreement (DUA) approval
  3. ✅ Image collection creation
  4. ✅ Advanced Downloader link generation
  5. ✅ IP address verification
  6. 📥 Automated download execution

Examples

Download Multiple Datasets

# Download several INDI datasets
for dataset in HBN CORR ADHD200; do
  neurodatahub pull $dataset ./data/$dataset
done

Search and Filter Workflow

# Find brain development datasets
neurodatahub search "development"

# Show only small datasets without authentication
neurodatahub list --no-auth | grep -E "(MB|GB)" | head -5

# List all OpenNeuro datasets
neurodatahub list --category openneuro

Check Before Downloading

# Preview what will be downloaded
neurodatahub pull HBN ./data/HBN --dry-run

# Check system readiness
neurodatahub check

# Get dataset details
neurodatahub info HBN

Troubleshooting

Common Issues

Download Failed - Missing Dependencies

# Check what's missing
neurodatahub check

# Install missing tools as suggested
pip install awscli
brew install aria2

Authentication Issues

# For AWS datasets, check credentials
aws configure list

# For IDA-LONI datasets, verify:
# - Account registration
# - DUA approval
# - Same IP for link generation and download

Network/Resume Issues

# Most downloads can be resumed by re-running the command
neurodatahub pull HBN ./data/HBN

# For aria2c downloads, use native resume:
aria2c --continue=true [URL]

Getting Help

  1. Built-in Help

    neurodatahub --help
    neurodatahub pull --help
    
  2. Check Dependencies

    neurodatahub check
    
  3. Dataset Information

    neurodatahub info DATASET_ID
    
  4. GitHub Issues: https://github.com/blackpearl006/neurodatahub-cli/issues

Contributing

We welcome contributions! Please see our Contributing Guidelines for details.

Development Setup

git clone https://github.com/blackpearl006/neurodatahub-cli.git
cd neurodatahub-cli
pip install -e ".[dev]"

Running Tests

pytest tests/

License

MIT License - see LICENSE file for details.

Citation

If you use NeuroDataHub CLI in your research, please cite:

@software{neurodatahub_cli,
  title={NeuroDataHub},
  author={Ninad Aithal},
  year={2025},
  url={https://github.com/blackpearl006/neurodatahub-cli},
  version={0.1.0}
}

Acknowledgments

  • Thanks to all dataset providers for making their data openly available
  • INDI consortium for pioneering open neuroimaging data sharing
  • OpenNeuro platform for standardized dataset hosting
  • ReproBrainChart project for reproducible brain charting
  • IDA-LONI for comprehensive neuroimaging data archives

Links


Made with ❤️ for the neuroimaging community

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

neurodatahub_cli-1.0.1.tar.gz (85.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

neurodatahub_cli-1.0.1-py3-none-any.whl (82.6 kB view details)

Uploaded Python 3

File details

Details for the file neurodatahub_cli-1.0.1.tar.gz.

File metadata

  • Download URL: neurodatahub_cli-1.0.1.tar.gz
  • Upload date:
  • Size: 85.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.2

File hashes

Hashes for neurodatahub_cli-1.0.1.tar.gz
Algorithm Hash digest
SHA256 e3346bf1ecbe895dd80e514f44cbb9fe83c460862fa7b32d026edc236c77fdee
MD5 509e9486b8152bc1d42057873c12a13f
BLAKE2b-256 195cee77f421c8bdcde45fcdd0353219d3760f16e853b0a66556bc9fc3e430dc

See more details on using hashes here.

File details

Details for the file neurodatahub_cli-1.0.1-py3-none-any.whl.

File metadata

File hashes

Hashes for neurodatahub_cli-1.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 975900a259dd3f77ecea828bf86b74c699ff4e3add4559d2435285c43e221f06
MD5 aea2d717e226a23af0fa91520854529d
BLAKE2b-256 a393699ff32a7270f4f5ed838c8571082fd1d7d9697f9029c1fb1fd1ec48e3a0

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page