Automated BIDS standardization tool powered by LLM-first architecture

These details have not been verified by PyPI

Project links

Project description

auto-bidsify

Automated BIDS standardization tool powered by LLM-first architecture.

Features

General compatibility: Handles diverse dataset structures (flat, hierarchical, multi-site)
Multi-modal support: MRI, fNIRS, and mixed modality datasets
Intelligent metadata extraction: Automatic participant demographics from DICOM headers, documents, and filenames
Format conversion: DICOM→NIfTI, CSV→SNIRF, and more
Evidence-based reasoning: Confidence scoring and provenance tracking for all decisions

Supported Formats

Input formats:

MRI: DICOM, NIfTI (.nii, .nii.gz)
fNIRS: SNIRF, Homer3 (.nirs), CSV/TSV tables
Documents: PDF, DOCX, TXT, Markdown, ...

Output: BIDS-compliant dataset (v1.10.0)

Quick Start

Installation

# Clone repository
git clone https://github.com/yourusername/auto-bidsify.git
cd auto-bidsify

# Setup environment
conda create -n bidsify python=3.10
conda activate bidsify
pip install -r requirements.txt

# Set OpenAI API key
export OPENAI_API_KEY="your-key-here"

Basic Usage

# Full pipeline (one command)
python cli.py full \
  --input /path/to/your/data \
  --output outputs/my_dataset \
  --model gpt-4o \
  --modality mri

# Step-by-step execution
python cli.py ingest --input data.zip --output outputs/run
python cli.py evidence --output outputs/run --modality mri
python cli.py trio --output outputs/run --model gpt-4o
python cli.py plan --output outputs/run --model gpt-4o
python cli.py execute --output outputs/run
python cli.py validate --output outputs/run

Command Options

--input PATH          Input data (archive or directory)
--output PATH         Output directory
--model MODEL         LLM model (default: gpt-4o)
--modality TYPE       Data modality: mri|nirs|mixed
--nsubjects N         Number of subjects (optional)
--describe "TEXT"     Dataset description (recommended)

Pipeline Stages

Stage	Command	Input	Output	Purpose
1	`ingest`	Raw data	`ingest_info.json`	Extract/reference data
2	`evidence`	All files	`evidence_bundle.json`	Analyze structure, detect subjects
3	`classify`	Mixed data	`classification_plan.json`	Separate MRI/fNIRS (optional)
4	`trio`	Evidence	BIDS trio files	Generate metadata files
5	`plan`	Evidence + trio	`BIDSPlan.yaml`	Create conversion strategy
6	`execute`	Plan	`bids_compatible/`	Execute conversions
7	`validate`	BIDS dataset	Validation report	Check compliance

Output Structure

outputs/my_dataset/
  bids_compatible/              # Final BIDS dataset
    dataset_description.json
    README.md
    participants.tsv
    sub-001/
      anat/
        sub-001_T1w.nii.gz
      func/
        sub-001_task-rest_bold.nii.gz
  _staging/                     # Intermediate files
    evidence_bundle.json
    BIDSPlan.yaml
    conversion_log.json

Examples

Example 1: Single-site MRI study

python cli.py full \
  --input brain_scans/ \
  --output outputs/study1 \
  --nsubjects 50 \
  --model gpt-4o \
  --modality mri

Example 2: Multi-site dataset with description

python cli.py full \
  --input camcan_data/ \
  --output outputs/camcan \
  --model gpt-4o \
  --modality mri \
  --describe "Cambridge Centre for Ageing and Neuroscience: 650 participants, ages 18-88, multi-site MRI study"

Example 3: fNIRS dataset from CSV

python cli.py full \
  --input fnirs_study/ \
  --output outputs/fnirs \
  --model gpt-4o \
  --modality nirs \
  --describe "Prefrontal cortex activation during cognitive tasks, 30 subjects"

Architecture

LLM-First Design:

Python: Deterministic operations (file I/O, format conversion, validation)
LLM: Semantic understanding (file classification, metadata extraction, pattern recognition)
Hybrid: Best of both worlds - reliability + flexibility

Requirements

Python 3.10+
OpenAI API key
Optional: dcm2niix for DICOM conversion
Optional: bids-validator for validation

Current Status

Version: 1.0 (LLM-First Architecture with Evidence-Based Reasoning)

Tested datasets:

Visible Human Project (flat structure, CT scans)
CamCAN (hierarchical, multi-site, 1288 subjects)
[Your dataset here - help us test!]

Known limitations:

Classification stage (Stage 3) and mat/spreadsheet conversion is experimental
Some edge cases in participant metadata extraction

Contributing

We need YOUR datasets to improve robustness! Please test and report:

Success cases
Failure cases
Edge cases

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.9.1

Apr 1, 2026

0.9.0

Mar 31, 2026

0.8.6

Mar 24, 2026

0.8.5

Mar 24, 2026

0.8.0

Mar 19, 2026

0.7.0

Mar 18, 2026

0.6.2

Mar 14, 2026

0.6.1

Mar 13, 2026

0.6.0

Mar 13, 2026

This version

0.5.0

Feb 19, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

autobidsify-0.5.0.tar.gz (67.9 kB view details)

Uploaded Feb 19, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

autobidsify-0.5.0-py3-none-any.whl (76.1 kB view details)

Uploaded Feb 19, 2026 Python 3

File details

Details for the file autobidsify-0.5.0.tar.gz.

File metadata

Download URL: autobidsify-0.5.0.tar.gz
Upload date: Feb 19, 2026
Size: 67.9 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.10.12

File hashes

Hashes for autobidsify-0.5.0.tar.gz
Algorithm	Hash digest
SHA256	`08788b709d32b4cedfadee60bbe16f9d67c5224b72ab6227d4100905f46d96aa`
MD5	`af5ef462608b14b6ba1febca16c2e247`
BLAKE2b-256	`49e899104c64cc129704241dc8b29a038aff0aba6994fd12228e957a5b576b93`

See more details on using hashes here.

File details

Details for the file autobidsify-0.5.0-py3-none-any.whl.

File metadata

Download URL: autobidsify-0.5.0-py3-none-any.whl
Upload date: Feb 19, 2026
Size: 76.1 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.10.12

File hashes

Hashes for autobidsify-0.5.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`e9e4d2d84dcd6be73c386dc725c3670cd894ad4e5cb7924cf48fa4ae8f8a37b9`
MD5	`44cc9e64614ebfeec1d155998486ede7`
BLAKE2b-256	`fda39aa52a8845e30d90f5a5d61da6f63e8c2e5ece397f07531ba4d599498314`

See more details on using hashes here.

autobidsify 0.5.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

auto-bidsify

Features

Supported Formats

Quick Start

Installation

Basic Usage

Command Options

Pipeline Stages

Output Structure

Examples

Example 1: Single-site MRI study

Example 2: Multi-site dataset with description

Example 3: fNIRS dataset from CSV

Architecture

Requirements

Current Status

Contributing

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes