A unifying framework to benchmark NeuroAI models.

These details have not been verified by PyPI

Project links

Project description

NeuralBench: Unified benchmark for NeuroAI models

NeuralBench: open, reproducible benchmarking of NeuroAI models on EEG, MEG, and fMRI -- 36 EEG tasks, 94 EEG datasets (9.5k+ subjects, 13.6k+ hours of recording), and 14 models

neuralbench is a unified framework to benchmark NeuroAI models. It is designed for evaluating pretrained or randomly initialized models on a diverse suite of downstream tasks for brain modeling -- not for pretraining itself. It supports multiple neuroimaging devices -- EEG, MEG, and fMRI -- with more tasks and devices to come.

Examples:

neuralbench eeg audiovisual_stimulus -m eegnet   # EEG audiovisual stimulus classification with EEGNet

See neuralbench in the documentation.

Installation

Install from PyPI:

pip install neuralbench

Or install from source (e.g. for development):

cd neuralbench-repo
pip install -e .

Quick start

As an example, let's run the audiovisual stimulus classification task with the default model from EEG. This task uses the MNE sample dataset, which is small (~1.5 GB) and can be downloaded quickly. We use it both as a sanity-check task and as a probe of model behaviour in very-low-data regimes (a single subject, 288 trials, 4-class classification).

When you first run neuralbench, you will be prompted to configure paths for data (DATA_DIR), cache (CACHE_DIR) and results (SAVE_DIR). The configuration will be saved to ~/.neuralbench/config.json by default. Set NEURALBENCH_CONFIG=/path/to/config.json to override this location (useful on shared machines or in CI; see Custom config location). If using Weights & Biases, see this section for setup instructions.

Evaluate a model on a downstream task in three commands:

neuralbench eeg audiovisual_stimulus --download   # 1. Download the data (here, the Mne2013SampleEeg study)
neuralbench eeg audiovisual_stimulus --prepare    # 2. Prepare cache (preprocessed data and targets)
neuralbench eeg audiovisual_stimulus              # 3. Run the full grid

Steps 2 and 3 dispatch to SLURM when it is auto-detected on your machine; step 3 additionally requires SLURM_PARTITION to be set in your neuralbench config. Pass --debug to either step to force local execution. Step 2 is mostly useful for larger datasets that benefit from parallel preprocessing with SLURM, and is not strictly necessary for audiovisual_stimulus. Add --debug to any command for a fast local sanity-check run with a subsampled dataset and a limited number of epochs:

neuralbench eeg audiovisual_stimulus --debug      # Local validation run

By default, experiments use the EEGNet architecture[^1]; use -m <model> to swap models.

[^1]: Lawhern, Vernon J., et al. "EEGNet: a compact convolutional neural network for EEG-based brain–computer interfaces." Journal of neural engineering 15.5 (2018): 056013.

Results can be visualized on Weights & Biases, or aggregated locally using --plot-cached (see the Visualizing Results tutorial).

Mean normalized rank (lower is better) of 17 models on the NeuralBench-EEG-Core v1.0 suite, with foundation models, task-specific architectures, and baselines color-coded

_{Example output: model rankings on the NeuralBench-EEG-Core v1.0 suite (one dataset per task, lower rank is better).}

[!TIP] See the full quickstart tutorial for a walkthrough of the CLI, config system, and model selection.

The same workflow applies to MEG and fMRI tasks -- just swap the device and task name:

neuralbench meg typing --debug          # MEG keystroke classification in debug mode
neuralbench fmri image --debug          # fMRI image retrieval in debug mode

Running the full EEG benchmark

To run all 36 EEG tasks end-to-end:

neuralbench eeg all --download          # 1. Download all datasets (~3.3 TB)
neuralbench eeg all --prepare           # 2. Build preprocessing cache (~35 GB)
neuralbench eeg all                     # 3. Run all 36 tasks

Use -m all_classic, -m all_fm, or -m all_classic all_fm to evaluate across all 8 task-specific EEG models, all 6 EEG foundation models, or all 14 EEG models respectively.

See the full EEG benchmark guide for prerequisites, resource requirements (~3.3 TB disk, 1 GPU with 32 GB VRAM per job), dataset variant options, and computational considerations.

[!IMPORTANT] A handful of datasets cannot be fetched automatically and require a one-time manual step (creating an account, accepting a license agreement, or submitting an application form). The affected tasks are pathology, artifact, and clinical_event (TUH EEG Corpus); image and meg/image (THINGS-images); emotion (FACED on Synapse); video (SEED-DV); fmri/image (NSD); motor_imagery and mental_arithmetic (Shin2017OpenA and Shin2017OpenB); eeg/typing / meg/typing (Levy2025Brain, not yet publicly released); and speech (Brennan2019 on Deep Blue Data: the legacy urlretrieve-based downloader is currently blocked by Cloudflare's bot challenge, so the v1 files must be fetched manually via Globus until upstream restores anonymous HTTP access). See the Datasets requiring manual download section of the benchmark guide for the exact steps for each one.

Weights & Biases setup

If using Weights & Biases for experiment tracking, first set the relevant environment variable:

export WANDB_API_KEY=your_wandb_api_key

Then configure the WandbLoggerConfig (from neuraltrain.utils) accordingly, e.g., by setting the host, name, group and entity fields to those of your project.

Leave WANDB_HOST="" blank in your neuralbench config to disable W&B logging entirely; results are still written to SAVE_DIR and remain accessible via --plot-cached.

Benchmark suites

NeuralBench packages its tasks and datasets into named, versioned evaluation suites. Any reported result should cite a concrete suite, e.g. NeuralBench-EEG-Core v1.0. The unqualified name NeuralBench refers to the framework only.

Naming scheme: NeuralBench-<Modality>-<Variant> v<Major>.<Minor>

Modality — EEG, MEG, fMRI (one version history per modality).
Variant — Core (one dataset per task; broad coverage across paradigms) or Full (every dataset registered for each task; reveals within-task variability). Full is always a strict superset of Core at the same version.
Version — semver-style tag for the frozen task/dataset/split specification. Minor bumps are additive and back-comparable; major bumps are breaking.

Current release:

NeuralBench-EEG-Core v1.0 — one dataset × all EEG tasks.
NeuralBench-EEG-Full v1.0 — all datasets × all EEG tasks.

Available Tasks

The following tasks are available in neuralbench, organized by device:

EEG (36 tasks): age, artifact, audiovisual_stimulus, clinical_event, cvep, dementia_diagnosis, depression_diagnosis, emotion, ern, image, lrp, mental_arithmetic, mental_imagery, mental_workload, mismatch_negativity, motor_execution, motor_imagery, n170, n2pc, n400, p3, parkinsons_diagnosis, pathology, psychopathology, reaction_time, schizophrenia_diagnosis, seizure, sentence, sex, sleep_arousal, sleep_stage, speech, ssvep, typing, video, word

MEG (2 tasks): image, typing

fMRI (1 task): image

[!NOTE] More MEG and fMRI tasks are under development. Contributions are welcome!

Adding a new task

See the Adding a New Task tutorial.

Adding a new model

See the Adding a New Model tutorial.

(Advanced) Modifying the training loop

The training loop is implemented in neuralbench/pl_module.py as a PyTorch Lightning LightningModule (BrainModule). Override or extend this class to customize training, validation, or test steps. See the neuralbench API reference for details.

Contributing

See the CONTRIBUTING file for how to help out.

Citing

@misc{banville2026neuralbench,
  title        = {NeuralBench: A Unifying Framework to Benchmark NeuroAI Models},
  author       = {Banville, Hubert and d'Ascoli, St{\'e}phane and Dahan, Simon and Rapin, J{\'e}r{\'e}my and Careil, Marl{\`e}ne and Benchetrit, Yohann and L{\'e}vy, Jarod and Panchavati, Saarang and Ratouchniak, Antoine and Zhang, Mingfang and Cascardi, Elisa and Begany, Katelyn and Brooks, Teon and King, Jean-R{\'e}mi},
  year         = {2026},
  howpublished = {Brain \& AI team, Meta FAIR},
  url          = {https://ai.meta.com/research/publications/neuralbench-a-unifying-framework-to-benchmark-neuroai-models/},
}

Third-Party Content

Third party content pulled from other locations are subject to their own licenses and you may have other legal obligations or restrictions that govern your use of that content.

License

neuralbench is MIT licensed, as found in the LICENSE file. Also check-out Meta Open Source Terms of Use and Privacy Policy.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.2.0

May 6, 2026

0.0.1

Feb 13, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

neuralbench-0.2.0.tar.gz (195.1 kB view details)

Uploaded May 6, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

neuralbench-0.2.0-py3-none-any.whl (289.5 kB view details)

Uploaded May 6, 2026 Python 3

File details

Details for the file neuralbench-0.2.0.tar.gz.

File metadata

Download URL: neuralbench-0.2.0.tar.gz
Upload date: May 6, 2026
Size: 195.1 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.12.8

File hashes

Hashes for neuralbench-0.2.0.tar.gz
Algorithm	Hash digest
SHA256	`8a2bb64537810d552ed4356868aed7932a9ceb55065d1ba474cc61a895b3fe90`
MD5	`073b7a992d8a5ed316471eebd3b6c053`
BLAKE2b-256	`e16b16f75867972dbf13081e864f49d6589bf86ef38032a179abba437f5bd3a7`

See more details on using hashes here.

File details

Details for the file neuralbench-0.2.0-py3-none-any.whl.

File metadata

Download URL: neuralbench-0.2.0-py3-none-any.whl
Upload date: May 6, 2026
Size: 289.5 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.12.8

File hashes

Hashes for neuralbench-0.2.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`a40e5d3e7d9afc6d483571c1407ea3dc1ad38a1852552181ebc67dd5dd9c0883`
MD5	`2ac2c282e632baf9ae386898367b1f5d`
BLAKE2b-256	`0754d6795b3eb436ef5478685de30895d4a0405befa2541e053e142b724d06a7`

See more details on using hashes here.

neuralbench 0.2.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

NeuralBench: Unified benchmark for NeuroAI models

Installation

Quick start

Running the full EEG benchmark

Weights & Biases setup

Benchmark suites

Available Tasks

Adding a new task

Adding a new model

(Advanced) Modifying the training loop

Contributing

Citing

Third-Party Content

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes