Skip to main content

Creates a neuroimaging cohort by aggregating data across datasets.

Project description

Test pre-commit.ci status License https://github.com/psf/black Sourcery Documentation Status codecov DOI python versions

All Contributors

Cohort creator

TL;DR

Creates a neuroimaging cohort by aggregating data across datasets.

Command line tool to:

  • install a set of BIDS datalad datasets,
  • get the data for a set of participants,
  • copy the data to a new directory structure to create a "cohort".

It takes 2 files as input that should list:

  • datasets to be included in the cohort
  • subject in each dataset to be included in the cohort

Both of those files can be generated by the neurobagel query tool.

For examples of inputs TSV files see this page.

It outputs the cohort following the recommendations from the BIDS extension proposal 35.

Requirements

Operating system

It is recommended to use this package on a linux / Mac OS.

If you are on Windows, try using WSL (Windows Subsystem for Linux) to run this package: windows does not handle symbolic links well, and this package relies on symlinks. If you decided to go ahead anyway make sure you have got a LOT of disk space available.

More information here

Python dependencies

Make sure you have the following installed:

  • datalad and its dependencies:

    • if you are have anaconda / conda, it should be 'just' a matter of running

      conda install -c conda-forge datalad
      
    • But check the installation instructions for more details.

Other dependencies are listed in the pyproject.toml file.

Installation

pip install cohort_creator

Installation from source

git clone https://github.com/neurodatascience/cohort_creator.git
cd cohort_creator
pip install .

Limitations

Cohorts can only be created by aggregating data from open BIDS datasets curated with datalad.

Dataset types

Only possible to get data from:

  • raw
  • mriqc
  • fmriprep

Not yet possible to get freesurfer data via the cohort creator, though the data is available in the sourcedata folder of the fmriprep datasets.

Blind spots

It may be possible that that some metadata files (JSON, TSV) are not accessed over correctly if they are not in the root of the dataset or the same folder as the data file.

Contributors ✨

Thanks goes to these wonderful people (emoji key):

Michelle Wang
Michelle Wang

🐛 🤔 📓
Remi Gau
Remi Gau

⚠️ 🚧 📖 🐛 💻

This project follows the all-contributors specification. Contributions of any kind welcome!

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cohort_creator-0.3.0.tar.gz (17.2 MB view details)

Uploaded Source

Built Distribution

cohort_creator-0.3.0-py3-none-any.whl (259.6 kB view details)

Uploaded Python 3

File details

Details for the file cohort_creator-0.3.0.tar.gz.

File metadata

  • Download URL: cohort_creator-0.3.0.tar.gz
  • Upload date:
  • Size: 17.2 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.11.3

File hashes

Hashes for cohort_creator-0.3.0.tar.gz
Algorithm Hash digest
SHA256 8216566827e4bbb444ebba42601b4f1e86ffdd17ac70c603ffd8bec4f90cfaba
MD5 0d4a744ef03b77b74312bda72e5af5f0
BLAKE2b-256 49cf4132f38b273672b2f02e34adfeed19d5c12125d8044bd248d55c40406cd1

See more details on using hashes here.

File details

Details for the file cohort_creator-0.3.0-py3-none-any.whl.

File metadata

File hashes

Hashes for cohort_creator-0.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 8b8cd880b9710baadbcba42f1d7159fcff54bb167413026dd296f10db36b7cf6
MD5 2def6aca1c01bdecb0305d152a475c1b
BLAKE2b-256 3ce4c421811a9014b87921be37ae005175b29b9cad2da67005ee0d28e17c473c

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page