A simple and fast CLI for downloading, processing, and loading the THINGS-EEG2 dataset.
Project description
Introduction
This package provides tools for downloading, preprocessing the raw THINGS-EEG2 data, and generating image embeddings using various vision models.
[!WARNING] This repository builds upon the original data processing by Gifford et al (2022). Please check out their original code and the corresponding paper.
We are in no way associated with the authors. Nonetheless we hope, that this makes things easier (pun intended) to use.
Installation
CLI-only
If you only need the CLI functionality, you can run it using one line of code:
Using the PyPI package (with uv)
uvx run --from things_eeg2_dataset things-eeg2
Using the conda package (with pixi)
pixi exec --with things_eeg2_dataset things-eeg2
From GitHub
git clone git@github.com:ZEISS/things_eeg2_dataset.git
cd things_eeg2_dataset
uv sync
uv pip install --editable .
source .venv/bin/activate
things-eeg2 --help
things-eeg2 --install-completion
# Then restart your shell
# Example for zsh:
source ~/.zshrc
From PyPI
# Using UV
uv init
uv add things_eeg2_dataset
source .venv/bin/activate
things-eeg2 --help
things-eeg2 --install-completion
# Then restart your shell
# Example for zsh:
source ~/.zshrc
Using the conda package
# Using pixi
pixi init
pixi add things_eeg2_dataset
pixi shell
things-eeg2 --help
things-eeg2 --install-completion
# Then restart your shell
# Example for zsh:
source ~/.zshrc
Usage
Data Structure
You can understand the data structure that is created by the CLI by referring to paths.py. It contains the ground truth data structure used throughout the project.
Embedding Generation (embedding_processing/)
The package supports multiple state-of-the-art vision models for generating image embeddings:
| Model | Embedder Class | Description |
|---|---|---|
open-clip-vit-h-14 |
OpenClipViTH14Embedder |
OpenCLIP ViT-H/14 (SDXL image encoder) |
openai-clip-vit-l-14 |
OpenAIClipVitL14Embedder |
OpenAI CLIP ViT-L/14 |
dinov2 |
DinoV2Embedder |
DINOv2 with registers (self-supervised) |
ip-adapter |
IPAdapterEmbedder |
IP-Adapter Plus projections |
Each embedder generates:
- Pooled embeddings: Single vector per image (e.g.,
(1024,)for ViT-H-14) - Full sequence embeddings: All tokens (e.g.,
(257, 1280)for ViT-H-14) - Text embeddings: Corresponding text features from image captions
Output Files:
embeddings/
├── ViT-H-14_features_training.safetensors # Pooled embeddings
├── ViT-H-14_features_training_full.safetensors # Full token sequences
├── ViT-H-14_features_test.safetensors
└── ViT-H-14_features_test_full.safetensors
Using the dataloader
from things_eeg2_dataset.dataloader import ThingsEEGDataset
dataset = ThingsEEGDataset(
image_model="ViT-H-14",
data_path="/path/to/processed_data",
img_directory_training="/path/to/images/train",
img_directory_test="/path/to/images/test",
embeddings_dir="/path/to/embeddings",
train=True,
time_window=(0.0, 1.0),
)
See things_eeg2_dataloader/README.md for detailed usage.
References & Citation
We are happy users of the THINGS-EEG2 dataset, but not associated with the original authors. If you use this code, please cite the THINGS-EEG2 paper:
Gifford, A. T., Lahner, B., Saba-Sadiya, S., Vilas, M. G., Lascelles, A., Oliva, A., ... & Cichy, R. M. (2022). The THINGS-EEG2 dataset. Scientific Data.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file things_eeg2_dataset-0.0.6.tar.gz.
File metadata
- Download URL: things_eeg2_dataset-0.0.6.tar.gz
- Upload date:
- Size: 349.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: uv/0.10.9 {"installer":{"name":"uv","version":"0.10.9","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
1db86247643101718d4c19df6cd8c5c9a9f4e65d3fd5614f1ae812a1834a067e
|
|
| MD5 |
a58ebe142b8cc2e41479da229d3c6f01
|
|
| BLAKE2b-256 |
858fcbe3425db6e478fb59f1cad3824442ad8866da8d4239f1cc24f3184978b4
|
File details
Details for the file things_eeg2_dataset-0.0.6-py3-none-any.whl.
File metadata
- Download URL: things_eeg2_dataset-0.0.6-py3-none-any.whl
- Upload date:
- Size: 106.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: uv/0.10.9 {"installer":{"name":"uv","version":"0.10.9","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
804331c15d4f6b6f06a6de30a6eb15135254ef20813ecfc2e30ae349be2b5473
|
|
| MD5 |
26343ec30198af692a7eb0878f8f373f
|
|
| BLAKE2b-256 |
d38c6b3434497c3f02431f9348c72179be09f5617ab24d561a446a72f0648f3e
|