A simple and fast CLI for downloading, processing, and loading the THINGS-EEG2 dataset.
Project description
THINGS-EEG2 Raw Data Processing
This package provides tools for downloading, preprocessing raw THINGS-EEG2 EEG data, and generating image embeddings from various vision models.
[!WARNING] This repository builds upon the original data processing by Gifford et al (2022). Please check out their original code and the corresponding paper.
We are in no way associated with the authors. Nonetheless we hope, that this makes things easier (pun intended) to use.
Installation
git clone git@github.com:ZEISS/things_eeg2_dataset.git
cd things_eeg2_dataset
uv sync
uv pip install --editable .
source .venv/bin/activate
things-eeg2 --help
things-eeg2 --install-completion
# Then restart your terminal.
# Example for zsh:
source ~/.zshrc
Usage
uv run things-eeg2 process \
--project_dir /path/to/project_dir \
--subjects 1 2 3 4 5 6 7 8 9 10 \
uv run things-eeg2 info \
--project-dir /path/to/project_dir \
--subject <EXAMPLE_SUBJECT \
--session <EXAMPLE_SESSION> \
--data-index <EXAMPLE_INDEX>
Data Structure
For understanding the data structure that is created by the CLI (and then needed to perform proper preprocessing and loading), please see paths.py. All code is configured to use the structure defined there as its ground truth.
Embedding Generation (embedding_processing/)
The package supports multiple state-of-the-art vision models for generating image embeddings:
| Model | Embedder Class | Description |
|---|---|---|
open-clip-vit-h-14 |
OpenClipViTH14Embedder |
OpenCLIP ViT-H/14 (SDXL image encoder) |
openai-clip-vit-l-14 |
OpenAIClipVitL14Embedder |
OpenAI CLIP ViT-L/14 |
dinov2 |
DinoV2Embedder |
DINOv2 with registers (self-supervised) |
ip-adapter |
IPAdapterEmbedder |
IP-Adapter Plus projections |
Each embedder generates:
- Pooled embeddings: Single vector per image (e.g.,
(1024,)for ViT-H-14) - Full sequence embeddings: All tokens (e.g.,
(257, 1280)for ViT-H-14) - Text embeddings: Corresponding text features from image captions
Output Files:
embeddings/
├── ViT-H-14_features_training.pt # Pooled embeddings
├── ViT-H-14_features_training_full.pt # Full token sequences
├── ViT-H-14_features_test.pt
└── ViT-H-14_features_test_full.pt
References
- Original THINGS-EEG2 paper and code
- Implementation based on: https://www.sciencedirect.com/science/article/pii/S1053811922008758
Using the dataloader
from things_eeg2_dataset.dataloader import ThingsEEGDataset
dataset = ThingsEEGDataset(
image_model="ViT-H-14",
data_path="/path/to/processed_data",
img_directory_training="/path/to/images/train",
img_directory_test="/path/to/images/test",
embeddings_dir="/path/to/embeddings",
train=True,
time_window=(0.0, 1.0),
)
See things_eeg2_dataloader/README.md for detailed usage.
Citation
We are happy users of the THINGS-EEG2 dataset, but not associated with the original authors. If you use this code, please cite the THINGS-EEG2 paper:
Gifford, A. T., Lahner, B., Saba-Sadiya, S., Vilas, M. G., Lascelles, A., Oliva, A., ... & Cichy, R. M. (2022). The THINGS-EEG2 dataset. Scientific Data.
License
This project follows the original THINGS-EEG2 license terms.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file things_eeg2_dataset-0.0.1.tar.gz.
File metadata
- Download URL: things_eeg2_dataset-0.0.1.tar.gz
- Upload date:
- Size: 471.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: uv/0.9.17 {"installer":{"name":"uv","version":"0.9.17","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
fb867e7f8220f9ce3c6cbe7864f3c73dad641c072ddd8f21944e7404e13de29a
|
|
| MD5 |
f368caa0482773a0bd0a4a4fea1f47c9
|
|
| BLAKE2b-256 |
8b9a11ee26921ce35ac7094d8217de560357c9e93a02ffa89a474f8c5afbfc20
|
File details
Details for the file things_eeg2_dataset-0.0.1-py3-none-any.whl.
File metadata
- Download URL: things_eeg2_dataset-0.0.1-py3-none-any.whl
- Upload date:
- Size: 89.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: uv/0.9.17 {"installer":{"name":"uv","version":"0.9.17","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
522295c8af021286994bb08761659490a89f7444e6d0650acfc9b4c75d561ad6
|
|
| MD5 |
c4a6e016e18f89570a8e171fc4c09bfa
|
|
| BLAKE2b-256 |
bfb8b04cea33c27c5d5ee84a5ddb77faf7a12b788195c4d58bf5b126f1e98a15
|