A large-scale benchmark for machine learning on Raman spectroscopy data
Project description
RamanBench
A large-scale benchmark for machine learning on Raman spectroscopy data.
74 datasets · 163 prediction targets · 28 baseline models · 4 application domains
RamanBench provides a reproducible evaluation protocol and a curated collection of public Raman spectroscopy datasets spanning Material Science, Biological, Medical, and Chemical applications. Researchers can rank new models against 28 pre-evaluated baselines — from classical PLS to tabular foundation models and Raman-specific deep learning architectures — without re-running all experiments.
Ecosystem
raman-data ──▶ raman-bench ──▶ Live Leaderboard
(datasets) (this package) HuggingFace Space
PyPI / GitHub PyPI / GitHub
| Resource | Link |
|---|---|
| raman-data (dataset loader) | GitHub · PyPI |
| raman-bench (this package) | GitHub · PyPI |
| Live Leaderboard | huggingface.co/spaces/ml-lab-htw/RamanBench |
| Paper | arXiv TBD |
Quick Start
Installation
# Core package (leaderboard + dataset loading, no heavy dependencies)
pip install raman-bench
For running the full benchmark (AutoGluon + deep learning models), RamanBench requires a patched AutoGluon fork. The official AutoGluon release caps tabular foundation models (TabPFN v2, TabICL, TabDPT, MITRA, …) at 500 features and silently skips them on larger datasets; Raman spectra typically have 500–4000 wavenumber points. The fork removes this cap. Install it first:
git clone https://github.com/ml-lab-htw/RamanBench.git
cd RamanBench
pip install -r requirements-autogluon-fork.txt
pip install "raman-bench[deep]"
Explore the precomputed leaderboard
from raman_bench import Leaderboard
# Load v0.1 results: 28 models × 74 datasets
lb = Leaderboard.from_precomputed()
print(lb.rank()) # ranked DataFrame
lb.plot() # horizontal bar chart
Evaluate a new model
from raman_bench import Leaderboard
from sklearn.cross_decomposition import PLSRegression
lb = Leaderboard.from_precomputed()
# Evaluates your model on all 74 datasets (3 seeds) and adds it to the ranking
results = lb.evaluate_and_add(
model_name="My-PLS-10",
model=PLSRegression(n_components=10),
)
print(lb.rank())
lb.plot()
Run the full benchmark pipeline
# 1. Clone, install the AutoGluon fork, then install in development mode
git clone https://github.com/ml-lab-htw/RamanBench.git
cd RamanBench
pip install -r requirements-autogluon-fork.txt
pip install -e ".[deep]"
# 2. Pre-cache all dataset splits (optional, speeds up the run)
python scripts/prepare_datasets.py --config configs/benchmark_v0.1.json
# 3. Run predictions → metrics
raman-bench run --config configs/benchmark_v0.1.json
# 4. Run a single step
raman-bench run --config configs/benchmark_v0.1.json --step predictions
raman-bench run --config configs/benchmark_v0.1.json --step metrics
Notebooks
| Notebook | Description |
|---|---|
01_quick_start.ipynb |
Load a dataset, explore the precomputed leaderboard, plot rankings |
02_benchmark_new_model.ipynb |
Evaluate your own model and add it to the leaderboard |
03_explore_results.ipynb |
Deep dive into per-dataset and per-domain results |
04_contribute_dataset.ipynb |
Step-by-step guide to contributing a new dataset |
Benchmark Composition
Datasets
74 public Raman spectroscopy datasets from four application domains:
| Domain | Datasets | Task | Sources |
|---|---|---|---|
| Chemical | 37 | Regression | Zenodo, HuggingFace |
| Medical | 11 | Classification | Kaggle, Zenodo |
| Biological | 8 | Regression | HuggingFace, Zenodo |
| Material Science | 4 | Classification | RRUFF, Zenodo |
All datasets are accessible via pip install raman-data:
from raman_data import raman_data
dataset = raman_data("amino_acids_glycine")
X = dataset.spectra # (n_samples, n_wavenumbers)
y = dataset.targets # regression targets or class labels
w = dataset.raman_shifts # wavenumber axis in cm⁻¹
Dataset catalog: raman-data on GitHub
Models (v0.1 — 28 baselines)
Classical ML / Spectroscopy
- PLS (partial least squares)
- KNN, LR, RF, XT, GBM (LightGBM), XGB (XGBoost), CatBoost
Tabular Deep Learning
- NN_TORCH, FastAI, RealMLP
Tabular Foundation Models
- TabPFN v2, TabPFN v2.5, MITRA, TabM, TabDPT, TabICL
Time-Series / Spectral Classifiers
- ROCKET, ARSENAL
Raman-Specific Neural Networks
- DeepCNN (Liu et al., 2017)
- RamanNet (Ibtehaz et al., 2023)
- SANet (Deng et al., 2021)
- RamanFormer (Koyun et al., 2024)
- RamanTransformer (Liu et al., 2023)
- ReZeroNet, FC-ResNeXt, CoAtNet (Lange et al., 2025)
AutoGluon ensemble (AUTOGLUON)
Ranking Protocol
Models are evaluated under three complementary metrics:
| Metric | Description |
|---|---|
| Elo | Pairwise win-rate Elo calibrated to RF = 1000 (200-round bootstrap) |
| Score | Normalised per-dataset score: best model = 1, median model = 0 |
| Avg Rank | Average rank across all datasets and targets |
| Improvability | % gap to the best model, averaged across datasets |
See the live leaderboard for interactive filtering by model category, task type, and dataset domain.
Repository Structure
RamanBench/
├── src/raman_bench/ # Python package (install via pip)
│ ├── benchmark.py # Dataset loading and caching
│ ├── model.py # AutoGluon wrapper
│ ├── evaluation.py # Metric computation (Step 2)
│ ├── predictions.py # Prediction generation (Step 1)
│ ├── leaderboard.py # Leaderboard + model evaluation
│ ├── config.py # JSON config loader
│ ├── preprocessing/ # Raman preprocessing pipeline
│ ├── metrics/ # Classification + regression metrics
│ └── models/custom/ # 9 Raman-specific architectures
├── configs/ # Benchmark configuration files
│ ├── benchmark_v0.1.json
│ ├── models/ # Model lists (all, raman, traditional, foundation)
│ └── datasets/ # Dataset lists (regression_all, classification_all)
├── data/precomputed/ # Bundled v0.1 results (CSVs + dataset_stats.json)
├── notebooks/ # Example Jupyter notebooks
├── scripts/ # CLI scripts (run_benchmark.py, prepare_datasets.py)
├── tests/ # pytest test suite
└── docs/ # Sphinx documentation
Contributing
We welcome contributions of new models and datasets!
Adding a New Model
See CONTRIBUTING.md.
Quick summary:
- Implement your model as an AutoGluon
AbstractModelsubclass (or use theBaseCustomModelshared training loop). - Register it in
configs/models/. - Add tests in
tests/models/.
Adding a New Dataset
See CONTRIBUTING.md and NEW_DATASETS.md for detailed instructions and examples.
Quick summary:
- Upload your dataset to HuggingFace Datasets or Zenodo under CC BY 4.0.
- Add a loader to the raman-data package (open a PR there).
- Open an issue here linking to the raman-data PR.
The live leaderboard also has a "How to Contribute" section with step-by-step instructions.
Citation
If you use RamanBench in your research, please cite:
@inproceedings{koddenbrock2026ramanbench,
title = {RamanBench: A Large-Scale Benchmark for Machine Learning on Raman Spectroscopy Data},
author = {Koddenbrock, Mario and Lange, Christoph and others},
booktitle = {Advances in Neural Information Processing Systems (NeurIPS)},
year = {2026},
url = {https://arxiv.org/abs/TBD}
}
License
MIT — see LICENSE.
Dataset licenses vary; see the dataset catalog or raman-data for per-dataset license information. Most datasets are released under CC BY 4.0.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file raman_bench-0.1.0a1.tar.gz.
File metadata
- Download URL: raman_bench-0.1.0a1.tar.gz
- Upload date:
- Size: 88.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
9bcb5d7a46cfe1117a2fd1f28e11f23298a868f024b5e346c99170bbf2f353ca
|
|
| MD5 |
eb2095621ced1e8f61137a4547be57c0
|
|
| BLAKE2b-256 |
1384c4ceb2c346dc4cd6b90bb631e609201dd9d907a094cba8c953d8d2bda476
|
Provenance
The following attestation bundles were made for raman_bench-0.1.0a1.tar.gz:
Publisher:
ci.yml on ml-lab-htw/RamanBench
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
raman_bench-0.1.0a1.tar.gz -
Subject digest:
9bcb5d7a46cfe1117a2fd1f28e11f23298a868f024b5e346c99170bbf2f353ca - Sigstore transparency entry: 1379734455
- Sigstore integration time:
-
Permalink:
ml-lab-htw/RamanBench@d69bbb8060a71f6ce2a08bf32dbbec7b9c627965 -
Branch / Tag:
refs/tags/v0.1.0a1 - Owner: https://github.com/ml-lab-htw
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
ci.yml@d69bbb8060a71f6ce2a08bf32dbbec7b9c627965 -
Trigger Event:
push
-
Statement type:
File details
Details for the file raman_bench-0.1.0a1-py3-none-any.whl.
File metadata
- Download URL: raman_bench-0.1.0a1-py3-none-any.whl
- Upload date:
- Size: 104.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
23555e3f3012227c770a8bfb89306e8f8949a260fe10f2e512440e8081eb8aec
|
|
| MD5 |
fa6226cb6f79d410c1b7bc77d8449c1f
|
|
| BLAKE2b-256 |
ae3492183c2bcb1fdd81fcae40b68d756e3922a3542dd29a5cfa7d684a809d57
|
Provenance
The following attestation bundles were made for raman_bench-0.1.0a1-py3-none-any.whl:
Publisher:
ci.yml on ml-lab-htw/RamanBench
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
raman_bench-0.1.0a1-py3-none-any.whl -
Subject digest:
23555e3f3012227c770a8bfb89306e8f8949a260fe10f2e512440e8081eb8aec - Sigstore transparency entry: 1379734537
- Sigstore integration time:
-
Permalink:
ml-lab-htw/RamanBench@d69bbb8060a71f6ce2a08bf32dbbec7b9c627965 -
Branch / Tag:
refs/tags/v0.1.0a1 - Owner: https://github.com/ml-lab-htw
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
ci.yml@d69bbb8060a71f6ce2a08bf32dbbec7b9c627965 -
Trigger Event:
push
-
Statement type: