A large-scale benchmark for machine learning on Raman spectroscopy data

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

koddenbrock

These details have not been verified by PyPI

Project links

Project description

RamanBench

A large-scale benchmark for machine learning on Raman spectroscopy data.

74 datasets · 163 prediction targets · 28 baseline models · 4 application domains

RamanBench provides a reproducible evaluation protocol and a curated collection of public Raman spectroscopy datasets spanning Material Science, Biological, Medical, and Chemical applications. Researchers can rank new models against 28 pre-evaluated baselines — from classical PLS to tabular foundation models and Raman-specific deep learning architectures — without re-running all experiments.

Ecosystem

raman-data   ──▶  raman-bench  ──▶  Live Leaderboard
(datasets)        (this package)     HuggingFace Space
PyPI / GitHub     PyPI / GitHub

Resource	Link
raman-data (dataset loader)	GitHub · PyPI
raman-bench (this package)	GitHub · PyPI
Live Leaderboard	huggingface.co/spaces/HTW-KI-Werkstatt/RamanBench
Paper	arXiv:2605.02003

Installation

Option 1 — Datasets + leaderboard (recommended starting point)

pip install raman-bench

This gives you:

All 74 datasets with standardised train/test splits via raman-data
Precomputed results for 28 baseline models (bundled CSVs, no internet needed)
Leaderboard API — rank, plot, and compare against baselines
Evaluation API — lb.evaluate_and_add(model) works with any sklearn-compatible model

You can use any ML library you already have installed — scikit-learn, LightGBM, XGBoost, PyTorch, JAX, or anything else — against a large-scale, curated data foundation without installing a single additional dependency.

Option 2 — With all built-in models

Adds all Raman-specific architectures and standalone tabular foundation models, all with a standard fit(X, y) / predict(X) interface:

pip install "raman-bench[models]"

This installs torch, tabpfn, pytabkit, tabdpt, sktime, and ramanspy on top of the core package. No AutoGluon required.

Option 3 — Full benchmark reproducibility (AutoGluon fork)

The paper's benchmark runs all models through AutoGluon's automated preprocessing and HPO pipeline. The fork addresses two limitations of standard AutoGluon 1.5:

Feature cap — AutoGluon caps tabular foundation models (TabPFN v2, TabICL, TabDPT, MITRA) at 500 features; Raman spectra typically have 500–4000 wavenumber points. The fork removes this cap.
TabICL v2 regression — AutoGluon 1.5 ships TabICL v1, which supports classification only. The fork upgrades to TabICL v2, adding regression support. This limitation is expected to be resolved in AutoGluon 1.6.

A patched fork incorporates both fixes.

git clone https://github.com/ml-lab-htw/RamanBench.git
cd RamanBench
pip install -r requirements-autogluon-fork.txt
pip install -e ".[models]"

The fork is only needed to reproduce the exact paper benchmark. Options 1 and 2 work with a standard pip install and give full access to all datasets, splits, and built-in models.

Quick Start

Load a dataset (Option 1 — core install only)

from raman_data import raman_data

ds = raman_data("amino_acids_glycine")
print(ds.spectra.shape)      # (n_samples, n_wavenumbers)
print(ds.targets.shape)      # (n_samples,)
print(ds.raman_shifts[:5])   # wavenumber axis in cm⁻¹

All 74 datasets are available this way. Each comes with a fixed train/test split so results are directly comparable to the precomputed baselines.

Evaluate your model against 28 baselines (Option 1)

Any scikit-learn–compatible estimator works:

from raman_bench import Leaderboard
from sklearn.cross_decomposition import PLSRegression

lb = Leaderboard.from_precomputed()   # loads bundled v0.1 results

# Evaluates on all 74 datasets (3 seeds) and inserts into the ranking
results = lb.evaluate_and_add(
    model_name="My-PLS-10",
    model=PLSRegression(n_components=10),
)
print(lb.rank())
lb.plot()

Bring any library — LightGBM, XGBoost, a PyTorch model, a JAX model — and it will be scored on the same protocol as the 28 precomputed baselines.

Explore the precomputed leaderboard (Option 1)

from raman_bench import Leaderboard

lb = Leaderboard.from_precomputed()
print(lb.rank())          # ranked DataFrame
lb.plot()                 # horizontal bar chart

Use a built-in Raman model directly

All built-in models expose a standard sklearn fit / predict API:

import numpy as np
from raman_bench.models.custom import DeepCNNModel, TabPFNModel, RocketModel

X = np.random.randn(200, 512).astype("float32")  # 200 spectra, 512 wavenumbers
y = np.random.randn(200)                          # regression targets

# Raman-specific deep learning model
model = DeepCNNModel(n_epochs=50)
model.fit(X, y)
predictions = model.predict(X)

# Tabular foundation model (no feature-count limit)
tfm = TabPFNModel()
tfm.fit(X, y)
predictions = tfm.predict(X)

Run the full benchmark pipeline (fork required)

# Pre-cache all dataset splits (optional, speeds up the run)
python scripts/prepare_datasets.py --config configs/benchmark_v0.1.json

# Run predictions → metrics
raman-bench run --config configs/benchmark_v0.1.json

# Run individual steps
raman-bench run --config configs/benchmark_v0.1.json --step predictions
raman-bench run --config configs/benchmark_v0.1.json --step metrics

Notebooks

Notebook	Description
`01_quick_start.ipynb`	Load a dataset, explore the precomputed leaderboard, plot rankings
`02_benchmark_new_model.ipynb`	Evaluate your own model and add it to the leaderboard
`03_explore_results.ipynb`	Deep dive into per-dataset and per-domain results
`04_contribute_dataset.ipynb`	Step-by-step guide to contributing a new dataset

Models

Paper baselines (28 models)

All results in the paper were produced through the AutoGluon pipeline (Option 3 install).

| Category | Models |

Category	Models
Classical spectroscopy	PLS, KNN, LR
Tree ensembles	GBM (LightGBM), XGB, CatBoost, RF, XT
Tabular deep learning	NN_TORCH, FastAI, RealMLP
Tabular foundation models	TabPFN v2, TabPFN v2.5, TabM, TabDPT, TabICL, MITRA
Time-series classifiers	ROCKET, Arsenal
Raman-specific DL	DeepCNN, RamanNet, SANet, RamanFormer, RamanTransformer, ReZeroNet, FC-ResNeXt, CoAtNet
AutoGluon ensemble	AUTOGLUON

Standalone sklearn wrappers (`raman-bench[models]`)

raman-bench[models] provides sklearn-compatible (fit / predict) wrappers for many of the same algorithm families, usable directly without AutoGluon or the fork. These are not the exact pipeline configurations from the paper (no AutoGluon preprocessing or HPO), but they use the same underlying algorithms and are well-suited for building and evaluating new models.

Class	Algorithm	Requires
`PLSModel`	Partial Least Squares	—
`DeepCNNModel`	Raman-specific CNN	`torch`
`RamanNetModel`	Raman-specific CNN	`torch`
`SANetModel`	Spectral attention net	`torch`
`RamanFormerModel`	Raman transformer	`torch`
`RamanTransformerModel`	Raman transformer	`torch`
`ReZeroNetModel`	ReZero CNN	`torch`
`FCResNeXtModel`	FC-ResNeXt	`torch`
`CoAtNetModel`	Conv + attention	`torch`
`RocketModel`	ROCKET classifier	`sktime`
`ArsenalModel`	Arsenal classifier	`sktime`
`TabPFNModel`	TabPFN v2	`tabpfn`
`RealMLPModel`	RealMLP-TD	`pytabkit`
`TabMModel`	TabM-D	`pytabkit`
`TabDPTModel`	TabDPT	`tabdpt`

All classes support classification and regression and auto-detect the task from y. All package dependencies are included in raman-bench[models].

Benchmark Composition

Datasets

74 public Raman spectroscopy datasets from four application domains:

Domain	Datasets	Task	Sources
Chemical	37	Regression	Zenodo, HuggingFace
Medical	11	Classification	Kaggle, Zenodo
Biological	8	Regression	HuggingFace, Zenodo
Material Science	4	Classification	RRUFF, Zenodo

All datasets are accessible via pip install raman-data:

from raman_data import raman_data

dataset = raman_data("amino_acids_glycine")
X = dataset.spectra          # (n_samples, n_wavenumbers)
y = dataset.targets          # regression targets or class labels
w = dataset.raman_shifts     # wavenumber axis in cm⁻¹

Dataset catalog: raman-data on GitHub

Ranking Protocol

Models are evaluated under three complementary metrics:

Metric	Description
Elo	Pairwise win-rate Elo calibrated to RF = 1000 (200-round bootstrap)
Score	Normalised per-dataset score: best model = 1, median model = 0
Avg Rank	Average rank across all datasets and targets
Improvability	% gap to the best model, averaged across datasets

See the live leaderboard for interactive filtering by model category, task type, and dataset domain.

Repository Structure

RamanBench/
├── src/raman_bench/
│   ├── leaderboard.py          # Leaderboard + model evaluation API
│   ├── benchmark.py            # Dataset loading and cross-validation
│   ├── predictions.py          # Prediction generation (benchmark step 1)
│   ├── evaluation.py           # Metric computation (benchmark step 2)
│   ├── model.py                # AutoGluon pipeline wrapper (fork required)
│   ├── config.py               # JSON config loader
│   ├── models/custom/          # All built-in Raman models (sklearn API)
│   │   ├── base.py             #   BaseRamanEstimator (shared training loop)
│   │   ├── deepcnn.py          #   DeepCNNModel
│   │   ├── ramannet.py         #   RamanNetModel
│   │   ├── sanet.py            #   SANetModel
│   │   ├── ramanformer.py      #   RamanFormerModel
│   │   ├── ramantransformer.py #   RamanTransformerModel
│   │   ├── rezeronet.py        #   ReZeroNetModel
│   │   ├── fcresnext.py        #   FCResNeXtModel
│   │   ├── coatnet.py          #   CoAtNetModel
│   │   ├── pls.py              #   PLSModel
│   │   ├── sktime_models.py    #   RocketModel, ArsenalModel
│   │   └── tabular_foundation.py # TabPFNModel, RealMLPModel, TabMModel, TabDPTModel
│   └── preprocessing/
│       ├── mixin.py            #   RamanPreprocessingMixin (AutoGluon HPO)
│       └── wrapped_models.py   #   Prep_* classes + SklearnAutoGluonBridge
├── configs/                    # Benchmark configuration files
├── data/precomputed/           # Bundled v0.1 results
├── notebooks/                  # Example Jupyter notebooks
├── scripts/                    # CLI scripts
└── tests/                      # pytest test suite

Architecture: two paths, one set of model classes

Custom models are implemented once as plain scikit-learn BaseEstimator subclasses. The same classes are used in both usage modes:

  Custom model (e.g. DeepCNNModel)
  BaseEstimator — no AutoGluon dependency
  fit(X, y) / predict(X)
        │
        ├─── Standalone path (pip install "raman-bench[models]")
        │      CUSTOM_MODELS["DEEPCNN"] → DeepCNNModel().fit(X, y)
        │
        └─── AutoGluon pipeline path (fork required)
               SklearnAutoGluonBridge._fit() → DeepCNNModel(**params).fit(X_np, y_np)
               Prep_DEEPCNN(_RamanDLBase, _DeepCNNBridge)

SklearnAutoGluonBridge (in preprocessing/wrapped_models.py) is the only file that imports AutoGluon. All model source files are AutoGluon-free.

Contributing

We welcome contributions of new models and datasets!

Adding a New Model

The simplest way to add a model is to implement it as a scikit-learn–compatible estimator and submit a pull request. No AutoGluon knowledge is required.

Create src/raman_bench/models/custom/my_model.py:

import numpy as np
from sklearn.base import BaseEstimator

class MyModel(BaseEstimator):

    def __init__(self, n_components=10, lr=1e-3):
        self.n_components = n_components
        self.lr = lr

    def fit(self, X, y):
        # X: np.ndarray (n_samples, n_features)
        # y: np.ndarray — float → regression, int/str → classification
        ...
        return self

    def predict(self, X):
        ...  # return np.ndarray (n_samples,)

    def predict_proba(self, X):
        ...  # classification only, return (n_samples, n_classes)

For PyTorch-based models, inherit from BaseRamanEstimator in models/custom/base.py which provides a complete training loop with early stopping, cosine LR schedule, mixed-class augmentation, and batched inference.

from raman_bench.models.custom.my_model import MyModel

CUSTOM_MODELS["MYMODEL"] = MyModel

Add tests in tests/models/test_my_model.py following the patterns in tests/models/test_sanet.py.
Open a pull request — CI will run the full test suite automatically.

See CONTRIBUTING.md for the full guide, including how to optionally wire your model into the AutoGluon benchmark pipeline for full reproducibility.

Adding a New Dataset

See CONTRIBUTING.md and NEW_DATASETS.md for detailed instructions and examples.

Quick summary:

Upload your dataset to HuggingFace Datasets or Zenodo under CC BY 4.0.
Add a loader to the raman-data package (open a PR there).
Open an issue here linking to the raman-data PR.

The live leaderboard also has a "How to Contribute" section with step-by-step instructions.

Citation

If you use RamanBench in your research, please cite:

@misc{koddenbrock2026ramanbench,
  title         = {RamanBench: A Large-Scale Benchmark for Machine Learning on Raman Spectroscopy},
  author        = {Koddenbrock, Mario and Lange, Christoph and Legner, Robin and Jaeger, Martin
                   and K{\"o}gler, Martin and Cruz Bournazou, Mariano N. and Neubauer, Peter
                   and Bie{\ss}mann, Felix and Rodner, Erik},
  year          = {2026},
  eprint        = {2605.02003},
  archivePrefix = {arXiv},
  primaryClass  = {cs.LG},
  url           = {https://arxiv.org/abs/2605.02003}
}

License

MIT — see LICENSE.

Dataset licenses vary; see the dataset catalog or raman-data for per-dataset license information. Most datasets are released under CC BY 4.0.

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

koddenbrock

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.1.0

May 5, 2026

0.1.0a1 pre-release

Apr 25, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

raman_bench-0.1.0.tar.gz (920.7 kB view details)

Uploaded May 5, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

raman_bench-0.1.0-py3-none-any.whl (937.8 kB view details)

Uploaded May 5, 2026 Python 3

File details

Details for the file raman_bench-0.1.0.tar.gz.

File metadata

Download URL: raman_bench-0.1.0.tar.gz
Upload date: May 5, 2026
Size: 920.7 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for raman_bench-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`6155c8fa1c8b74355dde1eecfacea57cb27a78e2e0f4ade55f3873353e435b99`
MD5	`1a24f8ed0fe1f26a639ec63b7c2da4a8`
BLAKE2b-256	`a18a4827e416a5d8c9659bec2640264808db1d522227317ebd8a6cf82cbaa546`

See more details on using hashes here.

Provenance

The following attestation bundles were made for raman_bench-0.1.0.tar.gz:

Publisher: ci.yml on ml-lab-htw/RamanBench

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: raman_bench-0.1.0.tar.gz
- Subject digest: 6155c8fa1c8b74355dde1eecfacea57cb27a78e2e0f4ade55f3873353e435b99
- Sigstore transparency entry: 1439425910
- Sigstore integration time: May 5, 2026
Source repository:
- Permalink: ml-lab-htw/RamanBench@3884c015a5eece10bdc9e9d1584aadd79e05bf4f
- Branch / Tag: refs/tags/v0.1.0
- Owner: https://github.com/ml-lab-htw
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: ci.yml@3884c015a5eece10bdc9e9d1584aadd79e05bf4f
- Trigger Event: push

File details

Details for the file raman_bench-0.1.0-py3-none-any.whl.

File metadata

Download URL: raman_bench-0.1.0-py3-none-any.whl
Upload date: May 5, 2026
Size: 937.8 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for raman_bench-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`69eef4c3d3c363786b44a539c0c828c647e899126e74053dc0af8f8e66837c2a`
MD5	`0f15e4c8a9381bbecb04995ed51e71ec`
BLAKE2b-256	`423258f28eb12b68c3fdd274aab8a857b324a58a81c101848c8d871f5d891e68`

See more details on using hashes here.

Provenance

The following attestation bundles were made for raman_bench-0.1.0-py3-none-any.whl:

Publisher: ci.yml on ml-lab-htw/RamanBench

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: raman_bench-0.1.0-py3-none-any.whl
- Subject digest: 69eef4c3d3c363786b44a539c0c828c647e899126e74053dc0af8f8e66837c2a
- Sigstore transparency entry: 1439425919
- Sigstore integration time: May 5, 2026
Source repository:
- Permalink: ml-lab-htw/RamanBench@3884c015a5eece10bdc9e9d1584aadd79e05bf4f
- Branch / Tag: refs/tags/v0.1.0
- Owner: https://github.com/ml-lab-htw
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: ci.yml@3884c015a5eece10bdc9e9d1584aadd79e05bf4f
- Trigger Event: push

raman-bench 0.1.0

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

RamanBench

Ecosystem

Installation

Option 1 — Datasets + leaderboard (recommended starting point)

Option 2 — With all built-in models

Option 3 — Full benchmark reproducibility (AutoGluon fork)

Quick Start

Load a dataset (Option 1 — core install only)

Evaluate your model against 28 baselines (Option 1)

Explore the precomputed leaderboard (Option 1)

Use a built-in Raman model directly

Run the full benchmark pipeline (fork required)

Notebooks

Models

Paper baselines (28 models)

Standalone sklearn wrappers (raman-bench[models])

Benchmark Composition

Datasets

Ranking Protocol

Repository Structure

Architecture: two paths, one set of model classes

Contributing

Adding a New Model

Adding a New Dataset

Citation

License

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance

Standalone sklearn wrappers (`raman-bench[models]`)