SM-RS benchmark: data loaders and canonical task evaluators for the single- and multi-objective recommendations dataset.

These details have not been verified by PyPI

Project links

Project description

SM-RS

The single- and multi-objective recommendations benchmark — self-declared user propensities (relevance · diversity · novelty · exploration) linked to contextual impressions, item selections, and perceived quality.

SM-RS is, to our knowledge, the only public recommender-systems dataset linking users' self-declared propensities toward beyond-accuracy objectives with contextual impressions, item selections, and explicit perceived-quality judgments. This repository is the benchmark code: data loaders and the canonical evaluator for each task, so everyone reports comparable numbers. The data lives on Hugging Face; the leaderboard lives in the dataset card.

Dataset & leaderboard: 🤗 pdokoupil/SM-RS Cite BOTH the SM-RS 2.0 (TORS'26) and SM-RS (SIGIR'24) papers — see Citing.

Install

pip install sm-rs                 # core (numpy / pandas / scikit-learn)
pip install "sm-rs[lstm]"         # + TensorFlow, for the optional LSTM baseline

Quick start (Task 1: propensity estimation)

import numpy as np
from smrs.tasks import task1_propensity as t1

# reproducible 80/20 split (seed 2024), reading a local SM-RS copy for now
X_train, X_test, y_train, y_test = t1.split(data_dir="path/to/sm-rs")

# ... train your estimator, produce an (N, 4) array of [rel, div, nov, exp] ...
predictions = np.full((len(y_test), 4), 0.25)        # placeholder

print(t1.evaluate(predictions, y_test))   # {'MAE': ..., 'MSE': ..., 'KLDiv': ...}

The six tasks

All six draw on the same dataset; they differ in what you predict and how it's scored. The canonical evaluator for each lives in smrs.tasks.*.

#	Task	You produce	Metric(s)	Evaluator
1	Propensity estimation	a 4-vector propensity per user	MAE · MSE · KL	✅ `task1_propensity`
2	Results proportionality	a top-k list matching target propensities	MAE · KL · wSUM · Pearson ρ	✅ `task2_proportionality`¹
3	Selections-aware reranking	a reranked impression list	nDCG@10 · Precision@5	✅ `task3_reranking`
4	Diversity-metric definition	per-list diversity values	MAE · MSE · KL	✅ `task4_diversity`
5	Perceived quality (5.1 rel / 5.2 div / 5.3 nov / 5.4 ser)	per-objective perception	MAE · MSE · Kendall τ	✅ `task5_perceived`
6	Satisfaction (6.1 / 6.2)	overall satisfaction	MAE · MSE · Kendall τ	✅ `task6_satisfaction`

¹ Task 2's metric layer is implemented; turning a top-k list into achieved objective proportions needs the derived matrices (see Data) and lands with the source→rating-matrix builder.

Data

Two layers:

Core tables (the collected study data, CC-BY, hosted on Hugging Face): behaviors, propensities, objective_perceptions, criteria_values, comparative_diversity, users, movies, books. Items are referenced by ID.

Auto-downloaded from the Hub and cached — no manual download:
```
from smrs import data
df = data.load("propensities")                 # downloads from HF, cached
# offline / local copy (e.g. OSF download):
df = data.load("propensities", data_dir="path/to/sm-rs")   # or set $SMRS_DATA_DIR
```
Users of the 🤗 datasets library can equivalently do load_dataset("pdokoupil/SM-RS", "behaviors").
Derived matrices — recomputed locally, not downloaded. The list-scoring tasks (2, 3) need per-item / per-pair artifacts (relevance via item-item, intra-list diversity via a distance matrix, novelty via popularity). Rather than ship multi-GB blobs — or redistribute the third-party catalogs they come from — the benchmark recomputes them from a rating matrix you build from your own download of the public source datasets (movies: MovieLens 25M — the "Latest" snapshot at collection time — plus MovieLens Tag Genome 2021; books: goodbooks-10k):
```
from smrs import derived
art = derived.build_artifacts(rating_matrix_movies)   # {item_item, distance_matrix, mean_popularities}
```
derived provides the deterministic pieces: popularity, cosine_distance (1 − cosine over item rating-vectors), and ease_item_item (EASE^R closed form, used as the relevance model). This keeps the benchmark lightweight and license-clean. The bit-exact original artifacts are archived on OSF (v2) for strict reproduction.

Get the source datasets with the bundled fetcher (downloads from the official hosts — GroupLens, the goodbooks repo — under their licenses; or place them there manually):
```
smrs-fetch --list                      # show sources + licenses, no download
smrs-fetch --dest ./sm-rs-sources      # download MovieLens 25M, Tag Genome 2021, goodbooks-10k
```

Why recompute? MovieLens may not be redistributed, and a 5 GB download hurts adoption. You obtain MovieLens/goodbooks under their own licenses; we ship only the study data, the id-maps (movies.json: movieId→imdbId; books.json: book_index→goodreads_id), and the recompute code. The canonical scorer is pinned (sources above; positive feedback = rating ≥ 3; EASE λ) so results stay comparable. For strict reproduction of the paper's numbers, use the OSF artifacts.

Reproduction check

examples/reproduce_perceived.py reproduces the paper's Linear Regression baseline for Tasks 5 & 6 (the one baseline needing no derived/source data), scored with this package's evaluators — MAE matches the paper exactly:

subtask	MAE (ours / paper)	MSE	Kendall τ
5.1 relevance	0.235 / 0.235	0.086 / 0.085	0.076 / 0.080
5.2 diversity	0.222 / 0.222	0.080 / 0.061	0.197 / 0.196
5.3 novelty	0.259 / 0.259	0.104 / 0.104	0.143 / 0.143
5.4 serendipity	0.270 / 0.270	0.104 / 0.103	0.036 / 0.039
6 satisfaction	0.255 / 0.255	0.102 / 0.102	0.039 / 0.045

SMRS_DATA_DIR=/path/to/sm-rs python examples/reproduce_perceived.py

Submitting to the leaderboard

Self-service (no submission server): run the canonical evaluate() for a task, then open a PR adding your row to the leaderboard in the dataset card, with a link to reproduce. The shipped baselines are the rows to beat.

Citing

Please cite both papers (GitHub's "Cite this repository" reads CITATION.cff):

@article{dokoupil2026smrs2,
  author  = {Dokoupil, Patrik and Peska, Ladislav},
  title   = {SM-RS 2.0: User-perceived Qualities of Single- and Multi-Objective Recommender Systems},
  journal = {ACM Transactions on Recommender Systems},
  volume  = {4}, number = {3}, year = {2026},
  doi     = {10.1145/3754459}
}
@inproceedings{dokoupil2024smrs,
  author    = {Dokoupil, Patrik and Peska, Ladislav and Boratto, Ludovico},
  title     = {SM-RS: Single- and Multi-Objective Recommendations with Contextual Impressions and Beyond-Accuracy Propensity Scores},
  booktitle = {Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval},
  series    = {SIGIR '24}, pages = {988--995}, year = {2024},
  doi       = {10.1145/3626772.3657863}
}

License

Code: MIT (see LICENSE). Data: CC-BY-4.0 (see the dataset card).

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.1.0

Jun 22, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sm_rs-0.1.0.tar.gz (27.0 kB view details)

Uploaded Jun 22, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

sm_rs-0.1.0-py3-none-any.whl (24.7 kB view details)

Uploaded Jun 22, 2026 Python 3

File details

Details for the file sm_rs-0.1.0.tar.gz.

File metadata

Download URL: sm_rs-0.1.0.tar.gz
Upload date: Jun 22, 2026
Size: 27.0 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.9.12

File hashes

Hashes for sm_rs-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`62f7a41a18f8ba2a4daa66d97a903fda9939190338a46e62faf7223d0f7c6a38`
MD5	`d3a083fa1554085f1c67e46700d2fefb`
BLAKE2b-256	`13b6421bc272e356fd92f5741ea6225b6e0010e4b476f074a3c101b6000b8e26`

See more details on using hashes here.

File details

Details for the file sm_rs-0.1.0-py3-none-any.whl.

File metadata

Download URL: sm_rs-0.1.0-py3-none-any.whl
Upload date: Jun 22, 2026
Size: 24.7 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.9.12

File hashes

Hashes for sm_rs-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`26b9a8051e7172f0510bd986fb529641e7b0e59022c694a7f6f7bff62a86c6ed`
MD5	`518592be71a0fd4f83196dcda9731e34`
BLAKE2b-256	`44b4baab5e2e1911fed14fb512eb7f62b0fcc5cef9d3f4b40c377290365ec4bb`

See more details on using hashes here.

sm-rs 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

SM-RS

Install

Quick start (Task 1: propensity estimation)

The six tasks

Data

Reproduction check

Submitting to the leaderboard

Citing

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes