Lightweight, vectorised ranking-metric toolkit
Project description
rank‑validation
📊 One‑liner ranking evaluation for search, recommendation & IR.
rank‑validation turns a dataframe of truth ✨ vs prediction 🔮 into two ready‑to‑export reports—per‑query and overall—complete with industry‑standard metrics at any cut‑off k.
✨ Key features
- Simple API –
get_metrics_report(...)returns pandas DataFrames you already know how to use. - Out‑of‑the‑box metrics – nDCG, Recall, Kendall’s τ‑b, Kendall’s τ‑ap, RBO (extendable).
- Arbitrary cut‑offs – evaluate at @1, @5, @20… whatever matters.
- Automatic score alignment – helper utilities map prediction lists onto truth scores for graded relevance.
- Vectorised NumPy & Pandas core – scales to millions of queries on a laptop.
- Pure Python ≥ 3.8 – zero native extensions.
🚀 Installation
pip install rank-validation
The wheel is lightweight (< 30 KB) and pulls in only numpy, pandas, scipy & rbo.
⚡ Quick start
import pandas as pd
from rank_validation.validation_generator import get_metrics_report
df = pd.DataFrame({
"query": ["q1", "q2"],
"truth_items": [["A","B","C","D"], ["X","Y","Z"]],
"truth_scores": [[3,2,1,0], [2,1,0]],
"pred_items": [["B","A","E","C"], ["Y","X","Z"]],
})
metrics = ["ndcg", "recall", "kendall_tau", "tau_ap", "rbo"]
cutoffs = [3, 5]
query_report, overall_report = get_metrics_report(
df,
truth_item_col="truth_items",
truth_score_col="truth_scores",
pred_item_col="pred_items",
metric_list=metrics,
cutoff_list=cutoffs,
)
print(query_report.head()) # per‑query breakdown
print(overall_report) # summary stats (mean, std, …)
Typical query_report:
query ndcg@3 recall@3 kendall_tau@3 tau_ap@3 rbo@3 ndcg@5 recall@5 kendall_tau@5 tau_ap@5 rbo@5
0 q1 0.91 0.67 0.33 0.40 0.79 0.90 1.00 0.33 0.46 0.79
1 q2 1.00 0.67 0.67 0.80 1.00 1.00 1.00 0.67 0.80 1.00
Typical overall_report:
ndcg@3 recall@3 kendall_tau@3 tau_ap@3 rbo@3 ndcg@5 recall@5 kendall_tau@5 tau_ap@5 rbo@5
mean 0.96 0.67 0.50 0.60 0.90 0.95 1.00 0.50 0.63 0.90
std 0.06 0.00 0.24 0.28 0.15 0.05 0.00 0.24 0.24 0.15
🧮 Supported metrics & formulas
| Metric | What it measures | Reference |
|---|---|---|
| nDCG@k | Graded relevance with log‑discounted gain, normalised by ideal ranking | Järvelin & Kekäläinen (2002) |
| Recall@k | Proportion of ground‑truth items retrieved in top k | – |
| Kendall’s τ‑b@k | Rank correlation, tie‑adjusted | Kendall (1938) |
| Kendall’s τ‑ap@k | Top‑weighted rank correlation | Yilmaz et al. (2008) |
| RBO@k | Top‑weighted similarity between two indefinite rankings | Webber et al. (2010) |
Heads‑up: RBO requires the two lists to have unique items and equalised lengths. If you hit
RankingSimilarityerrors, drop duplicates beforehand or omit RBO for that experiment.
🛠️ API reference
def get_metrics_report(
df: pd.DataFrame,
truth_item_col: str,
truth_score_col: str,
pred_item_col: str,
metric_list: list[str],
cutoff_list: list[int],
) -> tuple[pd.DataFrame, pd.DataFrame]
| Parameter | Description |
|---|---|
| df | DataFrame with at least the three list‑columns below. |
| truth_item_col | Column holding ground‑truth item IDs. |
| truth_score_col | Column with relevance grades (same order & length). |
| pred_item_col | Column holding system‑predicted ranked lists. |
| metric_list | Any subset of METRIC_REGISTRY keys, e.g. ndcg, tau_ap. |
| cutoff_list | Integers e.g. [1, 3, 10]. Each yields metric@k columns. |
Returns (query_report, overall_report) where:
- query_report – original df plus metric columns.
- overall_report –
query_report.describe().
⚙️ Performance tips
- Core logic is vectorised; multi‑process pandas handles millions of rows out‑of‑the‑box.
- Chunk evaluation if truth lists are extremely long (> 1 K items) to limit memory.
🤝 Contributing
Found a bug? Need MAP or MRR? PRs are welcome! Please open an issue first so we can discuss the approach.
- Fork ➡️ branch ➡️ commit (with tests!)
pre‑commit run -a- Open a pull request describing the change.
🛣️ Roadmap
- Mean Average Precision (MAP)
- Mean Reciprocal Rank (MRR)
- Optional GPU acceleration via cuDF / RAPIDS
📝 License
MIT © 2025 Akash Dubey
🔗 Links & citation
- Docs / examples: https://github.com/akashkdubey/ranking_validation
- PyPI: https://pypi.org/project/rank-validation/
@software{Dubey_2025_rank_validation,
author = {Dubey, Akash},
title = {rank‑validation: A lightweight toolkit for ranking evaluation},
year = {2025},
url = {https://github.com/akashkdubey/ranking_validation}
}
Built with ❤️, Pandas & SciPy.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file rank_validation-1.1.8.tar.gz.
File metadata
- Download URL: rank_validation-1.1.8.tar.gz
- Upload date:
- Size: 16.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.4
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
982ed781bdc13c55848fcb2f9b3fc3eefedca1820168c444fb54716c2a91644c
|
|
| MD5 |
1e112c8f4a2766f6eddd4c1cbd0d6084
|
|
| BLAKE2b-256 |
f052e492bd8de0acf3a6c87bcba7519c34290b0fa4c8995d272b91e077a34d1e
|
File details
Details for the file rank_validation-1.1.8-py3-none-any.whl.
File metadata
- Download URL: rank_validation-1.1.8-py3-none-any.whl
- Upload date:
- Size: 15.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.4
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
edf1c24135d9b0b8f97c01f955f6a889ffabc313c77b80ec2c9cf2473c4c2a8c
|
|
| MD5 |
b3c7daf98c39f148d5a554e56f207c1a
|
|
| BLAKE2b-256 |
c81913c5e25db45ed575265452f9733b14ce72f025a6a28e91bbe431adf85e4e
|