Skip to main content

Lightweight, vectorised ranking-metric toolkit

Project description

PyPI version Downloads License: MIT

rank‑validation

📊 One‑liner ranking evaluation for search, recommendation & IR.

rank‑validation turns a dataframe of truth ✨ vs prediction 🔮 into two ready‑to‑export reports—per‑query and overall—complete with industry‑standard metrics at any cut‑off k.


✨ Key features

  • Simple APIget_metrics_report(...) returns pandas DataFrames you already know how to use.
  • Out‑of‑the‑box metrics – nDCG, Recall, Kendall’s τ‑b, Kendall’s τ‑ap, RBO (extendable).
  • Arbitrary cut‑offs – evaluate at @1, @5, @20… whatever matters.
  • Automatic score alignment – helper utilities map prediction lists onto truth scores for graded relevance.
  • Vectorised NumPy & Pandas core – scales to millions of queries on a laptop.
  • Pure Python ≥ 3.8 – zero native extensions.

🚀 Installation

pip install rank-validation

The wheel is lightweight (< 30 KB) and pulls in only numpy, pandas, scipy & rbo.


⚡ Quick start

import pandas as pd
from rank_validation.validation_generator import get_metrics_report

df = pd.DataFrame({
    "query": ["q1", "q2"],
    "truth_items":  [["A","B","C","D"], ["X","Y","Z"]],
    "truth_scores": [[3,2,1,0],          [2,1,0]],
    "pred_items":   [["B","A","E","C"], ["Y","X","Z"]],
})

metrics  = ["ndcg", "recall", "kendall_tau", "tau_ap", "rbo"]
cutoffs  = [3, 5]

query_report, overall_report = get_metrics_report(
    df,
    truth_item_col="truth_items",
    truth_score_col="truth_scores",
    pred_item_col="pred_items",
    metric_list=metrics,
    cutoff_list=cutoffs,
)

print(query_report.head())  # per‑query breakdown
print(overall_report)       # summary stats (mean, std, …)

Typical query_report:

  query  ndcg@3  recall@3  kendall_tau@3  tau_ap@3  rbo@3  ndcg@5  recall@5  kendall_tau@5  tau_ap@5  rbo@5
0    q1    0.91      0.67           0.33      0.40   0.79    0.90      1.00           0.33      0.46   0.79
1    q2    1.00      0.67           0.67      0.80   1.00    1.00      1.00           0.67      0.80   1.00

Typical overall_report:

       ndcg@3  recall@3  kendall_tau@3  tau_ap@3  rbo@3  ndcg@5  recall@5  kendall_tau@5  tau_ap@5  rbo@5
mean     0.96     0.67           0.50      0.60   0.90    0.95     1.00           0.50      0.63   0.90
std      0.06     0.00           0.24      0.28   0.15    0.05     0.00           0.24      0.24   0.15

🧮 Supported metrics & formulas

Metric What it measures Reference
nDCG@k Graded relevance with log‑discounted gain, normalised by ideal ranking Järvelin & Kekäläinen (2002)
Recall@k Proportion of ground‑truth items retrieved in top k
Kendall’s τ‑b@k Rank correlation, tie‑adjusted Kendall (1938)
Kendall’s τ‑ap@k Top‑weighted rank correlation Yilmaz et al. (2008)
RBO@k Top‑weighted similarity between two indefinite rankings Webber et al. (2010)

Heads‑up: RBO requires the two lists to have unique items and equalised lengths. If you hit RankingSimilarity errors, drop duplicates beforehand or omit RBO for that experiment.


🛠️ API reference

def get_metrics_report(
    df: pd.DataFrame,
    truth_item_col: str,
    truth_score_col: str,
    pred_item_col: str,
    metric_list: list[str],
    cutoff_list: list[int],
) -> tuple[pd.DataFrame, pd.DataFrame]
Parameter Description
df DataFrame with at least the three list‑columns below.
truth_item_col Column holding ground‑truth item IDs.
truth_score_col Column with relevance grades (same order & length).
pred_item_col Column holding system‑predicted ranked lists.
metric_list Any subset of METRIC_REGISTRY keys, e.g. ndcg, tau_ap.
cutoff_list Integers e.g. [1, 3, 10]. Each yields metric@k columns.

Returns (query_report, overall_report) where:

  • query_report – original df plus metric columns.
  • overall_reportquery_report.describe().

⚙️ Performance tips

  • Core logic is vectorised; multi‑process pandas handles millions of rows out‑of‑the‑box.
  • Chunk evaluation if truth lists are extremely long (> 1 K items) to limit memory.

🤝 Contributing

Found a bug? Need MAP or MRR? PRs are welcome! Please open an issue first so we can discuss the approach.

  1. Fork ➡️ branch ➡️ commit (with tests!)
  2. pre‑commit run -a
  3. Open a pull request describing the change.

🛣️ Roadmap

  • Mean Average Precision (MAP)
  • Mean Reciprocal Rank (MRR)
  • Optional GPU acceleration via cuDF / RAPIDS

📝 License

MIT © 2025 Akash Dubey


🔗 Links & citation

@software{Dubey_2025_rank_validation,
  author = {Dubey, Akash},
  title  = {rank‑validation: A lightweight toolkit for ranking evaluation},
  year   = {2025},
  url    = {https://github.com/akashkdubey/ranking_validation}
}

Built with ❤️, Pandas & SciPy.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

rank_validation-1.1.8.tar.gz (16.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

rank_validation-1.1.8-py3-none-any.whl (15.2 kB view details)

Uploaded Python 3

File details

Details for the file rank_validation-1.1.8.tar.gz.

File metadata

  • Download URL: rank_validation-1.1.8.tar.gz
  • Upload date:
  • Size: 16.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.4

File hashes

Hashes for rank_validation-1.1.8.tar.gz
Algorithm Hash digest
SHA256 982ed781bdc13c55848fcb2f9b3fc3eefedca1820168c444fb54716c2a91644c
MD5 1e112c8f4a2766f6eddd4c1cbd0d6084
BLAKE2b-256 f052e492bd8de0acf3a6c87bcba7519c34290b0fa4c8995d272b91e077a34d1e

See more details on using hashes here.

File details

Details for the file rank_validation-1.1.8-py3-none-any.whl.

File metadata

File hashes

Hashes for rank_validation-1.1.8-py3-none-any.whl
Algorithm Hash digest
SHA256 edf1c24135d9b0b8f97c01f955f6a889ffabc313c77b80ec2c9cf2473c4c2a8c
MD5 b3c7daf98c39f148d5a554e56f207c1a
BLAKE2b-256 c81913c5e25db45ed575265452f9733b14ce72f025a6a28e91bbe431adf85e4e

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page