Forge better rankings from candidate documents with LLM reranking.
Project description
ranksmith
Forge better rankings from candidate documents.
ranksmith is a small Python package for LLM-based reranking. The current
package focuses on Azure OpenAI powered zero-shot reranking for candidate
documents.
Highlights:
- Built-in listwise RankGPT, pairwise PRP, tournament-style TourRank-r, and uncertainty-aware AcuRank strategies
- Public strategy contracts for custom reranking methods
ModelClient/ModelProviderboundary for vendor-independent LLM calls- Strict JSON parsing and fast-fail error behavior
- Sync and async Azure OpenAI rerankers
- Reproducible benchmark summaries with committed evidence artifacts
Install
pip install ranksmith
Quick Start
from ranksmith import AzureOpenAIReranker, Document
reranker = AzureOpenAIReranker(
api_key="...",
azure_endpoint="https://example.openai.azure.com",
azure_deployment="gpt-4o-mini",
)
results = reranker.rerank(
query="What is listwise reranking?",
documents=[
Document(id="a", text="Listwise reranking compares candidates together."),
Document(id="b", text="Vector search retrieves candidate documents."),
],
top_k=2,
)
for result in results:
print(result.rank, result.original_index, result.document.id)
rank is 1-based for display. original_index is 0-based so it maps back to
the input list.
Supported Strategies & Algorithms
ranksmith separates the evaluation methodology (Strategy) from its execution
logic (Algorithm).
Recommended Use Cases
| Method | Strategy | Use when | Cost / risk |
|---|---|---|---|
rankgpt_sliding_window |
ListwiseStrategy |
You need the default, lowest-friction LLM reranker for production or evaluation. | Low call count, but each prompt asks for a full ordered list and can be sensitive to output format. With window_size >= N, this becomes one-shot listwise reranking. |
prp_sliding_k |
PairwiseStrategy |
You need pairwise preference comparisons or want to reproduce PRP-style behavior. | Many LLM calls; default passes=10 is expensive. |
setwise_heapsort |
SetwiseStrategy |
You want top-k-oriented setwise selection with fewer calls than pairwise PRP in practical long-context settings. | Quality depends on set_size; larger sets reduce calls but can make the selection prompt harder. |
tourrank_r, rounds=2 |
TourRankStrategy |
You want stronger quality than listwise on a moderate call budget. | More calls than RankGPT, much fewer than TourRank-10. |
tourrank_r, rounds=10 |
TourRankStrategy |
You are doing quality-focused offline reranking, paper-style evaluation, or final reranking where latency is acceptable. | Highest call cost among built-in methods in normal use. |
acurank |
AcuRankStrategy |
You want adaptive listwise reranking that spends calls on uncertain candidates near the top-k boundary. | Uses TrueSkill state and may issue more calls than basic listwise reranking unless capped. |
| Custom strategy | RerankStrategy / AsyncRerankStrategy |
You need deterministic business logic, a proprietary ranking process, or a new research method. | You own the ranking contract and validation behavior. |
Applying a Strategy
Configure a strategy and pass it to AzureOpenAIReranker.
from ranksmith import AzureOpenAIReranker, ListwiseStrategy
strategy = ListwiseStrategy(
algorithm="rankgpt_sliding_window",
window_size=20,
stride=10,
max_document_chars=4000,
)
reranker = AzureOpenAIReranker(
api_key="...",
azure_endpoint="https://example.openai.azure.com",
azure_deployment="gpt-4o-mini",
strategy=strategy,
)
results = reranker.rerank("query", documents)
Pairwise PRP uses the same reranker facade with a different strategy:
from ranksmith import AzureOpenAIReranker, PairwiseStrategy
reranker = AzureOpenAIReranker(
api_key="...",
azure_endpoint="https://example.openai.azure.com",
azure_deployment="gpt-4o-mini",
strategy=PairwiseStrategy(passes=3),
)
TourRank-r uses the same injection point:
from ranksmith import AzureOpenAIReranker, TourRankStrategy
reranker = AzureOpenAIReranker(
api_key="...",
azure_endpoint="https://example.openai.azure.com",
azure_deployment="gpt-4o-mini",
strategy=TourRankStrategy(rounds=2, group_parallelism=1),
)
For quality-focused runs, explicitly switch to TourRank-10:
reranker = AzureOpenAIReranker(
api_key="...",
azure_endpoint="https://example.openai.azure.com",
azure_deployment="gpt-4o-mini",
strategy=TourRankStrategy(rounds=10),
)
AcuRank uses listwise reranker calls as evidence for TrueSkill-based relevance estimates:
from ranksmith import AcuRankStrategy, AzureOpenAIReranker
reranker = AzureOpenAIReranker(
api_key="...",
azure_endpoint="https://example.openai.azure.com",
azure_deployment="gpt-4o-mini",
strategy=AcuRankStrategy(
target_rank=10,
window_size=20,
max_adaptive_reranker_calls=20, # Optional adaptive-phase budget cap.
batch_parallelism=2, # Optional; keep 1 if your provider is not thread-safe.
),
)
If every Document has numeric metadata["score"], AcuRank uses it as the
first-stage prior. If no document has a score, it falls back to the standard
TrueSkill prior. Partial score metadata and boolean score values fail fast.
For small candidate sets, target_rank is clipped to the number of documents.
max_adaptive_reranker_calls limits only the adaptive refinement phase; the
optional initial pass is counted separately in result metadata.
batch_parallelism parallelizes independent batches within the same AcuRank
iteration, while posterior updates are still applied in deterministic batch
order.
Note: If
strategyis not provided, it defaults toListwiseStrategy(algorithm="rankgpt_sliding_window"). Pairwise PRP, Setwise, TourRank-r, and AcuRank can use more LLM calls than basic listwise reranking, so check call estimates before live benchmarks.
Custom Strategies
Custom reranking methods should be implemented as new strategy classes instead
of adding new string values to ListwiseStrategy.algorithm. A strategy receives
the normalized Document objects, a model client, and optional top_k, then
returns RerankResult objects.
from collections.abc import Sequence
from ranksmith import (
AzureOpenAIReranker,
Document,
RerankResult,
)
class LengthStrategy:
def rerank(
self,
*,
query: str,
documents: Sequence[Document],
model_client: object,
top_k: int | None = None,
) -> list[RerankResult]:
del query, model_client
ordered_indexes = sorted(
range(len(documents)),
key=lambda index: len(documents[index].text),
reverse=True,
)
results = [
RerankResult(
document=documents[original_index],
rank=rank,
original_index=original_index,
metadata={"strategy": "length"},
)
for rank, original_index in enumerate(ordered_indexes, start=1)
]
return results if top_k is None else results[:top_k]
reranker = AzureOpenAIReranker(
api_key="...",
azure_endpoint="https://example.openai.azure.com",
azure_deployment="gpt-4o-mini",
strategy=LengthStrategy(),
)
Model-backed and async strategies use the same public contract. See the custom strategy extension guide and custom strategy example for the full extension guide.
Model Provider Architecture
ModelClient owns ranksmith's domain prompts and rank / compare / select
contracts. ModelProvider only executes vendor-specific JSON completion
requests.
| Layer | Responsibility | Public methods |
|---|---|---|
Strategy |
Build the final reranking order. | rerank(...) |
ModelClient |
Build ranksmith prompts, enforce the ranking domain contract, and emit usage. | rank(...), compare(...), select(...) |
ModelProvider |
Call a vendor SDK and return JSON completion text. | complete(...) |
from ranksmith import AzureAOAIProvider, ModelClient
provider = AzureAOAIProvider(
api_key="...",
azure_endpoint="https://example.openai.azure.com",
azure_deployment="gpt-4o-mini",
api_version="2024-08-01-preview",
)
model_client = ModelClient(provider=provider)
The same ModelClient can power all built-in strategies:
from ranksmith import AzureOpenAIReranker, PairwiseStrategy
reranker = AzureOpenAIReranker(
model_client=model_client,
strategy=PairwiseStrategy(passes=3),
)
OpenAIProvider, AnthropicProvider, and GeminiProvider are reserved public
stubs for future SDK-backed implementations. Calling them fails fast with
RerankProviderError.
Async Support
ranksmith provides first-class asynchronous support for high-throughput
environments like FastAPI.
from ranksmith import AsyncAzureOpenAIReranker
reranker = AsyncAzureOpenAIReranker(
api_key="...",
azure_endpoint="https://example.openai.azure.com",
azure_deployment="gpt-4o-mini",
)
results = await reranker.rerank("query", documents)
Examples
Runnable examples live in the examples/ directory.
- rankgpt_sync.py: synchronous RankGPT integration
- rankgpt_async.py: async RankGPT integration
- pairwise_prp.py: pairwise PRP strategy
- setwise_heapsort.py: Setwise Heapsort with a fake provider
- tourrank.py: TourRank-r with a fake provider
- acurank.py: AcuRank with first-stage score priors
- custom_strategy.py: custom strategy contracts
Benchmarking
The benchmark below measures reranking only. Pyserini BM25 provides the fixed
first-stage candidates; ranksmith reranks those candidates without performing
retrieval. The run uses AskUbuntuDupQuestions test data: 361 queries, BM25
top-20 candidates per query, and @5 evaluation. Methods that support top-k
early stopping may emit only the evaluated top-5. Azure OpenAI deployment
gpt-5.4-nano was used for live LLM calls.
Invalid LLM outputs were not repaired or silently corrected. They were retried, and any remaining invalid rows are reported as invalid.
The table separates nominal algorithm call estimates from row-level retry attempts. Row attempts are useful for retry accounting, but they are not exact provider-call telemetry for multi-call methods that can fail partway through an algorithm run. The committed evidence artifacts are:
benchmark-results/live/askubuntu-bm25-top20-default-live.v3.merged.jsonbenchmark-results/pyserini/askubuntu-bm25-top20.trec
| Method | NDCG@5 | MRR@5 | Recall@5 | Valid rows | Invalid rate | Nominal LLM calls/query | LLM row attempts/query incl. retries |
|---|---|---|---|---|---|---|---|
original_bm25 |
0.3520 | 0.5062 | 0.2862 | 361/361 | 0.000 | 0 | N/A |
single_call_listwise@20 |
0.4082 | 0.5541 | 0.3345 | 359/361 | 0.006 | 1 | 1.04 |
rankgpt_sw_w5 |
0.3973 | 0.5283 | 0.3366 | 361/361 | 0.000 | 9 | 1.01 |
acurank_k5_b1 |
0.4053 | 0.5491 | 0.3377 | 356/361 | 0.014 | 2 | 1.12 |
tourrank_r2 |
0.4236 | 0.5725 | 0.3601 | 361/361 | 0.000 | 8 | 1.03 |
setwise_hs_s10 |
0.3653 | 0.5059 | 0.3005 | 361/361 | 0.000 | 12 | 1.00 |
prp_sliding_p1 |
0.4065 | 0.5818 | 0.3277 | 361/361 | 0.000 | 38 | 1.00 |
tourrank_r2 had the best NDCG@5 and Recall@5, while prp_sliding_p1 had the
best MRR@5. single_call_listwise@20 is the one-shot listwise baseline.
rankgpt_sw_w5 is the true sliding-window listwise baseline for this top-20
setup. acurank_k5_b1 aligns AcuRank's uncertainty boundary with the @5
evaluation cutoff. setwise_hs_s10 is a practical Setwise Heapsort setting
that extracts only the evaluated top-5 from 20 candidates.
After retries, 2 single_call_listwise@20 rows and 5 acurank_k5_b1 rows
remained invalid. They are included in the invalid-rate accounting instead of
being repaired.
Result Model
result.document # Document
result.rank # 1-based rank
result.original_index # 0-based input index
result.metadata # strategy-specific metadata
Error Handling
ranksmith fails fast. It does not silently truncate long documents, repair
invalid rankings, or return unvalidated LLM output.
from ranksmith import (
DocumentTooLongError,
RerankParseError,
RerankProviderError,
RerankStrategyError,
)
try:
results = reranker.rerank("query", documents)
except DocumentTooLongError:
...
except RerankParseError:
...
except RerankProviderError:
...
except RerankStrategyError:
...
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file ranksmith-0.5.1.tar.gz.
File metadata
- Download URL: ranksmith-0.5.1.tar.gz
- Upload date:
- Size: 2.2 MB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
dff10b86fb7dfa403c606759453262726b79927a460a130583f41b3b87ec0c7f
|
|
| MD5 |
c50b3a50815434a9b08a9cf3e4df9ce4
|
|
| BLAKE2b-256 |
94075afa318010cb60fba9e3745ffc8ce578008249de3eb91678c38b4cce9285
|
Provenance
The following attestation bundles were made for ranksmith-0.5.1.tar.gz:
Publisher:
ci.yml on pko89403/ranksmith
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
ranksmith-0.5.1.tar.gz -
Subject digest:
dff10b86fb7dfa403c606759453262726b79927a460a130583f41b3b87ec0c7f - Sigstore transparency entry: 1690800492
- Sigstore integration time:
-
Permalink:
pko89403/ranksmith@b7b1e8072464af3e5350c40035fce58cd911c9a6 -
Branch / Tag:
refs/tags/v0.5.1 - Owner: https://github.com/pko89403
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
ci.yml@b7b1e8072464af3e5350c40035fce58cd911c9a6 -
Trigger Event:
push
-
Statement type:
File details
Details for the file ranksmith-0.5.1-py3-none-any.whl.
File metadata
- Download URL: ranksmith-0.5.1-py3-none-any.whl
- Upload date:
- Size: 35.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
21e910bf372ebd53602bcbbf00044ec85d231daf82c567c8e6cd702b06e9c559
|
|
| MD5 |
60791154c040c0ef49d5361605360eef
|
|
| BLAKE2b-256 |
df9953dd3fb487bb35c97f03ee54ebdf86bd5f8594e734d82d302d5ecd8fdd09
|
Provenance
The following attestation bundles were made for ranksmith-0.5.1-py3-none-any.whl:
Publisher:
ci.yml on pko89403/ranksmith
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
ranksmith-0.5.1-py3-none-any.whl -
Subject digest:
21e910bf372ebd53602bcbbf00044ec85d231daf82c567c8e6cd702b06e9c559 - Sigstore transparency entry: 1690800539
- Sigstore integration time:
-
Permalink:
pko89403/ranksmith@b7b1e8072464af3e5350c40035fce58cd911c9a6 -
Branch / Tag:
refs/tags/v0.5.1 - Owner: https://github.com/pko89403
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
ci.yml@b7b1e8072464af3e5350c40035fce58cd911c9a6 -
Trigger Event:
push
-
Statement type: