Lightweight hybrid reranker with baked-in model artifact.
Project description
small-hybrid-reranker
small-hybrid-reranker is a lightweight reranker package with a baked-in trained model.
It reranks a list of passages for a query using a hybrid feature stack:
- static embeddings (
cnmoro/static-nomic-384-pten) - lexical overlap and token interaction sketches
- BM25 and dense retrieval priors
- listwise LightGBM ranker
The model artifact is included in the package, so there is no separate checkpoint download.
Model In This Release
- Version
0.2.0packages an updated model trained on all available SciFact splits in this repository (train + test) for maximum fit. - Training setup used strict BM25 top-100 candidates with LightGBM LambdaRank over hybrid features.
- In-sample all-sets metric from training run:
ndcg@10:0.89999recall@10:0.89830
Inference remains lightweight and CPU-friendly: the API is still a single HybridReranker().rerank(query, passages) call.
Install
pip install small-hybrid-reranker
Quickstart
from small_hybrid_reranker import HybridReranker
reranker = HybridReranker()
query = "What is the speed of light?"
passages = [
"The speed of light in a vacuum is about 299,792 km/s.",
"Earth orbits the Sun in about 365 days.",
"Newton described laws of motion.",
]
ranked = reranker.rerank(query, passages)
print(ranked[0])
# {'passage': 'The speed of light in a vacuum is about 299,792 km/s.', 'score': 100.0}
API
HybridReranker(model_path: str | None = None)
model_path=None: uses baked-in model inside package.model_path="...joblib": load your own compatible artifact.
rerank(query: str, passages: list[str], top_k: int | None = None) -> list[dict]
Returns:
[
{"passage": "...", "score": 82.31},
{"passage": "...", "score": 40.87},
]
Scores are floats in [0, 100] and sorted descending.
Notes
- This package is optimized for reranking a provided candidate list.
- It is not a full retrieval system by itself.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file small_hybrid_reranker-0.2.0.tar.gz.
File metadata
- Download URL: small_hybrid_reranker-0.2.0.tar.gz
- Upload date:
- Size: 13.7 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.15
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d0f289a351c8ce4cd0b37733c0a52ccc52666c75e28d1e1ece61550f626579f2
|
|
| MD5 |
ac9e1fe2370039ad15242010e5b44299
|
|
| BLAKE2b-256 |
14f3e1129653fa583b761dfdb7e319916af809af6f422ff89660c7ee25c2b4a6
|
File details
Details for the file small_hybrid_reranker-0.2.0-py3-none-any.whl.
File metadata
- Download URL: small_hybrid_reranker-0.2.0-py3-none-any.whl
- Upload date:
- Size: 13.8 MB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.15
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
431d8c9b5ef33f4884fc1684b5bec9b8bff3d8c28c0025ef7d16d1404ab28d85
|
|
| MD5 |
9aa132af43c709ddb83c1a032dbaee29
|
|
| BLAKE2b-256 |
1fba6b5d6fd832f9555cd09b9cc018b7f718f322a4fd9fc07ecaac10c3ca27e7
|