BM25 reranking plugin for SearXNG using zero-dependency sparse search
Project description
English | 中文
SearXNG BM25 Reranker
An external SearXNG plugin that reranks search results using BM25 text relevance scoring to improve search quality.
Features
- BM25F Multi-Field Weighting — Title weight 2.0, content weight 1.0, prioritizing title-matching results
- RRF Fusion Ranking — Combines BM25 scores with original engine rankings via Reciprocal Rank Fusion, rather than replacing them
- CJK Tokenization — Built-in zero-dependency CJK tokenizer (unigram + bigram), no jieba required
- Zero External Dependencies — Core BM25 engine from zerodep/sparse_search, pure standard library
- Plug and Play — Standard SearXNG external plugin, deployable via volume mount or pip install
How It Works
Search request → Engines return results → [post_search hook]
↓
Build temporary BM25F index (title + content)
↓
BM25 retrieval with original query
↓
RRF fusion (engine ranking + BM25 ranking, k=60)
↓
Rewrite positions to influence scoring → Reranked results
The plugin hooks into the post_search phase, after all engine results are collected but before final scores are calculated. By rewriting each result's positions list, it influences SearXNG's built-in calculate_score() formula (weight / position), achieving non-invasive reranking.
Installation
Option 1: Volume Mount (Recommended for Quick Deployment)
- Clone and copy the plugin code to your server:
git clone https://github.com/Oaklight/searxng-bm25-reranker.git
cp -r searxng-bm25-reranker/src/searxng_bm25_reranker /path/to/plugins/
- Update
compose.yaml:
services:
searxng:
volumes:
- /path/to/plugins:/usr/local/searxng/plugins:ro
environment:
- PYTHONPATH=/usr/local/searxng/plugins
- Register the plugin in
settings.yml:
plugins:
searxng_bm25_reranker.SXNGPlugin:
active: true
- Restart the container:
docker compose restart searxng
Option 2: pip Install (For Custom Images)
FROM searxng/searxng:latest
RUN pip install --no-cache-dir searxng-bm25-reranker
You still need to register the plugin in settings.yml.
Configuration
The plugin works out of the box with no additional configuration. Default parameters:
| Parameter | Value | Description |
|---|---|---|
| BM25 variant | bm25 |
Standard Okapi BM25 |
| k1 | 1.5 |
Term frequency saturation |
| b | 0.75 |
Document length normalization |
| title weight | 2.0 |
BM25F weight for title field |
| content weight | 1.0 |
BM25F weight for content field |
| RRF k | 60 |
RRF fusion constant |
Project Structure
src/searxng_bm25_reranker/
├── __init__.py # SXNGPlugin class, post_search reranking logic
├── _tokenizer.py # CJK-aware tokenizer (unigram + bigram)
└── _vendor/
└── sparse_search.py # BM25 engine from zerodep (vendored)
Acknowledgements
- SearXNG — Privacy-respecting metasearch engine
- zerodep/sparse_search — Zero-dependency BM25 full-text search engine
License
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file searxng_bm25_reranker-0.1.0.tar.gz.
File metadata
- Download URL: searxng_bm25_reranker-0.1.0.tar.gz
- Upload date:
- Size: 18.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.10.20
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0a6caf531722648114ede96992466a5e882f237f9fc9acab111d02aed84400ac
|
|
| MD5 |
79aee3d7c67dee6eaea8e8427eb59d0a
|
|
| BLAKE2b-256 |
296a73560a52ed6d6e9fb5d77657da46404d6094e38eae3af42842e08cdcf58d
|
File details
Details for the file searxng_bm25_reranker-0.1.0-py3-none-any.whl.
File metadata
- Download URL: searxng_bm25_reranker-0.1.0-py3-none-any.whl
- Upload date:
- Size: 17.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.10.20
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
cae91a0e1972ab43c0823c36ae1946cbc9f36897e71799fb7cedeb3d13a17a6d
|
|
| MD5 |
fce6ec2d1ceb18a747ac6c9b83578d2f
|
|
| BLAKE2b-256 |
2bc154b790c37725aac63f834181b2c54a81a00ab14a9c92e3a2c9e3aa62c44c
|