Skip to main content

BM25 reranking plugin for SearXNG using zero-dependency sparse search

Project description

GitHub release License: AGPL-3.0 Python SearXNG

English | 中文

SearXNG BM25 Reranker

An external SearXNG plugin that reranks search results using BM25 text relevance scoring to improve search quality.

Features

  • BM25F Multi-Field Weighting — Title weight 2.0, content weight 1.0, prioritizing title-matching results
  • RRF Fusion Ranking — Combines BM25 scores with original engine rankings via Reciprocal Rank Fusion, rather than replacing them
  • CJK Tokenization — Built-in zero-dependency CJK tokenizer (unigram + bigram), no jieba required
  • Zero External Dependencies — Core BM25 engine from zerodep/sparse_search, pure standard library
  • Plug and Play — Standard SearXNG external plugin, deployable via volume mount or pip install

How It Works

Search request → Engines return results → [post_search hook]
                                                ↓
                                   Build temporary BM25F index (title + content)
                                                ↓
                                   BM25 retrieval with original query
                                                ↓
                                   RRF fusion (engine ranking + BM25 ranking, k=60)
                                                ↓
                                   Rewrite positions to influence scoring → Reranked results

The plugin hooks into the post_search phase, after all engine results are collected but before final scores are calculated. By rewriting each result's positions list, it influences SearXNG's built-in calculate_score() formula (weight / position), achieving non-invasive reranking.

Installation

Option 1: Volume Mount (Recommended for Quick Deployment)

  1. Clone and copy the plugin code to your server:
git clone https://github.com/Oaklight/searxng-bm25-reranker.git
cp -r searxng-bm25-reranker/src/searxng_bm25_reranker /path/to/plugins/
  1. Update compose.yaml:
services:
  searxng:
    volumes:
      - /path/to/plugins:/usr/local/searxng/plugins:ro
    environment:
      - PYTHONPATH=/usr/local/searxng/plugins
  1. Register the plugin in settings.yml:
plugins:
  searxng_bm25_reranker.SXNGPlugin:
    active: true
  1. Restart the container:
docker compose restart searxng

Option 2: pip Install (For Custom Images)

FROM searxng/searxng:latest
RUN pip install --no-cache-dir searxng-bm25-reranker

You still need to register the plugin in settings.yml.

Configuration

The plugin works out of the box with no additional configuration. Default parameters:

Parameter Value Description
BM25 variant bm25 Standard Okapi BM25
k1 1.5 Term frequency saturation
b 0.75 Document length normalization
title weight 2.0 BM25F weight for title field
content weight 1.0 BM25F weight for content field
RRF k 60 RRF fusion constant

Project Structure

src/searxng_bm25_reranker/
├── __init__.py          # SXNGPlugin class, post_search reranking logic
├── _tokenizer.py        # CJK-aware tokenizer (unigram + bigram)
└── _vendor/
    └── sparse_search.py # BM25 engine from zerodep (vendored)

Acknowledgements

License

AGPL-3.0

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

searxng_bm25_reranker-0.1.0.tar.gz (18.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

searxng_bm25_reranker-0.1.0-py3-none-any.whl (17.1 kB view details)

Uploaded Python 3

File details

Details for the file searxng_bm25_reranker-0.1.0.tar.gz.

File metadata

  • Download URL: searxng_bm25_reranker-0.1.0.tar.gz
  • Upload date:
  • Size: 18.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.20

File hashes

Hashes for searxng_bm25_reranker-0.1.0.tar.gz
Algorithm Hash digest
SHA256 0a6caf531722648114ede96992466a5e882f237f9fc9acab111d02aed84400ac
MD5 79aee3d7c67dee6eaea8e8427eb59d0a
BLAKE2b-256 296a73560a52ed6d6e9fb5d77657da46404d6094e38eae3af42842e08cdcf58d

See more details on using hashes here.

File details

Details for the file searxng_bm25_reranker-0.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for searxng_bm25_reranker-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 cae91a0e1972ab43c0823c36ae1946cbc9f36897e71799fb7cedeb3d13a17a6d
MD5 fce6ec2d1ceb18a747ac6c9b83578d2f
BLAKE2b-256 2bc154b790c37725aac63f834181b2c54a81a00ab14a9c92e3a2c9e3aa62c44c

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page