A simple and fast search service for hosting state-of-the-art retrieval models.

These details have not been verified by PyPI

Project description

RoutIR: Fast Server for Hosting Retrieval Models for Retrieval-Augmented Generation

RoutIR is a Python package that provides a simple and efficient wrapper around arbitrary retrieval models, including first stage retrieval, reranking, query expansion, and result fusion, and provides efficient asynchronous query batching and serving.

Get Started

You can install routir in your environment through pip or uv.

pip install routir

RoutIR comes with a number of extras to install only the dependencies for the models you would like to serve. These extras include dense, gpu, plaidx, and sparse. You can instal any combinations, such as

pip install "routir[dense,gpu]"

To start the service, simply provide the config file to the cli command routir to start. You can also optionally specify the port through --port flag (default 8000).

routir config.json --port 5000

You can also use uvx to let uv creates a virtual environment on the fly for you:

uvx --with transformers --with torch routir config.json

Use --with to specify additional packages that you may need for serving the model. Please refer to uv documentation for more information.

Configuration

The configuration file four major blocks: services, collections, server_imports, and file_imports.

services and collections are list of object configuring each engine and each collection being served
server_imports is list of external RoutIR endpoints that you would like to mirror in this endpoint. This will allow the end users to construct retrieval pipelines using services hosted on other machines. This is particularly helpful in a distributed compute cluster.
file_imports is a list of custom Python scripts implemeting custom engines that RoutIR should load at initialization. More in the Extension section.

For example, if there are two other RoutIR instances running on compute01:5000 and compute02:5000 where each host plaidx-neuclir and RankLllama, you can import them as following. Users using this endpoint will be able to use both plaidx-neuclir and RankLlama. The following is an example config.

{
    "server_imports": [
        "http://compute01:5000",
        "http://compute02:5000",
    ],
    "file_imports": [
        "./examples/rank1_extension.py"
    ],
    "services": [
        {
            "name": "qwen3-neuclir",
            "engine": "Qwen3", 
            "cache": 1024, 
            "cache_ttl": 1024000, 
            "batch_size": 32, 
            "max_wait_time": 0.05, 
            "config": {
                "index_path": "hfds:routir/neuclir-qwen3-8b-faiss-PQ2048x4fs",
                "api_key": "YOUR_API_KEY_HERE OR AT OPENAI_API_KEY ENVIRONMENT VARIABLE",
                "embedding_base_url": "https://api.fireworks.ai/inference/v1/",
                "embedding_model_name": "accounts/fireworks/models/qwen3-embedding-8b",
                "k_scale": 5
            }
        },
        {
            "name": "rank1",
            "engine": "Rank1Engine",
            "config": {}
        }
    ],
    "collections": [
        {
            "name": "neuclir",
            "doc_path": "./neuclir-doc.jsonl"
        }
    ]
}

If you want to use Redis for caching, add cache_redis_url and cache_redis_kwargs to the service object. If your Redis instance is password-protected (which you should), add password field to cache_redis_kwargs.

HTTP API

Available services: GET /avail. An example output of the service initiated with the previous example config would be:

{
    "content": ["neuclir"],
    "score": ["Rank1", "RankLlama"],
    "search": ["qwen3-neuclir", "plaidx-neuclir"],
    "fuse": ["RRF", "ScoreFusion"], 
    "decompose_query": []
}

Search an index: POST /search. The following is an example request using cURL.

curl -X POST http://localhost:5000/search \
-H "Content-Type: application/json" \
-d '{"service": "qwen3-neuclir", "query": "my test queries", "limit": 15}'

Output:

{
  "cached": true,
  "processed": true,
  "query": "my test queries",
  "scores": {
    "05a83946-dca2-4518-9bc3-3d394394d5e3": 0.3807981014251709,
    "36faf9fc-3751-4047-bb1c-2bd90fa6f4d4": 0.3675723671913147,
    "3a9ba832-f689-4204-8627-96abd73be65f": 0.42572247982025146,
    "6a5b81f3-9154-4959-9e88-79edfcecb43f": 0.3666379451751709,
    "6b086402-a00c-4fd8-8772-fade1f4b3198": 0.3996303975582123,
    "76ec4dd1-fb6e-4a1e-b3e2-4b6214886e52": 0.3723523020744324,
    "8c6e9e63-ea22-406e-a841-2dc645a3d2e2": 0.4014992415904999,
    "90f2e4af-8a92-4869-9c73-013fead4876d": 0.3644096851348877,
    "9dc749e8-f7a7-4c76-9883-03c7bc620d92": 0.37544310092926025,
    "aa3542e0-0c62-4518-9a0a-07eaa5b1eb00": 0.3768806755542755,
    "aeba1a4c-e02e-4d37-898c-68732c05b7d9": 0.3764134645462036,
    "b564d3aa-983d-42a4-b5ba-e6d43e79c094": 0.3760540783405304,
    "e46324a8-e9fb-442f-806d-1ed8f0efb2b0": 0.37497588992118835,
    "f91c5cf9-020b-4019-a483-41aee141808c": 0.3672129511833191,
    "fd6f8822-ddf4-4264-a449-5ecc7884c8ec": 0.36940526962280273
  },
  "service": "qwen3-neuclir",
  "timestamp": 1761023408.7890506
}

Score/Rerank a list of text given a query: POST /score. This allows you to score/rerank arbitrary pieces of text, such as document content, pasages in a document for context compression, or generated reponses for ranking answer relevancy. The following is an example request:

curl -X POST http://localhost:5000/score \
-H "Content-Type: application/json" \
-d '{
    "service": "rank1", 
    "query": "what is routir", 
    "passages": [
        "routir is a python package", 
        "sushi is the best food in the world"
    ]
}'

Output:

{
  "cached": false,
  "processed": true,
  "query": "what is routir",
  "scores": [
    0.9999997617631468,
    7.889264466868659e-06
  ],
  "service": "rank1",
  "timestamp": 1761026442.1780925
}

Search with dynamic pipeline: POST /pipeline. This allows the end users to construct an arbirary search pipeline with available engines on the fly. For example

curl -X POST http://localhost:5000/pipeline \
-H "Content-Type: application/json" \
-d '{
    "pipeline": "{qwen3-neuclir, plaidx-neuclir}RRF%50 >> rank1", 
    "query": "which team is the world series champion in 2020?",
    "collection": "neuclir"
}'

Output:

{
  "cached": false,
  "collection": "neuclir",
  "pipeline": "{qwen3-neuclir, plaidx-neuclir}RRF%50 >> rank1",
  "processed": true,
  "query": "which team is the world series champion in 2020?",
  "scores": {
    "027b3f6f-3dc6-4e69-86ae-2a98f8c4a881": 0.999999712631481,
    "066e645a-a495-4622-bcc8-7a804f598bcf": 5.4222202626709005e-06,
    "0ced1751-181a-4abb-8d64-37c362ede67c": 0.9999986290429566,
    "1c0d1e33-ea2c-48f3-9422-6f81259095eb": 0.9999996940976272,
    "27b429cc-b2a0-43cf-8b2b-883796486780": 4.539786865487149e-05,
    "2d11d0a3-78de-4201-ad26-64a6ac4b148f": 1.8925157266468097e-05,
    "302d1c1a-d620-4971-a44c-c1faead39494": 1.6701429809483402e-05,
    "39d2608d-e0a5-4b52-bb0e-b04968e21a15": 0.033085980653064666,
    "3c3c49f3-24b1-4dad-ac57-35b0565ab9b8": 2.1444943303118133e-05,
    "6e65cae3-443d-4cdd-9efc-8dfb3e1fe0b1": 6.962258739847376e-06,
    "7e4a4d57-9e73-4fb6-8ea7-584d0549c508": 0.0052201256185966365,
    "7ecdc77d-ea8c-4d48-9235-c21df9086831": 0.9999999397642365,
    "8660ca1b-ef5a-4c3a-a3e9-692e1e686f07": 1.9947301971022554e-06,
    "88b3eff5-738a-4bcd-b31a-15d9f1b9e198": 3.288748281343353e-06,
    "940bb6ff-f88a-40cd-88c1-2ae719d1dc74": 0.99999980249468,
    "a3edc861-7cf5-4152-a32b-90961bd12b80": 1.8925155010490798e-05,
    "c26f5a26-e732-4deb-80d6-b3ad6b249927": 0.9999986290426297,
    "da57d712-c6a8-4fa3-8e68-7f21ea7d3167": 0.9999996072138465,
    "ed231d01-05d9-4ed6-98d6-97b4e3e64aae": 0.00317268301626477,
    "f3954f32-62e6-4cb3-9ef2-78fe3dcb8f7a": 1.9947304348917116e-06
  },
  "service": "rank1",
  "timestamp": 1761026586.5823486
}

Extension Examples

We provide several examples for integrating other IR toolkits with RoutIR. Please refer to each example for details.

[!WARNING] The Python script implementing the custom Engine needs to be imported through file_imports in the config. When using uvx, remember to put the essential packages at --with.

PyTerrier

python ./examples/pyterrier_extension.py # to build the index
uvx --with python-terrier routir ./examples/pyterrier_example_config.json --port 8000 # serve it at port 8000

Pyserini

uvx --with pyserini routir ./examples/pyserini_example_config.json --port 8000 # serve it at port 8000

Rank1

uvx --with mteb==1.39.0 --with vllm routir ./examples/rank1_example_config.json

The specific mteb version is crucial for this example.

Other Helper Scripts

Here is an example command to generate .npy files containing Qwen3 document embeddings from a .jsonl file with id, title, and text fields:

python -m routir.utils.qwen3_encode /path/to/docs.jsonl /output/path \
--id-field id --fields title text --docs-per-file 10000
--batch-size 8 --model-name Qwen/Qwen3-Embedding-8B

To provide reference for the FAISS index structure that RoutIR uses, you can refer to the routir.utils.faiss_indexing for details. Here is an example command to generate a FAISS index from a directory containing .npy files, each with features and ids fields (as generated by the above script):

python -m routir.utils.faiss_indexing \
./encoded_vectors/ ./faiss_index.PQ2048x4fs.IP/ \
--index_string "PQ2048x4fs" --use_gpu --sampling_rate 0.25

Contribution

We welcome any feedback, feature requests and pull requests. Please raise issues on GitHub. Feel free to reach out to us through emails, ACM SIGIR Slack, or GitHub issues.

Attribution

TBA

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

0.0.2a0 pre-release

Mar 4, 2026

This version

0.0.1

Dec 17, 2025

0.0.1b10 pre-release

Oct 21, 2025

0.0.1b9 pre-release

Oct 20, 2025

0.0.1b8 pre-release

Oct 15, 2025

0.0.1b7 pre-release

Oct 15, 2025

0.0.1b6 pre-release

Oct 10, 2025

0.0.1b5 pre-release

Oct 10, 2025

0.0.1b4 pre-release

Oct 10, 2025

0.0.1b3 pre-release

Oct 8, 2025

0.0.1b2 pre-release

Oct 8, 2025

0.0.1b1 pre-release

Oct 8, 2025

0.0.1b0 pre-release

Oct 8, 2025

0.0.1a0 pre-release

Oct 8, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

routir-0.0.1.tar.gz (52.9 kB view details)

Uploaded Dec 17, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

routir-0.0.1-py3-none-any.whl (61.3 kB view details)

Uploaded Dec 17, 2025 Python 3

File details

Details for the file routir-0.0.1.tar.gz.

File metadata

Download URL: routir-0.0.1.tar.gz
Upload date: Dec 17, 2025
Size: 52.9 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.13

File hashes

Hashes for routir-0.0.1.tar.gz
Algorithm	Hash digest
SHA256	`74afc871546afbe1546ff9d2797648eb42ba0963dcce34c2c8dbc763fa76f30f`
MD5	`944da50c1fd9f5125f358cdc0f899a6d`
BLAKE2b-256	`7afd7cc6f5dd8bfdb8f8d473b51dd553b0c69c9acdb0ed12fc6392040d559eae`

See more details on using hashes here.

File details

Details for the file routir-0.0.1-py3-none-any.whl.

File metadata

Download URL: routir-0.0.1-py3-none-any.whl
Upload date: Dec 17, 2025
Size: 61.3 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.13

File hashes

Hashes for routir-0.0.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`e8b406b51089155d713b2ae4dc70a4981f4eb86558087d37cc2485b98797ea85`
MD5	`1001bdd003ed1a682517d9d0d8db32cc`
BLAKE2b-256	`ae9718822c5d4f9156f05e42a5af0936fdefd045721207e4e55fbf7281756aac`

See more details on using hashes here.

routir 0.0.1

Navigation

Verified details

Maintainers

Unverified details

Meta

Project description

RoutIR: Fast Server for Hosting Retrieval Models for Retrieval-Augmented Generation

Get Started

Configuration

HTTP API

Extension Examples

Other Helper Scripts

Contribution

Attribution

Project details

Verified details

Maintainers

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes