An ultra-fast search index for SPLADE sparse neural retrieval models.

These details have not been verified by PyPI

Project links

Homepage

Project description

SPLADE-Index⚡

SPLADE-Index is an ultrafast search index for SPLADE sparse retrieval models implemented in pure Python. It is built on top of the BM25s library.

You can use `splade-index` to

✅ Index and Query up to millions of documents using any SPLADE Sparse Embedding (SparseEncoder) model supported by sentence-transformers, such as naver/spalde-v3.
📀 Save your index locally and load your index from the save files.
🤗 Upload your index to HuggingFace hub and let anyone download and use it.
🪶 Use memory mapping to load large indices with minimal RAM usage and no noticeable change in search latency (Loading a 1 Million document index with mmap uses just 2GB of RAM).
⚡ Make use of NVIDIA GPUs and PyTorch for 10x faster search compared to splade-index's CPU based numba backend, when your index contains 1 million plus documents.

SPLADE

SPLADE is a neural retrieval model which learns query/document sparse expansion. Sparse representations benefit from several advantages compared to dense approaches: efficient use of inverted index, explicit lexical match, interpretability... They also seem to be better at generalizing on out-of-domain data (BEIR benchmark).

For more information about SPLADE models, please refer to the following.

Installation

You can install splade-index with pip:

pip install splade-index

In order to use the 2x faster numba backend, install splade-index with the core dependencies:

pip install splade-index[core]

Quickstart

Here is a simple example of how to use splade-index:

from sentence_transformers import SparseEncoder
from splade_index import SPLADE

# Download a SPLADE model from the 🤗 Hub
model = SparseEncoder("rasyosef/splade-tiny")

# Create your corpus here
corpus = [
    "Bonobos are intelligent primates native to the Democratic Republic of the Congo.",
    "Komodo dragons are giant carnivorous lizards native to Indonesia.",
    "Gelada baboons are grass-eating primates native to the highlands of Ethiopia.",
    "Orangutans are highly intelligent great apes native to the rainforests of Indonesia and Malaysia.",   
]

# Create the SPLADE retriever and index the corpus
retriever = SPLADE()
retriever.index(model=model, documents=corpus)

# Query the corpus
queries = ["Do any large primates come from the jungles of Indonesia?"]

# Get top-k results as a tuple of (doc ids, documents, scores). All three are arrays of shape (n_queries, k).
results = retriever.retrieve(queries, k=2)
doc_ids, result_docs, scores = results.doc_ids, results.documents, results.scores

for i in range(doc_ids.shape[1]):
    doc_id, doc, score = doc_ids[0, i], result_docs[0, i], scores[0, i]
    print(f"Rank {i+1} (score: {score:.2f}) (doc_id: {doc_id}): {doc}")

# You can save the index to a directory
retriever.save("animal_index_splade")

# ...and load it when you need it
import splade_index

reloaded_retriever = splade_index.SPLADE.load("animal_index_splade", model=model)

Hugging Face Integration

splade-index can naturally work with Hugging Face's huggingface_hub, allowing you to load and save your index to the model hub.

First, make sure you have a valid access token for the Hugging Face model hub. This is needed to save models to the hub, or to load private models. Once you created it, you can add it to your environment variables:

export HF_TOKEN="hf_..."

Now, let's install the huggingface_hub library:

pip install huggingface_hub

Let's see how to use SPLADE.save_to_hub to save a SPLADE index to the Hugging Face model hub:

import os
from sentence_transformers import SparseEncoder
from splade_index import SPLADE

# Download a SPLADE model from the 🤗 Hub
model = SparseEncoder("rasyosef/splade-tiny")

# Create your corpus here
corpus = [
    "Bonobos are intelligent primates native to the Democratic Republic of the Congo.",
    "Komodo dragons are giant carnivorous lizards native to Indonesia.",
    "Gelada baboons are grass-eating primates native to the highlands of Ethiopia.",
    "Orangutans are highly intelligent great apes native to the rainforests of Indonesia and Malaysia.",   
]

# Create the SPLADE retriever and index the corpus
retriever = SPLADE()
retriever.index(model=model, documents=corpus)

# Set your username and token
user = "your-username"
token = os.environ["HF_TOKEN"]
repo_id = f"{user}/splade-index-animals"

# Save the index on your huggingface account
retriever.save_to_hub(repo_id, token=token)
# You can also save it publicly with private=False

Then, you can use the following code to load a SPLADE index from the Hugging Face model hub:

import os
from sentence_transformers import SparseEncoder
from splade_index import SPLADE

# Download a SPLADE model from the 🤗 Hub
model = SparseEncoder("rasyosef/splade-tiny")

# Set your huggingface username and token
user = "your-username"
token = os.environ["HF_TOKEN"]
repo_id = f"{user}/splade-index-animals"

# Load a SPLADE index from the Hugging Face model hub
retriever = SPLADE.load_from_hub(repo_id, model=model, token=token)

# Query the corpus
queries = ["Do any large primates come from the jungles of Indonesia?"]

# Get top-k results as a tuple of (doc ids, documents, scores). All three are arrays of shape (n_queries, k).
results = retriever.retrieve(queries, k=2)
doc_ids, result_docs, scores = results.doc_ids, results.documents, results.scores

for i in range(doc_ids.shape[1]):
    doc_id, doc, score = doc_ids[0, i], result_docs[0, i], scores[0, i]
    print(f"Rank {i+1} (score: {score:.2f}) (doc_id: {doc_id}): {doc}")

10x faster search with SPLADE_GPU

For large indices with 1 million plus documents, you can use SPLADE_GPU for 10x higher search throughput (queries/second) relative to splade-index's already fast CPU based numba backend. In order to use SPLADE_GPU, you need to have an NVIDIA GPU and a pytorch installation with CUDA.

from sentence_transformers import SparseEncoder
from splade_index.pytorch import SPLADE_GPU

# Download a SPLADE model from the 🤗 Hub
model = SparseEncoder("rasyosef/splade-mini", device="cuda")

# Load a SPLADE index from the Hugging Face model hub
repo_id = "rasyosef/msmarco_dev_1M_splade_index"
retriever = SPLADE_GPU.load_from_hub(
    repo_id, 
    model=model, 
    mmap=True, # memory mapping enabled for low RAM usage
    device="cuda"
)

# Query the corpus
queries = ["what is a corporation?", "do owls eat in the day", "average pharmacy tech salary"]

# Get top-k results as a tuple of (doc ids, documents, scores). All three are arrays of shape (n_queries, k).
results = retriever.retrieve(queries, k=5)
doc_ids, result_docs, scores = results.doc_ids, results.documents, results.scores

for i in range(doc_ids.shape[1]):
    doc_id, doc, score = doc_ids[0, i], result_docs[0, i], scores[0, i]
    print(f"Rank {i+1} (score: {score:.2f}) (doc_id: {doc_id}): {doc}")

Performance

splade-index with a numba backend gives 45% faster query time on average than the pyseismic-lsr library, which is "an Efficient Inverted Index for Approximate Retrieval", all while splade-index does exact retrieval with no approximations involved.

The query latency values shown include the query encoding times using the naver/splade-v3-distilbert SPLADE sparse encoder model.

Library	Latency per query (in miliseconds)
`splade-index` (with `numba` backend)	1.77 ms
`splade-index` (with `numpy` backend)	2.44 ms
`splade-index` (with `pytorch` backend)	2.61 ms
`pyseismic-lsr`	3.24 ms

The tests were conducted using 100,231 documents and 5,000 queries from the sentence-transformers/natural-questions dataset, and an NVIDIA Tesla T4 16GB GPU on Google Colab.

Examples

splade_index_usage_example.ipynb to index and query 1,000 documents on a cpu.
indexing_and_querying_100k_docs_with_gpu.ipynb to index and query a 100,000 documents on a gpu.

SPLADE Models

You can use SPLADE-Index with any splade model from huggingface hub such as the ones below.

	Size (# Params)	MSMARCO MRR@10	BEIR-13 avg nDCG@10
naver/splade-v3	110M	40.2	51.7
naver/splade-v3-distilbert	67.0M	38.7	50.0
rasyosef/splade-small	28.8M	35.4	46.6
rasyosef/splade-mini	11.2M	34.1	44.5
rasyosef/splade-tiny	4.4M	30.9	40.6

Acknowledgement

splade-index was built on top of the bm25s library, and makes use of its excellent inverted index impementation, originally used by bm25s for its many variants of the BM25 ranking algorithm.

Project details

These details have not been verified by PyPI

Project links

Homepage

Release history Release notifications | RSS feed

This version

0.2.0

Oct 16, 2025

0.1.2

Sep 14, 2025

0.1.1

Sep 1, 2025

0.1.0

Aug 26, 2025

0.0.4

Aug 23, 2025

0.0.3

Jul 28, 2025

0.0.2

Jul 28, 2025

0.0.1

Jul 28, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

splade_index-0.2.0.tar.gz (40.8 kB view details)

Uploaded Oct 16, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

splade_index-0.2.0-py3-none-any.whl (39.4 kB view details)

Uploaded Oct 16, 2025 Python 3

File details

Details for the file splade_index-0.2.0.tar.gz.

File metadata

Download URL: splade_index-0.2.0.tar.gz
Upload date: Oct 16, 2025
Size: 40.8 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.11.3

File hashes

Hashes for splade_index-0.2.0.tar.gz
Algorithm	Hash digest
SHA256	`7b516636f9dea5aff576a498a23fd5599dc4896b8df3b1a4d2b121f6dcb38d94`
MD5	`65619a67bd29749b703e945f8788fd27`
BLAKE2b-256	`91c2b1c6640fa940ffb65601864f4a27ed42bea2f2f5be64bd4a3684fabeda26`

See more details on using hashes here.

File details

Details for the file splade_index-0.2.0-py3-none-any.whl.

File metadata

Download URL: splade_index-0.2.0-py3-none-any.whl
Upload date: Oct 16, 2025
Size: 39.4 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.11.3

File hashes

Hashes for splade_index-0.2.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`7ad757471eda200d1757bdcf7fa852747f4c288569a933a85619513cd40d5ecd`
MD5	`ed2d4ac925a2199f15884fe7d412193c`
BLAKE2b-256	`f4a8ef7ee07d832386a43cca76ead44c65c5ff48d853cd0472749d78ff721a19`

See more details on using hashes here.

splade-index 0.2.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

SPLADE-Index⚡

You can use `splade-index` to

SPLADE

Installation

Quickstart

Hugging Face Integration

10x faster search with SPLADE_GPU

Performance

Examples

SPLADE Models

Acknowledgement

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

splade-index 0.2.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

SPLADE-Index⚡

You can use splade-index to

SPLADE

Installation

Quickstart

Hugging Face Integration

10x faster search with SPLADE_GPU

Performance

Examples

SPLADE Models

Acknowledgement

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

You can use `splade-index` to