Qdrant integration with BEIR, simplifying quality checks on standard datasets
Project description
beir-qdrant
BEIR is a heterogeneous benchmark containing diverse IR tasks. This project integrates BEIR with Qdrant - a vector search engine.
Installation
pip install beir-qdrant
Quick Example
The following example demonstrates how to use BEIR with Qdrant dense search. The example uses the SciFact dataset
and all-MiniLM-L6-v2
model from Sentence Transformers to generate the dense embeddings.
from beir import util
from beir.datasets.data_loader import GenericDataLoader
from beir.retrieval.evaluation import EvaluateRetrieval
from qdrant_client import QdrantClient
from beir_qdrant.retrieval.models.fastembed import DenseFastEmbedModelAdapter
from beir_qdrant.retrieval.search.dense import DenseQdrantSearch
# Download and load the dataset
dataset = "scifact"
url = "https://public.ukp.informatik.tu-darmstadt.de/thakur/BEIR/datasets/{}.zip".format(dataset)
data_path = util.download_and_unzip(url, "datasets")
corpus, queries, qrels = GenericDataLoader(data_folder=data_path).load(split="test")
# Connect to Qdrant running on localhost
qdrant_client = QdrantClient("http://localhost:6333")
# Create the retriever and evaluate it on the test set using
# one of the sentence-transformers models available in FastEmbed
model = DenseQdrantSearch(
qdrant_client,
model=DenseFastEmbedModelAdapter(
model_name="sentence-transformers/all-MiniLM-L6-v2"
),
collection_name="scifact-all-MiniLM-L6-v2",
initialize=True,
)
retriever = EvaluateRetrieval(model)
results = retriever.retrieve(corpus, queries)
ndcg, _map, recall, precision = retriever.evaluate(qrels, results, retriever.k_values)
The example above demonstrates how to use the dense embeddings, but changing the search mode is as simple as changing the model implementation.
Supported Modes
Qdrant supports different search modes, including:
- Dense search:
beir_qdrant.retrieval.search.dense.DenseQdrantSearch
- Sparse search:
beir_qdrant.retrieval.search.sparse.SparseQdrantSearch
- Multi vector search:
beir_qdrant.retrieval.search.multi_vector.MultiVectorQdrantSearch
- Hybrid search with RRF:
beir_qdrant.retrieval.search.hybrid.RRFHybridQdrantSearch
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file beir_qdrant-0.5.2.tar.gz
.
File metadata
- Download URL: beir_qdrant-0.5.2.tar.gz
- Upload date:
- Size: 11.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.9.19
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | e4d46a73b9e59a6bce77342f62e851fa1d6376f057957e52a1916cde5b79573b |
|
MD5 | b194b91527c03560864139cb0d8f0302 |
|
BLAKE2b-256 | de5a9a060a270c0c361d75439e3933b708c2e08a60a7d5dfb316759c8c62ee94 |
File details
Details for the file beir_qdrant-0.5.2-py3-none-any.whl
.
File metadata
- Download URL: beir_qdrant-0.5.2-py3-none-any.whl
- Upload date:
- Size: 18.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.9.19
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | f8f4bc908997a166839c06b989ee0065b3472efb5dc7d74d5c06b39bc706b853 |
|
MD5 | 8a7ecfa6d147e568aa51093d0704536a |
|
BLAKE2b-256 | 1815e113057f89ef5bf2239cd5279dfa57c481707477d3e23e24059c0f9dc9c5 |