Qdrant integration with BEIR, simplifying quality checks on standard datasets
Project description
beir-qdrant
BEIR is a heterogeneous benchmark containing diverse IR tasks. This project integrates BEIR with Qdrant - a vector search engine.
Installation
pip install beir-qdrant
Quick Example
The following example demonstrates how to use BEIR with Qdrant dense search. The example uses the SciFact dataset
and all-MiniLM-L6-v2
model from Sentence Transformers to generate the embeddings.
from beir import util
from beir.datasets.data_loader import GenericDataLoader
from beir.retrieval.evaluation import EvaluateRetrieval
from qdrant_client import QdrantClient
from beir_qdrant.retrieval.search.dense import DenseQdrantSearch
# Download and load the dataset
dataset = "scifact"
url = "https://public.ukp.informatik.tu-darmstadt.de/thakur/BEIR/datasets/{}.zip".format(dataset)
data_path = util.download_and_unzip(url, "datasets")
corpus, queries, qrels = GenericDataLoader(data_folder=data_path).load(split="test")
# Connect to Qdrant running on localhost
qdrant_client = QdrantClient("http://localhost:6333")
# Create the retriever and evaluate it on the test set
model = DenseQdrantSearch(
qdrant_client,
collection_name="scifact-all-MiniLM-L6-v2",
dense_model_name="sentence-transformers/all-MiniLM-L6-v2",
initialize=True,
)
retriever = EvaluateRetrieval(model)
results = retriever.retrieve(corpus, queries)
ndcg, _map, recall, precision = retriever.evaluate(qrels, results, retriever.k_values)
Supported Modes
Qdrant supports different search modes, including:
- Dense search:
beir_qdrant.retrieval.search.dense.DenseQdrantSearch
- Sparse search:
beir_qdrant.retrieval.search.sparse.SparseQdrantSearch
- BM42 search:
beir_qdrant.retrieval.search.sparse.BM42Search
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
beir_qdrant-0.0.1.tar.gz
(5.4 kB
view hashes)
Built Distribution
Close
Hashes for beir_qdrant-0.0.1-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | b099d103252336fece7506cbf4b540560c85a37f07b72a45b01030b63ab96413 |
|
MD5 | 97450244e8f020fc8b41537925a032c9 |
|
BLAKE2b-256 | 6b176de22745478c4f0c488a108fd9e72f79787411e52a4896307d1ad20101db |