Skip to main content

Qdrant integration with BEIR, simplifying quality checks on standard datasets

Project description

beir-qdrant

BEIR is a heterogeneous benchmark containing diverse IR tasks. This project integrates BEIR with Qdrant - a vector search engine.

Installation

pip install beir-qdrant

Quick Example

The following example demonstrates how to use BEIR with Qdrant dense search. The example uses the SciFact dataset and all-MiniLM-L6-v2 model from Sentence Transformers to generate the dense embeddings.

from beir import util
from beir.datasets.data_loader import GenericDataLoader
from beir.retrieval.evaluation import EvaluateRetrieval
from qdrant_client import QdrantClient

from beir_qdrant.retrieval.models.fastembed import DenseFastEmbedModelAdapter
from beir_qdrant.retrieval.search.dense import DenseQdrantSearch

# Download and load the dataset
dataset = "scifact"
url = "https://public.ukp.informatik.tu-darmstadt.de/thakur/BEIR/datasets/{}.zip".format(dataset)
data_path = util.download_and_unzip(url, "datasets")
corpus, queries, qrels = GenericDataLoader(data_folder=data_path).load(split="test")

# Connect to Qdrant running on localhost
qdrant_client = QdrantClient("http://localhost:6333")

# Create the retriever and evaluate it on the test set using
# one of the sentence-transformers models available in FastEmbed
model = DenseQdrantSearch(
    qdrant_client,
    model=DenseFastEmbedModelAdapter(
        model_name="sentence-transformers/all-MiniLM-L6-v2"
    ),
    collection_name="scifact-all-MiniLM-L6-v2",
    initialize=True,
)
retriever = EvaluateRetrieval(model)
results = retriever.retrieve(corpus, queries)

ndcg, _map, recall, precision = retriever.evaluate(qrels, results, retriever.k_values)

The example above demonstrates how to use the dense embeddings, but changing the search mode is as simple as changing the model implementation.

Supported Modes

Qdrant supports different search modes, including:

  • Dense search: beir_qdrant.retrieval.search.dense.DenseQdrantSearch
  • Sparse search: beir_qdrant.retrieval.search.sparse.SparseQdrantSearch
  • Multi vector search: beir_qdrant.retrieval.search.multi_vector.MultiVectorQdrantSearch
  • Hybrid search with RRF: beir_qdrant.retrieval.search.hybrid.RRFHybridQdrantSearch

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

beir_qdrant-0.5.4.tar.gz (11.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

beir_qdrant-0.5.4-py3-none-any.whl (18.4 kB view details)

Uploaded Python 3

File details

Details for the file beir_qdrant-0.5.4.tar.gz.

File metadata

  • Download URL: beir_qdrant-0.5.4.tar.gz
  • Upload date:
  • Size: 11.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.9.23

File hashes

Hashes for beir_qdrant-0.5.4.tar.gz
Algorithm Hash digest
SHA256 bc052abbfe5dc0475e3c7bf96cb996bafa019b95987b7a36c64087de9efb6394
MD5 2a9558e08f4a3be22bf86e0fff2ab0e7
BLAKE2b-256 b1d7e819f244937ba87770c5e756a07a8cbb9f5e6e1beb9edb797a650c51db96

See more details on using hashes here.

File details

Details for the file beir_qdrant-0.5.4-py3-none-any.whl.

File metadata

  • Download URL: beir_qdrant-0.5.4-py3-none-any.whl
  • Upload date:
  • Size: 18.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.9.23

File hashes

Hashes for beir_qdrant-0.5.4-py3-none-any.whl
Algorithm Hash digest
SHA256 a63756951d73e6a4d07e8f428a339cf600d2cc45cf10c92b4eab465fc1edb664
MD5 c9b555880af881274f40cc87452ef693
BLAKE2b-256 a8bfa00eff99ab09e3dae7fa2bafd6ef44c6b3da893d9458be18eac0cda1fddb

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page