Skip to main content

Qdrant integration with BEIR, simplifying quality checks on standard datasets

Project description

beir-qdrant

BEIR is a heterogeneous benchmark containing diverse IR tasks. This project integrates BEIR with Qdrant - a vector search engine.

Installation

pip install beir-qdrant

Quick Example

The following example demonstrates how to use BEIR with Qdrant dense search. The example uses the SciFact dataset and all-MiniLM-L6-v2 model from Sentence Transformers to generate the dense embeddings.

from beir import util
from beir.datasets.data_loader import GenericDataLoader
from beir.retrieval.evaluation import EvaluateRetrieval
from qdrant_client import QdrantClient

from beir_qdrant.retrieval.models.fastembed import DenseFastEmbedModelAdapter
from beir_qdrant.retrieval.search.dense import DenseQdrantSearch

# Download and load the dataset
dataset = "scifact"
url = "https://public.ukp.informatik.tu-darmstadt.de/thakur/BEIR/datasets/{}.zip".format(dataset)
data_path = util.download_and_unzip(url, "datasets")
corpus, queries, qrels = GenericDataLoader(data_folder=data_path).load(split="test")

# Connect to Qdrant running on localhost
qdrant_client = QdrantClient("http://localhost:6333")

# Create the retriever and evaluate it on the test set using
# one of the sentence-transformers models available in FastEmbed
model = DenseQdrantSearch(
    qdrant_client,
    model=DenseFastEmbedModelAdapter(
        model_name="sentence-transformers/all-MiniLM-L6-v2"
    ),
    collection_name="scifact-all-MiniLM-L6-v2",
    initialize=True,
)
retriever = EvaluateRetrieval(model)
results = retriever.retrieve(corpus, queries)

ndcg, _map, recall, precision = retriever.evaluate(qrels, results, retriever.k_values)

The example above demonstrates how to use the dense embeddings, but changing the search mode is as simple as changing the model implementation.

Supported Modes

Qdrant supports different search modes, including:

  • Dense search: beir_qdrant.retrieval.search.dense.DenseQdrantSearch
  • Sparse search: beir_qdrant.retrieval.search.sparse.SparseQdrantSearch
  • Multi vector search: beir_qdrant.retrieval.search.multi_vector.MultiVectorQdrantSearch
  • Hybrid search with RRF: beir_qdrant.retrieval.search.hybrid.RRFHybridQdrantSearch
  • Muvera postprocessing for multi vector search: beir_qdrant.retrieval.models.fastembed.MuveraPostprocessorAdapter

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

beir_qdrant-0.5.5.tar.gz (12.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

beir_qdrant-0.5.5-py3-none-any.whl (20.7 kB view details)

Uploaded Python 3

File details

Details for the file beir_qdrant-0.5.5.tar.gz.

File metadata

  • Download URL: beir_qdrant-0.5.5.tar.gz
  • Upload date:
  • Size: 12.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.9.23

File hashes

Hashes for beir_qdrant-0.5.5.tar.gz
Algorithm Hash digest
SHA256 bff4bf5d1c7d14f9f65c76bb462660d3724d4d3ba83f57ad51e8e2440a178b0e
MD5 05051c1722cce2fa2ab95881f6383fbb
BLAKE2b-256 dd2c738612ceb39987c99313d82be7a377e899868aec8d864aa0ce42a1b3faa2

See more details on using hashes here.

File details

Details for the file beir_qdrant-0.5.5-py3-none-any.whl.

File metadata

  • Download URL: beir_qdrant-0.5.5-py3-none-any.whl
  • Upload date:
  • Size: 20.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.9.23

File hashes

Hashes for beir_qdrant-0.5.5-py3-none-any.whl
Algorithm Hash digest
SHA256 e789a8eed54e1fc169ce9fcaf1ddc8fc89a74a4e67aeb7c7029d50666aec9fa7
MD5 55216e49606f993d22b046f4f12a6c86
BLAKE2b-256 7d9f9683a3ea47ec1d87ac87aab4246e96ad023dab12118343fd8ee9d0f8c6eb

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page