Skip to main content

Qdrant integration with BEIR, simplifying quality checks on standard datasets

Project description

beir-qdrant

BEIR is a heterogeneous benchmark containing diverse IR tasks. This project integrates BEIR with Qdrant - a vector search engine.

Installation

pip install beir-qdrant

Quick Example

The following example demonstrates how to use BEIR with Qdrant dense search. The example uses the SciFact dataset and all-MiniLM-L6-v2 model from Sentence Transformers to generate the dense embeddings.

from beir import util
from beir.datasets.data_loader import GenericDataLoader
from beir.retrieval.evaluation import EvaluateRetrieval
from qdrant_client import QdrantClient

from beir_qdrant.retrieval.models.fastembed import DenseFastEmbedModelAdapter
from beir_qdrant.retrieval.search.dense import DenseQdrantSearch

# Download and load the dataset
dataset = "scifact"
url = "https://public.ukp.informatik.tu-darmstadt.de/thakur/BEIR/datasets/{}.zip".format(dataset)
data_path = util.download_and_unzip(url, "datasets")
corpus, queries, qrels = GenericDataLoader(data_folder=data_path).load(split="test")

# Connect to Qdrant running on localhost
qdrant_client = QdrantClient("http://localhost:6333")

# Create the retriever and evaluate it on the test set using
# one of the sentence-transformers models available in FastEmbed
model = DenseQdrantSearch(
    qdrant_client,
    model=DenseFastEmbedModelAdapter(
        model_name="sentence-transformers/all-MiniLM-L6-v2"
    ),
    collection_name="scifact-all-MiniLM-L6-v2",
    initialize=True,
)
retriever = EvaluateRetrieval(model)
results = retriever.retrieve(corpus, queries)

ndcg, _map, recall, precision = retriever.evaluate(qrels, results, retriever.k_values)

The example above demonstrates how to use the dense embeddings, but changing the search mode is as simple as changing the model implementation.

Supported Modes

Qdrant supports different search modes, including:

  • Dense search: beir_qdrant.retrieval.search.dense.DenseQdrantSearch
  • Sparse search: beir_qdrant.retrieval.search.sparse.SparseQdrantSearch
  • Multi vector search: beir_qdrant.retrieval.search.multi_vector.MultiVectorQdrantSearch
  • Hybrid search with RRF: beir_qdrant.retrieval.search.hybrid.RRFHybridQdrantSearch

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

beir_qdrant-0.5.2.tar.gz (11.3 kB view details)

Uploaded Source

Built Distribution

beir_qdrant-0.5.2-py3-none-any.whl (18.4 kB view details)

Uploaded Python 3

File details

Details for the file beir_qdrant-0.5.2.tar.gz.

File metadata

  • Download URL: beir_qdrant-0.5.2.tar.gz
  • Upload date:
  • Size: 11.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.9.19

File hashes

Hashes for beir_qdrant-0.5.2.tar.gz
Algorithm Hash digest
SHA256 e4d46a73b9e59a6bce77342f62e851fa1d6376f057957e52a1916cde5b79573b
MD5 b194b91527c03560864139cb0d8f0302
BLAKE2b-256 de5a9a060a270c0c361d75439e3933b708c2e08a60a7d5dfb316759c8c62ee94

See more details on using hashes here.

File details

Details for the file beir_qdrant-0.5.2-py3-none-any.whl.

File metadata

  • Download URL: beir_qdrant-0.5.2-py3-none-any.whl
  • Upload date:
  • Size: 18.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.9.19

File hashes

Hashes for beir_qdrant-0.5.2-py3-none-any.whl
Algorithm Hash digest
SHA256 f8f4bc908997a166839c06b989ee0065b3472efb5dc7d74d5c06b39bc706b853
MD5 8a7ecfa6d147e568aa51093d0704536a
BLAKE2b-256 1815e113057f89ef5bf2239cd5279dfa57c481707477d3e23e24059c0f9dc9c5

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page