Skip to main content

A Heterogeneous Benchmark for Information Retrieval

Project description

PyPI made-with-python Maintenance Open In Colab Open Source Love svg1

What is it?

BEIR consists a heterogeneous benchmark for diverse sentence or passage IR level tasks. It also provides a common and easy framework for evaluation of your NLP models on them.

The package takes care of the downloading, hosting, preprocessing datasets and providing you in a single easy to understand dataset zip folders. We take care of transforming the dataset and provide 15 diverse datasets used for IR in the both academia and industry, with more to add. Further the package provides an easy framework to evalaute your models against some competitive benchmarks including Sentence-Transformers (SBERT), Dense Passage Retrieval (DPR), Universal Sentence Encoder (USE-QA) and Elastic Search.

Worried about your dataset or model not present in the benchmark?

Worry not! You can easily add your dataset into the benchmark by following this data format (here) and also you are free to evaluate your own model and required to return a dictionary with mappings (here) and you can evaluate your IR model using our easy plugin code.

Want us to add a new dataset or a new model? feel free to post an issue here or make a pull request!

Installation

Install via pip:

pip install beir

If you want to build from source, use:

$ git clone https://github.com/benchmarkir/beir.git
$ pip install -e .

Tested with python versions 3.6 and 3.7

Getting Started

Click here to view 15+ Datasets available in BEIR.

Try it out live with our Google Colab Example.

1. Data Downloading and Loading

First download and unzip a dataset. Load the dataset with our data loader.

from beir import util
from beir.datasets.data_loader import GenericDataLoader

url = "https://public.ukp.informatik.tu-darmstadt.de/thakur/BEIR/datasets/trec-covid.zip"
out_dir = "datasets"
data_path = util.download_and_unzip(url, out_dir)

#### Provide the data_path where trec-covid has been downloaded and unzipped
corpus, queries, qrels = GenericDataLoader(data_path).load(split="test")

2. Model Loading

Now, you can use either Sentence-transformers, DPR or USE-QA as your dense retriever model.

from beir.retrieval import models
from beir.retrieval.search.dense import DenseRetrievalExactSearch as DRES

model = DRES(models.SentenceBERT("distilroberta-base-msmarco-v2"))

# model = DRES(EvaluateRetrieval(models.DPR(
#     'facebook/dpr-question_encoder-single-nq-base',
#     'facebook/dpr-ctx_encoder-single-nq-base' )))

# model = DRES(models.UseQA("https://tfhub.dev/google/universal-sentence-encoder-qa/3"))

Or if you wish to use lexical retrieval, we provide support with Elasticsearch.

from beir.retrieval.search.lexical import BM25Search as BM25

#### Provide parameters for elastic-search
hostname = "your-es-hostname-here" # localhost for default
index_name = "your-index-name-here"
model = BM25(index_name=index_name, hostname=hostname)

3. Retriever Search and Evaluation

Format of results is identical to that of qrels. You can evaluate your IR performance using qrels and results. We find NDCG@10 score for all datasets, for more details on why check our upcoming paper.

from beir.retrieval.evaluation import EvaluateRetrieval

retriever = EvaluateRetrieval(model)
results = retriever.retrieve(corpus, queries)

#### Evaluate your retrieval using NDCG@k, MAP@K ...
ndcg, _map, recall, precision = retriever.evaluate(qrels, results, retriever.k_values)

Examples

For all examples, see below:

All in One

Retrieval

Generation

Datasets

Available datasets include:

Data Formats

from beir.datasets.data_loader import GenericDataLoader

data_path = "datasets/trec-covid/"
corpus, queries, qrels = GenericDataLoader(data_path).load(split="test")

# Corpus
for doc_id, doc_metadata in corpus.items():
    print(doc_id, doc_metadata)
# ug7v899j  {"title": "Clinical features of culture-proven Mycoplasma...", "text": "This retrospective chart review describes the epidemiology..."}
# 02tnwd4m  {"title": "Nitric oxide: a pro-inflammatory mediator in lung disease?, "text": "Inflammatory diseases of the respiratory tract are commonly associated..."}
# ...

# Queries
for query_id, query_text in query.items():
    print(query_id, query_text)
# 1     what is the origin of COVID-19?
# 2     how does the coronavirus respond to changes in the weather?
# ...

# Query Relevance Judgements (Qrels)
for query_id, metadata in qrels.items():
    for doc_id, gold_score in metadata.items():
        print(query_id, doc_id, gold_score)
# 1     005b2j4b    2
# 1     00fmeepz    1
# ...

Benchmarking

The Table shows the NDCG@10 scores.

Domain Dataset BM25 SBERT USE-QA DPR
TREC-COVID 0.616 0.461
Bio-Medical BioASQ
NFCorpus 0.294 0.233
Question NQ 0.481 0.530
Answering HotpotQA 0.601 0.419
News NewsQA 0.457 0.263
Twitter Signal-1M 0.477 0.272
Finance FiQA-2018 0.223
Argument ArguAna 0.441 0.415
Touche-2020 0.605
Duplicate CQaDupstack 0.069 0.061
Question Quora
Entity DBPedia-v2 0.285 0.261
Scientific SCIDOCS
Claim FEVER 0.649 0.601
Verification Climate-FEVER 0.179 0.192

Citing & Authors

The main contributors of this repository are:

Contact person: Nandan Thakur, nandant@gmail.com

https://www.ukp.tu-darmstadt.de/

Don't hesitate to send us an e-mail or report an issue, if something is broken (and it shouldn't be) or if you have further questions.

This repository contains experimental software and is published for the sole purpose of giving additional background details on the respective publication.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

beir-0.0.12.tar.gz (20.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

beir-0.0.12-py3-none-any.whl (27.6 kB view details)

Uploaded Python 3

File details

Details for the file beir-0.0.12.tar.gz.

File metadata

  • Download URL: beir-0.0.12.tar.gz
  • Upload date:
  • Size: 20.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.25.1 setuptools/53.0.0 requests-toolbelt/0.9.1 tqdm/4.56.0 CPython/3.6.12

File hashes

Hashes for beir-0.0.12.tar.gz
Algorithm Hash digest
SHA256 ea6bc5ab796e32d30a4ff95c244958c237552288f673e59d5e2cfbc9bcdd430d
MD5 e4c05b9f9abf9c3bb30aa636f01985a6
BLAKE2b-256 61d65678eee53e93a41be2d0a4e28a31275ab0de0fa9e9c1bc98ef9dbc618757

See more details on using hashes here.

File details

Details for the file beir-0.0.12-py3-none-any.whl.

File metadata

  • Download URL: beir-0.0.12-py3-none-any.whl
  • Upload date:
  • Size: 27.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.25.1 setuptools/53.0.0 requests-toolbelt/0.9.1 tqdm/4.56.0 CPython/3.6.12

File hashes

Hashes for beir-0.0.12-py3-none-any.whl
Algorithm Hash digest
SHA256 a063710bca37575463ac343c6ba44383e669ca246277d29126e949ba4c492d20
MD5 4c1a78532d3ea67557f578bd0486c5c2
BLAKE2b-256 0fc8a65e6e446f29d2bfa7cafe3b835cd167115181537fb334df5d967df89143

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page