beir · PyPI

A Heterogeneous Benchmark for Information Retrieval

These details have not been verified by PyPI

Project links

Project description

PyPI

What is it?

BEIR consists a heterogeneous benchmark for diverse sentence or passage IR level tasks. It also provides a common and easy framework for evaluation of your NLP models on them.

The package takes care of the downloading, hosting, preprocessing datasets and providing you in a single easy to understand dataset zip folders. We take care of transforming the dataset and provide 15 diverse datasets used for IR in the both academia and industry, with more to add. Further the package provides an easy framework to evalaute your models against some competitive benchmarks including Sentence-Transformers (SBERT), Dense Passage Retrieval (DPR), Universal Sentence Encoder (USE-QA) and Elastic Search.

Worried about your dataset or model not present in the benchmark?

Worry not! You can easily add your dataset into the benchmark by following this data format (here) and also you are free to evaluate your own model and required to return a dictionary with mappings (here) and you can evaluate your IR model using our easy plugin code.

Want us to add a new dataset or a new model? feel free to post an issue here or make a pull request!

Installation

Install via pip:

pip install beir

If you want to build from source, use:

$ git clone https://github.com/benchmarkir/beir.git
$ pip install -e .

Tested with python versions 3.6 and 3.7

Getting Started

Try it out live with our Google Colab Example.

First download and unzip a dataset. Click here to view all datasets available.

from beir import util

url = "https://public.ukp.informatik.tu-darmstadt.de/thakur/BEIR/datasets/trec-covid.zip"
out_path = "datasets/trec-covid.zip"
out_dir = "datasets"

util.download_url(url, out_path)
util.unzip(out_path, out_dir)

Then load the dataset using our Generic Data Loader, (Wonderful right)

from beir.datasets.data_loader import GenericDataLoader
from beir.retrieval.evaluation import EvaluateRetrieval

data_path = "datasets/trec-covid/"
corpus, queries, qrels = GenericDataLoader(data_path).load(split="test")

Now, you can use either Sentence-transformers, DPR or USE-QA as your dense retriever model. Format of results is identical to that of qrels.

from beir.retrieval.evaluation import EvaluateRetrieval

retriever = EvaluateRetrieval(model="sbert", model_name="distilroberta-base-msmarco-v2") 
# retriever = EvaluateRetrieval(model="dpr")
# retriever = EvaluateRetrieval(model="use-qa")

results = retriever.retrieve(corpus, queries, qrels)

Finally after retrieving, you can evaluate your IR performance using qrels and results. We find NDCG@10 score for all datasets, for more details on why check our upcoming paper.

ndcg, _map, recall, precision = retriever.evaluate(qrels, results, retriever.k_values)

for key, value in ndcg.items():
    print(key, value) 
# ndcg@1    0.3456
# ndcg@3    0.4567
# ...

Examples

For all examples, see below:

Retrieval

Datasets

Available datasets include:

Data Formats

from beir.datasets.data_loader import GenericDataLoader

data_path = "datasets/trec-covid/"
corpus, queries, qrels = GenericDataLoader(data_path).load(split="test")

# Corpus
for doc_id, doc_metadata in corpus.items():
    print(doc_id, doc_metadata)
# ug7v899j  {"title": "Clinical features of culture-proven Mycoplasma...", "text": "This retrospective chart review describes the epidemiology..."}
# 02tnwd4m  {"title": "Nitric oxide: a pro-inflammatory mediator in lung disease?, "text": "Inflammatory diseases of the respiratory tract are commonly associated..."}
# ...

# Queries
for query_id, query_text in query.items():
    print(query_id, query_text)
# 1     what is the origin of COVID-19?
# 2     how does the coronavirus respond to changes in the weather?
# ...

# Query Relevance Judgements (Qrels)
for query_id, metadata in qrels.items():
    for doc_id, gold_score in metadata.items():
        print(query_id, doc_id, gold_score)
# 1     005b2j4b    2
# 1     00fmeepz    1
# ...

Benchmarking

Domain	Dataset	BM25	SBERT	USE-QA	DPR
	TREC-COVID
Bio-Medical	BioASQ
	NFCorpus

Question	NQ
Answering	HotpotQA

News	NewsQA

Twitter	Signal-1M

Finance	FiQA-2018
Argument	ArguAna
	Touche-2020

Duplicate	CQaDupstack
Question	Quora

Entity	DBPedia-v2

Scientific	SCIDOCS

Claim	FEVER
Verification	Climate-FEVER

Citing & Authors

The main contributors of this repository are:

Nandan Thakur

Contact person: Nandan Thakur, nandant@gmail.com

https://www.ukp.tu-darmstadt.de/

Don't hesitate to send us an e-mail or report an issue, if something is broken (and it shouldn't be) or if you have further questions.

This repository contains experimental software and is published for the sole purpose of giving additional background details on the respective publication.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

2.2.0

Jun 4, 2025

2.1.0

Feb 25, 2025

2.0.0

Jul 21, 2023

1.0.1

Jun 30, 2022

1.0.0

Mar 21, 2022

0.2.3

Oct 22, 2021

0.2.2

Aug 17, 2021

0.2.1

Jul 19, 2021

0.2.0

Jul 6, 2021

0.1.8

Jun 16, 2021

0.1.7

May 28, 2021

0.1.6

May 26, 2021

0.1.5

May 7, 2021

0.1.3

May 1, 2021

0.1.2

Apr 26, 2021

0.1.1

Apr 20, 2021

0.1.0

Apr 19, 2021

0.0.14

Feb 25, 2021

0.0.13

Feb 16, 2021

0.0.12

Feb 9, 2021

0.0.11

Feb 6, 2021

0.0.10

Feb 2, 2021

This version

0.0.9

Feb 1, 2021

0.0.8

Jan 29, 2021

0.0.7

Jan 29, 2021

0.0.6

Jan 28, 2021

0.0.5

Jan 26, 2021

0.0.4

Jan 25, 2021

0.0.3

Jan 25, 2021

0.0.2

Jan 25, 2021

0.0.1

Jan 25, 2021

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

beir-0.0.9.tar.gz (18.1 kB view details)

Uploaded Feb 1, 2021 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

beir-0.0.9-py3-none-any.whl (24.4 kB view details)

Uploaded Feb 1, 2021 Python 3

File details

Details for the file beir-0.0.9.tar.gz.

File metadata

Download URL: beir-0.0.9.tar.gz
Upload date: Feb 1, 2021
Size: 18.1 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.25.1 setuptools/52.0.0.post20210125 requests-toolbelt/0.9.1 tqdm/4.56.0 CPython/3.6.12

File hashes

Hashes for beir-0.0.9.tar.gz
Algorithm	Hash digest
SHA256	`07f62322235ab75252a325d7bfe30ece77f79de420553aa4af532203b35930b9`
MD5	`34ddf222ec04483fc9983edc1ba2b07b`
BLAKE2b-256	`40e6c75f6eb132350bf8d9e023039d9af302ba94269c35c28e6cb0855755a2b2`

See more details on using hashes here.

File details

Details for the file beir-0.0.9-py3-none-any.whl.

File metadata

Download URL: beir-0.0.9-py3-none-any.whl
Upload date: Feb 1, 2021
Size: 24.4 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.25.1 setuptools/52.0.0.post20210125 requests-toolbelt/0.9.1 tqdm/4.56.0 CPython/3.6.12

File hashes

Hashes for beir-0.0.9-py3-none-any.whl
Algorithm	Hash digest
SHA256	`fc2c4f55137547f9e410e31a5d10d367b6039605a9f864d61ea1a9e107384d12`
MD5	`2fdb2d1f8802ae0f262096a0382882f0`
BLAKE2b-256	`3ea6d44a29888b4b93d782dc844ebd25b94c9ae4220652fa99f583e8bab21c44`

See more details on using hashes here.

beir 0.0.9

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

What is it?

Worried about your dataset or model not present in the benchmark?

Installation

Getting Started

Examples

Retrieval

Datasets

Data Formats

Benchmarking

Citing & Authors

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes