a tool for automatically inferring query relevance assessments (qrels)
Project description
autoqrels
autoqrels
is a tool for automatically inferring query relevance assessments (qrels).
Currently, it supports the one-shot labeling approach (1SL) presented in MacAvaney and Soldaini, One-Shot Labeling for Automatic Relevance Estimation, SIGIR 2023.
This package adheres to the ir-measures
API, which means it can
be directly used by various tools, such as PyTerrier.
Getting started
You can install autoqrels
using pip:
pip install autoqrels
You can also work with the repository locally:
git clone https://github.com/seanmacavaney/autoqrels.git
cd autoqrels
python setup.py develop
API
The primary interface in autoqrels
is autoqrels.Labeler
. A Labeler
exposes a
method, infer_qrels(run, qrels)
, which returns a new set of qrels that covers the
provided run:
run
is a Pandas DataFrame with the columnsquery_id
(str),doc_id
(str), andscore
(float)qrels
is a Pandas DataFrame with the columnsquery_id
(str),doc_id
(str), andrelevance
(int)- The return value is a Pandas DataFrame with the columns
query_id
(str),doc_id
(str), andrelevance
(float)
Labeler
s also expose several measure definitions compatible with ir_measures
:
labeler.SDCG@k
,
labeler.RBP(p=persistence)
,
labeler.P@k
.
These measures can be used to calculate the corresponding effectivness, with the
addition of the labeler's inferred qrels. See the ir-measures documentation
for more details.
We'll now explore the available Labeler
implementations.
autoqrels.oneshot
: 1SL (One-shot Labeling)
Reproduction: See repro instructions in repro/oneshot
.
One-shot labelers work over a single known relevant document per query. An error is raised if multiple relevant documents are provided.
Example:
import autoqrels
import ir_datasets
dataset = ir_datasets.load('msmarco-passage/trec-dl-2019')
duot5 = autoqrels.oneshot.DuoT5(dataset=dataset, cache_path='data/duot5.cache.json.gz')
# measures:
duot5.SDCG@10
duot5.P@10
duot5.RBP
Citation
If you use this work, please cite:
@inproceedings{autoqrels,
author = {MacAvaney, Sean and Soldaini, Luca},
title = {One-Shot Labeling for Automatic Relevance Estimation},
booktitle = {Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval},
year = {2023},
url = {https://arxiv.org/abs/2302.11266}
}
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for autoqrels-0.0.1-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 9fb17ba2e459853c1d45a17729421e5ccbf74267c79e8f946240eb49f190b938 |
|
MD5 | c96e13e70f7d9765b5541e4db2dcf595 |
|
BLAKE2b-256 | 868665406d3116ab5ef32b757465ac4de93019220bf29357454645026254c1ac |