a tool for automatically inferring query relevance assessments (qrels)
Project description
autoqrels
autoqrels
is a tool for automatically inferring query relevance assessments (qrels).
Currently, it supports the one-shot labeling approach (1SL) presented in MacAvaney and Soldaini, One-Shot Labeling for Automatic Relevance Estimation, SIGIR 2023.
This package adheres to the ir-measures
API, which means it can
be directly used by various tools, such as PyTerrier.
Getting started
You can install autoqrels
using pip:
pip install autoqrels
You can also work with the repository locally:
git clone https://github.com/seanmacavaney/autoqrels.git
cd autoqrels
python setup.py develop
API
The primary interface in autoqrels
is autoqrels.Labeler
. A Labeler
exposes a
method, infer_qrels(run, qrels)
, which returns a new set of qrels that covers the
provided run:
run
is a Pandas DataFrame with the columnsquery_id
(str),doc_id
(str), andscore
(float)qrels
is a Pandas DataFrame with the columnsquery_id
(str),doc_id
(str), andrelevance
(int)- The return value is a Pandas DataFrame with the columns
query_id
(str),doc_id
(str), andrelevance
(float)
Labeler
s also expose several measure definitions compatible with ir_measures
:
labeler.SDCG@k
,
labeler.RBP(p=persistence)
,
labeler.P@k
.
These measures can be used to calculate the corresponding effectivness, with the
addition of the labeler's inferred qrels. See the ir-measures documentation
for more details.
We'll now explore the available Labeler
implementations.
autoqrels.oneshot
: 1SL (One-shot Labeling)
Reproduction: See repro instructions in repro/oneshot
.
One-shot labelers work over a single known relevant document per query. An error is raised if multiple relevant documents are provided.
Example:
import autoqrels
import ir_datasets
dataset = ir_datasets.load('msmarco-passage/trec-dl-2019')
duot5 = autoqrels.oneshot.DuoT5(dataset=dataset, cache_path='data/duot5.cache.json.gz')
# measures:
duot5.SDCG@10
duot5.P@10
duot5.RBP
Citation
If you use this work, please cite:
@inproceedings{autoqrels,
author = {MacAvaney, Sean and Soldaini, Luca},
title = {One-Shot Labeling for Automatic Relevance Estimation},
booktitle = {Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval},
year = {2023},
url = {https://arxiv.org/abs/2302.11266}
}
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file autoqrels-0.0.1.tar.gz
.
File metadata
- Download URL: autoqrels-0.0.1.tar.gz
- Upload date:
- Size: 14.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.12.5
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 4ec3b2bd5eaa1dce2b3755a1e67e9864fd5c4ce9fe5bfebb0ba754884df4ef07 |
|
MD5 | 6a92c09d17c4f7c330c57dc13718f448 |
|
BLAKE2b-256 | 6d752c8b6c658f624c44d8308e28da08fe267fb30c9e6a7ff7007312e0da5db1 |
File details
Details for the file autoqrels-0.0.1-py3-none-any.whl
.
File metadata
- Download URL: autoqrels-0.0.1-py3-none-any.whl
- Upload date:
- Size: 17.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.12.5
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 9fb17ba2e459853c1d45a17729421e5ccbf74267c79e8f946240eb49f190b938 |
|
MD5 | c96e13e70f7d9765b5541e4db2dcf595 |
|
BLAKE2b-256 | 868665406d3116ab5ef32b757465ac4de93019220bf29357454645026254c1ac |