Intuitive interface to many IR axioms.
Project description
↕️ ir_axioms
Intuitive axiomatic retrieval experimentation.
ir_axioms
is a Python framework for experimenting with axioms in information retrieval in a declarative way.
It includes reference implementations of many commonly used retrieval axioms and is well integrated with the PyTerrier framework and the Pyserini toolkit.
Re-rank your search results today with ir_axioms
and understand your retrieval systems better by analyzing
axiomatic preferences!
Presentation video on YouTube | Poster |
---|---|
Usage
The ir_axioms
framework is easy to use. Below, we've prepared some notebooks showcasing the main features.
If you have questions or need assistance, please contatct us.
Example Notebooks
We include several example notebooks to demonstrate re-ranking and preference evaluation in PyTerrier using ir_axioms
.
You can find all examples in the examples/
directory.
- Re-ranking top-20 results using KwikSort
- Re-ranking top-20 results using KwikSort learned from ORACLE preferences
- Re-ranking top-20 results using LambdaMART with axiomatic preference features
- Post-Hoc analysis of rankings and relevance judgments
- Computing axiom preferences for top-20 results of TREC 2022 Deep Learning (passage) runs in parallel
- SIGIR 2022 showcase for step-by-step explanations with our presentation video
Backends
You can experiment with ir_axioms
in PyTerrier and Pyserini.
However, we recommend PyTerrier as not all features are implemented for the Pyserini backend.
PyTerrier (Terrier index)
To use ir_axioms
with a Terrier index, please use our PyTerrier transformers (modules):
Transformer Class | Type | Description |
---|---|---|
AggregatedPreferences |
𝑅 → 𝑅𝑓 | Aggregate axiom preferences for each document |
EstimatorKwikSortReranker |
𝑅 → 𝑅′ | Train estimator for ORACLE, use it to re-rank with KwikSort. |
KwikSortReranker |
𝑅 → 𝑅′ | Re-rank using axiom preferences aggregated by KwikSort. |
PreferenceMatrix |
𝑅 → (𝑅×𝑅)𝑓 | Compute an axiom’s preference matrix. |
You can also directly instantiate a index context object from a Terrier index if you want to build custom axiomatic modules:
from ir_axioms.backend.pyterrier import TerrierIndexContext
context = TerrierIndexContext("/path/to/index/dir")
axiom.preference(context, query, doc1, doc2)
Pyserini (Anserini index)
We don't have modules for Pyserini to re-rank or analyze results out of the box. However, you can still comute axiom preferences to integrate retrieval axioms into your search pipeline:
from ir_axioms.backend.pyserini import AnseriniIndexContext
context = AnseriniIndexContext("/path/to/index/dir")
axiom.preference(context, query, doc1, doc2)
TIRA
Here's an example how ir_axioms
can be used to get axiomatic preferences for a run in TIRA:
tira-run \
--input-directory ${PWD}/data/tira/input-of-re-ranker \
--input-run ${PWD}/data/tira/output-of-indexer \
--output-directory ${PWD}/data/tira/output \
--image webis/ir_axioms:0.2.13 \
--command '/venv/bin/python -m ir_axioms --offline --terrier-version 5.7 --terrier-helper-version 0.0.7 preferences --run-file $inputDataset/run.jsonl --run-format jsonl --index-dir $inputRun/index --output-dir $outputDir AND ANTI-REG ASPECT-REG DIV LB1 LNC1 LEN-AND LEN-DIV LEN-M-AND LEN-M-TDC LNC1 M-AND M-TDC PROX1 PROX2 PROX3 PROX4 PROX5 REG STMC1 STMC2 TF-LNC TFC1 TFC3'
Citation
If you use this package or its components in your research, please cite the following paper describing the ir_axioms
framework and its use-cases:
Alexander Bondarenko, Maik Fröbe, Jan Heinrich Reimer, Benno Stein, Michael Völske, and Matthias Hagen. Axiomatic Retrieval Experimentation with
ir_axioms
. In 45th International ACM Conference on Research and Development in Information Retrieval (SIGIR 2022), July 2022. ACM.
You can use the following BibTeX entry for citation:
@InProceedings{bondarenko:2022d,
author = {Alexander Bondarenko and
Maik Fr{\"o}be and
{Jan Heinrich} Reimer and
Benno Stein and
Michael V{\"o}lske and
Matthias Hagen},
booktitle = {45th International ACM Conference on Research and Development
in Information Retrieval (SIGIR 2022)},
month = jul,
publisher = {ACM},
site = {Madrid, Spain},
title = {{Axiomatic Retrieval Experimentation with ir_axioms}},
year = 2022
}
Development
To build ir_axioms
and contribute to its development you need to install the build
, and setuptools
and wheel
packages:
pip install build setuptools wheel
(On most systems, these packages are already pre-installed.)
Installation
Install dependencies for developing the ir_axioms
package:
pip install -e .
If you want to develop the Pyserini backend, install dependencies like this:
pip install -e .[pyserini]
If you want to develop the PyTerrier backend, install dependencies like this:
pip install -e .[pyterrier]
Testing
Install test dependencies:
pip install -e .[test]
Verify your changes against our test suite to verify.
flake8 ir_axioms tests
pylint -E ir_axioms tests.unit --ignore-paths=^ir_axioms.backend
pytest ir_axioms/ tests/unit/ --ignore=ir_axioms/backend/
Please also add tests for the axioms or integrations you've added.
Testing backend integrations
Install test dependencies (replace <BACKEND>
with either pyserini
or pyterrier
):
pip install -e .[<BACKEND>]
Verify your changes against our test suite to verify.
pylint -E ir_axioms.backend.<BACKEND> tests.integration.<BACKEND>
pytest tests/integration/<BACKEND>/
Build wheel
A wheel for this package can be built by running:
python -m build
Support
If you hit any problems using ir_axioms
or reproducing our experiments, please write us an email or file an issue:
- jan.reimer@student.uni-halle.de
- maik.froebe@informatik.uni-halle.de
- alexander.bondarenko@informatik.uni-halle.de
We're happy to help!
License
This repository is released under the MIT license. If you use ir_axioms
in your research, we'd be glad if
you'd cite us.
Abstract
Axiomatic approaches to information retrieval have played a key role in determining basic constraints that characterize good retrieval models. Beyond their importance in retrieval theory, axioms have been operationalized to improve an initial ranking, to “guide” retrieval, or to explain some model’s rankings. However, recent open-source retrieval frameworks like PyTerrier and Pyserini, which made it easy to experiment with sparse and dense retrieval models, have not included any retrieval axiom support so far. To fill this gap, we propose ir_axioms
, an open-source Python framework that integrates retrieval axioms with common retrieval frameworks. We include reference implementations for 25 retrieval axioms, as well as components for preference aggregation, re-ranking, and evaluation. New axioms can easily be defined by implementing an abstract data type or by intuitively combining existing axioms with Python operators or regression. Integration with PyTerrier and ir_datasets
makes standard retrieval models, corpora, topics, and relevance judgments—including those used at TREC—immediately accessible for axiomatic experimentation. Our experiments on the TREC Deep Learning tracks showcase some potential research questions that ir_axioms can help to address.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file ir_axioms-0.2.14.tar.gz
.
File metadata
- Download URL: ir_axioms-0.2.14.tar.gz
- Upload date:
- Size: 301.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/4.0.2 CPython/3.11.6
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 2ae723e92d74f8c594dea9ef59ce79da54eb8d4e5a5d99559016fd87d823be42 |
|
MD5 | 6e4112e9d3131bf72a3098e5b0095672 |
|
BLAKE2b-256 | 7dd373afdc555c43f850517c782b0088c5ea214808afe77e13f40a09cf5b2c87 |
File details
Details for the file ir_axioms-0.2.14-py3-none-any.whl
.
File metadata
- Download URL: ir_axioms-0.2.14-py3-none-any.whl
- Upload date:
- Size: 51.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/4.0.2 CPython/3.11.6
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | e98c81a4c8d89f4fc1a2cbc78056889861131e488ebded7641644a90bb393d52 |
|
MD5 | ba1b789e8108135a7a7415a4571f4306 |
|
BLAKE2b-256 | f4eb3602f1bdba608b4d75ab9c67477ca760be08473aaf4c67609d6c5e51de11 |