Skip to main content

Tool for visually diffing the difference between two TREC run files.

Project description

License Documentation Status Worfklow PyPI version fury.io Code style: black

DiffIR

DiffIR is a tool for visually 'diffing' the difference between two sets of rankings. Given a pair of TREC runs containing rankings for multiple queries, DiffIR identifies contrasting queries that have "substantially" different results between the two systems and generates a visual side-by-side comparison illustrating how the key rankings differ.

DiffIR supports multiple query contrast meastures for ranking comparison including unsupervised ranking correlations like TauAP and supervised comparison based on existing judgments. DiffIR additionally accepts term importance weights in order to highlight the terms most relevant to a model's relevance prediction.

Usage Open In Colab

Installation

Python 3 is required. Install via PyPI:

pip install diffir

Usage

Download two run files to test with:

wget -c https://github.com/capreolus-ir/diffir/raw/master/trec-dl-2020/p_bm25
wget -c https://github.com/capreolus-ir/diffir/raw/master/trec-dl-2020/p_bm25rm3

Compare the two files and output a comparison page to bm25_bm25rm3.html:

diffir p_bm25 p_bm25rm3 -w --dataset msmarco-passage/trec-dl-2020 \
       --measure qrel --metric nDCG@5 --topk 3 > bm25_bm25rm3.html

Now open bm25_bm25rm3.html in your web browser. You should see DiffIR's web interface:

Command line arguments

Usage: diffir <run files> <options> where the run files are 1 or 2 positional arguments indicating the run files to visualize, and <options> are:

  • -w to output HTML or -c for the command line interface
  • --dataset <id>: a dataset id from ir_datasets
  • --measure <measure> the query contrast measure to use. Valid measures: qrel, tauap, pearsonrank, weightedtau, spearmanr, kldiv (using scores)
  • --metric <metric>: the relevance metric to use with the qrel measure. Accepts ir_measures notation
  • --topk <k>: the number of queries to compare (as identified by the query contrast measure)
  • --weights_1 <file>, --weights_2 <file>: term importance files to use for snippet selection

Batch mode

Use diffir-batch to generate comparison pages for every pair of run files in a directory.

Usage: diffir-batch <input directory> -o <output directory> <options> where the <options> are those shown above.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

diffir-0.2.0.tar.gz (26.9 kB view details)

Uploaded Source

Built Distribution

diffir-0.2.0-py3-none-any.whl (34.6 kB view details)

Uploaded Python 3

File details

Details for the file diffir-0.2.0.tar.gz.

File metadata

  • Download URL: diffir-0.2.0.tar.gz
  • Upload date:
  • Size: 26.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.4.2 requests/2.23.0 setuptools/41.0.1 requests-toolbelt/0.9.1 tqdm/4.61.0 CPython/3.7.0

File hashes

Hashes for diffir-0.2.0.tar.gz
Algorithm Hash digest
SHA256 716777217ae254f05ad3f8d681bbad357056cc1f6ba6f86ab464670931dd389e
MD5 7522e36e51eb44b56383745748809226
BLAKE2b-256 1b8027ac89e30c0138602e8a3a31c4059bca8f7398fbf8734126592d96457dce

See more details on using hashes here.

File details

Details for the file diffir-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: diffir-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 34.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.4.2 requests/2.23.0 setuptools/41.0.1 requests-toolbelt/0.9.1 tqdm/4.61.0 CPython/3.7.0

File hashes

Hashes for diffir-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 833f62ff235125fa1dca8f5b457702db69a5bb5b7dbaa2828f8d70facafed01a
MD5 950983367a631f5fc7b6552b8458c580
BLAKE2b-256 ca8befc03586a5620fa26519c419ca54f48a619927080bc4438293cb7250307c

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page