Skip to main content

Python implementation of the R package `ircor`

Project description

pyircor

https://img.shields.io/pypi/v/pyircor.svg https://img.shields.io/travis/eldrin/pyircor.svg Documentation Status

is the Python implementation of the R package ircor. ircor provides the implementation of various correlation coefficients of common use in Information Retrieval, such as Kendall and AP correlation coefficients, with and without ties. For this implementation, numba is used for the accelleration.

For reference please refer to Julián Urbano and Mónica Marrero, “The Treatment of Ties in AP Correlation”, ACM ICTIR, 2017.

  • Free software: MIT license

Installation

You may install the stable release from PyPI using pip

pip install pyircor

Usage

tau and tauap implement the Kendall tau and Yilmaz tauAP correlation coefficients, where no ties are allowed between items:

from pyircor.tau import tau
from pyircor.tauap import tauap
import nupmy as np

x = np.array([0.06, 0.2, 0.27, 0.37, 0.57, 0.63, 0.66, 0.9, 0.91, 0.94])
y = np.array([0.37, 0.06, 0.2, 0.27, 0.57, 0.66, 0.63, 0.91, 0.9, 0.94])
tau(x, y)
# 0.7777777777777778
tauap(x, y)
# 0.7491181657848325

In tauap it is important to use the correct sorting order. By default, items are sorted in decreasing order, as should be for instance if the scores represent system effectiveness. When they should be in increasing order, decreasing should be set to False:

from pyircor.tauap import tauap

# these two calls are equivalent
tauap(x, y)
# 0.7491181657848325
tauap(-x, -y, decreasing=False)
# 0.7491181657848325

tau_a and tauap_a are versions to use when x represents a true ranking without ties, and y represents a ranking estimated by an observer who is allowed to produce ties. They can be used as a measure of accuracy of the observer with respect to the true ranking

from pyircor.tau import tau_a
from pyircor.tauap_a import tauap_a

y = np.round(y * 5) / 5
tau_a(x, y)
# 0.7111111111111111
tauap_a(x, y)
# 0.6074514991181656

tau_b and tauap_b are versions to use under the assumption that both x and y represent rankings estimated by two observers who may produce ties. They can be used as a measure of agreement between the observers:

x = np.round(x * 5) / 5
tau_b(x, y)
# 0.75
tauap_b(x, y)
# 0.626984126984127

Credits

Along with the codebase itself, many parts of this package, including docstrings and comments, are directly adopted under the original authors’ agreement. Please refer to the original work if you want to use this package for any publication.

This package was created with Cookiecutter and the audreyr/cookiecutter-pypackage project template.

Reference

@inproceedings{urbano2017ties,
  author = {Urbano, Juli{\'{a}}n and Marrero, M{\'{o}}nica},
  booktitle = {ACM SIGIR International Conference on the Theory of Information Retrieval},
  pages = {321--324},
  title = {{The Treatment of Ties in AP Correlation}},
  year = {2017}
}

History

0.1.0 (2019-12-08)

  • First release on PyPI.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Files for pyircor, version 0.2.0
Filename, size File type Python version Upload date Hashes
Filename, size pyircor-0.2.0-py2.py3-none-any.whl (7.9 kB) File type Wheel Python version py2.py3 Upload date Hashes View hashes
Filename, size pyircor-0.2.0.tar.gz (14.2 kB) File type Source Python version None Upload date Hashes View hashes

Supported by

Elastic Elastic Search Pingdom Pingdom Monitoring Google Google BigQuery Sentry Sentry Error logging AWS AWS Cloud computing DataDog DataDog Monitoring Fastly Fastly CDN DigiCert DigiCert EV certificate StatusPage StatusPage Status page