Skip to main content

Quick Inter Coder Agreement in Python

Project description

Quick Inter Coder Agreement in Python

Quica (Quick Inter Coder Agreement in Python) is a tool to run inter coder agreement pipelines in an easy and effective ways. Multiple measures are run and results are collected in a single table than can be easily exported in Latex. quica supports binary or multiple coders.

https://img.shields.io/pypi/v/quica.svg https://github.com/vinid/quica/workflows/Python%20package/badge.svg Documentation Status License

Quick Inter Coder Agreement in Python

Installation

pip install -U quica

Get Quick Agreement

If you already have a python dataframe you can run Quica with few liens of code! Let’s assume you have two coders; we will create a pandas dataframe just to show how to use the library. As for now, we support only integer values and we still have not included weighting.

from quica.quica import Quica
import pandas as pd

coder_1 = [0, 1, 0, 1, 0, 1]
coder_3 = [0, 1, 0, 1, 0, 0]

dataframe = pd.DataFrame({"coder1" : coder_1,
              "coder3" : coder_3})

quica = Quica(dataframe=dataframe)
print(quica.get_results())

This is the expected output:

Out[1]:
             score
names
krippendorff  0.685714
fleiss        0.666667
scotts        0.657143
raw           0.833333
mace          0.426531
cohen         0.666667

It was pretty easy to get all the scores, right? What if we do not have a pandas dataframe? what if we want to directly get the latex table to put into the paper? worry not, my friend: it’s easier done than said!

from quica.measures.irr import *
from quica.dataset.dataset import IRRDataset
from quica.quica import Quica

coder_1 = [0, 1, 0, 1, 0, 1]
coder_3 = [0, 1, 0, 1, 0, 0]

disagreeing_coders = [coder_1, coder_3]
disagreeing_dataset = IRRDataset(disagreeing_coders)

quica = Quica(disagreeing_dataset)

print(quica.get_results())
print(quica.get_latex())

you should get this in output, note that the latex table requires the booktabs package:

Out[1]:
             score
names
krippendorff  0.685714
fleiss        0.666667
scotts        0.657143
raw           0.833333
mace          0.426531
cohen         0.666667

Out[2]:

\begin{tabular}{lr}
\toprule
{} &     score \\
names        &           \\
\midrule
krippendorff &  0.685714 \\
fleiss       &  0.666667 \\
scotts       &  0.657143 \\
raw          &  0.833333 \\
mace         &  0.426531 \\
cohen        &  0.666667 \\
\bottomrule
\end{tabular}

Features

from quica.measures.irr import *
from quica.dataset.dataset import IRRDataset
from quica.quica import Quica

coder_1 = [0, 1, 0, 1, 0, 1]
coder_2 = [0, 1, 0, 1, 0, 1]
coder_3 = [0, 1, 0, 1, 0, 0]

agreeing_coders = [coder_1, coder_2]
agreeing_dataset = IRRDataset(agreeing_coders)

disagreeing_coders = [coder_1, coder_3]
disagreeing_dataset = IRRDataset(disagreeing_coders)

kri = Krippendorff()
cohen = CohensK()

assert kri.compute_irr(agreeing_dataset) == 1
assert kri.compute_irr(agreeing_dataset) == 1
assert cohen.compute_irr(disagreeing_dataset) < 1
assert cohen.compute_irr(disagreeing_dataset) < 1

Supported Algorithms

  • MACE (Multi-Annotator Competence Estimation)
    • Hovy, D., Berg-Kirkpatrick, T., Vaswani, A., & Hovy, E. (2013, June). Learning whom to trust with MACE. In Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (pp. 1120-1130).

    • We define the inter coder agreeement as the average competence of the users.

  • Krippendorff’s Alpha

  • Cohens’ K

  • Fleiss’ K

  • Scotts’ PI

  • Raw Agreement: Standard Accuracy

Credits

This package was created with Cookiecutter and the audreyr/cookiecutter-pypackage project template. Thanks to Pietro Lesci and Dirk Hovy for their implementation of MACE.

History

0.1.0 (2020-11-08)

  • New API to get the output

  • Fixed test cases

  • Extended documentation on the README file

0.1.0 (2020-11-05)

  • First release on PyPI.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

quica-0.2.5.tar.gz (21.0 kB view details)

Uploaded Source

Built Distribution

quica-0.2.5-py2.py3-none-any.whl (10.8 kB view details)

Uploaded Python 2 Python 3

File details

Details for the file quica-0.2.5.tar.gz.

File metadata

  • Download URL: quica-0.2.5.tar.gz
  • Upload date:
  • Size: 21.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.6.1 requests/2.24.0 setuptools/50.3.2 requests-toolbelt/0.9.1 tqdm/4.51.0 CPython/3.8.6

File hashes

Hashes for quica-0.2.5.tar.gz
Algorithm Hash digest
SHA256 dc3880b224229fe8398db02c1b16b827f4bca057f7b66d8017eed125ad2670cb
MD5 44e96ee48a54805759d691019e276fbb
BLAKE2b-256 f51b08836de3ac44a3665076e163820d4d1db3e16baed93196e4dc62d1760dd5

See more details on using hashes here.

File details

Details for the file quica-0.2.5-py2.py3-none-any.whl.

File metadata

  • Download URL: quica-0.2.5-py2.py3-none-any.whl
  • Upload date:
  • Size: 10.8 kB
  • Tags: Python 2, Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.6.1 requests/2.24.0 setuptools/50.3.2 requests-toolbelt/0.9.1 tqdm/4.51.0 CPython/3.8.6

File hashes

Hashes for quica-0.2.5-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 636300d9ae2dd98a2c7e9dc2e54f4dda547bed872346181c35edac93c60a0d9f
MD5 5271b182d2e60922099d5f92b675936c
BLAKE2b-256 b3af0b74cab32427e6b85b11716971246ba9fe5235300bcbd30e6394d8fb0668

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page