Skip to main content

brat utilities

Project description

bratutils

CircleCI License: MIT

A collection of utilities for manipulating data and calculating inter-annotator agreement in brat annotation files.

Installation

Install as a normal package from the source directory.

$ pip install bratutils

Agreement Definition

Agreement in multi-token annotations is commonly evaluated using f-score. due to various problems with computing the traditional Krippendorf's alpha and Cohen's kappa. Hripcsak prove the validity of the metric for very large populations, i.e. for unrestricted text annotations.

This library roughly follows the definitions of precision and recall calculation from the MUC-7 test scoring. The basic definitions along with some additional restrictions are laid out below:

  • CORRECT - when annotation tags and indices match completely
  • INCORRECT - when annotation tags do not match, but the indices coincide
  • PARTIAL - when the annotation tags are the same but one of the annotations has the same end index and a different start index
  • MISSING - annotations exising only in the gold standard annotation set
  • SPURIOUS - annotations existing only in the candidate annotation set

Note: the gold standard is considered the collections/document from which the comparison is invoked, while the supplied parallel annotation is considered the candidate set.

Disclaimer: the current definition of the PARTIAL category accomodates working with syntactic chunks. A different arrangement (e.g. pick largest contained tag as partial match instead of rightmost) might be more suitable for other tasks, for example some types of semantic annotation.

Examples

Simple example:

from bratutils import agreement as a

doc = a.Document('res/samples/A/data-sample-1.ann')
doc2 = a.Document('res/samples/B/data-sample-1.ann')

doc.make_gold()
statistics = doc2.compare_to_gold(doc)

print(statistics)

Output:

-------------------MUC-Table--------------------
------------------------------------------------
pos:135
act:134
cor:115
par:5
inc:4
mis:11
spu:10
------------------------------------------------
pre:0.858208955224
rec:0.851851851852
fsc:0.855018587361
------------------------------------------------
und:0.0814814814815
ovg:0.0746268656716
sub:0.0725806451613
------------------------------------------------
bor:119
ibo:15
------------------------------------------------
------------------------------------------------

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

bratutils-0.2.2.tar.gz (13.6 kB view details)

Uploaded Source

Built Distribution

bratutils-0.2.2-py3-none-any.whl (17.4 kB view details)

Uploaded Python 3

File details

Details for the file bratutils-0.2.2.tar.gz.

File metadata

  • Download URL: bratutils-0.2.2.tar.gz
  • Upload date:
  • Size: 13.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.0.1 requests-toolbelt/0.9.1 tqdm/4.32.1 CPython/3.6.8

File hashes

Hashes for bratutils-0.2.2.tar.gz
Algorithm Hash digest
SHA256 6eb242b537598cf8acc06baffb84e5593dc17c6fdca6658516f0fb9ec72fa467
MD5 e60cf2f9a8e796aeb97bd1f5c1e6ed2d
BLAKE2b-256 e66762cc3b15db0c0d71523700769452c42b2995ac9327312c354945b1a5ce45

See more details on using hashes here.

File details

Details for the file bratutils-0.2.2-py3-none-any.whl.

File metadata

  • Download URL: bratutils-0.2.2-py3-none-any.whl
  • Upload date:
  • Size: 17.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.0.1 requests-toolbelt/0.9.1 tqdm/4.32.1 CPython/3.6.8

File hashes

Hashes for bratutils-0.2.2-py3-none-any.whl
Algorithm Hash digest
SHA256 b99115b4806a7f1cc8a6499d5dee9b03b2af0b1d353370af4a0e88a289d08c7a
MD5 3a65f4ac98119421535dcfd489b86a19
BLAKE2b-256 b836ef2c5ef29e0aaa5c2fe19e99bb9567f70de86c8e9ad35352b9a828d82e17

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page