Skip to main content

Alternative scorer for the CoNLL-2011/2012 shared tasks on coreference resolution.

Project description

Scorch¹

Build Status

This is an alternative implementation of the coreference scorer for the CoNLL-2011/2012 shared tasks on coreference resolution.

It aims to be more straightforward than the reference implementation, while maintaining as much compatibility with it as possible.

The implementations of the various scores are as close as possible from the formulas used by Pradhan et al. (2014), with the edge cases for BLANC taken from Recasens and Hovy (2011).


1. Scorer for coreference chains.

Use

Download from master with

git clone https://github.com/LoicGrobol/scorch.git

Install with

python3 -m pip install .

Then just use scorch, e.g.

scorch gold.json sys.json out.txt

Alternatively, just running scorch.py without installing should work as long as you have all the dependencies installed

python3 scorch.py -h

Formats

Single document

The input files should be JSON files with a "type" key at top-level

  • If "type" is "graph", then top-level should have at top-level
    • A "mentions" key containing a list of all mention identifiers
    • A "links" key containing a list of pairs of corefering mention identifiers
  • If "type" is "clusters", then top-level should have a "clusters" key containing a mapping from clusters ids to cluster contents (as lists of mention identifiers).

Of course the system and gold files should use the same set of mention identifiers…

Multiple documents

If the inputs to directories, files with the same base name (excluding extension) as those present in the gold directory are expected to be present in the sys directory, with exactly one sys file for each gold file. In that case, the output scores will be the micro-average of the individual files scores, ie their arithmetic means weighted by the relative numbers of

  • Gold mentions for Recall
  • System mentions for Precision
  • The sum of the previous two for F₁

This is different from the reference interpretation where

  • MUC weighting ignores mentions in singleton entities
    • This should not make any difference for the CoNLL-2012 dataset, since singleton entities are not annotated.
    • For datasets with singletons, the shortcomings of MUC are well known, so this score shouldn't matter much
  • BLANC is calculated by micro-averaging coreference and non-coreference separately, using the number of links as weights instead of the number of mentions.

The CoNLL average score is the arithmetic mean of the global MUC, B³ and CEAFₑ F₁ scores.

Sources

License

Unless otherwise specified (see below), the following licence (the so-called “MIT License”) applies to all the files in this repository. See also LICENSE.md.

Copyright 2018 Loïc Grobol <loic.grobol@gmail.com>

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and
associated documentation files (the "Software"), to deal in the Software without restriction,
including without limitation the rights to use, copy, modify, merge, publish, distribute,
sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or
substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT
NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM,
DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT
OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

License exceptions

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

scorch-0.0.14.tar.gz (11.5 kB view details)

Uploaded Source

Built Distribution

scorch-0.0.14-py3-none-any.whl (13.2 kB view details)

Uploaded Python 3

File details

Details for the file scorch-0.0.14.tar.gz.

File metadata

  • Download URL: scorch-0.0.14.tar.gz
  • Upload date:
  • Size: 11.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No

File hashes

Hashes for scorch-0.0.14.tar.gz
Algorithm Hash digest
SHA256 80adc24ef6db184a4d4a4dfb95b4de576a17f7f6963560cd382b721bb917b166
MD5 d595ed1895617e68e959dd5ee4c33ec9
BLAKE2b-256 9618222765893985fe72fde99dd785a501e16411fd513c07d8f3db1a74838c5a

See more details on using hashes here.

File details

Details for the file scorch-0.0.14-py3-none-any.whl.

File metadata

File hashes

Hashes for scorch-0.0.14-py3-none-any.whl
Algorithm Hash digest
SHA256 ffe3c96dfd61637cd655fe87a47ed6d8bd60d730b820fe180e26d82d2a6b6cc7
MD5 10be4a237e174caedd50f0a372c45dd8
BLAKE2b-256 f232c5b6909941b58e2c134c5b75d7f35185a3f4b6b9a7b3f6518afe151fea7c

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page