Skip to main content

Coreference Resolution wrapper

Project description

Coreference Resolution wrapper

Coreference Resolution is the task of finding all expressions that refer to the same entity in a text. It is an important step for a lot of higher level NLP tasks that involve natural language understanding such as document summarization, question answering, and information extraction.

This is a simple library that wrap two Coreference Resolution models form StanfordNLP package: the statistic and neural models. We use here the SpaCy package to load the neural model (a.k.a, NeuralCoref), and the stanfordnlp package to load the statistic model (a.k.a, CoreNLPCoref).

Requirements

pip3 install spacy
pip3 install stanfordnlp
pip3 install wrapperCoreference

StanfordNLP also require the manual downloading of a core of modules, review here for more details.

wget http://nlp.stanford.edu/software/stanford-corenlp-full-2018-10-05.zip

Methods

Example of usage of the neural model

from wrapperCoreference import WrapperCoreference
wc = WrapperCoreference()
wc.NeuralCoref(u'My sister has a dog. She loves him.')
#output: [{'start': 21, 'end': 24, 'text': 'She', 'resolved': 'My sister'}, {'start': 31, 'end': 34, 'text': 'him', 'resolved': 'a dog'}]

Example of usage of the statistic model

from wrapperCoreference import WrapperCoreference
wc = WrapperCoreference()
wc.setCoreNLP('/tmp/stanford-corenlp-full-2018-10-05')
print(wc.CoreNLPCoref(u'My sister has a dog. She loves him.'))
#output: [{'start': 31, 'end': 34, 'text': 'him', 'resolved': 'a dog', 'fullInformation': [{'start': 14, 'end': 19, 'text': 'a dog'}]}, {'start' : 21, 'end': 24, 'text': 'She', 'resolved': 'My sister', 'fullInformation': [{'start': 0, 'end': 9, 'text': 'My sister'}]}]

Combining the output with Entity Linking

You can use the nifwrapper library in order to merge the coreference outputs with Entity Linking annotations.

from wrapperCoreference import WrapperCoreference
from nifwrapper import *

#---- Obtaining coreferences
wc = WrapperCoreference()
corefResults = wc.NeuralCoref(u'My sister has a dog. She loves him.')
#corefResults = [{'start': 21, 'end': 24, 'text': 'She', 'resolved': 'My sister'}, {'start': 31, 'end': 34, 'text': 'him', 'resolved': 'a dog'}]


#---- Obtaining Entity Linking results
# inline NIF corpus creation
wrp = NIFWrapper()
doc = NIFDocument("https://example.org/doc1")
#--
sent = NIFSentence("https://example.org/doc1#char=0,19")
sent.addAttribute("nif:beginIndex","0","xsd:nonNegativeInteger")
sent.addAttribute("nif:endIndex","19","xsd:nonNegativeInteger")
sent.addAttribute("nif:isString","My sister has a dog.","xsd:string")
sent.addAttribute("nif:broaderContext",["https://example.org/doc1"],"URI LIST")


#-- 
a1 = NIFAnnotation("https://example.org/doc1#char=14,19", "14", "19", ["https://en.wikipedia.org/wiki/ExambleDogUri"], ["dbo:FamilyRelations"])
a1.addAttribute("nif:anchorOf","a dog","xsd:string")
sent.pushAnnotation(a1)
doc.pushSentence(sent)

#--
sent2 = NIFSentence("https://example.org/doc1#char=21,35")
sent2.addAttribute("nif:isString","She loves him.","xsd:string")
sent2.addAttribute("nif:broaderContext",["https://example.org/doc1"],"URI LIST")
sent2.addAttribute("nif:beginIndex","21","xsd:nonNegativeInteger")
sent2.addAttribute("nif:endIndex","35","xsd:nonNegativeInteger")
doc.pushSentence(sent2)
#--
wrp.pushDocument(doc)

#---- Combining EL annotations with coreferences 
wrp.extendsDocWithCoref(corefResults, doc.uri)

print(wrp.toString())

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Files for wrapperCoreference, version 0.0.4
Filename, size File type Python version Upload date Hashes
Filename, size wrapperCoreference-0.0.4-py3-none-any.whl (4.7 kB) File type Wheel Python version py3 Upload date Hashes View hashes
Filename, size wrapperCoreference-0.0.4.tar.gz (7.1 kB) File type Source Python version None Upload date Hashes View hashes

Supported by

Elastic Elastic Search Pingdom Pingdom Monitoring Google Google BigQuery Sentry Sentry Error logging AWS AWS Cloud computing DataDog DataDog Monitoring Fastly Fastly CDN DigiCert DigiCert EV certificate StatusPage StatusPage Status page