SpaCy DBpedia Spotlight wrapper
Project description
Spacy DBpedia Spotlight
This package acts as a Entity Recogniser and Linker using DBpedia Spotlight, annotating SpaCy's Spans and adding them to the entities annotations.
It can create a new Language object or be added on an existing one.
It uses:
DBPEDIA_ENT
asent_type
- the entity URI as
ent_kb_id
Usage
Use on a blank new language
import spacy_dbpedia_spotlight
# here a new blank model will be created
nlp = spacy_dbpedia_spotlight.load('en')
doc = nlp('The president of USA is calling Boris Johnson to decide what to do about coronavirus')
print("Entities", [(ent.text, ent.label_, ent.kb_id_) for ent in doc.ents])
Or use on top of an existing nlp object (added as last pipeline stage)
import spacy
import spacy_dbpedia_spotlight
# use your model
nlp = spacy.load('en_core_web_lg')
# pass nlp as parameter
spacy_dbpedia_spotlight.load('en', nlp)
doc = nlp('The president of USA is calling Boris Johnson to decide what to do about coronavirus')
print("Entities", [(ent.text, ent.label_, ent.kb_id_) for ent in doc.ents])
Or if you want to just get the pipeline stage and place it where you want
import spacy
# any nlp you want
nlp = spacy.blank('en')
# create the pipe component, the dict argument is optional
entity_annotator = nlp.create_pipe('annotate_dbpedia_spotlight', {'language_code':'it'})
# add on your fancy pipeline with options like `first`
nlp.add_pipe(entity_annotator, first=True)
Output example:
Entities [('USA', 'DBPEDIA_ENT', 'http://dbpedia.org/resource/United_States'), ('Boris Johnson', 'DBPEDIA_ENT', 'http://dbpedia.org/resource/Boris_Johnson'), ('coronavirus', 'DBPEDIA_ENT', 'http://dbpedia.org/resource/Coronavirus')]
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Close
Hashes for spacy_dbpedia_spotlight-0.1.3.tar.gz
Algorithm | Hash digest | |
---|---|---|
SHA256 | 6883e447f3ac2514585315f99191de559277cfcc4aa25fad37867fb695e7cbb1 |
|
MD5 | 3a6c3e4be6333d42e224afcdd5219c0b |
|
BLAKE2b-256 | 993c1d674084db34395ce968e3e1bd4b1958d932cd99398d12f9e165d38c947d |