Named Entity Recognition (NER) and Relation Extraction (RE) library using Regular Expressions
Project description
Extr
Named Entity Recognition (NER) and Relation Extraction (RE) library using Regular Expressions
Install
pip install extr
Example
text = 'Ted is a Pitcher.'
1. Entity Extraction
Find Named Entities from text.
from extr import RegEx, RegExLabel
from extr.entities import EntityExtractor
entity_extractor = EntityExtractor([
RegExLabel('PERSON', [
RegEx([r'ted'], re.IGNORECASE)
]),
RegExLabel('POSITION', [
RegEx([r'pitcher'], re.IGNORECASE)
]),
])
entities = entity_extractor.get_entities(text)
## entities == [
## <Entity label="POSITION" text="Pitcher" span=(9, 16)>,
## <Entity label="PERSON" text="Ted" span=(0, 3)>
## ]
2. Visualize Entities in HTML
Annotate text to display in HTML.
from extr.entities import HtmlEntityAnnotator
html = HtmlEntityAnnotator().annotate(text, entities)
<!-- customize colors by label -->
<style>
span.entity {
border: 1px solid black;
border-radius: 5px;
padding: 5px;
margin: 3px;
color: gray;
cursor: pointer;
}
span.label {
font-weight: bold;
padding: 3px;
color: black;
}
.lb-PERSON {
background-color: orange;
}
.lb-POSITION {
background-color: yellow;
}
</style>
<div>
{{ -- insert html here -- }}
</div>
3. Relation Extraction
Annotate and Extract Relationships between Entities
from extr.entities import EntityAnnotator
from extr.relations import RelationExtractor, \
RegExRelationLabelBuilder
## define relationship between PERSON and POSITION
relationship = RegExRelationLabelBuilder('is_a') \
.add_e1_to_e2(
'PERSON', ## e1
[
## define how the relationship exists in nature
r'\s+is\s+a\s+',
],
'POSITION' ## e2
) \
.build()
relations_to_extract = [relationship]
## `entities` see 'Entity Extraction' above
annotation_results = EntityAnnotator().annotate(text, entities)
relations = RelationExtractor(relations_to_extract).extract(annotation_results)
## relations == [
## <Relation e1="Ted" r="is_a" e2="Pitcher">
## ]
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
extr-0.0.13.tar.gz
(7.4 kB
view hashes)
Built Distribution
extr-0.0.13-py3-none-any.whl
(10.1 kB
view hashes)