Named Entity Recognition (NER) and Relation Extraction (RE) library using Regular Expressions
Project description
Extr
Named Entity Recognition (NER) and Relation Extraction (RE) library using Regular Expressions
Install
pip install extr
Example
text = 'Ted is a Pitcher.'
1. Entity Extraction
Find Named Entities from text.
from extr import RegEx, RegExLabel, EntityExtractor
entity_extractor = EntityExtractor([
RegExLabel('PERSON', [
RegEx([r'ted'], re.IGNORECASE)
]),
RegExLabel('POSITION', [
RegEx([r'pitcher'], re.IGNORECASE)
]),
])
entities = entity_extractor.get_entities(text)
## entities == [
## <Entity label="POSITION" text="Pitcher" span=(9, 16)>,
## <Entity label="PERSON" text="Ted" span=(0, 3)>
## ]
2. Relation Extraction
Annotate and Extract Relationships between Entities
from extr import EntityAnnotator
from extr import RegExRelationLabelBuilder, RelationExtractor
## define relationship between PERSON and POSITION
relationship = RegExRelationLabelBuilder('is_a') \
.add_e1_to_e2(
'PERSON', ## e1
[
## define how the relationship exists in nature
r'\s+is\s+a\s+',
],
'POSITION' ## e2
) \
.build()
relations_to_extract = [relationship]
## `entities` see 'Entity Extraction' above
annotation_results = EntityAnnotator().annotate(text, entities)
relations = RelationExtractor(relations_to_extract).extract(annotation_results)
## relations == [
## <Relation e1="Ted" r="is_a" e2="Pitcher">
## ]
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
extr-0.0.7.tar.gz
(4.3 kB
view hashes)
Built Distribution
extr-0.0.7-py3-none-any.whl
(5.9 kB
view hashes)