A SpaCy pipeline object for extracting values that correspond to a named entity (e.g., birth dates, account numbers, or laboratory results)
Project description
extractacy - pattern extraction and named entity linking for spaCy
spaCy pipeline object for extracting values that correspond to a named entity (e.g., birth dates, account numbers, or laboratory results)
Installation and usage
Install the library.
pip install extractacy
Import library and spaCy.
import spacy from spacy.pipeline import EntityRuler from extractacy.extract import ValueExtractor
Load spacy language model. Set up an EntityRuler for the example.
nlp = spacy.load("en_core_web_sm") # Set up entity ruler ruler = EntityRuler(nlp) patterns = [ {"label": "TEMP_READING", "pattern": [{"LOWER": "temperature"}]}, {"label": "TEMP_READING", "pattern": [{"LOWER": "temp"}]}, { "label": "DISCHARGE_DATE", "pattern": [{"LOWER": "discharge"}, {"LOWER": "date"}], }, ] ruler.add_patterns(patterns) nlp.add_pipe(ruler, last=True)
Define which entities you would like to link patterns to. Each entity needs 3 things:
- patterns to search for (list). This relies on spaCy token matching syntax.
- n_tokens to search around a named entity (
int
orsent
) - direction (
right
,left
,both
)
# Define ent_patterns for value extraction ent_patterns = { "DISCHARGE_DATE": {"patterns": [[{"SHAPE": "dd/dd/dddd"}],[{"SHAPE": "dd/d/dddd"}]],"n": 2, "direction": "right"}, "TEMP_READING": {"patterns": [[ {"LIKE_NUM": True}, {"LOWER": {"IN": ["f", "c", "farenheit", "celcius", "centigrade", "degrees"]} }, ] ], "n": "sent", "direction": "both" }, }
Add ValueExtractor to spaCy processing pipeline
valext = ValueExtractor(nlp, ent_patterns) nlp.add_pipe(valext, last=True) doc = nlp("Discharge Date: 11/15/2008. Patient had temp reading of 102.6 degrees.") for e in doc.ents: if e._.value_extract: print(e.text, e.label_, e._.value_extract) ## Discharge Date DISCHARGE_DATE 11/15/2008 ## temp reading TEMP_READING 102.6 degrees
Contributing
Authors
- Jeno Pizarro
License
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Filename, size | File type | Python version | Upload date | Hashes |
---|---|---|---|---|
Filename, size extractacy-0.1.2.tar.gz (4.6 kB) | File type Source | Python version None | Upload date | Hashes View |