A small lspaCy pipeline component for matching within document sentences using regular expressi
Project description
sentency
A small spaCy pipeline component for matching within document sentences using regular expressions.
Free software: MIT license
Documentation: https://sentency.readthedocs.io.
Features
spaCy component for sentence-by-sentence pattern matching
Find matches with complex patterns using the power of regular expressions
Easily convert simple keywords into valid regular expressions
Specify matching patterns as well as patterns to ignore
Annotate matches for NER (Named Entity Recognition) tasks
Installation
pip install sentency
Usage
The following minimally complex example showcases the features of sentenCy.
import spacy
from spacy import displacy
from sentency.regex import regexize_keywords
from sentency.sentency import Sentex
text = """
Screening for abdominal aortic aneurysm.
Impression: There is evidence of a fusiform
abdominal aortic aneurysm measuring 3.4 cm.
"""
aaa_keywords = "abdominal aortic aneurysm"
ignore_keywords = "screening aneurysm"
keyword_regex = regexize_keywords(aaa_keywords)
ignore_regex = regexize_keywords(ignore_keywords)
nlp = spacy.load("en_core_web_sm")
nlp.add_pipe(
"sentex", config={
"sentence_regex": keyword_regex,
"ignore_regex": ignore_regex,
"annotate_ents": True,
"label": "AAA"
}
)
doc = nlp(text)
displacy.render(doc, style="ent", options = {"ents": ["AAA"]})
Credits
This package was created with Cookiecutter and the audreyr/cookiecutter-pypackage project template.
History
0.1.0 (2022-03-08)
First release on PyPI.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file sentency-0.2.0.tar.gz
.
File metadata
- Download URL: sentency-0.2.0.tar.gz
- Upload date:
- Size: 13.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/1.14.0 pkginfo/1.8.2 requests/2.27.1 setuptools/67.7.2 requests-toolbelt/0.9.1 tqdm/4.63.0 CPython/3.8.10
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 88f334b06addc38851c6529e957240fe584fa80483bc9466fa2745bc6f74c5f5 |
|
MD5 | af5fc61371cebf863c0a75e90fd1fd4c |
|
BLAKE2b-256 | 6a60450897d7b180dabca8ffd275084e84a941d02da5cbb47e4d6f3b43ed9a86 |
File details
Details for the file sentency-0.2.0-py2.py3-none-any.whl
.
File metadata
- Download URL: sentency-0.2.0-py2.py3-none-any.whl
- Upload date:
- Size: 7.2 kB
- Tags: Python 2, Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/1.14.0 pkginfo/1.8.2 requests/2.27.1 setuptools/67.7.2 requests-toolbelt/0.9.1 tqdm/4.63.0 CPython/3.8.10
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 4d815839fc310149f02cffef2f95f670a729d5322cf4c4f409b4c28e4d79027e |
|
MD5 | 0b95a1fe0f91c2bf02847454ca0bbe2b |
|
BLAKE2b-256 | c740593c3efbed500ab6a843a0a3addb10edc61014deb89c246f9904b415f144 |