Skip to main content

A small lspaCy pipeline component for matching within document sentences using regular expressi

Project description

sentency

PYPI Status Documentation Status

A small spaCy pipeline component for matching within document sentences using regular expressions.

Features

  • spaCy component for sentence-by-sentence pattern matching

  • Find matches with complex patterns using the power of regular expressions

  • Easily convert simple keywords into valid regular expressions

  • Specify matching patterns as well as patterns to ignore

  • Annotate matches for NER (Named Entity Recognition) tasks

Installation

pip install sentency

Usage

The following minimally complex example showcases the features of sentenCy.

import spacy
from spacy import displacy

from sentency.regex import regexize_keywords
from sentency.sentency import Sentex

text = """
Screening for abdominal aortic aneurysm.
Impression: There is evidence of a fusiform
abdominal aortic aneurysm measuring 3.4 cm.
"""
aaa_keywords = "abdominal aortic aneurysm"
ignore_keywords = "screening aneurysm"

keyword_regex = regexize_keywords(aaa_keywords)
ignore_regex = regexize_keywords(ignore_keywords)

nlp = spacy.load("en_core_web_sm")
nlp.add_pipe(
"sentex", config={
        "sentence_regex": keyword_regex,
        "ignore_regex": ignore_regex,
        "annotate_ents": True,
        "label": "AAA"
        }
)

doc = nlp(text)

displacy.render(doc, style="ent", options = {"ents": ["AAA"]})

Credits

This package was created with Cookiecutter and the audreyr/cookiecutter-pypackage project template.

History

0.1.0 (2022-03-08)

  • First release on PyPI.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sentency-0.2.0.tar.gz (13.9 kB view hashes)

Uploaded Source

Built Distribution

sentency-0.2.0-py2.py3-none-any.whl (7.2 kB view hashes)

Uploaded Python 2 Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page