Skip to main content

A spaCy pipeline object for negation.

Project description

negspacy: negation for spaCy

Build Status Build Status Built with spaCy pypi Version DOI Code style: black

spaCy pipeline object for negating concepts in text. Based on the NegEx algorithm.

NegEx - A Simple Algorithm for Identifying Negated Findings and Diseases in Discharge Summaries Chapman, Bridewell, Hanbury, Cooper, Buchanan https://doi.org/10.1006/jbin.2001.1029

Installation and usage

Install the library.

pip install negspacy

Import library and spaCy.

import spacy
from negspacy.negation import Negex

Load spacy language model. Add negspacy pipeline object. Filtering on entity types is optional.

nlp = spacy.load("en_core_web_sm")
negex = Negex(nlp, ent_types=["PERSON","ORG"])
nlp.add_pipe(negex, last=True)

View negations.

doc = nlp("She does not like Steve Jobs but likes Apple products.")

for e in doc.ents:
	print(e.text, e._.negex)
Steve Jobs True
Apple False

Consider pairing with scispacy to find UMLS concepts in text and process negations.

NegEx Patterns

  • psuedo_negations - phrases that are false triggers, ambiguous negations, or double negatives
  • preceding_negations - negation phrases that precede an entity
  • following_negations - negation phrases that follow an entity
  • termination - phrases that cut a sentence in parts, for purposes of negation detection (.e.g., "but")

Termsets

Designate termset to use, en_clinical is used by default.

negex = Negex(nlp, language = "en_clinical")

  • en = phrases for general english language text
  • en_clinical DEFAULT = adds phrases specific to clinical domain to general english
  • en_clinical_sensitive = adds additional phrases to help rule out historical and possibly irrelevant entities

Additional Functionality

Use own patterns or view patterns in use

Use own patterns

nlp = spacy.load("en_core_web_sm")
negex = Negex(nlp, termination=["but", "however", "nevertheless", "except"])

View patterns in use

patterns_dict = negex.get_patterns

Negations in noun chunks

Depending on the Named Entity Recognition model you are using, you may have negations "chunked together" with nouns. For example when using scispacy:

nlp = spacy.load("en_core_sci_sm")
doc = nlp("There is no headache.")
for e in doc.ents:
    print(e.text)

# no headache

This would cause the Negex algorithm to miss the preceding negation. To account for this, you can add a chunk_prefix:

nlp = spacy.load("en_core_sci_sm")
negex = Negex(nlp, language = "en_clinical", chunk_prefix = ["no"])
nlp.add_pipe(negex)
doc = nlp("There is no headache.")
for e in doc.ents:
    print(e.text, e._.negex)

# no headache True

Contributing

contributing

Authors

  • Jeno Pizarro

License

license

API Documentation

Docs

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

negspacy-0.1.7.tar.gz (7.8 kB view details)

Uploaded Source

File details

Details for the file negspacy-0.1.7.tar.gz.

File metadata

  • Download URL: negspacy-0.1.7.tar.gz
  • Upload date:
  • Size: 7.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.0.1 requests-toolbelt/0.9.1 tqdm/4.43.0 CPython/3.6.9

File hashes

Hashes for negspacy-0.1.7.tar.gz
Algorithm Hash digest
SHA256 74b930b2aa5d834a6e9496e96abc025a00c4ae292e861899258924a24e25538d
MD5 c0c4762f4c0c19709934691f0bd27635
BLAKE2b-256 b7690c8f46cef8d8b6ee8925270e2d48c7ebd93153dcfbd28db778eaf3588f3f

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page