Skip to main content

Fast lightweight NLP library for reliable concept extraction and negation detection

Project description

NLPLite

Lightweight NLP library for fast concept extraction and enhanced negation and uncertainty detection.

Features

  • Fast Aho Corasick term extraction (C-accelerated via pyahocorasick)
  • NegEx Engine negation & uncertainty detection
  • CLI for quick runs

Install

pip install nlplite

Quick Start

Basic Example

from nlplite import flash_extractor

# Term to code dictionary
dictionary = {"headache": "C0018681", "fever": "C0015967"}

extractor = flash_extractor(dictionary)
text = "The patient denies headache but reports fever."

print(extractor.run(text, negation_check=True))

Output:

[('headache:N', 'C0018681:N', 18, 25), ('fever:Y', 'C0015967:Y', 40, 44)]

Structured JSONL-Style Record

rec = extractor.extract_note("1234", text, negation_check=True)
print(rec)

Output:

{
  "note_id": "1234",
  "text": "The patient denies headache but reports fever.",
  "extractions": [
    ["headache:N", "C0018681:N", 18, 25],
    ["fever:Y", "C0015967:Y", 40, 44]
  ]
}

Dictionary Formats

A) Two-Column File (term, code)

Use when you want codes in the output (CUI/SNOMED/ICD etc)

Example CSV:

term,code
fever,C0015967
headache,C0018681

Example TSV:

term	code
fever	C0015967
headache	C0018681

Loader:

# CSV
extractor = flash_extractor("terms.csv", sep=",", header=True)

# TSV
extractor = flash_extractor("terms.tsv", sep="\t", header=True)

B) Term List (No Codes)

If you omit sep, the file is treated as a term list (one term per line).

Example:

fever
headache
shortness of breath

Loader:

extractor = flash_extractor("terms.txt")  # sep omitted → term list mode

Notes

  • Term suffix: :N = negated, :Y = affirmed, :U = uncertain
  • Uncertainty: Handled via NegEx-style phrase matching
  • Case-insensitive by default
  • Returns tuples: (term, code, start_pos, end_pos)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

nlplite-0.1.2.tar.gz (15.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

nlplite-0.1.2-py3-none-any.whl (14.9 kB view details)

Uploaded Python 3

File details

Details for the file nlplite-0.1.2.tar.gz.

File metadata

  • Download URL: nlplite-0.1.2.tar.gz
  • Upload date:
  • Size: 15.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.5

File hashes

Hashes for nlplite-0.1.2.tar.gz
Algorithm Hash digest
SHA256 38ff9943a767f96246a210b95271168096929079867f17f1d7c5c42ef0d3b6a0
MD5 e602a250d671bbf365e3a9ed816a16bb
BLAKE2b-256 695fcb39ebf7599cdc57571f75a1aec695d90c36874d2299cfb1c5e9651bf9a8

See more details on using hashes here.

File details

Details for the file nlplite-0.1.2-py3-none-any.whl.

File metadata

  • Download URL: nlplite-0.1.2-py3-none-any.whl
  • Upload date:
  • Size: 14.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.5

File hashes

Hashes for nlplite-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 83d7e1961d7c2a2ce3dd8e1d4e533ffc512fe4d361db800d48590a58b5437958
MD5 41251f0728aca994f2f99373256bfd1d
BLAKE2b-256 62f15f344cffbd23d98ee128bd0dff2349a058c8311f906ea2c386132629ac3c

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page