Fast lightweight NLP library for reliable concept extraction and negation detection
Project description
NLPLite
Lightweight NLP library for fast concept extraction and enhanced negation and uncertainty detection.
Features
- Fast Aho Corasick term extraction (C-accelerated via
pyahocorasick) - NegEx Engine negation & uncertainty detection
- CLI for quick runs
Install
pip install nlplite
Quick Start
Basic Example
from nlplite import flash_extractor
# Term to code dictionary
dictionary = {"headache": "C0018681", "fever": "C0015967"}
extractor = flash_extractor(dictionary)
text = "The patient denies headache but reports fever."
print(extractor.run(text, negation_check=True))
Output:
[('headache:N', 'C0018681:N', 18, 25), ('fever:Y', 'C0015967:Y', 40, 44)]
Structured JSONL-Style Record
rec = extractor.extract_note("1234", text, negation_check=True)
print(rec)
Output:
{
"note_id": "1234",
"text": "The patient denies headache but reports fever.",
"extractions": [
["headache:N", "C0018681:N", 18, 25],
["fever:Y", "C0015967:Y", 40, 44]
]
}
Dictionary Formats
A) Two-Column File (term, code)
Use when you want codes in the output (CUI/SNOMED/ICD etc)
Example CSV:
term,code
fever,C0015967
headache,C0018681
Example TSV:
term code
fever C0015967
headache C0018681
Loader:
# CSV
extractor = flash_extractor("terms.csv", sep=",", header=True)
# TSV
extractor = flash_extractor("terms.tsv", sep="\t", header=True)
B) Term List (No Codes)
If you omit sep, the file is treated as a term list (one term per line).
Example:
fever
headache
shortness of breath
Loader:
extractor = flash_extractor("terms.txt") # sep omitted → term list mode
Notes
- Term suffix:
:N= negated,:Y= affirmed,:U= uncertain - Uncertainty: Handled via NegEx-style phrase matching
- Case-insensitive by default
- Returns tuples:
(term, code, start_pos, end_pos)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file nlplite-0.1.2.tar.gz.
File metadata
- Download URL: nlplite-0.1.2.tar.gz
- Upload date:
- Size: 15.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
38ff9943a767f96246a210b95271168096929079867f17f1d7c5c42ef0d3b6a0
|
|
| MD5 |
e602a250d671bbf365e3a9ed816a16bb
|
|
| BLAKE2b-256 |
695fcb39ebf7599cdc57571f75a1aec695d90c36874d2299cfb1c5e9651bf9a8
|
File details
Details for the file nlplite-0.1.2-py3-none-any.whl.
File metadata
- Download URL: nlplite-0.1.2-py3-none-any.whl
- Upload date:
- Size: 14.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
83d7e1961d7c2a2ce3dd8e1d4e533ffc512fe4d361db800d48590a58b5437958
|
|
| MD5 |
41251f0728aca994f2f99373256bfd1d
|
|
| BLAKE2b-256 |
62f15f344cffbd23d98ee128bd0dff2349a058c8311f906ea2c386132629ac3c
|