Fast lightweight NLP library for concept and segment extraction with negation/uncertainty detection.

Project description

NLPLite

Fast, lightweight NLP for concept extraction with sentence/paragraph segments and negation/uncertainty detection.

Highlights

Fast string matching: Aho–Corasick with a pure‑Python fallback in case C is not available.
Whole‑word, case‑insensitive: term matching with smart longest match capture.
Negation & uncertainty: term hits accompanied by negation status :Y (YES), :N (NO), :U (UNCERTAIN).
Segment text: return the sentence or paragraph containing each hit (or ±N chars around the term hit).
Code mapping: map terms to codes (ICD, SNOMED, CUIs, etc).
Simple CLI: one command to search, extract, convert codes, or get assertion status.

Install

pip install nlplite

Quick Start

1) Search, locate and extract terms or phrases within a large text file 🕵️

from nlplite import search_terms

text = "Patient has heart failure. He denies chest pain but reports headache."
hits = search_terms(text, ["heart failure", "headache"], window_size="sentence")

print(hits)
# [
#   ('heart failure', 12, 24, 'Patient has heart failure.'),
#   ('headache', 53, 60, 'He denies chest pain but reports headache.')
# ]

Return shape: (term, start_postion, end_position, [context])
window_size may be an int (±N chars), "sentence", "paragraph", or None.

Offsets: Set include_offsets=False to skip start/end locations from results.

2) Translate your text to codes (Clinical usecase: Term-CUI, Term-ICD code)

from nlplite import convert_text_to_codes

dictionary = [("diabetes", "E11"), ("hypertension", "I10"), ("stroke", "I63")]
text = "No stroke. Has hypertension and diabetes."

# All occurrences with locations
rows = convert_text_to_codes(text, dictionary, negation_check=True, unique=False)
print(rows)
# [('I63:N', 3, 8), ('I10:Y', 13, 24), ('E11:Y', 29, 37)]

Notes:

When negation_check=True, the code fields carry a flag :Y/:N/:U.
If your file is two columns with a header (term,code), pass sep="," (or "tab") and leave header=True (default).
Turn offstart/end locations from results by passing include_offsets=False

3) Extract sentences, paragraphs or string surrounding terms of interest 📚

from nlplite import extract_terms_with_window

# Dictionary can be a path to CSV/TSV or an in‑memory dict/list.
dictionary = [("heart failure", "I50.9"), ("chest pain", "R07.9"), ("headache", "R51")]

text = "Patient has heart failure. He denies chest pain but reports headache."
rows = extract_terms_with_window(
    text=text,
    dictionary=dictionary,      # or "terms.csv"
    window_size="sentence",     # 'sentence' | 'paragraph' | int | None
    include_code=None,          # auto-include codes if present
    include_offsets=True,
    negation_check=True         # adds :Y / :N / :U flags
)

print(rows)
# [
#   ('heart failure:Y', 'I50.9:Y', 12, 24, 'Patient has heart failure.'),
#   ('chest pain:N',    'R07.9:N', 33, 42, 'He denies chest pain but reports headache.'),
#   ('headache:Y',      'R51:Y',   53, 60, 'He denies chest pain but reports headache.')
# ]

CLI Quickstart

After installing, use the nlplite command.

Search (inline text) 🔎

nlplite --search \
  --terms "heart","heart failure" \
  --text "Patient has heart failure. He denies chest pain." \
  --window sentence \
  --no-offsets \
  --format json
#  [["heart failure",12,24,"Patient has heart failure."]]

Extract with dictionary file + negation 🧠

# terms.csv (with header):
# term,code
# heart failure,I50.9
# chest pain,R07.9
# headache,R51

nlplite --extract --dict terms.csv --sep "," \
  --text "note.txt" \
  --window paragraph \
  --negation \
  --format text
# Example line:
# Term: chest pain (negated), Code: R07.9, Location: 123-132, Context: "..."

Convert to unique codes only 🔄

nlplite --convert --dict terms.csv --sep "," \
  --text "note.txt" \
  --unique --format json
# → ["I50.9:Y","R07.9:N","R51:Y"]

Tips:

Use --neg-window N to restrict how far a negation/uncertainty cue can reach.
--format json|csv|text controls output shape.
--no-header if your dictionary file has no header row.
--convert does not support --window (by design).
--no-offsets to skip start/end locations from results.

Notes

Matching is case‑insensitive and respects word boundaries; overlapping hits resolve to the longest match first.
Performance uses a C‑accelerated automaton when pyahocorasick is present; a pure‑Python fallback maintains portability.
Segmentation (window_size) can be an integer (±N characters), "sentence", or "paragraph".

Project details

Release history Release notifications | RSS feed

This version

0.2.0

Oct 11, 2025

0.1.2

Oct 5, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

nlplite-0.2.0.tar.gz (18.3 kB view details)

Uploaded Oct 11, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

nlplite-0.2.0-py3-none-any.whl (17.9 kB view details)

Uploaded Oct 11, 2025 Python 3

File details

Details for the file nlplite-0.2.0.tar.gz.

File metadata

Download URL: nlplite-0.2.0.tar.gz
Upload date: Oct 11, 2025
Size: 18.3 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.5

File hashes

Hashes for nlplite-0.2.0.tar.gz
Algorithm	Hash digest
SHA256	`9d0d29acc74b09e0cfa9d9509ac3d7ccd7fe56f3ec9bd5555842009d1ad80a27`
MD5	`132d46d6fa4f0da3ba023dd5f2681ac3`
BLAKE2b-256	`adec23debc34897c85bd6003ae261227d71240bfc104027c11f9dc552a4d22c1`

See more details on using hashes here.

File details

Details for the file nlplite-0.2.0-py3-none-any.whl.

File metadata

Download URL: nlplite-0.2.0-py3-none-any.whl
Upload date: Oct 11, 2025
Size: 17.9 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.5

File hashes

Hashes for nlplite-0.2.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`e504aae1314376912fc81e0cbb4052e3d3c0fc42d7a7e81b33f5c42a2b42d03a`
MD5	`22fc5e0cde97be7ac9eecff8e8e89643`
BLAKE2b-256	`c50c7404a9c9ae6447370dd85df2f5873f9d161c741e5f511f83b5e695f1503d`

See more details on using hashes here.

nlplite 0.2.0

Navigation

Verified details

Maintainers

Unverified details

Meta

Project description

NLPLite

Highlights

Install

Quick Start

1) Search, locate and extract terms or phrases within a large text file 🕵️

2) Translate your text to codes (Clinical usecase: Term-CUI, Term-ICD code)

3) Extract sentences, paragraphs or string surrounding terms of interest 📚

CLI Quickstart

Search (inline text) 🔎

Extract with dictionary file + negation 🧠

Convert to unique codes only 🔄

Notes

Project details

Verified details

Maintainers

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes