An NLP tool for parsing, analyzing, and visualizing medical records
Project description
clinisift
clinisift
is a multitool for processing clinical medical records.
The main goal is to provide easy, off-the-shelf access to common NLP processes when working with medical records:
- Sentence Tokenization and Section Identification from unstructured clinical textual data
- Named Entity Recognition of medication-related data and clinical entities from records
- Intuitive visualization of extracted information
Some motivating examples that can be accomplished in only a few lines of code to illustrate possible use-cases:
- Extract clinical problems and procedures mentioned in a record's CLINICAL HISTORY section.
- When exploring a new dataset, visualize records with clinical and medication entities parsed and highlighted on-the-fly.
- Check if both a particular medication and particular surgical procedure are mentioned in a patient's PAST MEDICAL HISTORY.
Quick Features
- Parse - Extract clinical and medical entities through Transformers-based Named Entity Recognition, as well as other components like medical record section identification. Also supports any NER model that can be loaded as a HuggingFace pipeline
- Analyze - Built-in methods to quickly filter through parsed data with as little code overhead as possible.
- Visualize - spaCy-based visualizer that integrates with Transformers NER to visualize medical record parses on-the-fly, programmatically or via command line.
Get Started
Installation
Install via pip
:
pip install clinisift
Or, from source:
git clone git@github.com:clinisift/clinisift.git
cd clinisift && pip install -e .
Quickstart
For a comprehensive overview of clinisift's capabilities, see the "Components" page on the wiki.
Components
clinisift is made up of Parser
and Doc
components. See the "Components" page on the wiki for an explanation of all the parameters.
class Parser(
models=None,
include_ents=[],
exclude_ents=[],
iob_resolve=True,
sent_tokenizer="clinitokenizer",
sent_per_line=False,
extract_section_headers=False,
section_header_expr=None,
device=None,
)
class Doc(
filepath_or_str,
parser,
is_file=True
)
Examples
Below are some examples for common use-cases.
Extract all clinical entities and medications from a *.txt file
from clinisift.cliniparse import Parser
from clinisift.doc import Doc
parser = Parser() # med ner and clinical ner
doc = Doc(text_file_path, parser)
res = doc.parse()
# { "sentences": [...],
# "entities": [...l, }
Visualize entities extracted on-the-fly from a directory of .txt files
To launch a visualizer using the default Parser() config:
From the command line:
python -m clinisift.visualizer /my/data/dir
A Flask server will be launched:
The visualizer module can be integrated with any `Parser` for more customizability about the NER pipelines used, entities visualized, and so forth. More information is available in the wiki.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file clinisift-0.0.3.tar.gz
.
File metadata
- Download URL: clinisift-0.0.3.tar.gz
- Upload date:
- Size: 9.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.0 CPython/3.9.13
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 9dd8411202ce06db635cfa31a5b870e7d8d7edbefbab07e80f1bc57d08032dbf |
|
MD5 | 37a388bcef7f362243a35d9bbae69cc3 |
|
BLAKE2b-256 | fb7b30a01883b316bd3953d9698043af1772884d7aab9eaeffb825fd595b33cd |
File details
Details for the file clinisift-0.0.3-py3-none-any.whl
.
File metadata
- Download URL: clinisift-0.0.3-py3-none-any.whl
- Upload date:
- Size: 9.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.0 CPython/3.9.13
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 26325cdb2f3c35e949ac33bfaf1827b4148db94a77382ba397409778f439cba0 |
|
MD5 | 6b497a64018b74786364bf77ecd3cb89 |
|
BLAKE2b-256 | ad9a7e5534bfddcb03e7fd5f6ec7773eb4c44744d1afc14f66f10349be1e892d |