Skip to main content

An NLP tool for parsing, analyzing, and visualizing medical records

Project description

clinisift

clinisift is a multitool for processing clinical medical records.

The main goal is to provide easy, off-the-shelf access to common NLP processes when working with medical records:

  • Sentence Tokenization and Section Identification from unstructured clinical textual data
  • Named Entity Recognition of medication-related data and clinical entities from records
  • Intuitive visualization of extracted information

Some motivating examples that can be accomplished in only a few lines of code to illustrate possible use-cases:

  • Extract clinical problems and procedures mentioned in a record's CLINICAL HISTORY section.
  • When exploring a new dataset, visualize records with clinical and medication entities parsed and highlighted on-the-fly.
  • Check if both a particular medication and particular surgical procedure are mentioned in a patient's PAST MEDICAL HISTORY.

Quick Features

  • Parse - Extract clinical and medical entities through Transformers-based Named Entity Recognition, as well as other components like medical record section identification. Also supports any NER model that can be loaded as a HuggingFace pipeline
  • Analyze - Built-in methods to quickly filter through parsed data with as little code overhead as possible.
  • Visualize - spaCy-based visualizer that integrates with Transformers NER to visualize medical record parses on-the-fly, programmatically or via command line.

Get Started

Installation

Install via pip:

pip install clinisift

Or, from source:

git clone git@github.com:clinisift/clinisift.git
cd clinisift && pip install -e .

Quickstart

For a comprehensive overview of clinisift's capabilities, see the "Components" page on the wiki.

Components

clinisift is made up of Parser and Doc components. See the "Components" page on the wiki for an explanation of all the parameters.

class Parser(
    models=None,
    include_ents=[],
    exclude_ents=[],
    iob_resolve=True,
    sent_tokenizer="clinitokenizer",
    sent_per_line=False,
    extract_section_headers=False,
    section_header_expr=None,
    device=None,
) 

class Doc(
    filepath_or_str,
    parser,
    is_file=True
)

Examples

Below are some examples for common use-cases.

Extract all clinical entities and medications from a *.txt file

from clinisift.cliniparse import Parser
from clinisift.doc import Doc

parser = Parser() # med ner and clinical ner
doc = Doc(text_file_path, parser)

res = doc.parse()
# { "sentences": [...],
# "entities": [...l, }

Visualize entities extracted on-the-fly from a directory of .txt files

To launch a visualizer using the default Parser() config:

From the command line:

python -m clinisift.visualizer /my/data/dir

A Flask server will be launched:

img

img

The visualizer module can be integrated with any `Parser` for more customizability about the NER pipelines used, entities visualized, and so forth. More information is available in the wiki.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

clinisift-0.0.3.tar.gz (9.7 kB view details)

Uploaded Source

Built Distribution

clinisift-0.0.3-py3-none-any.whl (9.3 kB view details)

Uploaded Python 3

File details

Details for the file clinisift-0.0.3.tar.gz.

File metadata

  • Download URL: clinisift-0.0.3.tar.gz
  • Upload date:
  • Size: 9.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.0 CPython/3.9.13

File hashes

Hashes for clinisift-0.0.3.tar.gz
Algorithm Hash digest
SHA256 9dd8411202ce06db635cfa31a5b870e7d8d7edbefbab07e80f1bc57d08032dbf
MD5 37a388bcef7f362243a35d9bbae69cc3
BLAKE2b-256 fb7b30a01883b316bd3953d9698043af1772884d7aab9eaeffb825fd595b33cd

See more details on using hashes here.

File details

Details for the file clinisift-0.0.3-py3-none-any.whl.

File metadata

  • Download URL: clinisift-0.0.3-py3-none-any.whl
  • Upload date:
  • Size: 9.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.0 CPython/3.9.13

File hashes

Hashes for clinisift-0.0.3-py3-none-any.whl
Algorithm Hash digest
SHA256 26325cdb2f3c35e949ac33bfaf1827b4148db94a77382ba397409778f439cba0
MD5 6b497a64018b74786364bf77ecd3cb89
BLAKE2b-256 ad9a7e5534bfddcb03e7fd5f6ec7773eb4c44744d1afc14f66f10349be1e892d

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page