Skip to main content

A natural language medical domain parsing library.

Project description

Medical natural language parsing and utility library

PyPI Python 3.9 Python 3.10 Build Status

A natural language medical domain parsing library. This library:

  • Provides an interface to the UTS (UMLS Terminology Services) RESTful service with data caching (NIH login needed).
  • Wraps the MedCAT library by parsing medical and clinical text into first class Python objects reflecting the structure of the natural language complete with UMLS entity linking with CUIs and other domain specific features.
  • Combines non-medical (such as POS and NER tags) and medical features (such as CUIs) in one API and resulting data structure and/or as a Pandas data frame.
  • Provides cui2vec as a word embedding model for either fast indexing and access or to use directly as features in a Zensols Deep NLP embedding layer model.
  • Provides access to cTAKES using as a dictionary like Stash abstraction.
  • Includes a command line program to access all of these features without having to write any code.

Documentation

See the full documentation. The API reference is also available.

Obtaining

The easiest way to install the command line program is via the pip installer:

pip3 install zensols.mednlp

Binaries are also available on pypi.

If the cui2vec functionality is used, the Zensols Deep NLP library is also needed, which is stalled with pip install zensols.deepnlp.

Attribution

This API utilizes the following frameworks:

  • MedCAT: used to extract information from Electronic Health Records (EHRs) and link it to biomedical ontologies like SNOMED-CT and UMLS.
  • cTAKES: a natural language processing system for extraction of information from electronic medical record clinical free-text.
  • cui2vec: a new set of (like word) embeddings for medical concepts learned using an extremely large collection of multimodal medical data.
  • Zensols Deep NLP library: a deep learning utility library for natural language processing that aids in feature engineering and embedding layers.
  • ctakes-parser: parses cTAKES output in to a Pandas data frame.

Citation

If you use this project in your research please use the following BibTeX entry:

@article{Landes_DiEugenio_Caragea_2021,
  title={DeepZensols: Deep Natural Language Processing Framework},
  url={http://arxiv.org/abs/2109.03383},
  note={arXiv: 2109.03383},
  journal={arXiv:2109.03383 [cs]},
  author={Landes, Paul and Di Eugenio, Barbara and Caragea, Cornelia},
  year={2021},
  month={Sep}
}

Community

Please star the project and let me know how and where you use this API. Contributions as pull requests, feedback and any input is welcome.

Changelog

An extensive changelog is available here.

License

MIT License

Copyright (c) 2021 - 2022 Paul Landes

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

zensols.mednlp-1.3.0-py3.10.egg (57.3 kB view details)

Uploaded Source

zensols.mednlp-1.3.0-py3-none-any.whl (29.7 kB view details)

Uploaded Python 3

File details

Details for the file zensols.mednlp-1.3.0-py3.10.egg.

File metadata

  • Download URL: zensols.mednlp-1.3.0-py3.10.egg
  • Upload date:
  • Size: 57.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.10.2

File hashes

Hashes for zensols.mednlp-1.3.0-py3.10.egg
Algorithm Hash digest
SHA256 4f3ccbdf73c4642371dbd0941c4adf6ef85f15cc39d3dc9c910d263e4427c3a3
MD5 818a5124160a1b02376d3e426d7e68d8
BLAKE2b-256 67edb82312f4f2ed6c532820140f1563f92cdfba05dd4a4435e081b9b08beb0e

See more details on using hashes here.

File details

Details for the file zensols.mednlp-1.3.0-py3-none-any.whl.

File metadata

File hashes

Hashes for zensols.mednlp-1.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 8c085bc0cd8f9f87be0226f4961c50b08f8674911bed54d11e3568d3bbdd548a
MD5 3e90f88d905c12fc1cbff838beb46f4c
BLAKE2b-256 db33e754d6ceb54ab7a58627e7afed7ff77568952c4b47a872d3ccb2e6aed01a

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page