Skip to main content

Concept annotation tool for Electronic Health Records

Project description

Medical oncept Annotation Tool

Build Status Documentation Status Latest release pypi Version

MedCAT can be used to extract information from Electronic Health Records (EHRs) and link it to biomedical ontologies like SNOMED-CT and UMLS. Paper on arXiv.

Official Docs here

Discussion Forum discourse

News

  • New Downloader [15. March 2022]: You can now download the latest SNOMED-CT and UMLS model packs via UMLS user authentication.
  • New Feature and Tutorial [7. December 2021]: Exploring Electronic Health Records with MedCAT and Neo4j
  • New Minor Release [20. October 2021] Introducing model packs, new faster multiprocessing for large datasets (100M+ documents) and improved MetaCAT.
  • New Release [1. August 2021]: Upgraded MedCAT to use spaCy v3, new scispaCy models have to be downloaded - all old CDBs (compatble with MedCAT v1) will work without any changes.
  • New Feature and Tutorial [8. July 2021]: Integrating 🤗 Transformers with MedCAT for biomedical NER+L
  • General [1. April 2021]: MedCAT is upgraded to v1, unforunately this introduces breaking changes with older models (MedCAT v0.4), as well as potential problems with all code that used the MedCAT package. MedCAT v0.4 is available on the legacy branch and will still be supported until 1. July 2021 (with respect to potential bug fixes), after it will still be available but not updated anymore.
  • Paper: What’s in a Summary? Laying the Groundwork for Advances in Hospital-Course Summarization
  • (more...)

Demo

A demo application is available at MedCAT. This was trained on MIMIC-III and all of SNOMED-CT.

Tutorials

A guide on how to use MedCAT is available at MedCAT Tutorials. Read more about MedCAT on Towards Data Science.

Available Models

Available models here

Acknowledgements

Entity extraction was trained on MedMentions In total it has ~ 35K entites from UMLS

The vocabulary was compiled from Wiktionary In total ~ 800K unique words

Powered By

A big thank you goes to spaCy and Hugging Face - who made life a million times easier.

Citation

@ARTICLE{Kraljevic2021-ln,
  title="Multi-domain clinical natural language processing with {MedCAT}: The Medical Concept Annotation Toolkit",
  author="Kraljevic, Zeljko and Searle, Thomas and Shek, Anthony and Roguski, Lukasz and Noor, Kawsar and Bean, Daniel and Mascio, Aurelie and Zhu, Leilei and Folarin, Amos A and Roberts, Angus and Bendayan, Rebecca and Richardson, Mark P and Stewart, Robert and Shah, Anoop D and Wong, Wai Keong and Ibrahim, Zina and Teo, James T and Dobson, Richard J B",
  journal="Artif. Intell. Med.",
  volume=117,
  pages="102083",
  month=jul,
  year=2021,
  issn="0933-3657",
  doi="10.1016/j.artmed.2021.102083"
}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

zensols.medcat-1.3.0-py3.10.egg (307.4 kB view details)

Uploaded Source

zensols.medcat-1.3.0-py3-none-any.whl (134.0 kB view details)

Uploaded Python 3

File details

Details for the file zensols.medcat-1.3.0-py3.10.egg.

File metadata

  • Download URL: zensols.medcat-1.3.0-py3.10.egg
  • Upload date:
  • Size: 307.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.10.2

File hashes

Hashes for zensols.medcat-1.3.0-py3.10.egg
Algorithm Hash digest
SHA256 b5b08d3991693fca4ca809ae52b45300ec44269a58313d0caea725b99d09e382
MD5 465ac2af5e9f93fbfc69d9366a136884
BLAKE2b-256 23c92c2d16e0a64b6541cdbadcb9ddb89b7d3c181adbbd41d3f8b35c81fd3b17

See more details on using hashes here.

File details

Details for the file zensols.medcat-1.3.0-py3-none-any.whl.

File metadata

File hashes

Hashes for zensols.medcat-1.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 e68d8187590cc94256ff01b9c36124a2157730e83f736585e47cd7641a6cea36
MD5 5d22e4c52795b3e6415ad4a0f5e7659e
BLAKE2b-256 1e35eec323066deca3ad9a2be3d335ac7a0b77785ae3d77eee312344aae91453

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page