Skip to main content

Concept annotation tool for Electronic Health Records

Project description

Medical oncept Annotation Tool

Build Status Documentation Status Latest release pypi Version

MedCAT can be used to extract information from Electronic Health Records (EHRs) and link it to biomedical ontologies like SNOMED-CT and UMLS. Paper on arXiv.

Official Docs here

Discussion Forum discourse

News

  • New Downloader [15. March 2022]: You can now download the latest SNOMED-CT and UMLS model packs via UMLS user authentication.
  • New Feature and Tutorial [7. December 2021]: Exploring Electronic Health Records with MedCAT and Neo4j
  • New Minor Release [20. October 2021] Introducing model packs, new faster multiprocessing for large datasets (100M+ documents) and improved MetaCAT.
  • New Release [1. August 2021]: Upgraded MedCAT to use spaCy v3, new scispaCy models have to be downloaded - all old CDBs (compatble with MedCAT v1) will work without any changes.
  • New Feature and Tutorial [8. July 2021]: Integrating 🤗 Transformers with MedCAT for biomedical NER+L
  • General [1. April 2021]: MedCAT is upgraded to v1, unforunately this introduces breaking changes with older models (MedCAT v0.4), as well as potential problems with all code that used the MedCAT package. MedCAT v0.4 is available on the legacy branch and will still be supported until 1. July 2021 (with respect to potential bug fixes), after it will still be available but not updated anymore.
  • Paper: What’s in a Summary? Laying the Groundwork for Advances in Hospital-Course Summarization
  • (more...)

Demo

A demo application is available at MedCAT. This was trained on MIMIC-III and all of SNOMED-CT.

Tutorials

A guide on how to use MedCAT is available at MedCAT Tutorials. Read more about MedCAT on Towards Data Science.

Available Models

Available models here

Acknowledgements

Entity extraction was trained on MedMentions In total it has ~ 35K entites from UMLS

The vocabulary was compiled from Wiktionary In total ~ 800K unique words

Powered By

A big thank you goes to spaCy and Hugging Face - who made life a million times easier.

Citation

@ARTICLE{Kraljevic2021-ln,
  title="Multi-domain clinical natural language processing with {MedCAT}: The Medical Concept Annotation Toolkit",
  author="Kraljevic, Zeljko and Searle, Thomas and Shek, Anthony and Roguski, Lukasz and Noor, Kawsar and Bean, Daniel and Mascio, Aurelie and Zhu, Leilei and Folarin, Amos A and Roberts, Angus and Bendayan, Rebecca and Richardson, Mark P and Stewart, Robert and Shah, Anoop D and Wong, Wai Keong and Ibrahim, Zina and Teo, James T and Dobson, Richard J B",
  journal="Artif. Intell. Med.",
  volume=117,
  pages="102083",
  month=jul,
  year=2021,
  issn="0933-3657",
  doi="10.1016/j.artmed.2021.102083"
}

Project details


Release history Release notifications | RSS feed

This version

1.3.1

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

medcat-1.3.1.tar.gz (10.7 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

medcat-1.3.1-py3-none-any.whl (133.9 kB view details)

Uploaded Python 3

File details

Details for the file medcat-1.3.1.tar.gz.

File metadata

  • Download URL: medcat-1.3.1.tar.gz
  • Upload date:
  • Size: 10.7 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.9.15

File hashes

Hashes for medcat-1.3.1.tar.gz
Algorithm Hash digest
SHA256 5738f97cfc5676ff26e1a75132dc8caecdc0c013b1af61d26bff8a7ccb20a306
MD5 107869365b82556571b46e9151d178c8
BLAKE2b-256 3d8ee758c87d531f0541f42f5b2491a17176eb5f79ff11b835859e6330c1df3d

See more details on using hashes here.

File details

Details for the file medcat-1.3.1-py3-none-any.whl.

File metadata

  • Download URL: medcat-1.3.1-py3-none-any.whl
  • Upload date:
  • Size: 133.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.9.15

File hashes

Hashes for medcat-1.3.1-py3-none-any.whl
Algorithm Hash digest
SHA256 e9567528ba5f94f52f399c37264bd1392362653008d39f1b1f4f1d91f9f5cb89
MD5 f456b583ec4697f33f3ca2afde6ff733
BLAKE2b-256 b18a73af886d90e475603c10d15594f16ec0fc4b5aaea2c3f1e03a8b56a023a5

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page