Concept annotation tool for Electronic Health Records

These details have not been verified by PyPI

Project links

Homepage

License
- OSI Approved :: MIT License
Operating System
- OS Independent
Programming Language
- Python :: 3

Project description

Medical oncept Annotation Tool

A simple tool for concept annotation from UMLS or any other source.

This is still experimental

How to use

There are a few ways to run CAT

PIP Installation

pip install --upgrade medcat

Please install the langauge models before running anything

python -m spacy download en_core_web_sm

pip install https://s3-us-west-2.amazonaws.com/ai2-s2-scispacy/releases/v0.2.0/en_core_sci_md-0.2.0.tar.gz

Building a new Concept Database (.csv) or using an existing one

First download the vocabulary from Vocabulary Download

Now in python3+

from medcat.cat import CAT
from medcat.utils.vocab import Vocab
from medcat.prepare_cdb import PrepareCDB
from medcat.cdb import CDB 

vocab = Vocab()

# Load the vocab model you just downloaded
vocab.load_dict('<path to the vocab file>')

# If you have an existing CDB
cdb = CDB()
cdb.load_dict('<path to the cdb file>') 

# If you need a special CDB you can build one from a .csv file
preparator = PrepareCDB(vocab=vocab)
csv_paths = ['<path to your csv_file>', '<another one>', ...] 
# e.g.
csv_paths = ['./examples/simple_cdb.csv']
cdb = preparator.prepare_csvs(csv_paths)

# Save the new CDB for later
cdb.save_dict("<path to a file where it will be saved>")

# To annotate documents we do
doc = "My simple document with kidney failure"
cat = CAT(cdb=cdb, vocab=vocab)
cat.train = False
doc_spacy = cat(doc)
# Entities are in
doc_spacy._.ents
# Or to get a json
doc_json = cat.get_json(doc)

# To have a look at the results:
from spacy import displacy
# Note that this will not show all entites, but only the longest ones
displacy.serve(doc_spacy, style='ent')

# To run cat on a large number of documents
data = [(<doc_id>, <text>), (<doc_id>, <text>), ...]
docs = cat.multi_processing(data)

Training and Fine-tuning

To fine-tune or train everything from the ground up (excluding word-vectors), you can use the following:

# Loadinga CDB or creating a new one is as above.

# To run the training do
f = open("<some file with a lot of medical text>", 'r')
# If you want fine tune set it to True, old training will be preserved
cat.run_training(f, fine_tune=False)

If building from source, the requirements are

python >= 3.5 [tested with 3.7, but most likely works with 3+]

All the rest can be instaled using pip from the requirements.txt file, by running:

pip install -r requirements.txt

Results

Dataset	SoftF1	Description
MedMentions	0.83	The whole MedMentions dataset without any modifications or supervised training
MedMentions	0.828	MedMentions only for concepts that require disambiguation, or names that map to more CUIs
MedMentions	0.93	Medmentions filterd by TUI to only concepts that are a disease

Models

A basic trained model is made public for the vocabulary. It is trained for the 35K entities available in MedMentions. It is quite limited so the performance might not be the best.

Vocabulary Download - Built from MedMentions

(Note: This is was compiled from MedMentions and does not have any data from NLM as that data is not publicaly available.)

Acknowledgement

Entity extraction was trained on MedMentions In total it has ~ 35K entites from UMLS

The dictionary was compiled from Wiktionary In total ~ 800K unique words For now NOT made publicaly available

Project details

These details have not been verified by PyPI

Project links

Homepage

License
- OSI Approved :: MIT License
Operating System
- OS Independent
Programming Language
- Python :: 3

Release history Release notifications | RSS feed

2.6.0

Feb 16, 2026

2.5.3

Jan 13, 2026

2.5.2

Jan 13, 2026

2.4.0

Nov 28, 2025

2.3.0

Nov 11, 2025

2.2.0

Oct 20, 2025

2.2.0rc1 pre-release

Oct 17, 2025

2.1.0 yanked

Sep 1, 2025

Reason this release was yanked:

MetaCAT implementation broken

2.0.0 yanked

Aug 18, 2025

Reason this release was yanked:

MetaCAT implementation broken

2.0.0b4 pre-release yanked

Jul 18, 2025

Reason this release was yanked:

MetaCAT implementation broken

2.0.0b3 pre-release yanked

Jul 11, 2025

Reason this release was yanked:

MetaCAT implementation broken

1.16.8

Jan 20, 2026

1.16.7

Oct 1, 2025

1.16.5 yanked

Aug 12, 2025

Reason this release was yanked:

MetaCAT implementation broken

1.16.0 yanked

May 16, 2025

Reason this release was yanked:

MetaCAT implementation broken

1.15.2 yanked

Mar 28, 2025

Reason this release was yanked:

MetaCAT implementation broken

1.15.1 yanked

Mar 24, 2025

Reason this release was yanked:

MetaCAT implementation broken

1.15.0 yanked

Feb 12, 2025

Reason this release was yanked:

MetaCAT implementation broken

1.15.0b0 pre-release yanked

Dec 11, 2024

Reason this release was yanked:

MetaCAT implementation broken

1.14.2 yanked

Mar 24, 2025

Reason this release was yanked:

MetaCAT implementation broken

1.14.1 yanked

Feb 12, 2025

Reason this release was yanked:

MetaCAT implementation broken

1.14.0 yanked

Nov 19, 2024

Reason this release was yanked:

MetaCAT implementation broken

1.13.4 yanked

Mar 24, 2025

Reason this release was yanked:

MetaCAT implementation broken

1.13.3 yanked

Feb 12, 2025

Reason this release was yanked:

MetaCAT implementation broken

1.13.2 yanked

Nov 15, 2024

Reason this release was yanked:

MetaCAT implementation broken

1.13.1 yanked

Oct 14, 2024

Reason this release was yanked:

MetaCAT implementation broken

1.13.0 yanked

Aug 28, 2024

Reason this release was yanked:

MetaCAT implementation broken

1.12.2 yanked

Oct 14, 2024

Reason this release was yanked:

MetaCAT implementation broken

1.12.1 yanked

Aug 12, 2024

Reason this release was yanked:

MetaCAT implementation broken

1.12.0 yanked

Jun 20, 2024

Reason this release was yanked:

MetaCAT implementation broken

1.11.1

Aug 13, 2024

1.11.0

May 3, 2024

1.10.3

Aug 13, 2024

1.10.2

Feb 28, 2024

1.10.1

Feb 13, 2024

1.10.0

Jan 8, 2024

1.9.3

Oct 10, 2023

1.9.2

Oct 9, 2023

1.9.1

Sep 21, 2023

1.9.0

Aug 2, 2023

1.8.2

Oct 13, 2023

1.8.1

Sep 22, 2023

1.8.0

Jul 6, 2023

1.7.4

Oct 13, 2023

1.7.3

Sep 22, 2023

1.7.1

Jul 6, 2023

1.7.0

Feb 20, 2023

1.6.1

Oct 13, 2023

1.6.0

Jan 9, 2023

1.5.3

Oct 13, 2023

1.5.0

Nov 24, 2022

1.4.1

Dec 1, 2022

1.4.0

Oct 5, 2022

1.3.1

Dec 1, 2022

1.3.0

Jul 5, 2022

1.2.9

Dec 1, 2022

1.2.8

Mar 30, 2022

1.2.7

Feb 11, 2022

1.2.6

Dec 7, 2021

1.2.5

Nov 4, 2021

1.2.4

Nov 4, 2021

1.2.3

Oct 25, 2021

1.2.0

Oct 20, 2021

1.1.3

Aug 27, 2021

1.1.2

Aug 27, 2021

1.1.1

Aug 18, 2021

1.1.0

Aug 1, 2021

1.0.40

Jul 27, 2021

1.0.39

Jul 26, 2021

1.0.38

Jul 25, 2021

1.0.37

Jul 25, 2021

1.0.36

Jul 6, 2021

1.0.35

Jun 25, 2021

1.0.34

Jun 25, 2021

1.0.33

Jun 10, 2021

1.0.32

Jun 10, 2021

1.0.31

Jun 10, 2021

1.0.30

May 30, 2021

1.0.29

May 25, 2021

1.0.28

May 10, 2021

1.0.27

May 10, 2021

1.0.26

Apr 30, 2021

1.0.25

Apr 29, 2021

1.0.24

Apr 29, 2021

1.0.23

Apr 28, 2021

1.0.22

Apr 26, 2021

1.0.21

Apr 17, 2021

1.0.20

Apr 17, 2021

1.0.19

Apr 17, 2021

1.0.18

Apr 17, 2021

1.0.17

Apr 17, 2021

1.0.16

Apr 17, 2021

1.0.15

Apr 17, 2021

1.0.14

Apr 17, 2021

1.0.13

Apr 17, 2021

1.0.12

Apr 17, 2021

1.0.11

Apr 16, 2021

1.0.10

Apr 16, 2021

1.0.9

Apr 16, 2021

1.0.8

Apr 9, 2021

1.0.7

Apr 9, 2021

1.0.6

Apr 8, 2021

1.0.5

Apr 8, 2021

1.0.4

Apr 8, 2021

1.0.3

Apr 8, 2021

1.0.2

Apr 7, 2021

1.0.1

Apr 7, 2021

1.0.0.dev47 pre-release

Mar 26, 2021

1.0.0.dev46 pre-release

Mar 24, 2021

1.0.0.dev45 pre-release

Mar 24, 2021

1.0.0.dev44 pre-release

Feb 19, 2021

1.0.0.dev43 pre-release

Feb 19, 2021

1.0.0.dev42 pre-release

Feb 18, 2021

1.0.0.dev41 pre-release

Feb 18, 2021

1.0.0.dev40 pre-release

Feb 18, 2021

1.0.0.dev39 pre-release

Feb 18, 2021

1.0.0.dev38 pre-release

Feb 18, 2021

1.0.0.dev37 pre-release

Feb 18, 2021

1.0.0.dev36 pre-release

Feb 18, 2021

1.0.0.dev35 pre-release

Feb 18, 2021

1.0.0.dev34 pre-release

Feb 16, 2021

1.0.0.dev33 pre-release

Feb 16, 2021

1.0.0.dev32 pre-release

Feb 16, 2021

1.0.0.dev31 pre-release

Feb 2, 2021

1.0.0.dev30 pre-release

Jan 30, 2021

1.0.0.dev29 pre-release

Jan 29, 2021

1.0.0.dev28 pre-release

Jan 29, 2021

1.0.0.dev27 pre-release

Jan 23, 2021

1.0.0.dev26 pre-release

Jan 23, 2021

1.0.0.dev25 pre-release

Jan 15, 2021

1.0.0.dev24 pre-release

Jan 15, 2021

1.0.0.dev23 pre-release

Jan 14, 2021

1.0.0.dev22 pre-release

Jan 14, 2021

1.0.0.dev21 pre-release

Jan 12, 2021

1.0.0.dev20 pre-release

Jan 11, 2021

1.0.0.dev19 pre-release

Jan 11, 2021

1.0.0.dev18 pre-release

Jan 11, 2021

1.0.0.dev17 pre-release

Jan 11, 2021

1.0.0.dev16 pre-release

Jan 9, 2021

1.0.0.dev15 pre-release

Jan 8, 2021

1.0.0.dev14 pre-release

Jan 8, 2021

1.0.0.dev13 pre-release

Jan 5, 2021

1.0.0.dev12 pre-release

Jan 4, 2021

1.0.0.dev11 pre-release

Dec 19, 2020

1.0.0.dev10 pre-release

Dec 15, 2020

1.0.0.dev9 pre-release

Dec 11, 2020

1.0.0.dev8 pre-release

Dec 5, 2020

1.0.0.dev7 pre-release

Dec 5, 2020

1.0.0.dev6 pre-release

Dec 5, 2020

1.0.0.dev5 pre-release

Dec 4, 2020

1.0.0.dev4 pre-release

Dec 4, 2020

1.0.0.dev3 pre-release

Dec 4, 2020

1.0.0.dev2 pre-release

Dec 4, 2020

1.0.0.dev1 pre-release

Dec 4, 2020

1.0.0.dev0 pre-release

Dec 2, 2020

0.4.0.6

Jan 5, 2021

0.4.0.5

Jan 5, 2021

0.4.0.4

Jan 5, 2021

0.4.0.3

Nov 26, 2020

0.4.0.2

Oct 6, 2020

0.4.0.1

Sep 27, 2020

0.4.0.0

Sep 27, 2020

0.3.9.9.9

Sep 27, 2020

0.3.9.9.8

Sep 27, 2020

0.3.9.9.7

Sep 27, 2020

0.3.9.9.6

Sep 27, 2020

0.3.9.9.5

Sep 25, 2020

0.3.9.9.4

Sep 25, 2020

0.3.9.9.3

Sep 25, 2020

0.3.9.9.2

Sep 24, 2020

0.3.9.9.1

Sep 22, 2020

0.3.9.9

Sep 22, 2020

0.3.9.8

Sep 22, 2020

0.3.9.7

Sep 22, 2020

0.3.9.6

Sep 22, 2020

0.3.9.5

Sep 20, 2020

0.3.9.4

Sep 20, 2020

0.3.9.3

Sep 20, 2020

0.3.9.2

Sep 20, 2020

0.3.9.1

Sep 20, 2020

0.3.9.0

Sep 20, 2020

0.3.8.9

Sep 20, 2020

0.3.8.8

Sep 19, 2020

0.3.8.7

Sep 19, 2020

0.3.8.6

Sep 19, 2020

0.3.8.5

Sep 16, 2020

0.3.8.4

Sep 9, 2020

0.3.8.3

Sep 9, 2020

0.3.8.2

Sep 9, 2020

0.3.8.1

Sep 9, 2020

0.3.8.0

Aug 28, 2020

0.3.7.9

Aug 5, 2020

0.3.7.8

Aug 5, 2020

0.3.7.7

Aug 5, 2020

0.3.7.6

Aug 5, 2020

0.3.7.5

Aug 5, 2020

0.3.7.4

Aug 5, 2020

0.3.7.3

Aug 5, 2020

0.3.7.2

Aug 5, 2020

0.3.7.1

Aug 5, 2020

0.3.7.0

Aug 4, 2020

0.3.6.9

Aug 4, 2020

0.3.6.8

Aug 4, 2020

0.3.6.7

Aug 4, 2020

0.3.6.6

Aug 4, 2020

0.3.6.5

Aug 2, 2020

0.3.6.4

Jul 16, 2020

0.3.6.3

Jul 16, 2020

0.3.6.2

Jul 16, 2020

0.3.6.1

Jul 15, 2020

0.3.6.0

Jul 15, 2020

0.3.5.9

Jul 13, 2020

0.3.5.8

Jul 13, 2020

0.3.5.7

Jul 13, 2020

0.3.5.6

Jul 13, 2020

0.3.5.5

Jul 13, 2020

0.3.5.4

Jul 13, 2020

0.3.5.3

Jul 11, 2020

0.3.5.2

Jul 11, 2020

0.3.5.1

Jul 11, 2020

0.3.5.0

Jul 11, 2020

0.3.4.9

Jul 10, 2020

0.3.4.8

Jul 10, 2020

0.3.4.7

Jul 10, 2020

0.3.4.6

Jul 7, 2020

0.3.4.5

Jul 7, 2020

0.3.4.4

Jun 25, 2020

0.3.4.3

Jun 25, 2020

0.3.4.2

May 14, 2020

0.3.4.1

May 14, 2020

0.3.4.0

May 10, 2020

0.3.3.9

May 8, 2020

0.3.3.8

May 1, 2020

0.3.3.7

Apr 30, 2020

0.3.3.6

Apr 30, 2020

0.3.3.5

Apr 29, 2020

0.3.3.4

Apr 29, 2020

0.3.3.3

Apr 28, 2020

0.3.3.2

Apr 28, 2020

0.3.3.1

Apr 22, 2020

0.3.3.0

Apr 21, 2020

0.3.2.9

Apr 21, 2020

0.3.2.8

Apr 21, 2020

0.3.2.7

Apr 18, 2020

0.3.2.6

Apr 14, 2020

0.3.2.5

Apr 4, 2020

0.3.2.4

Apr 1, 2020

0.3.2.3

Apr 1, 2020

0.3.2.2

Apr 1, 2020

0.3.2.1

Mar 30, 2020

0.3.2.0

Mar 29, 2020

0.3.1.9

Mar 29, 2020

0.3.1.8

Mar 29, 2020

0.3.1.7

Mar 29, 2020

0.3.1.6

Mar 29, 2020

0.3.1.5

Mar 29, 2020

0.3.1.4

Mar 28, 2020

0.3.1.1

Mar 28, 2020

0.3.1.0

Mar 28, 2020

0.3.0.9

Mar 28, 2020

0.3.0.8

Mar 28, 2020

0.3.0.7

Mar 28, 2020

0.3.0.6

Mar 28, 2020

0.3.0.5

Mar 28, 2020

0.3.0.4

Mar 28, 2020

0.3.0.3

Mar 28, 2020

0.3.0.2

Mar 28, 2020

0.3.0.1

Mar 28, 2020

0.3.0.0

Mar 28, 2020

0.2.9.9

Mar 23, 2020

0.2.9.8

Mar 22, 2020

0.2.9.7

Mar 21, 2020

0.2.9.6

Mar 19, 2020

0.2.9.5

Mar 10, 2020

0.2.9.4

Mar 7, 2020

0.2.9.3

Feb 20, 2020

0.2.9.2

Feb 14, 2020

0.2.9.1

Jan 30, 2020

0.2.9.0

Jan 27, 2020

0.2.8.9

Jan 27, 2020

0.2.8.8

Jan 27, 2020

0.2.8.7

Jan 24, 2020

0.2.8.6

Jan 24, 2020

0.2.8.5

Jan 24, 2020

0.2.8.4

Jan 22, 2020

0.2.8.3

Jan 22, 2020

0.2.8.2

Jan 22, 2020

0.2.8.1

Jan 22, 2020

0.2.8.0

Jan 7, 2020

0.2.7.9

Dec 18, 2019

0.2.7.8

Nov 27, 2019

0.2.7.7

Nov 26, 2019

0.2.7.6

Nov 25, 2019

0.2.7.4

Nov 25, 2019

0.2.7.3

Nov 25, 2019

0.2.7.2

Nov 25, 2019

0.2.7.1

Nov 24, 2019

0.2.7.0

Nov 24, 2019

0.2.6.9

Nov 24, 2019

0.2.6.8

Nov 24, 2019

0.2.6.7

Nov 22, 2019

0.2.6.5

Nov 22, 2019

0.2.6.4

Nov 21, 2019

0.2.6.3

Nov 21, 2019

0.2.6.2

Nov 18, 2019

0.2.6.1

Nov 15, 2019

0.2.6.0

Nov 14, 2019

0.2.5.9

Oct 11, 2019

0.2.5.8

Oct 3, 2019

0.2.5.7

Oct 1, 2019

0.2.5.6

Sep 30, 2019

0.2.5.5

Sep 30, 2019

0.2.5.4

Sep 28, 2019

0.2.5.3

Sep 27, 2019

0.2.5.2

Sep 27, 2019

0.2.5.1

Sep 27, 2019

0.2.5.0

Sep 27, 2019

0.2.4.9

Sep 27, 2019

0.2.4.8

Sep 24, 2019

0.2.4.7

Sep 24, 2019

0.2.4.6

Sep 24, 2019

0.2.4.5

Sep 24, 2019

0.2.4.4

Sep 22, 2019

0.2.4.3

Sep 22, 2019

0.2.4.2

Sep 22, 2019

0.2.4.1

Sep 17, 2019

0.2.4.0

Sep 17, 2019

0.2.3.7

Sep 5, 2019

0.2.3.6

Sep 5, 2019

0.2.3.5

Sep 5, 2019

0.2.3.4

Sep 5, 2019

0.2.3.3

Aug 29, 2019

0.2.3.2

Aug 29, 2019

0.2.3.1

Aug 29, 2019

0.2.3

Aug 27, 2019

0.2.2

Aug 27, 2019

0.2.1

Aug 25, 2019

0.2.0.7

Jul 31, 2019

0.2.0.6

Jul 14, 2019

0.2.0.5

Jul 13, 2019

0.2.0.4

Jul 4, 2019

0.2.0.3

Jul 4, 2019

0.2.0.2

Jul 4, 2019

0.2.0.1

Jul 4, 2019

This version

0.2.0.0

Jul 3, 2019

0.1.9.9

Jul 1, 2019

0.1.9.7

Jun 7, 2019

0.1.9.6

Jun 5, 2019

0.1.9.5

Jun 4, 2019

0.1.9.4

Jun 4, 2019

0.1.9.3

May 23, 2019

0.1.9.2

May 13, 2019

0.1.9

May 10, 2019

0.1.8

May 8, 2019

0.1.7.1

May 8, 2019

0.1.7

May 8, 2019

0.1.6

May 8, 2019

0.1.5

May 8, 2019

0.1.4

May 8, 2019

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

medcat-0.2.0.0.tar.gz (26.5 kB view details)

Uploaded Jul 3, 2019 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

medcat-0.2.0.0-py3-none-any.whl (33.1 kB view details)

Uploaded Jul 3, 2019 Python 3

File details

Details for the file medcat-0.2.0.0.tar.gz.

File metadata

Download URL: medcat-0.2.0.0.tar.gz
Upload date: Jul 3, 2019
Size: 26.5 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.20.0 setuptools/41.0.1 requests-toolbelt/0.9.1 tqdm/4.28.1 CPython/3.7.0

File hashes

Hashes for medcat-0.2.0.0.tar.gz
Algorithm	Hash digest
SHA256	`33b23ce2b6b11e59f3125fbef9a54d8af3ab73419eca587408b05c9bc2ea42b4`
MD5	`8ac9fe81c72fd84c24170db8fe71ff46`
BLAKE2b-256	`f9dd793c3773605a5d6febfcd75ee54d7bdb613af8c01baf65142380375fa94d`

See more details on using hashes here.

File details

Details for the file medcat-0.2.0.0-py3-none-any.whl.

File metadata

Download URL: medcat-0.2.0.0-py3-none-any.whl
Upload date: Jul 3, 2019
Size: 33.1 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.20.0 setuptools/41.0.1 requests-toolbelt/0.9.1 tqdm/4.28.1 CPython/3.7.0

File hashes

Hashes for medcat-0.2.0.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`de7b80dac82bccfd2dd7c3ab25a77c2a908167b79cb42f5b1a30d5fad495a414`
MD5	`00837146fc69edd2dfd51da40b6b0c77`
BLAKE2b-256	`4a366773663f6db74c30c8aeb41d0d23f9381ba30e1a57a065df57bdbf759d60`

See more details on using hashes here.

medcat 0.2.0.0

Navigation

Verified details

Owner

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Medical oncept Annotation Tool

This is still experimental

How to use

PIP Installation

Please install the langauge models before running anything

Building a new Concept Database (.csv) or using an existing one

Training and Fine-tuning

If building from source, the requirements are

Results

Models

Acknowledgement

Project details

Verified details

Owner

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes