No project description provided
Project description
MedCAT-gliner
This provides gliner based NER step for MedCAT core library.
Usage
First install from PyPI, e.g:
pip install medcat-gliner
Subsequently, if you have an existing model, you should be able to just change the NER component:
cat = CAT.load_model_pack("path/to/existing/model")
# change component
from medcat_gliner import GLiNERConfig
cat.config.components.ner.comp_name = "gliner_ner"
cat.config.components.ner.custom_cnf = GLiNERConfig()
# recreate pipe with new NER component
cat._recreate_pipe()
# use as needed
NER recall comparison (linkable SNOMED entities)
The following results compare the existing NER (vocab based NER with spell checking) implementation with the gliner implementation when used as the NER component within MedCAT. Evaluation was performed on the 2023 SNOMED CT Linking Challenge dataset.
Important caveat This is not a measure of general NER quality. Recall is computed only with respect to annotated, linkable SNOMED CT entities present in the linking dataset. Mentions outside the annotation scope are treated as false positives by construction, so precision is not meaningful here.
| Implementation | True Positives | False Negatives | Recall | Runtime |
|---|---|---|---|---|
| Vocab based NER | 10,545 | 3,917 | 0.729 | ~5m 50s |
| GliNER implementation | 7,971 | 6,491 | 0.551 | ~34m |
As we can see, for this dataset, GliNER is significantly slower and performs worse than the standard vocab based implementation. This is likely because the vocab based NER step has been configured and tuned to work best within the MedCAT pipeline. It is likely that with additional tuning the GliNER implementation could perform as good or better than the vocab based linker does.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file medcat_gliner-0.2.0.tar.gz.
File metadata
- Download URL: medcat_gliner-0.2.0.tar.gz
- Upload date:
- Size: 6.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
016133978fa76834cc7ca0aff14e2e2bd37c258d52cfc5cd9b28b0b57a4e2ab6
|
|
| MD5 |
8a39de6ce48134449257e7170748bd95
|
|
| BLAKE2b-256 |
c90d7a26cb9728ebea831e9e69e63fab72d2efbd5bfdb4be1796b356460845d8
|
Provenance
The following attestation bundles were made for medcat_gliner-0.2.0.tar.gz:
Publisher:
medcat-gliner_ci.yml on CogStack/cogstack-nlp
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
medcat_gliner-0.2.0.tar.gz -
Subject digest:
016133978fa76834cc7ca0aff14e2e2bd37c258d52cfc5cd9b28b0b57a4e2ab6 - Sigstore transparency entry: 947305964
- Sigstore integration time:
-
Permalink:
CogStack/cogstack-nlp@66219d88123488cdc77d79007cc214c798376547 -
Branch / Tag:
refs/tags/medcat-gliner/v0.2.0 - Owner: https://github.com/CogStack
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
medcat-gliner_ci.yml@66219d88123488cdc77d79007cc214c798376547 -
Trigger Event:
push
-
Statement type:
File details
Details for the file medcat_gliner-0.2.0-py3-none-any.whl.
File metadata
- Download URL: medcat_gliner-0.2.0-py3-none-any.whl
- Upload date:
- Size: 5.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ae076dd3be61a95771a075d832e56dfa6fd270cf4b179175260f13a5f382aeae
|
|
| MD5 |
bc99de9d3145c15cefd0bda6977e689d
|
|
| BLAKE2b-256 |
c6f1cb5919fe8803ea2e8c164296885c840c904b819c1fd5b7c0e9427319bab8
|
Provenance
The following attestation bundles were made for medcat_gliner-0.2.0-py3-none-any.whl:
Publisher:
medcat-gliner_ci.yml on CogStack/cogstack-nlp
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
medcat_gliner-0.2.0-py3-none-any.whl -
Subject digest:
ae076dd3be61a95771a075d832e56dfa6fd270cf4b179175260f13a5f382aeae - Sigstore transparency entry: 947305971
- Sigstore integration time:
-
Permalink:
CogStack/cogstack-nlp@66219d88123488cdc77d79007cc214c798376547 -
Branch / Tag:
refs/tags/medcat-gliner/v0.2.0 - Owner: https://github.com/CogStack
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
medcat-gliner_ci.yml@66219d88123488cdc77d79007cc214c798376547 -
Trigger Event:
push
-
Statement type: