coreason-tagger
Project description
coreason-tagger
A High-Throughput Biomedical Semantic Extraction Engine
Coreason-Tagger is a high-throughput, latency-aware NLP engine designed to normalize unstructured clinical and biomedical text into structured Knowledge Graph nodes. It utilizes a Strategy Pattern Architecture to dynamically switch extraction engines at runtime based on the complexity of the request.
Features
- Dynamic Extraction Strategies:
- Speed (GLiNER): Single-pass inference for high-volume ETL and real-time UI highlighting.
- Precision (NuNER Zero): Token-classification for fixed schemas where boundary precision is critical.
- Reasoning (Ensemble): LLM-verified candidates for complex, ambiguous text using GLiNER recall + LLM verification.
- Contextualization: Lightweight assertion detection (Negation, Speculation, History, Family) to prevent false positives using rules or DistilBERT.
- Normalization: Maps ambiguous text spans to canonical IDs (e.g., SNOMED, RxNorm) using Vector Retrieval (Bi-Encoder) and Semantic Re-ranking.
- Resilience: Built-in Circuit Breakers and fallback mechanisms (L1/L2 Caching, Offline Mode) for robust operation.
Installation
pip install -r requirements.txt
Usage
Here is a quick example of how to initialize and use the CoreasonTagger:
import asyncio
from coreason_tagger.tagger import CoreasonTagger
from coreason_tagger.ner import ExtractorFactory
from coreason_tagger.assertion_detector import RegexBasedAssertionDetector
from coreason_tagger.codex_real import RealCoreasonCodex
from coreason_tagger.linker import VectorLinker
from coreason_tagger.schema import ExtractionStrategy
async def main():
# Initialize components
ner = ExtractorFactory()
assertion = RegexBasedAssertionDetector()
codex_client = RealCoreasonCodex(api_url="http://localhost:8000")
linker = VectorLinker(codex_client=codex_client)
# Initialize Tagger
tagger = CoreasonTagger(ner=ner, assertion=assertion, linker=linker)
# Tag text
text = "Patient complains of severe migraine and nausea."
results = await tagger.tag(
text,
labels=["Symptom", "Condition"],
strategy=ExtractionStrategy.SPEED_GLINER
)
for entity in results:
print(f"{entity.text}: {entity.label} (Assertion: {entity.assertion})")
if __name__ == "__main__":
asyncio.run(main())
For more details, please refer to the Product Requirements.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file coreason_tagger-0.1.0.tar.gz.
File metadata
- Download URL: coreason_tagger-0.1.0.tar.gz
- Upload date:
- Size: 24.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d36e8340d33ea1eafb7167735291a8390a7801b1206e47c29a32598d535a2337
|
|
| MD5 |
5d9dc280e4ae1ef6c666283c540defc2
|
|
| BLAKE2b-256 |
9e070b19cf1d1d6ad25b2c3a4325ca4e1bf1fd72c6404831f5dedc9a014c0e1e
|
Provenance
The following attestation bundles were made for coreason_tagger-0.1.0.tar.gz:
Publisher:
publish.yml on CoReason-AI/coreason-tagger
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
coreason_tagger-0.1.0.tar.gz -
Subject digest:
d36e8340d33ea1eafb7167735291a8390a7801b1206e47c29a32598d535a2337 - Sigstore transparency entry: 849871793
- Sigstore integration time:
-
Permalink:
CoReason-AI/coreason-tagger@8780a227d85ead46b3bc41e41fd0e28ef74fcd75 -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/CoReason-AI
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@8780a227d85ead46b3bc41e41fd0e28ef74fcd75 -
Trigger Event:
release
-
Statement type:
File details
Details for the file coreason_tagger-0.1.0-py3-none-any.whl.
File metadata
- Download URL: coreason_tagger-0.1.0-py3-none-any.whl
- Upload date:
- Size: 33.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a9a26409a4a369d9cf904d8ffbac500dd303143b3b5a2d25e78140b9912bbd0e
|
|
| MD5 |
dc909dcaf180f7fe244afc59f8cdfd75
|
|
| BLAKE2b-256 |
b10ed07c9b471b14bb89c243eb1764ca2861b4d09bad025975e797ef11258f7f
|
Provenance
The following attestation bundles were made for coreason_tagger-0.1.0-py3-none-any.whl:
Publisher:
publish.yml on CoReason-AI/coreason-tagger
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
coreason_tagger-0.1.0-py3-none-any.whl -
Subject digest:
a9a26409a4a369d9cf904d8ffbac500dd303143b3b5a2d25e78140b9912bbd0e - Sigstore transparency entry: 849871796
- Sigstore integration time:
-
Permalink:
CoReason-AI/coreason-tagger@8780a227d85ead46b3bc41e41fd0e28ef74fcd75 -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/CoReason-AI
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@8780a227d85ead46b3bc41e41fd0e28ef74fcd75 -
Trigger Event:
release
-
Statement type: