Skip to main content

coreason-tagger

Project description

coreason-tagger

A High-Throughput Biomedical Semantic Extraction Engine

License CI Code Style Docs

Coreason-Tagger is a high-throughput, latency-aware NLP engine designed to normalize unstructured clinical and biomedical text into structured Knowledge Graph nodes. It utilizes a Strategy Pattern Architecture to dynamically switch extraction engines at runtime based on the complexity of the request.

Features

  • Dynamic Extraction Strategies:
    • Speed (GLiNER): Single-pass inference for high-volume ETL and real-time UI highlighting.
    • Precision (NuNER Zero): Token-classification for fixed schemas where boundary precision is critical.
    • Reasoning (Ensemble): LLM-verified candidates for complex, ambiguous text using GLiNER recall + LLM verification.
  • Contextualization: Lightweight assertion detection (Negation, Speculation, History, Family) to prevent false positives using rules or DistilBERT.
  • Normalization: Maps ambiguous text spans to canonical IDs (e.g., SNOMED, RxNorm) using Vector Retrieval (Bi-Encoder) and Semantic Re-ranking.
  • Resilience: Built-in Circuit Breakers and fallback mechanisms (L1/L2 Caching, Offline Mode) for robust operation.

Installation

pip install -r requirements.txt

Usage

Here is a quick example of how to initialize and use the CoreasonTagger:

import asyncio
from coreason_tagger.tagger import CoreasonTagger
from coreason_tagger.ner import ExtractorFactory
from coreason_tagger.assertion_detector import RegexBasedAssertionDetector
from coreason_tagger.codex_real import RealCoreasonCodex
from coreason_tagger.linker import VectorLinker
from coreason_tagger.schema import ExtractionStrategy

async def main():
    # Initialize components
    ner = ExtractorFactory()
    assertion = RegexBasedAssertionDetector()
    codex_client = RealCoreasonCodex(api_url="http://localhost:8000")
    linker = VectorLinker(codex_client=codex_client)

    # Initialize Tagger
    tagger = CoreasonTagger(ner=ner, assertion=assertion, linker=linker)

    # Tag text
    text = "Patient complains of severe migraine and nausea."
    results = await tagger.tag(
        text,
        labels=["Symptom", "Condition"],
        strategy=ExtractionStrategy.SPEED_GLINER
    )

    for entity in results:
        print(f"{entity.text}: {entity.label} (Assertion: {entity.assertion})")

if __name__ == "__main__":
    asyncio.run(main())

For more details, please refer to the Product Requirements.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

coreason_tagger-0.1.0.tar.gz (24.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

coreason_tagger-0.1.0-py3-none-any.whl (33.4 kB view details)

Uploaded Python 3

File details

Details for the file coreason_tagger-0.1.0.tar.gz.

File metadata

  • Download URL: coreason_tagger-0.1.0.tar.gz
  • Upload date:
  • Size: 24.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for coreason_tagger-0.1.0.tar.gz
Algorithm Hash digest
SHA256 d36e8340d33ea1eafb7167735291a8390a7801b1206e47c29a32598d535a2337
MD5 5d9dc280e4ae1ef6c666283c540defc2
BLAKE2b-256 9e070b19cf1d1d6ad25b2c3a4325ca4e1bf1fd72c6404831f5dedc9a014c0e1e

See more details on using hashes here.

Provenance

The following attestation bundles were made for coreason_tagger-0.1.0.tar.gz:

Publisher: publish.yml on CoReason-AI/coreason-tagger

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file coreason_tagger-0.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for coreason_tagger-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 a9a26409a4a369d9cf904d8ffbac500dd303143b3b5a2d25e78140b9912bbd0e
MD5 dc909dcaf180f7fe244afc59f8cdfd75
BLAKE2b-256 b10ed07c9b471b14bb89c243eb1764ca2861b4d09bad025975e797ef11258f7f

See more details on using hashes here.

Provenance

The following attestation bundles were made for coreason_tagger-0.1.0-py3-none-any.whl:

Publisher: publish.yml on CoReason-AI/coreason-tagger

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page