Skip to main content

coreason-tagger

Project description

coreason-tagger

A High-Throughput Biomedical Semantic Extraction Engine

License CI Code Style Docs

Coreason-Tagger is a high-throughput, latency-aware NLP engine designed to normalize unstructured clinical and biomedical text into structured Knowledge Graph nodes. It utilizes a Strategy Pattern Architecture to dynamically switch extraction engines at runtime based on the complexity of the request.

Features

  • Dynamic Extraction Strategies:
    • Speed (GLiNER): Single-pass inference for high-volume ETL and real-time UI highlighting.
    • Precision (NuNER Zero): Token-classification for fixed schemas where boundary precision is critical.
    • Reasoning (Ensemble): LLM-verified candidates for complex, ambiguous text using GLiNER recall + LLM verification.
  • Contextualization: Lightweight assertion detection (Negation, Speculation, History, Family) to prevent false positives using rules or DistilBERT.
  • Normalization: Maps ambiguous text spans to canonical IDs (e.g., SNOMED, RxNorm) using Vector Retrieval (Bi-Encoder) and Semantic Re-ranking.
  • Resilience: Built-in Circuit Breakers and fallback mechanisms (L1/L2 Caching, Offline Mode) for robust operation.

Installation

pip install -r requirements.txt

Usage

Here is a quick example of how to initialize and use the CoreasonTagger:

import asyncio
from coreason_tagger.tagger import CoreasonTagger
from coreason_tagger.ner import ExtractorFactory
from coreason_tagger.assertion_detector import RegexBasedAssertionDetector
from coreason_tagger.codex_real import RealCoreasonCodex
from coreason_tagger.linker import VectorLinker
from coreason_tagger.schema import ExtractionStrategy

async def main():
    # Initialize components
    ner = ExtractorFactory()
    assertion = RegexBasedAssertionDetector()
    codex_client = RealCoreasonCodex(api_url="http://localhost:8000")
    linker = VectorLinker(codex_client=codex_client)

    # Initialize Tagger
    tagger = CoreasonTagger(ner=ner, assertion=assertion, linker=linker)

    # Tag text
    text = "Patient complains of severe migraine and nausea."
    results = await tagger.tag(
        text,
        labels=["Symptom", "Condition"],
        strategy=ExtractionStrategy.SPEED_GLINER
    )

    for entity in results:
        print(f"{entity.text}: {entity.label} (Assertion: {entity.assertion})")

if __name__ == "__main__":
    asyncio.run(main())

For more details, please refer to the Product Requirements.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

coreason_tagger-0.2.0.tar.gz (24.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

coreason_tagger-0.2.0-py3-none-any.whl (34.7 kB view details)

Uploaded Python 3

File details

Details for the file coreason_tagger-0.2.0.tar.gz.

File metadata

  • Download URL: coreason_tagger-0.2.0.tar.gz
  • Upload date:
  • Size: 24.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for coreason_tagger-0.2.0.tar.gz
Algorithm Hash digest
SHA256 6410f434c871ee66ad7506a07f1224d0b2cae34bb1b1c48cb523cea3c08311f7
MD5 562f4b153755d796215af2b6d1152a60
BLAKE2b-256 c6a05494b255e7f4b7eb8a6076f04179a67aa459d6ea418258bfc6839bf0856d

See more details on using hashes here.

Provenance

The following attestation bundles were made for coreason_tagger-0.2.0.tar.gz:

Publisher: publish.yml on CoReason-AI/coreason-tagger

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file coreason_tagger-0.2.0-py3-none-any.whl.

File metadata

File hashes

Hashes for coreason_tagger-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 902b9812638b30ad8845a3fc7c9eeec40a83bcd1529f73a4849e3899e6976fe7
MD5 6034ffb3bd93d3d53d7388f3e6fb1f11
BLAKE2b-256 c20d89c400e28ecb6df85909726332c55aa9712319acb476517dc52aa21a1f2b

See more details on using hashes here.

Provenance

The following attestation bundles were made for coreason_tagger-0.2.0-py3-none-any.whl:

Publisher: publish.yml on CoReason-AI/coreason-tagger

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page