Skip to main content

coreason-codex

Project description

coreason-codex

The Terminology Server for Bio-Pharma AI

coreason-codex acts as the "Universal Translator" for the platform. It bridges the "Semantic Precision Gap" in Bio-Pharma AI by enforcing the use of Standardized Vocabularies (OMOP CDM).

CI/CD codecov Python License Ruff pre-commit Poetry Checked with mypy

Executive Summary

While Large Language Models are fluent, they can be imprecise. coreason-codex ensures that when an Agent reads "Heart Attack", it records it as ConceptID: 312327 (Data), enabling precise retrieval, graph grounding, and regulatory reporting.

It provides tools for Agents to lookup, validate, and translate medical concepts using a "Frozen Lake" pattern for GxP compliance.

For detailed requirements, see the Product Requirements Document.

Getting Started

Prerequisites

  • Python 3.12+
  • Poetry

Installation

  1. Clone the repository:
    git clone https://github.com/CoReason-AI/coreason-codex.git
    cd coreason-codex
    
  2. Install dependencies:
    poetry install
    

Usage

Here is a quick example of how to use coreason-codex to normalize text to a standard concept.

from pathlib import Path
from coreason_codex.loader import CodexLoader
from coreason_codex.normalizer import CodexNormalizer
from coreason_codex.embedders import SapBertEmbedder

# 1. Initialize Loader with path to your Codex Pack
# Ensure you have a valid Codex Pack at this location
pack_path = Path("./codex_pack_v1")
loader = CodexLoader(pack_path)
duckdb_conn, lancedb_conn = loader.load_codex()

# 2. Initialize Embedder and Normalizer
embedder = SapBertEmbedder() # Uses cambridgeltl/SapBERT-from-PubMedBERT-fulltext
normalizer = CodexNormalizer(embedder, duckdb_conn, lancedb_conn)

# 3. Normalize Text
matches = normalizer.normalize("Heart Attack")

for match in matches:
    print(f"Concept: {match.match_concept.concept_name} (ID: {match.match_concept.concept_id})")
    print(f"Score: {match.similarity_score}")

Documentation

Detailed documentation is available in the docs/ directory:

  • Setup & Deployment: Step-by-step guide to installing, building, and running the service.
  • Architecture: Overview of the system design, including the Frozen Lake pattern and Zero-Copy architecture.
  • Usage Guide: Detailed instructions on using the Loader, Normalizer, Hierarchy, and CrossWalker components.
  • Vignettes: Walkthroughs of key user stories (Semantic Tagging, Lateral Logic, Audit Replay).

Development

  • Run the linter:
    poetry run pre-commit run --all-files
    
  • Run the tests:
    poetry run pytest
    

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

coreason_codex-0.4.1.tar.gz (18.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

coreason_codex-0.4.1-py3-none-any.whl (26.8 kB view details)

Uploaded Python 3

File details

Details for the file coreason_codex-0.4.1.tar.gz.

File metadata

  • Download URL: coreason_codex-0.4.1.tar.gz
  • Upload date:
  • Size: 18.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for coreason_codex-0.4.1.tar.gz
Algorithm Hash digest
SHA256 e41e26fec5e1cda2c873e86d8506ce833cc16252e4709614a3ca09aaf57f4fd9
MD5 3d1170625833fcce11e31b22931a4e5f
BLAKE2b-256 703f9463c3813b5bc2a01715197f4f9b77a2fc6aee3e05aff9ae5a9db0540680

See more details on using hashes here.

Provenance

The following attestation bundles were made for coreason_codex-0.4.1.tar.gz:

Publisher: publish.yml on CoReason-AI/coreason-codex

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file coreason_codex-0.4.1-py3-none-any.whl.

File metadata

  • Download URL: coreason_codex-0.4.1-py3-none-any.whl
  • Upload date:
  • Size: 26.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for coreason_codex-0.4.1-py3-none-any.whl
Algorithm Hash digest
SHA256 5760f38d5b584fa4437428b7539662028bf98042cbdebc68a4f2af6e001bfc50
MD5 a48ef357c0ccd91df9d841c578bc1b21
BLAKE2b-256 e7da371622f1e4f0b40f2dfde10752440bd0ba01497642b0d598eb49b9807acc

See more details on using hashes here.

Provenance

The following attestation bundles were made for coreason_codex-0.4.1-py3-none-any.whl:

Publisher: publish.yml on CoReason-AI/coreason-codex

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page