Skip to main content

coreason-prism

Project description

coreason-prism

The Scientific Eye / Multi-Modal Encoder

Organization License CI Code Quality Docs

coreason-prism is the specialized processing engine for scientific data types (Chemistry and Vision) within the CoReason AI ecosystem. It acts as the "Scientific Eye" and "Multi-Modal Encoder", transforming fragile string representations and static images into robust, mathematical graphs and vectors.

Core Philosophy:

"A Molecule is a Graph, not a String. A Chart is Data, not an Image."


Features

Derived from the Product Requirements Document:

  • Cheminformatic Grounding (The Chemist):

    • Treats molecules as mathematical graphs, not strings.
    • Normalizes and sanitizes chemical structures (SMILES/InChI) using datamol.
    • Transmutes SMILES to SELFIES for 100% valid generative output.
    • Computes fingerprints (Morgan/ECFP) for structural similarity search.
    • Calculates key properties: Molecular Weight, LogP, TPSA, Lipinski Violations.
  • Visual De-Plotting (The Analyst):

    • Extracts raw data from scientific figures (e.g., Kaplan-Meier curves) using DePlot.
    • Digitizes charts into linear tables/DataFrames.
    • Enables meta-analysis of data locked in PDF images.
  • Bio-Image Segmentation (The Biologist):

    • Segments and classifies medical images (e.g., Histology) using MedSAM.
    • Detects ROIs and computes metrics like cell counts and tumor area.
  • Multi-Modal Embedding (The Embedder):

    • Generates joint embeddings for text, molecules, and images using BioCLIP.
    • Enables multi-modal retrieval (e.g., searching for histology slides via text description).

Installation

pip install coreason-prism

Or install from source:

git clone https://github.com/CoReason-AI/coreason_prism.git
cd coreason_prism
pip install .

Usage

Here is a concise snippet showing how to initialize and use the library:

from pathlib import Path
from coreason_prism.interface import Prism, PrismMode

# Initialize Prism (The Facade)
prism = Prism(light_mode=False)  # Set light_mode=True to skip heavy models (DePlot/BioCLIP)

# 1. Process a Molecule (SMILES -> Graph/SELFIES + Properties)
molecule_result = prism.process_molecule("CC(=O)Oc1ccccc1C(=O)O")  # Aspirin
if molecule_result.status == "VALID":
    print(f"Canonical SMILES: {molecule_result.canonical_smiles}")
    print(f"SELFIES: {molecule_result.selfies_string}")
    print(f"LogP: {molecule_result.logp}")
    print(f"Fingerprint (first 10 bits): {molecule_result.fingerprint_vector[:10]}")

# 2. Process a Chart Image (Extract Data)
# Ensure you have an image file at the specified path
chart_path = Path("tests/data/kaplan_meier.png")
if chart_path.exists():
    chart_result = prism.process_image(
        image_path=chart_path,
        source_document_id="doc_123",
        mode=PrismMode.CHART
    )
    print(f"Figure Type: {chart_result.figure_type}")
    print(f"Extracted Data: {chart_result.data_series}")
    if chart_result.metadata:
        print(f"Median Survival: {chart_result.metadata.get('median_survival')}")

# 3. Process a Bio-Image (Segmentation)
bio_path = Path("tests/data/histology_slide.jpg")
if bio_path.exists():
    bio_result = prism.process_image(
        image_path=bio_path,
        source_document_id="doc_456",
        mode=PrismMode.BIO
    )
    print(f"Cell Count: {bio_result.metadata.get('cell_count')}")

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

coreason_prism-0.4.1.tar.gz (21.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

coreason_prism-0.4.1-py3-none-any.whl (30.6 kB view details)

Uploaded Python 3

File details

Details for the file coreason_prism-0.4.1.tar.gz.

File metadata

  • Download URL: coreason_prism-0.4.1.tar.gz
  • Upload date:
  • Size: 21.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for coreason_prism-0.4.1.tar.gz
Algorithm Hash digest
SHA256 730a0766361884d9133c498020de960333faaba9aac34700e6c1aa7d00562ac6
MD5 63f0b9c5387c120513272be5f498f0f7
BLAKE2b-256 d226850df7120dfd3dbcb728ea0bad336d84f6e6cbc3dbea0f8d09dd0429c3be

See more details on using hashes here.

Provenance

The following attestation bundles were made for coreason_prism-0.4.1.tar.gz:

Publisher: publish.yml on CoReason-AI/coreason-prism

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file coreason_prism-0.4.1-py3-none-any.whl.

File metadata

  • Download URL: coreason_prism-0.4.1-py3-none-any.whl
  • Upload date:
  • Size: 30.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for coreason_prism-0.4.1-py3-none-any.whl
Algorithm Hash digest
SHA256 676aaf30ed60a183fc6c98fe1ee56fa58f41ca857d999f288e41004656a19bae
MD5 34ead81286b54c119a96452e501d259e
BLAKE2b-256 3137c03113d31e00054d084dccea1000ef4524c160db84ebe0b25b4063408f85

See more details on using hashes here.

Provenance

The following attestation bundles were made for coreason_prism-0.4.1-py3-none-any.whl:

Publisher: publish.yml on CoReason-AI/coreason-prism

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page