Skip to main content

coreason-prism

Project description

coreason-prism

The Scientific Eye / Multi-Modal Encoder

Organization License CI Code Quality Docs

coreason-prism is the specialized processing engine for scientific data types (Chemistry and Vision) within the CoReason AI ecosystem. It acts as the "Scientific Eye" and "Multi-Modal Encoder", transforming fragile string representations and static images into robust, mathematical graphs and vectors.

Core Philosophy:

"A Molecule is a Graph, not a String. A Chart is Data, not an Image."


Features

Derived from the Product Requirements Document:

  • Cheminformatic Grounding (The Chemist):

    • Treats molecules as mathematical graphs, not strings.
    • Normalizes and sanitizes chemical structures (SMILES/InChI) using datamol.
    • Transmutes SMILES to SELFIES for 100% valid generative output.
    • Computes fingerprints (Morgan/ECFP) for structural similarity search.
    • Calculates key properties: Molecular Weight, LogP, TPSA, Lipinski Violations.
  • Visual De-Plotting (The Analyst):

    • Extracts raw data from scientific figures (e.g., Kaplan-Meier curves) using DePlot.
    • Digitizes charts into linear tables/DataFrames.
    • Enables meta-analysis of data locked in PDF images.
  • Bio-Image Segmentation (The Biologist):

    • Segments and classifies medical images (e.g., Histology) using MedSAM.
    • Detects ROIs and computes metrics like cell counts and tumor area.
  • Multi-Modal Embedding (The Embedder):

    • Generates joint embeddings for text, molecules, and images using BioCLIP.
    • Enables multi-modal retrieval (e.g., searching for histology slides via text description).

Installation

pip install coreason-prism

Or install from source:

git clone https://github.com/CoReason-AI/coreason_prism.git
cd coreason_prism
pip install .

Usage

Here is a concise snippet showing how to initialize and use the library:

from pathlib import Path
from coreason_prism.interface import Prism, PrismMode

# Initialize Prism (The Facade)
prism = Prism(light_mode=False)  # Set light_mode=True to skip heavy models (DePlot/BioCLIP)

# 1. Process a Molecule (SMILES -> Graph/SELFIES + Properties)
molecule_result = prism.process_molecule("CC(=O)Oc1ccccc1C(=O)O")  # Aspirin
if molecule_result.status == "VALID":
    print(f"Canonical SMILES: {molecule_result.canonical_smiles}")
    print(f"SELFIES: {molecule_result.selfies_string}")
    print(f"LogP: {molecule_result.logp}")
    print(f"Fingerprint (first 10 bits): {molecule_result.fingerprint_vector[:10]}")

# 2. Process a Chart Image (Extract Data)
# Ensure you have an image file at the specified path
chart_path = Path("tests/data/kaplan_meier.png")
if chart_path.exists():
    chart_result = prism.process_image(
        image_path=chart_path,
        source_document_id="doc_123",
        mode=PrismMode.CHART
    )
    print(f"Figure Type: {chart_result.figure_type}")
    print(f"Extracted Data: {chart_result.data_series}")
    if chart_result.metadata:
        print(f"Median Survival: {chart_result.metadata.get('median_survival')}")

# 3. Process a Bio-Image (Segmentation)
bio_path = Path("tests/data/histology_slide.jpg")
if bio_path.exists():
    bio_result = prism.process_image(
        image_path=bio_path,
        source_document_id="doc_456",
        mode=PrismMode.BIO
    )
    print(f"Cell Count: {bio_result.metadata.get('cell_count')}")

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

coreason_prism-0.2.0.tar.gz (19.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

coreason_prism-0.2.0-py3-none-any.whl (27.5 kB view details)

Uploaded Python 3

File details

Details for the file coreason_prism-0.2.0.tar.gz.

File metadata

  • Download URL: coreason_prism-0.2.0.tar.gz
  • Upload date:
  • Size: 19.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for coreason_prism-0.2.0.tar.gz
Algorithm Hash digest
SHA256 dd528abe23e086794dc4f4b322c9d220f62b4a1fe2bddd79a314ba989a90e437
MD5 ce831650a79b9e127ddd286d367fb36f
BLAKE2b-256 9885a13ce433e28bfe85c6ec08db48af1aaf8dd79740fe89f1c709d58cab30d9

See more details on using hashes here.

Provenance

The following attestation bundles were made for coreason_prism-0.2.0.tar.gz:

Publisher: publish.yml on CoReason-AI/coreason-prism

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file coreason_prism-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: coreason_prism-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 27.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for coreason_prism-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 51ad66dd5acf64b41d85868e5c6ae0edc7f7649a258ab30c16db3696fff879f1
MD5 47ae653d3f65fe47a24c95fe3d1bfb02
BLAKE2b-256 8512687e5255daf700fdffeaafaba9d94edf2c2d8f368d3d74df13eab3ce2e21

See more details on using hashes here.

Provenance

The following attestation bundles were made for coreason_prism-0.2.0-py3-none-any.whl:

Publisher: publish.yml on CoReason-AI/coreason-prism

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page