COMO: Closed-loop Optical Molecule recOgnition with Minimum Risk Training — Optical Chemical Structure Recognition

These details have not been verified by PyPI

Project links

Project description

COMO

COMO (Closed-loop Optical Molecule recOgnition) is a deep learning framework that recognizes chemical structure diagrams from images and predicts SMILES strings with atom-level coordinates and bond matrices. It uses Minimum Risk Training (MRT) to directly optimize molecular-level, non-differentiable objectives.

Installation

pip install como-ocsr

Quick Start

import como

# Load a model checkpoint (on GPU 0)
model = como.load_model("path/to/checkpoint.pth", device="cuda:0")

# Predict SMILES from a single image
smiles = como.predict(model, "molecule.png")
print(smiles)  # "CC(=O)O"

# Batch prediction on a specific GPU
smiles_list = como.predict_batch(model, ["mol1.png", "mol2.png"], device="cuda:1")

# Evaluate on a benchmark (single GPU by default)
metrics = como.evaluate(
    model,
    benchmark_dir="benchmark/USPTO/",
    csv_path="benchmark/USPTO.csv",
)
print(f"Exact Match: {metrics['postprocess/exact_match_acc']:.2%}")

# Multi-GPU, multi-benchmark evaluation
benchmarks = [
    {"name": "USPTO", "benchmark_dir": "benchmark/USPTO/",
     "csv_path": "benchmark/USPTO.csv"},
    {"name": "CLEF",  "benchmark_dir": "benchmark/CLEF/",
     "csv_path": "benchmark/CLEF_corrected.csv"},
]
results = como.evaluate_benchmarks(model, benchmarks, gpus="0,1,2,3")
for name, m in results.items():
    print(f"{name}: {m['postprocess/exact_match_acc']:.2%}")

API Reference

GPU Selection

All functions accept a device parameter for single-GPU usage:

model = como.load_model("checkpoint.pth", device="cuda:0")
como.predict(model, "img.png", device="cuda:1")
como.predict_batch(model, [...], device="cuda:2")

For evaluation (which uses multi-GPU internally via mp.spawn), use the gpus parameter:

Function	GPU control
`load_model`	`device="cuda:0"`
`predict`	`device="cuda:0"`
`predict_batch`	`device="cuda:0"`
`evaluate`	`gpus="0"` (default), `gpus="0,1,2"`, `gpus=None` (all)
`evaluate_benchmarks`	`gpus="0"` (default), `gpus="0,1,2"`, `gpus=None` (all)

`como.load_model(checkpoint_path, device="cuda", pretrained=True, **kwargs)`

Load a COMO model from a .pth checkpoint. Returns a :class:ComoModel instance in evaluation mode.

Parameter	Type	Default	Description
`checkpoint_path`	`str`	required	Path to `.pth` checkpoint
`device`	`str`	`"cuda"`	`"cuda"`, `"cuda:0"`, or `"cpu"`
`pretrained`	`bool`	`True`	Use ImageNet-pretrained backbone weights

Returns: ComoModel

`como.predict(model, image, *, beam_size=1, max_len=500, smiles_mode="postprocess", device=None)`

Predict the SMILES string for a single molecular image.

Parameter	Type	Default	Description
`model`	`ComoModel`	required	A loaded model
`image`	`str` / `np.ndarray` / `PIL.Image` / `torch.Tensor`	required	Input image (file path, array, PIL, or preprocessed tensor)
`beam_size`	`int`	`1`	Beam width (1 = greedy, 3 = beam search)
`max_len`	`int`	`500`	Maximum number of tokens to generate
`smiles_mode`	`str` or `None`	`"postprocess"`	`"postprocess"` (best quality), `"graph"`, `"decoder"`, or `None` (raw result dict)
`device`	`str` or `None`	`None`	Optional device override (e.g. `"cuda:1"`)

Returns:

str — predicted SMILES string (if smiles_mode is not None)
dict — full result dict with keys tokens, symbols, coords, bond_mat, decode_smiles, success (if smiles_mode=None)

`como.predict_batch(model, images, *, beam_size=1, max_len=500, smiles_mode="postprocess", device=None)`

Batch prediction on a single GPU.

Parameter	Type	Default	Description
`model`	`ComoModel`	required	A loaded model
`images`	`list`	required	List of file paths, NumPy arrays, PIL Images, or tensors
`beam_size`	`int`	`1`	Beam width (1 = greedy, recommended for batch)
`max_len`	`int`	`500`	Maximum tokens per image
`smiles_mode`	`str` or `None`	`"postprocess"`	SMILES reconstruction mode
`device`	`str` or `None`	`None`	Optional device override

Returns:

list[str] — predicted SMILES for each image (if smiles_mode is not None)
list[dict] — raw result dicts (if smiles_mode=None)

`como.evaluate(model, benchmark_dir, csv_path, *, beam_size=1, postproc_workers=32, tautomer_standardize=True, gpus="0")`

Evaluate on a single benchmark dataset. Returns a flat dict of metrics.

Parameter	Type	Default	Description
`model`	`ComoModel`	required	A loaded model
`benchmark_dir`	`str`	required	Directory containing `.png` images
`csv_path`	`str`	required	CSV with columns `image_id`, `SMILES`
`beam_size`	`int`	`1`	Beam width for decoding
`postproc_workers`	`int`	`32`	Parallel workers for SMILES post-processing
`tautomer_standardize`	`bool`	`True`	Include tautomer-normalized exact match
`gpus`	`str` or `None`	`"0"`	GPU IDs (`"0,1"`) or `None` for all

Returns: dict with the following keys:

Key	Type	Description
`decoder/exact_match_acc`	`float`	Exact match accuracy (decoder mode)
`decoder/avg_tanimoto`	`float`	Average Tanimoto similarity (decoder)
`decoder/tautomer_match_acc`	`float`	Tautomer-normalized exact match (decoder, if `tautomer_standardize=True`)
`decoder/failed_predictions`	`int`	Number of failed predictions (decoder)
`decoder/valid`	`int`	Number of chemically valid predictions (decoder)
`decoder/total`	`int`	Total benchmark samples
`graph/exact_match_acc`	`float`	Exact match accuracy (graph mode)
`graph/avg_tanimoto`	`float`	Average Tanimoto similarity (graph)
`graph/tautomer_match_acc`	`float`	Tautomer-normalized exact match (graph, if `tautomer_standardize=True`)
`graph/failed_predictions`	`int`	Number of failed predictions (graph)
`graph/valid`	`int`	Number of chemically valid predictions (graph)
`graph/total`	`int`	Total benchmark samples
`postprocess/exact_match_acc`	`float`	Exact match accuracy (postprocess mode, primary metric)
`postprocess/avg_tanimoto`	`float`	Average Tanimoto similarity (postprocess)
`postprocess/tautomer_match_acc`	`float`	Tautomer-normalized exact match (postprocess, if `tautomer_standardize=True`)
`postprocess/failed_predictions`	`int`	Number of failed predictions (postprocess)
`postprocess/valid`	`int`	Number of chemically valid predictions (postprocess)
`postprocess/records_df`	`DataFrame`	Per-image results with columns `image_id`, `gt_smiles`, `pred_smiles`, `exact`, `tautomer`, `tanimoto`
`postprocess/total`	`int`	Total benchmark samples
`total`	`int`	Total benchmark samples

`como.evaluate_benchmarks(model, benchmarks, *, beam_size=1, postproc_workers=32, tautomer_standardize=True, gpus="0")`

Evaluate on multiple benchmarks in one call. Returns a nested dict keyed by benchmark name.

Parameter	Type	Default	Description
`model`	`ComoModel`	required	A loaded model
`benchmarks`	`list[dict]`	required	Each dict has keys `"name"`, `"benchmark_dir"`, `"csv_path"`
`beam_size`	`int`	`1`	Beam width for decoding
`postproc_workers`	`int`	`32`	Parallel workers for SMILES post-processing
`tautomer_standardize`	`bool`	`True`	Include tautomer-normalized exact match
`gpus`	`str` or `None`	`"0"`	GPU IDs (`"0,1"`) or `None` for all

Returns: dict[str, dict] — mapping from benchmark name to a metrics dict with the same structure as :func:evaluate. Example::

{
  "USPTO": {
    "postprocess/exact_match_acc": 0.934,
    "postprocess/avg_tanimoto": 0.987,
    ...
  },
  "CLEF": {
    "postprocess/exact_match_acc": 0.948,
    ...
  },
}

Example:

benchmarks = [
    {"name": "USPTO", "benchmark_dir": "data/benchmark/real/USPTO",
     "csv_path": "data/benchmark/real/USPTO.csv"},
    {"name": "CLEF",  "benchmark_dir": "data/benchmark/real/CLEF",
     "csv_path": "data/benchmark/real/CLEF_corrected.csv"},
]
results = como.evaluate_benchmarks(model, benchmarks, gpus="0,1")
for name, metrics in results.items():
    acc = metrics["postprocess/exact_match_acc"]
    tan = metrics["postprocess/avg_tanimoto"]
    print(f"{name}: Exact={acc:.2%}, Tanimoto={tan:.4f}")

`como.canonicalize_smiles(smiles, *, ignore_chiral=False, ignore_cistrans=False, replace_rgroup=True)`

Canonicalize a SMILES string using RDKit.

Parameter	Type	Default	Description
`smiles`	`str`	required	Input SMILES string
`ignore_chiral`	`bool`	`False`	Strip tetrahedral chirality before canonicalization
`ignore_cistrans`	`bool`	`False`	Strip cis–trans markers (`/` and `\`) before canonicalization
`replace_rgroup`	`bool`	`True`	If `True`, replace R-group tokens (`R`, `R1`, `X`, `Ar`, …) with wildcard `*`

Returns: tuple[str, bool] — (canonical_smiles, ok) where ok is True if the SMILES is chemically valid and canonicalization succeeded.

`como.canonicalize_tautomer(smiles)`

Canonicalize a SMILES string via RDKit's TautomerEnumerator, normalizing different tautomeric forms (e.g., keto/enol, lactam/lactim) to the same canonical representation.

Parameter	Type	Default	Description
`smiles`	`str`	required	Input SMILES string

Returns: tuple[str, bool] — (tautomer_canonical_smiles, ok) where ok is False if the input SMILES is invalid or tautomer enumeration fails.

`como._result_to_smiles(result, mode="postprocess")`

Low-level: convert a raw prediction result dict (from :func:predict with smiles_mode=None) to a canonical SMILES string.

Parameter	Type	Default	Description
`result`	`dict`	required	Raw prediction dict with keys `smiles`, `symbols`, `coords`, `bond_mat`, `success`
`mode`	`str`	`"postprocess"`	SMILES reconstruction mode

mode options:

Mode	Source	Chirality	Description
`"decoder"`	Decoder token sequence	✗	Raw decoder SMILES, no graph info used. Fastest but lowest quality.
`"graph"`	Predicted atoms + bonds	✓	Reconstructs SMILES entirely from predicted atom symbols, coordinates, and bond matrix. Chirality restored via `_verify_chirality`.
`"postprocess"`	Decoder + atoms + bonds	✓	Starts from decoder SMILES, replaces R-groups/abbreviations, restores chirality from predicted coordinates and bond matrix, then expands functional groups back. Best quality.

Returns: str or None — canonical SMILES string, or None if conversion fails.

Model Weights

Pre-trained model weights are available on HuggingFace:

Checkpoint	Reward Mode	Description
`COMO_joint/tanimoto/final.pth`	Tanimoto	Joint MLE+MRT (Tanimoto reward)
`COMO_joint/edit_distance/final.pth`	Edit Distance	Joint MLE+MRT (Edit Distance reward)
`COMO_joint/visual/final.pth`	Visual	Joint MLE+MRT (Visual reward)

Download from: https://huggingface.co/Keylab/COMO

Benchmark Datasets

Benchmark datasets (images + CSV ground truth) are available on HuggingFace Datasets:

Dataset	Images	Type
USPTO	~6K	Real patent images
USPTO-10K	~10K	Real patent images
CLEF	~5K	Real patent images
JPO	~3K	Real patent images
UOB	~4K	Real academic images
staker	~1K	Real images
acs	~2K	Real publication images
WildMol-10K	~10K	Real wild images
indigo	~8K	Synthetic (Indigo-rendered)
chemdraw	~8K	Synthetic (ChemDraw style)

Download from: https://huggingface.co/Keylab/COMO (see benchmarks/ folder)

Citation

If you use COMO in your research, please cite:

@article{lyu2026closed,
  title={COMO: Closed-Loop Optical Molecule Recognition with Minimum Risk Training},
  author={Lyu, Zhuoqi and Ke, Qing},
  journal={arXiv preprint arXiv:2604.23546},
  year={2026}
}

License

Code (como/ package): MIT License
Model Weights (.pth files): CC BY-NC 4.0 (non-commercial use only)
Benchmark Datasets: collected from existing public OCSR benchmarks; please refer to their original sources for license and attribution:

Dataset	Source
USPTO, CLEF, JPO, UOB, Staker	Rajan et al., 2020, Xiong et al., 2023
Indigo, ChemDraw, ACS, Staker	Qian et al., 2023
USPTO-10K	Morin et al., 2023
WildMol-10K	Fang et al., 2025

See LICENSE for full terms.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

1.2.1

May 19, 2026

1.2.0

May 19, 2026

1.1.1 yanked

May 18, 2026

Reason this release was yanked:

Before refactoring; contains a large monolith

This version

1.1.0 yanked

May 18, 2026

Reason this release was yanked:

Contains training code; superseded by v1.1.1

1.0.0 yanked

May 14, 2026

Reason this release was yanked:

Contains training code; superseded by v1.1.1

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

como_ocsr-1.1.0.tar.gz (5.4 MB view details)

Uploaded May 18, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

como_ocsr-1.1.0-py3-none-any.whl (5.4 MB view details)

Uploaded May 18, 2026 Python 3

File details

Details for the file como_ocsr-1.1.0.tar.gz.

File metadata

Download URL: como_ocsr-1.1.0.tar.gz
Upload date: May 18, 2026
Size: 5.4 MB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.0

File hashes

Hashes for como_ocsr-1.1.0.tar.gz
Algorithm	Hash digest
SHA256	`574eb68c137870faf45aad877592580ba978bae6e2a62dd4e75805200358629b`
MD5	`469888a00309bb4a5d20c2fa34c2c5ac`
BLAKE2b-256	`390a870c236fe98999142a942ca9dacdb48ed7fbabd137f157059bed1184e1d7`

See more details on using hashes here.

File details

Details for the file como_ocsr-1.1.0-py3-none-any.whl.

File metadata

Download URL: como_ocsr-1.1.0-py3-none-any.whl
Upload date: May 18, 2026
Size: 5.4 MB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.0

File hashes

Hashes for como_ocsr-1.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`292333b90d59be12cca065020099f6c971029c9c703f904bb16cc8ddf6125b34`
MD5	`116ee5f8d9bec41dd16454f47bd33102`
BLAKE2b-256	`4308f44a2d82e4fc4042ad23589b98a8f3d2a9d8602eea33ddfeaad4bc521447`

See more details on using hashes here.

como-ocsr 1.1.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

COMO

Installation

Quick Start

API Reference

GPU Selection

como.load_model(checkpoint_path, device="cuda", pretrained=True, **kwargs)

como.predict(model, image, *, beam_size=1, max_len=500, smiles_mode="postprocess", device=None)

como.predict_batch(model, images, *, beam_size=1, max_len=500, smiles_mode="postprocess", device=None)

como.evaluate(model, benchmark_dir, csv_path, *, beam_size=1, postproc_workers=32, tautomer_standardize=True, gpus="0")

como.evaluate_benchmarks(model, benchmarks, *, beam_size=1, postproc_workers=32, tautomer_standardize=True, gpus="0")

como.canonicalize_smiles(smiles, *, ignore_chiral=False, ignore_cistrans=False, replace_rgroup=True)

como.canonicalize_tautomer(smiles)

como._result_to_smiles(result, mode="postprocess")

Model Weights

Benchmark Datasets

Citation

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

`como.load_model(checkpoint_path, device="cuda", pretrained=True, **kwargs)`

`como.predict(model, image, *, beam_size=1, max_len=500, smiles_mode="postprocess", device=None)`

`como.predict_batch(model, images, *, beam_size=1, max_len=500, smiles_mode="postprocess", device=None)`

`como.evaluate(model, benchmark_dir, csv_path, *, beam_size=1, postproc_workers=32, tautomer_standardize=True, gpus="0")`

`como.evaluate_benchmarks(model, benchmarks, *, beam_size=1, postproc_workers=32, tautomer_standardize=True, gpus="0")`

`como.canonicalize_smiles(smiles, *, ignore_chiral=False, ignore_cistrans=False, replace_rgroup=True)`

`como.canonicalize_tautomer(smiles)`

`como._result_to_smiles(result, mode="postprocess")`