COMO: Closed-loop Optical Molecule recOgnition with Minimum Risk Training
Project description
COMO: Optical Chemical Structure Recognition
COMO (Closed-loop Optical Molecule recOgnition) converts images of chemical structure diagrams into machine-readable SMILES strings, atom-level coordinates, and bond matrices.
Compared to image-to-text OCSR models (e.g., MolScribe, SwinOCSR, Image2Mol), COMO uniquely predicts explicit molecular graphs — atoms with 2D coordinates and bonds — then reconstructs SMILES using cheminformatics post-processing for provably valid, chemically accurate structures.
🚀 Quick Start
import como
# 1. Load model
model = como.load_model("COMO_joint.pth", device="cuda")
# 2. Predict a single molecule
smiles = como.predict(model, "molecule.png") # → "CC(=O)O"
result = como.predict(model, "molecule.png", smiles_mode=None)
# result contains: tokens, atom symbols, coordinates, bond matrix, etc.
# 3. Batch prediction
smiles_list = como.predict_batch(model, ["mol1.png", "mol2.png"])
# 4. Benchmark evaluation
metrics = como.evaluate(model, "benchmark/USPTO/", "benchmark/USPTO.csv")
print(metrics["exact_match_acc"], metrics["avg_tanimoto"])
📦 Installation
pip install como-ocsr
Requirements: Python 3.10+, PyTorch ≥ 2.0, RDKit.
🧠 Model Checkpoints
| Checkpoint | Description |
|---|---|
COMO_joint.pth |
Full model — MLE + MRT joint training (recommended) |
COMO_stage1_synthetic.pth |
Stage 1 only — MLE on synthetic data |
Download from Hugging Face.
📖 API Reference
como.load_model(checkpoint_path, device="cuda", pretrained=True, **kwargs)
Load a COMO model from a checkpoint.
- checkpoint_path (
str): Path to.pthcheckpoint file. - device (
str):"cuda"or"cpu". - pretrained (
bool): Use ImageNet-pretrained backbone weights (default:True). - Returns:
ComoModelin evaluation mode.
como.predict(model, image, *, beam_size=1, max_len=500, smiles_mode="postprocess", device=None)
Predict SMILES for a single image.
- image: File path (
str), NumPy array (H×W×3 or H×W), PILImage, or preprocessedtorch.Tensor. - beam_size (
int): 1 = greedy, 3 = beam search. - smiles_mode (
str):"postprocess"— cheminformatics-based SMILES reconstruction (recommended, best accuracy)"graph"— graph-traversal SMILES"decoder"— raw decoder outputNone— returns full result dict (tokens, atoms, bonds, coordinates)
- Returns: SMILES string (
str) or full result dict.
como.predict_batch(model, images, *, beam_size=1, max_len=500, smiles_mode="postprocess", device=None)
Predict SMILES for multiple images (single GPU).
- images: List of file paths, NumPy arrays, PIL Images, or Tensors.
- Returns: List of SMILES strings or result dicts.
como.evaluate(model, benchmark_dir, csv_path, *, beam_size=1, postproc_workers=32, tautomer_standardize=True, gpus="0")
Evaluate on a benchmark dataset.
- benchmark_dir: Directory of
.pngimages. - csv_path: CSV with columns
image_idandSMILES. - gpus: Comma-separated GPU IDs (e.g.
"0,1,2,3"), orNonefor all GPUs. - Returns: Dict with
exact_match_acc,avg_tanimoto,tautomer_match_acc, etc.
como.evaluate_benchmarks(model, benchmarks, *, ...)
Evaluate on multiple benchmarks at once.
- benchmarks: List of
{"name": ..., "benchmark_dir": ..., "csv_path": ...}dicts. - Returns:
dict[name] → metrics_dict.
🧪 Supported Input Formats
- PNG / JPEG / TIFF images
- Hand-drawn or computer-generated chemical structure diagrams
- Arbitrary aspect ratios and sizes (auto-resized internally)
📄 License
- Code: MIT License (see LICENSE)
- Model Weights: CC BY-NC 4.0
📚 Citation
If you use COMO in your research, please cite:
@article{lyu2025como,
title={Closed-loop Optical Molecule recOgnition with Minimum Risk Training},
author={Lyu, Zhuoqi and others},
journal={arXiv},
year={2025}
}
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file como_ocsr-1.2.0.tar.gz.
File metadata
- Download URL: como_ocsr-1.2.0.tar.gz
- Upload date:
- Size: 45.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b88b5c750a608518539c430a0f79712559333169fb4a34706a59e3706e295220
|
|
| MD5 |
dc4fd1a4d6687a255af1badfb731506a
|
|
| BLAKE2b-256 |
fdf9d623fa170e0381feb5d41e5eab77f11f1d3a9d0991d9621bfbccff1406e0
|
File details
Details for the file como_ocsr-1.2.0-py3-none-any.whl.
File metadata
- Download URL: como_ocsr-1.2.0-py3-none-any.whl
- Upload date:
- Size: 47.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
344b1fccf7137741ee162813227b704d46b9897ccfba7236b7b020d6c6bc3cfe
|
|
| MD5 |
a0fd1f5c10be9533806d07747a616e18
|
|
| BLAKE2b-256 |
a495888b5f7a366e7a075c617e15b7387279177af9533c4d5786ab3677106ded
|