Reaction atom-mapping from transformers
Project description
Extraction of organic chemistry grammar from unsupervised learning of chemical reactions
Enable robust atom mapping on valid reaction SMILES. The atom-mapping information was learned by an ALBERT model trained in an unsupervised fashion on a large dataset of chemical reactions.
- Extraction of organic chemistry grammar from unsupervised learning of chemical reactions: peer-reviewed Science Advances publication (open access).
- Demo: give RXNMapper a try!
- Unsupervised attention-guided atom-mapping preprint: presented at the ML Interpretability for Scientific Discovery ICML workshop, 2020.
Installation
Create virtual environment (optional)
python3 -m venv .venv
source .venv/bin/activate
Install from pip
pip install "rxnmapper[rdkit]"
You can leave out [rdkit] if RDKit is already available in your Python environment.
From source
git clone https://github.com/rxn4chemistry/rxnmapper.git
cd rxnmapper
pip install -e ".[rdkit]"
Usage
Basic usage
from rxnmapper import RXNMapper
rxn_mapper = RXNMapper()
rxns = ['CC(C)S.CN(C)C=O.Fc1cccnc1F.O=C([O-])[O-].[K+].[K+]>>CC(C)Sc1ncccc1F', 'C1COCCO1.CC(C)(C)OC(=O)CONC(=O)NCc1cccc2ccccc12.Cl>>O=C(O)CONC(=O)NCc1cccc2ccccc12']
results = rxn_mapper.get_attention_guided_atom_maps(rxns)
The results contain the mapped reactions and confidence scores:
[{'mapped_rxn': 'CN(C)C=O.F[c:5]1[n:6][cH:7][cH:8][cH:9][c:10]1[F:11].O=C([O-])[O-].[CH3:1][CH:2]([CH3:3])[SH:4].[K+].[K+]>>[CH3:1][CH:2]([CH3:3])[S:4][c:5]1[n:6][cH:7][cH:8][cH:9][c:10]1[F:11]',
'confidence': 0.9565619900376546},
{'mapped_rxn': 'C1COCCO1.CC(C)(C)[O:3][C:2](=[O:1])[CH2:4][O:5][NH:6][C:7](=[O:8])[NH:9][CH2:10][c:11]1[cH:12][cH:13][cH:14][c:15]2[cH:16][cH:17][cH:18][cH:19][c:20]12.Cl>>[O:1]=[C:2]([OH:3])[CH2:4][O:5][NH:6][C:7](=[O:8])[NH:9][CH2:10][c:11]1[cH:12][cH:13][cH:14][c:15]2[cH:16][cH:17][cH:18][cH:19][c:20]12',
'confidence': 0.9704424331552834}]
To account for batching and error handling automatically, you can use BatchedMapper instead:
from rxnmapper import BatchedMapper
rxn_mapper = BatchedMapper(batch_size=32)
rxns = ['CC[O-]~[Na+].BrCC>>CCOCC', 'invalid>>reaction']
# The following calls work with input of arbitrary size. Also, they do not raise
# any exceptions but will return ">>" or an empty dictionary for the second reaction.
results = list(rxn_mapper.map_reactions(rxns)) # results as strings directly
results = list(rxn_mapper.map_reactions_with_info(rxns)) # results as dictionaries (as above)
Testing
You can run the test suite with:
pip install -e .[dev,rdkit]
pytest tests
Examples
To learn more see the examples.
Data
Data can be found at: https://ibm.box.com/v/RXNMapperData
Citation
@article{schwaller2021extraction,
title={Extraction of organic chemistry grammar from unsupervised learning of chemical reactions},
author={Schwaller, Philippe and Hoover, Benjamin and Reymond, Jean-Louis and Strobelt, Hendrik and Laino, Teodoro},
journal={Science Advances},
volume={7},
number={15},
pages={eabe4166},
year={2021},
publisher={American Association for the Advancement of Science}
}
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file rxnmapper-0.4.3.tar.gz.
File metadata
- Download URL: rxnmapper-0.4.3.tar.gz
- Upload date:
- Size: 3.0 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
942c1a7c594c954078fc726cdc8dfb17dfe80152a15ce00ea3459bf37f92e7e9
|
|
| MD5 |
3f64640688c293ab1c6eed7768b99f80
|
|
| BLAKE2b-256 |
73fbd45f1c0480c31c211990ad2336047ce2f2fc93934ab66cac46a6edb41301
|
File details
Details for the file rxnmapper-0.4.3-py3-none-any.whl.
File metadata
- Download URL: rxnmapper-0.4.3-py3-none-any.whl
- Upload date:
- Size: 3.0 MB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
27876a4286881aafd286fd6f24a6a56a4ca6ba22d68e035a0ea120106c541ba5
|
|
| MD5 |
43f53b7da8088a524fe74461a914ebd4
|
|
| BLAKE2b-256 |
ffb0012426a039b6e88352d48f6381f165f5eb9225939814d3c518df51a176ec
|