Reaction atom-mapping from transformers
Project description
Extraction of organic chemistry grammar from unsupervised learning of chemical reactions
Enable robust atom mapping on valid reaction SMILES. The atom-mapping information was learned by an ALBERT model trained in an unsupervised fashion on a large dataset of chemical reactions.
- Extraction of organic chemistry grammar from unsupervised learning of chemical reactions: peer-reviewed Science Advances publication (open access).
- Demo: give RXNMapper a try!
- Unsupervised attention-guided atom-mapping preprint: presented at the ML Interpretability for Scientific Discovery ICML workshop, 2020.
Installation
From pip
conda create -n rxnmapper python=3.6 -y
conda activate rxnmapper
pip install rxnmapper
From github
You can install the package and setup the environment directly from github using:
git clone https://github.com/rxn4chemistry/rxnmapper.git
cd rxnmapper
conda create -n rxnmapper python=3.6 -y
conda activate rxnmapper
pip install -e .
RDkit
In both installation settings above, the RDKit
dependency is not installed automatically, unless you include the extra when installing: pip install "rxmapper[rdkit]"
.
It can also be installed via Conda or Pypi:
# Install RDKit from Conda
conda install -c conda-forge rdkit
# Install RDKit from Pypi
pip install rdkit
# for Python<3.7
# pip install rdkit-pypi
Usage
Basic usage
from rxnmapper import RXNMapper
rxn_mapper = RXNMapper()
rxns = ['CC(C)S.CN(C)C=O.Fc1cccnc1F.O=C([O-])[O-].[K+].[K+]>>CC(C)Sc1ncccc1F', 'C1COCCO1.CC(C)(C)OC(=O)CONC(=O)NCc1cccc2ccccc12.Cl>>O=C(O)CONC(=O)NCc1cccc2ccccc12']
results = rxn_mapper.get_attention_guided_atom_maps(rxns)
The results contain the mapped reactions and confidence scores:
[{'mapped_rxn': 'CN(C)C=O.F[c:5]1[n:6][cH:7][cH:8][cH:9][c:10]1[F:11].O=C([O-])[O-].[CH3:1][CH:2]([CH3:3])[SH:4].[K+].[K+]>>[CH3:1][CH:2]([CH3:3])[S:4][c:5]1[n:6][cH:7][cH:8][cH:9][c:10]1[F:11]',
'confidence': 0.9565619900376546},
{'mapped_rxn': 'C1COCCO1.CC(C)(C)[O:3][C:2](=[O:1])[CH2:4][O:5][NH:6][C:7](=[O:8])[NH:9][CH2:10][c:11]1[cH:12][cH:13][cH:14][c:15]2[cH:16][cH:17][cH:18][cH:19][c:20]12.Cl>>[O:1]=[C:2]([OH:3])[CH2:4][O:5][NH:6][C:7](=[O:8])[NH:9][CH2:10][c:11]1[cH:12][cH:13][cH:14][c:15]2[cH:16][cH:17][cH:18][cH:19][c:20]12',
'confidence': 0.9704424331552834}]
To account for batching and error handling automatically, you can use BatchedMapper
instead:
from rxnmapper import BatchedMapper
rxn_mapper = BatchedMapper(batch_size=32)
rxns = ['CC[O-]~[Na+].BrCC>>CCOCC', 'invalid>>reaction']
# The following calls work with input of arbitrary size. Also, they do not raise
# any exceptions but will return ">>" or an empty dictionary for the second reaction.
results = list(rxn_mapper.map_reactions(rxns)) # results as strings directly
results = list(rxn_mapper.map_reactions_with_info(rxns)) # results as dictionaries (as above)
Testing
You can run the examples above with the test suite as well:
- In your Conda environment:
pip install -e .[dev]
pytest tests
from the root
Examples
To learn more see the examples.
Data
Data can be found at: https://ibm.box.com/v/RXNMapperData
Citation
@article{schwaller2021extraction,
title={Extraction of organic chemistry grammar from unsupervised learning of chemical reactions},
author={Schwaller, Philippe and Hoover, Benjamin and Reymond, Jean-Louis and Strobelt, Hendrik and Laino, Teodoro},
journal={Science Advances},
volume={7},
number={15},
pages={eabe4166},
year={2021},
publisher={American Association for the Advancement of Science}
}
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file rxnmapper-0.4.0.tar.gz
.
File metadata
- Download URL: rxnmapper-0.4.0.tar.gz
- Upload date:
- Size: 3.0 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.12.6
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 2ea81c3ae2fc5a81dcd875a3af4e5e79d38ea48c9b0d7484c3eddc130d194d76 |
|
MD5 | 1f5046195a43be42c56459c7f48c023a |
|
BLAKE2b-256 | ba434e6c444c9ad7c14b782df6cb22e5c88880707a06973706ca1ecf5e425337 |
File details
Details for the file rxnmapper-0.4.0-py3-none-any.whl
.
File metadata
- Download URL: rxnmapper-0.4.0-py3-none-any.whl
- Upload date:
- Size: 3.0 MB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.12.6
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | f7f7af125051d3db500ef330bc0d75fe1f92931f18aa625d96a23e8cc8c13c1f |
|
MD5 | e6e1a7f915dc274c3c9090366d6d0bd9 |
|
BLAKE2b-256 | b68df5fe0beac8ffcff26f0e513cc74b3b298df54b6100f7a0f53cba46ed81c7 |