Skip to main content

Reaction atom-mapping from transformers

Project description

Extraction of organic chemistry grammar from unsupervised learning of chemical reactions

Enable robust atom mapping on valid reaction SMILES. The atom-mapping information was learned by an ALBERT model trained in an unsupervised fashion on a large dataset of chemical reactions.

Installation

From pip

conda create -n rxnmapper python=3.6 -y
conda activate rxnmapper
pip install rxnmapper

From github

You can install the package and setup the environment directly from github using:

git clone https://github.com/rxn4chemistry/rxnmapper.git 
cd rxnmapper
conda create -n rxnmapper python=3.6 -y
conda activate rxnmapper
pip install -e .

RDkit

In both installation settings above, the RDKit dependency is not installed automatically, unless you include the extra when installing: pip install "rxmapper[rdkit]". It can also be installed via Conda or Pypi:

# Install RDKit from Conda
conda install -c conda-forge rdkit

# Install RDKit from Pypi
pip install rdkit
# for Python<3.7
# pip install rdkit-pypi

Usage

Basic usage

from rxnmapper import RXNMapper
rxn_mapper = RXNMapper()
rxns = ['CC(C)S.CN(C)C=O.Fc1cccnc1F.O=C([O-])[O-].[K+].[K+]>>CC(C)Sc1ncccc1F', 'C1COCCO1.CC(C)(C)OC(=O)CONC(=O)NCc1cccc2ccccc12.Cl>>O=C(O)CONC(=O)NCc1cccc2ccccc12']
results = rxn_mapper.get_attention_guided_atom_maps(rxns)

The results contain the mapped reactions and confidence scores:

[{'mapped_rxn': 'CN(C)C=O.F[c:5]1[n:6][cH:7][cH:8][cH:9][c:10]1[F:11].O=C([O-])[O-].[CH3:1][CH:2]([CH3:3])[SH:4].[K+].[K+]>>[CH3:1][CH:2]([CH3:3])[S:4][c:5]1[n:6][cH:7][cH:8][cH:9][c:10]1[F:11]',
  'confidence': 0.9565619900376546},
 {'mapped_rxn': 'C1COCCO1.CC(C)(C)[O:3][C:2](=[O:1])[CH2:4][O:5][NH:6][C:7](=[O:8])[NH:9][CH2:10][c:11]1[cH:12][cH:13][cH:14][c:15]2[cH:16][cH:17][cH:18][cH:19][c:20]12.Cl>>[O:1]=[C:2]([OH:3])[CH2:4][O:5][NH:6][C:7](=[O:8])[NH:9][CH2:10][c:11]1[cH:12][cH:13][cH:14][c:15]2[cH:16][cH:17][cH:18][cH:19][c:20]12',
  'confidence': 0.9704424331552834}]

To account for batching and error handling automatically, you can use BatchedMapper instead:

from rxnmapper import BatchedMapper
rxn_mapper = BatchedMapper(batch_size=32)
rxns = ['CC[O-]~[Na+].BrCC>>CCOCC', 'invalid>>reaction']

# The following calls work with input of arbitrary size. Also, they do not raise 
# any exceptions but will return ">>" or an empty dictionary for the second reaction.
results = list(rxn_mapper.map_reactions(rxns))  # results as strings directly
results = list(rxn_mapper.map_reactions_with_info(rxns))  # results as dictionaries (as above)

Testing

You can run the examples above with the test suite as well:

  1. In your Conda environment: pip install -e .[dev]
  2. pytest tests from the root

Examples

To learn more see the examples.

Data

Data can be found at: https://ibm.box.com/v/RXNMapperData

Citation

@article{schwaller2021extraction,
  title={Extraction of organic chemistry grammar from unsupervised learning of chemical reactions},
  author={Schwaller, Philippe and Hoover, Benjamin and Reymond, Jean-Louis and Strobelt, Hendrik and Laino, Teodoro},
  journal={Science Advances},
  volume={7},
  number={15},
  pages={eabe4166},
  year={2021},
  publisher={American Association for the Advancement of Science}
}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

rxnmapper-0.3.1.tar.gz (3.0 MB view details)

Uploaded Source

Built Distribution

rxnmapper-0.3.1-py3-none-any.whl (3.0 MB view details)

Uploaded Python 3

File details

Details for the file rxnmapper-0.3.1.tar.gz.

File metadata

  • Download URL: rxnmapper-0.3.1.tar.gz
  • Upload date:
  • Size: 3.0 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.0 CPython/3.12.5

File hashes

Hashes for rxnmapper-0.3.1.tar.gz
Algorithm Hash digest
SHA256 c702aa0da06052004e32a445819b706b4808f0d08dbf35c5eb7f213c4ecbe596
MD5 bff3dec6cd73f1c50ecd216c745b0917
BLAKE2b-256 b8d99f641bde722e4191835debd9992ac85eed9c9870762286dbdb46df07846d

See more details on using hashes here.

File details

Details for the file rxnmapper-0.3.1-py3-none-any.whl.

File metadata

  • Download URL: rxnmapper-0.3.1-py3-none-any.whl
  • Upload date:
  • Size: 3.0 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.0 CPython/3.12.5

File hashes

Hashes for rxnmapper-0.3.1-py3-none-any.whl
Algorithm Hash digest
SHA256 99cb820caa6538edd023efd2e9250444ac9098ba244fd4307388f61d7754fd81
MD5 a5660f5cb1bac5100d0f001bd5839a43
BLAKE2b-256 63cc8a350df1863c0c597ed041ccfd1f18793a848baff25056f09f0749140f7d

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page