Skip to main content

Reaction atom-mapping from transformers

Project description

Extraction of organic chemistry grammar from unsupervised learning of chemical reactions

Enable robust atom mapping on valid reaction SMILES. The atom-mapping information was learned by an ALBERT model trained in an unsupervised fashion on a large dataset of chemical reactions.

Installation

From pip

conda create -n rxnmapper python=3.6 -y
conda activate rxnmapper
pip install rxnmapper

From github

You can install the package and setup the environment directly from github using:

git clone https://github.com/rxn4chemistry/rxnmapper.git 
cd rxnmapper
conda create -n rxnmapper python=3.6 -y
conda activate rxnmapper
pip install -e .

RDkit

In both installation settings above, the RDKit dependency is not installed automatically, unless you include the extra when installing: pip install "rxmapper[rdkit]". It can also be installed via Conda or Pypi:

# Install RDKit from Conda
conda install -c conda-forge rdkit

# Install RDKit from Pypi
pip install rdkit
# for Python<3.7
# pip install rdkit-pypi

Usage

Basic usage

from rxnmapper import RXNMapper
rxn_mapper = RXNMapper()
rxns = ['CC(C)S.CN(C)C=O.Fc1cccnc1F.O=C([O-])[O-].[K+].[K+]>>CC(C)Sc1ncccc1F', 'C1COCCO1.CC(C)(C)OC(=O)CONC(=O)NCc1cccc2ccccc12.Cl>>O=C(O)CONC(=O)NCc1cccc2ccccc12']
results = rxn_mapper.get_attention_guided_atom_maps(rxns)

The results contain the mapped reactions and confidence scores:

[{'mapped_rxn': 'CN(C)C=O.F[c:5]1[n:6][cH:7][cH:8][cH:9][c:10]1[F:11].O=C([O-])[O-].[CH3:1][CH:2]([CH3:3])[SH:4].[K+].[K+]>>[CH3:1][CH:2]([CH3:3])[S:4][c:5]1[n:6][cH:7][cH:8][cH:9][c:10]1[F:11]',
  'confidence': 0.9565619900376546},
 {'mapped_rxn': 'C1COCCO1.CC(C)(C)[O:3][C:2](=[O:1])[CH2:4][O:5][NH:6][C:7](=[O:8])[NH:9][CH2:10][c:11]1[cH:12][cH:13][cH:14][c:15]2[cH:16][cH:17][cH:18][cH:19][c:20]12.Cl>>[O:1]=[C:2]([OH:3])[CH2:4][O:5][NH:6][C:7](=[O:8])[NH:9][CH2:10][c:11]1[cH:12][cH:13][cH:14][c:15]2[cH:16][cH:17][cH:18][cH:19][c:20]12',
  'confidence': 0.9704424331552834}]

To account for batching and error handling automatically, you can use BatchedMapper instead:

from rxnmapper import BatchedMapper
rxn_mapper = BatchedMapper(batch_size=32)
rxns = ['CC[O-]~[Na+].BrCC>>CCOCC', 'invalid>>reaction']

# The following calls work with input of arbitrary size. Also, they do not raise 
# any exceptions but will return ">>" or an empty dictionary for the second reaction.
results = list(rxn_mapper.map_reactions(rxns))  # results as strings directly
results = list(rxn_mapper.map_reactions_with_info(rxns))  # results as dictionaries (as above)

Testing

You can run the examples above with the test suite as well:

  1. In your Conda environment: pip install -e .[dev]
  2. pytest tests from the root

Examples

To learn more see the examples.

Data

Data can be found at: https://ibm.box.com/v/RXNMapperData

Citation

@article{schwaller2021extraction,
  title={Extraction of organic chemistry grammar from unsupervised learning of chemical reactions},
  author={Schwaller, Philippe and Hoover, Benjamin and Reymond, Jean-Louis and Strobelt, Hendrik and Laino, Teodoro},
  journal={Science Advances},
  volume={7},
  number={15},
  pages={eabe4166},
  year={2021},
  publisher={American Association for the Advancement of Science}
}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

rxnmapper-0.4.2.tar.gz (3.0 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

rxnmapper-0.4.2-py3-none-any.whl (3.0 MB view details)

Uploaded Python 3

File details

Details for the file rxnmapper-0.4.2.tar.gz.

File metadata

  • Download URL: rxnmapper-0.4.2.tar.gz
  • Upload date:
  • Size: 3.0 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for rxnmapper-0.4.2.tar.gz
Algorithm Hash digest
SHA256 4288f6bdc197cfa2a6a2cc4158d71e844b98fd956ac703be2824a8f18ce77a1f
MD5 32fe4c041018648ab543dee689e82aed
BLAKE2b-256 d406ef9dbd6fec4612ac337795090b5f70ef6283d9b3f3a634fdf7639ee4919b

See more details on using hashes here.

File details

Details for the file rxnmapper-0.4.2-py3-none-any.whl.

File metadata

  • Download URL: rxnmapper-0.4.2-py3-none-any.whl
  • Upload date:
  • Size: 3.0 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for rxnmapper-0.4.2-py3-none-any.whl
Algorithm Hash digest
SHA256 b7dd5d86349d565ddbe01ee83e30dc730465cee4760f56349566c39b855f216f
MD5 84b858bb8a28322b91ce43c558926072
BLAKE2b-256 7cdb864854bd6c3c395cd58266015169d1152abdc44ec586f6f9a1708bd283ca

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page