Skip to main content

Reaction atom-mapping from transformers

Project description

Extraction of organic chemistry grammar from unsupervised learning of chemical reactions

Enable robust atom mapping on valid reaction SMILES. The atom-mapping information was learned by an ALBERT model trained in an unsupervised fashion on a large dataset of chemical reactions.

Installation

Create virtual environment (optional)

python3 -m venv .venv
source .venv/bin/activate

Install from pip

pip install "rxnmapper[rdkit]"

You can leave out [rdkit] if RDKit is already available in your Python environment.

From source

git clone https://github.com/rxn4chemistry/rxnmapper.git
cd rxnmapper
pip install -e ".[rdkit]"

Usage

Basic usage

from rxnmapper import RXNMapper
rxn_mapper = RXNMapper()
rxns = ['CC(C)S.CN(C)C=O.Fc1cccnc1F.O=C([O-])[O-].[K+].[K+]>>CC(C)Sc1ncccc1F', 'C1COCCO1.CC(C)(C)OC(=O)CONC(=O)NCc1cccc2ccccc12.Cl>>O=C(O)CONC(=O)NCc1cccc2ccccc12']
results = rxn_mapper.get_attention_guided_atom_maps(rxns)

The results contain the mapped reactions and confidence scores:

[{'mapped_rxn': 'CN(C)C=O.F[c:5]1[n:6][cH:7][cH:8][cH:9][c:10]1[F:11].O=C([O-])[O-].[CH3:1][CH:2]([CH3:3])[SH:4].[K+].[K+]>>[CH3:1][CH:2]([CH3:3])[S:4][c:5]1[n:6][cH:7][cH:8][cH:9][c:10]1[F:11]',
  'confidence': 0.9565619900376546},
 {'mapped_rxn': 'C1COCCO1.CC(C)(C)[O:3][C:2](=[O:1])[CH2:4][O:5][NH:6][C:7](=[O:8])[NH:9][CH2:10][c:11]1[cH:12][cH:13][cH:14][c:15]2[cH:16][cH:17][cH:18][cH:19][c:20]12.Cl>>[O:1]=[C:2]([OH:3])[CH2:4][O:5][NH:6][C:7](=[O:8])[NH:9][CH2:10][c:11]1[cH:12][cH:13][cH:14][c:15]2[cH:16][cH:17][cH:18][cH:19][c:20]12',
  'confidence': 0.9704424331552834}]

To account for batching and error handling automatically, you can use BatchedMapper instead:

from rxnmapper import BatchedMapper
rxn_mapper = BatchedMapper(batch_size=32)
rxns = ['CC[O-]~[Na+].BrCC>>CCOCC', 'invalid>>reaction']

# The following calls work with input of arbitrary size. Also, they do not raise 
# any exceptions but will return ">>" or an empty dictionary for the second reaction.
results = list(rxn_mapper.map_reactions(rxns))  # results as strings directly
results = list(rxn_mapper.map_reactions_with_info(rxns))  # results as dictionaries (as above)

Testing

You can run the test suite with:

pip install -e .[dev,rdkit]
pytest tests

Examples

To learn more see the examples.

Data

Data can be found at: https://ibm.box.com/v/RXNMapperData

Citation

@article{schwaller2021extraction,
  title={Extraction of organic chemistry grammar from unsupervised learning of chemical reactions},
  author={Schwaller, Philippe and Hoover, Benjamin and Reymond, Jean-Louis and Strobelt, Hendrik and Laino, Teodoro},
  journal={Science Advances},
  volume={7},
  number={15},
  pages={eabe4166},
  year={2021},
  publisher={American Association for the Advancement of Science}
}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

rxnmapper-0.4.3.tar.gz (3.0 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

rxnmapper-0.4.3-py3-none-any.whl (3.0 MB view details)

Uploaded Python 3

File details

Details for the file rxnmapper-0.4.3.tar.gz.

File metadata

  • Download URL: rxnmapper-0.4.3.tar.gz
  • Upload date:
  • Size: 3.0 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for rxnmapper-0.4.3.tar.gz
Algorithm Hash digest
SHA256 942c1a7c594c954078fc726cdc8dfb17dfe80152a15ce00ea3459bf37f92e7e9
MD5 3f64640688c293ab1c6eed7768b99f80
BLAKE2b-256 73fbd45f1c0480c31c211990ad2336047ce2f2fc93934ab66cac46a6edb41301

See more details on using hashes here.

File details

Details for the file rxnmapper-0.4.3-py3-none-any.whl.

File metadata

  • Download URL: rxnmapper-0.4.3-py3-none-any.whl
  • Upload date:
  • Size: 3.0 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for rxnmapper-0.4.3-py3-none-any.whl
Algorithm Hash digest
SHA256 27876a4286881aafd286fd6f24a6a56a4ca6ba22d68e035a0ea120106c541ba5
MD5 43f53b7da8088a524fe74461a914ebd4
BLAKE2b-256 ffb0012426a039b6e88352d48f6381f165f5eb9225939814d3c518df51a176ec

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page