ReactEA: Combining Evolutionary Algorithms with Reaction Rules Towards Focused Molecular Design
Project description
ReactEA: Combining Evolutionary Algorithms with Reaction Rules Towards Focused Molecular Design
Description
ReactEA is a reaction-based single and multi-objective evolutionary approach towards focused molecular design. ReactEA is a modular and problem-agnostic method that uses enzymatic reaction rules to manipulate molecules. The generated molecules are optimized for user-specified objective functions using a vast suite of EAs implemented in the jMetalPy framework.
Table of contents:
Requirements
- rdkit-pypi==2022.03.1
- numpy==1.21.5
- pandas==1.3.5
- cytoolz==0.11.2
- jmetalpy
- PyYAML==6.0
- matplotlib==3.5.1
- chembl_structure_pipeline
- joblib==1.1.0
- networkx==2.6.3
- click==8.1.3
Installation
Pip
Install DeepMol via pip:
pip install git+https://github.com/BioSystemsUM/ReactEA.git
From GitHub
Alternatively, install dependencies and ReactEA manually.
- Clone the repository:
git clone https://github.com/BioSystemsUM/ReactEA.git
- Install dependencies:
python setup.py install
Getting Started
Using ReactEA:
- Define the evaluation functions (case study) to use in the optimization (see evaluation_functions.ipynb and case_studies.ipynb for more details).
Example:
from rdkit.Chem.QED import qed
from reactea import evaluation_functions_wrapper
# EVALUATION FUNCTIONS
# evaluation function returning the number of rings a molecule
def number_of_rings(mol):
ri = mol.GetRingInfo()
n_rings = len(ri.AtomRings())
return n_rings
n_rigs_feval = evaluation_functions_wrapper(number_of_rings,
maximize=False,
worst_fitness=100,
name='n_rings')
# evaluation function returning the drug-likeliness score (QED) of a molecule
def qed_score(mol):
return qed(mol)
qed_feval = evaluation_functions_wrapper(qed_score,
maximize=True,
worst_fitness=0.0,
name='qed')
# CASE STUDY
from reactea import case_study_wrapper
# SINGLE OBJECTIVE CASE STUDY
# case study to optimize a single objective `f1` (minimize number of rings in a molecule)
minimize_rings = case_study_wrapper(n_rigs_feval,
multi_objective=False,
name='minimize_rings')
# SINGLE-OBJECTIVE CASE STUDY WITH MULTIPLE EVALUATION FUNCTIONS
# case study to optimize a single objective but with multiple evaluation functions `f1` and `f2` (minimize number of rings in a molecule and maximize qed)
# the number of evaluation functions must be the same as the number of values in weights and the sum of the weights must be 1
minimize_rings_maximize_qed = case_study_wrapper([n_rigs_feval, qed_feval],
multi_objective=False,
name='minimize_rings_maximize_qed',
weights=[0.3, 0.7])
# MULTI-OBJECTIVE CASE STUDY
# case study to optimize multiple objectives simultaneous
minimize_rings_maximize_qed_mo = case_study_wrapper([n_rigs_feval, qed_feval],
multi_objective=True,
name='minimize_rings_maximize_qed_mo')
- Provide the configuration file (see configuration files in config_files for more details).
Example:
# CONFIGURATION FILE
# Name of the experiment (results will be saved in a folder with this name (inside output folder))
exp_name: "NSGAIII_EXAMPLE_CONFIG"
# Path to the file containing the seed compounds (with column named smiles)
init_pop_path: ".../.../path_to_seed_compounds.tsv"
# size of the initial population to sample from the seed compounds (if not provided, all seed compounds will be used)
init_pop_size: 100
# whether to standardize the seed compounds (if not provided, the seed compounds will not be standardized)
standardize: True
# Maximum number of reaction rules to try in each generation (maximum is 22949)
max_rules_by_iter: 22949
# **Mutant selected will be randomly chosen from the compounds with similarity between `best_similarity` and `best_similarity - tolerance`
tolerance: 0.1
# Number of generations to run the algorithm
generations: 100
# EA to use (NSGAIII, NSGAII, SPEA2, IBEA, GA, LS, ES and SA)
algorithm: "NSGAIII"
# Path to output folder
output_path: ".../output_dir_path/"
Note: ** When generating reaction products (offspring) from the molecules from the previous generation (parents),
multiple products will be generated for each parent, including cofactors like water, carbon dioxide,
acetylCoA, etc. These cofactors have no interest in the optimization process, to eliminate them and select compounds
with relevance we only select from a pool of compounds that are similar to the parent compound. This pool is formed by
the compounds within a range of similarity (tolerance). The range is defined by the similarity of the most
similar compound (best_similarity) and best_similarity - tolerance.
- Run ReactEA:
from reactea import run_reactea
case_study_rings = minimize_rings_maximize_qed_mo()
# provide path to configuration file and case study
run_reactea(configs_path = 'config.yaml',
case_study = case_study_rings)
Citing ReactEA
Accepted at GECC0 2023.
Publication:
João Correia, Vítor Pereira, and Miguel Rocha. 2023. Combining Evolutionary Algorithms with Reaction Rules Towards Focused Molecular Design. In Proceedings of the Genetic and Evolutionary Computation Conference (GECCO '23). Association for Computing Machinery, New York, NY, USA, 900–909. https://doi.org/10.1145/3583131.3590413
Bibtex entry:
@inproceedings{Correia2023,
doi = {10.1145/3583131.3590413},
url = {https://doi.org/10.1145/3583131.3590413},
year = {2023},
month = jul,
publisher = {{ACM}},
author = {Jo{\~{a}}o Correia and V{\'{\i}}tor Pereira and Miguel Rocha},
title = {Combining Evolutionary Algorithms with Reaction Rules Towards Focused Molecular Design},
booktitle = {Proceedings of the Genetic and Evolutionary Computation Conference}
}
Licensing
Reactea is under GPL-3.0 license.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file reactea-1.0.0.tar.gz.
File metadata
- Download URL: reactea-1.0.0.tar.gz
- Upload date:
- Size: 24.4 MB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.12.8
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
1bc016eaeb0235110cd0596630dc87e77f85d768dae2dfdd237749b287ad4f07
|
|
| MD5 |
6a3813e1328109e71ba6feda725d51a6
|
|
| BLAKE2b-256 |
c7fa9ba8385246d0d331e5ab1dde309dd36a6cf121945c77d942a465f8fd30e8
|
Provenance
The following attestation bundles were made for reactea-1.0.0.tar.gz:
Publisher:
publish.yml on BioSystemsUM/ReactEA
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
reactea-1.0.0.tar.gz -
Subject digest:
1bc016eaeb0235110cd0596630dc87e77f85d768dae2dfdd237749b287ad4f07 - Sigstore transparency entry: 170671713
- Sigstore integration time:
-
Permalink:
BioSystemsUM/ReactEA@e3303ce3bbeef719650a2ad416f87753cada34f7 -
Branch / Tag:
refs/heads/main - Owner: https://github.com/BioSystemsUM
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@e3303ce3bbeef719650a2ad416f87753cada34f7 -
Trigger Event:
workflow_run
-
Statement type:
File details
Details for the file reactea-1.0.0-py3-none-any.whl.
File metadata
- Download URL: reactea-1.0.0-py3-none-any.whl
- Upload date:
- Size: 24.4 MB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.12.8
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
227462067abd3286aff24e5add92a41ff92d8a678abe16e0331d9c57c3d9bcfe
|
|
| MD5 |
4c9410541542b5af44a6ebc82110a5b8
|
|
| BLAKE2b-256 |
e66c6e57c34cba8e92399cded621c87512f5001f0a03fb1b319de6f1bad8e750
|
Provenance
The following attestation bundles were made for reactea-1.0.0-py3-none-any.whl:
Publisher:
publish.yml on BioSystemsUM/ReactEA
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
reactea-1.0.0-py3-none-any.whl -
Subject digest:
227462067abd3286aff24e5add92a41ff92d8a678abe16e0331d9c57c3d9bcfe - Sigstore transparency entry: 170671714
- Sigstore integration time:
-
Permalink:
BioSystemsUM/ReactEA@e3303ce3bbeef719650a2ad416f87753cada34f7 -
Branch / Tag:
refs/heads/main - Owner: https://github.com/BioSystemsUM
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@e3303ce3bbeef719650a2ad416f87753cada34f7 -
Trigger Event:
workflow_run
-
Statement type: