Skip to main content

ReactEA: Combining Evolutionary Algorithms with Reaction Rules Towards Focused Molecular Design

Project description

ReactEA: Combining Evolutionary Algorithms with Reaction Rules Towards Focused Molecular Design

Description

ReactEA is a reaction-based single and multi-objective evolutionary approach towards focused molecular design. ReactEA is a modular and problem-agnostic method that uses enzymatic reaction rules to manipulate molecules. The generated molecules are optimized for user-specified objective functions using a vast suite of EAs implemented in the jMetalPy framework.

Table of contents:

Requirements

  • rdkit-pypi==2022.03.1
  • numpy==1.21.5
  • pandas==1.3.5
  • cytoolz==0.11.2
  • jmetalpy
  • PyYAML==6.0
  • matplotlib==3.5.1
  • chembl_structure_pipeline
  • joblib==1.1.0
  • networkx==2.6.3
  • click==8.1.3

Installation

Pip

Install DeepMol via pip:

pip install git+https://github.com/BioSystemsUM/ReactEA.git

From GitHub

Alternatively, install dependencies and ReactEA manually.

  1. Clone the repository:
git clone https://github.com/BioSystemsUM/ReactEA.git
  1. Install dependencies:
python setup.py install

Getting Started

Using ReactEA:

Example:

from rdkit.Chem.QED import qed
from reactea import evaluation_functions_wrapper

# EVALUATION FUNCTIONS

# evaluation function returning the number of rings a molecule
def number_of_rings(mol):
    ri = mol.GetRingInfo()
    n_rings = len(ri.AtomRings())
    return n_rings

n_rigs_feval = evaluation_functions_wrapper(number_of_rings, 
                                            maximize=False, 
                                            worst_fitness=100, 
                                            name='n_rings')

# evaluation function returning the drug-likeliness score (QED) of a molecule
def qed_score(mol):
    return qed(mol)

qed_feval = evaluation_functions_wrapper(qed_score, 
                                         maximize=True, 
                                         worst_fitness=0.0, 
                                         name='qed')

# CASE STUDY

from reactea import case_study_wrapper

# SINGLE OBJECTIVE CASE STUDY
# case study to optimize a single objective `f1` (minimize number of rings in a molecule)
minimize_rings = case_study_wrapper(n_rigs_feval, 
                                    multi_objective=False, 
                                    name='minimize_rings')

# SINGLE-OBJECTIVE CASE STUDY WITH MULTIPLE EVALUATION FUNCTIONS
# case study to optimize a single objective but with multiple evaluation functions `f1` and `f2` (minimize number of rings in a molecule and maximize qed)
# the number of evaluation functions must be the same as the number of values in weights and the sum of the weights must be 1
minimize_rings_maximize_qed = case_study_wrapper([n_rigs_feval, qed_feval], 
                                                 multi_objective=False, 
                                                 name='minimize_rings_maximize_qed', 
                                                 weights=[0.3, 0.7])

# MULTI-OBJECTIVE CASE STUDY
# case study to optimize multiple objectives simultaneous
minimize_rings_maximize_qed_mo = case_study_wrapper([n_rigs_feval, qed_feval], 
                                                    multi_objective=True, 
                                                    name='minimize_rings_maximize_qed_mo')
  • Provide the configuration file (see configuration files in config_files for more details).

Example:

# CONFIGURATION FILE

# Name of the experiment (results will be saved in a folder with this name (inside output folder))
exp_name: "NSGAIII_EXAMPLE_CONFIG"

# Path to the file containing the seed compounds (with column named smiles)
init_pop_path: ".../.../path_to_seed_compounds.tsv"
# size of the initial population to sample from the seed compounds (if not provided, all seed compounds will be used)
init_pop_size: 100
# whether to standardize the seed compounds (if not provided, the seed compounds will not be standardized)
standardize: True

# Maximum number of reaction rules to try in each generation (maximum is 22949)
max_rules_by_iter: 22949
# **Mutant selected will be randomly chosen from the compounds with similarity between `best_similarity` and `best_similarity - tolerance`
tolerance: 0.1

# Number of generations to run the algorithm
generations: 100
# EA to use (NSGAIII, NSGAII, SPEA2, IBEA, GA, LS, ES and SA)
algorithm: "NSGAIII"

# Path to output folder
output_path: ".../output_dir_path/"

Note: ** When generating reaction products (offspring) from the molecules from the previous generation (parents), multiple products will be generated for each parent, including cofactors like water, carbon dioxide, acetylCoA, etc. These cofactors have no interest in the optimization process, to eliminate them and select compounds with relevance we only select from a pool of compounds that are similar to the parent compound. This pool is formed by the compounds within a range of similarity (tolerance). The range is defined by the similarity of the most similar compound (best_similarity) and best_similarity - tolerance.

  • Run ReactEA:
from reactea import run_reactea

case_study_rings = minimize_rings_maximize_qed_mo()
# provide path to configuration file and case study
run_reactea(configs_path = 'config.yaml', 
            case_study = case_study_rings)

Citing ReactEA

Accepted at GECC0 2023.

Publication:

João Correia, Vítor Pereira, and Miguel Rocha. 2023. Combining Evolutionary Algorithms with Reaction Rules Towards Focused Molecular Design. In Proceedings of the Genetic and Evolutionary Computation Conference (GECCO '23). Association for Computing Machinery, New York, NY, USA, 900–909. https://doi.org/10.1145/3583131.3590413

Bibtex entry:

@inproceedings{Correia2023,
  doi = {10.1145/3583131.3590413},
  url = {https://doi.org/10.1145/3583131.3590413},
  year = {2023},
  month = jul,
  publisher = {{ACM}},
  author = {Jo{\~{a}}o Correia and V{\'{\i}}tor Pereira and Miguel Rocha},
  title = {Combining Evolutionary Algorithms with Reaction Rules Towards Focused Molecular Design},
  booktitle = {Proceedings of the Genetic and Evolutionary Computation Conference}
}

Licensing

Reactea is under GPL-3.0 license.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

reactea-1.0.0.tar.gz (24.4 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

reactea-1.0.0-py3-none-any.whl (24.4 MB view details)

Uploaded Python 3

File details

Details for the file reactea-1.0.0.tar.gz.

File metadata

  • Download URL: reactea-1.0.0.tar.gz
  • Upload date:
  • Size: 24.4 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.8

File hashes

Hashes for reactea-1.0.0.tar.gz
Algorithm Hash digest
SHA256 1bc016eaeb0235110cd0596630dc87e77f85d768dae2dfdd237749b287ad4f07
MD5 6a3813e1328109e71ba6feda725d51a6
BLAKE2b-256 c7fa9ba8385246d0d331e5ab1dde309dd36a6cf121945c77d942a465f8fd30e8

See more details on using hashes here.

Provenance

The following attestation bundles were made for reactea-1.0.0.tar.gz:

Publisher: publish.yml on BioSystemsUM/ReactEA

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file reactea-1.0.0-py3-none-any.whl.

File metadata

  • Download URL: reactea-1.0.0-py3-none-any.whl
  • Upload date:
  • Size: 24.4 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.8

File hashes

Hashes for reactea-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 227462067abd3286aff24e5add92a41ff92d8a678abe16e0331d9c57c3d9bcfe
MD5 4c9410541542b5af44a6ebc82110a5b8
BLAKE2b-256 e66c6e57c34cba8e92399cded621c87512f5001f0a03fb1b319de6f1bad8e750

See more details on using hashes here.

Provenance

The following attestation bundles were made for reactea-1.0.0-py3-none-any.whl:

Publisher: publish.yml on BioSystemsUM/ReactEA

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page