Python package for preparing small molecule for docking

These details have not been verified by PyPI

Project links

Homepage

Project description

Meeko: preparation of small molecules for AutoDock

Meeko reads an RDKit molecule object and writes a PDBQT string (or file) for AutoDock-Vina and AutoDock-GPU. It converts the docking output to RDKit molecules and SD files, without loss of bond orders.

Meeko is developed by the Forli lab at the Center for Computational Structural Biology (CCSB) at Scripps Research.

Usage notes

Meeko does not calculate 3D coordinates or assign protonation states. Input molecules must have explicit hydrogens.

Sampling of macrocycle conformers (paper) is enabled by default.

SDF format strongly preferred over Mol2. See this discussion, and this one, also this, and this. and RDKit issues 1755 and 917. So, what could go wrong? For example, reading Mol2 files from ZINC led to incorrect net charge of some molecules.

v0.6.0-alpha.3

This release aims to distribute an enhanced mk_prepare_receptor.py. Some features are still being developed, hence the -alpha suffix in the version. Reactive docking is not working in v0.6.0-alpha, but should be restored soon. Documentation is also work in progress.

API changes in v0.5

Class MoleculePreparation no longer has method write_pdbqt_string(). Instead, MoleculePreparation.prepare() returns a list of MoleculeSetup objects that must be passed, individually, to PDBQTWriterLegacy.write_string().

from meeko import MoleculePreparation
from meeko import PDBQTWriterLegacy

preparator = MoleculePreparation()
mol_setups = preparator.prepare(rdkit_molecule_3D_with_Hs)
for setup in mol_setups:
    pdbqt_string, is_ok, error_msg = PDBQTWriterLegacy.write_string(setup)
    if is_ok:
        print(pdbqt_string, end="")

Argument keep_nonpolar_hydrogens is replaced by merge_these_atom_types, both in the Python interface and for script mk_prepare_ligand.py. The default is merge_these_atom_types=("H",), which merges hydrogens typed "H", keeping the current default behavior. To keep all hydrogens, set merge_these_atom_types to an empty list when initializing MoleculePreparation, or pass no atom types to --merge_these_atom_types from the command line:

mk_prepare_ligand.py -i molecule.sdf --merge_these_atom_types

Dependencies

Python (>=3.5)
Numpy
Scipy
RDKit
ProDy (optionally, for covalent docking)

Conda or Miniconda can install the dependencies:

conda install -c conda-forge numpy scipy rdkit
pip install prody # optional. pip recommended at http://prody.csb.pitt.edu/downloads/

Installation (from PyPI)

$ pip install meeko

If using conda, pip installs the package in the active environment.

Installation (from source)

You'll get the develop branch, which may be ahead of the latest release.

$ git clone https://github.com/forlilab/Meeko
$ cd Meeko
$ pip install .

Optionally include --editable. Changes in the original package location take effect immediately without the need to run pip install . again.

$ pip install --editable .

Examples using the command line scripts

1. make PDBQT files

AutoDock-GPU and Vina read molecules in the PDBQT format. These can be prepared by Meeko from SD files, or from Mol2 files, but SDF is strongly preferred.

mk_prepare_ligand.py -i molecule.sdf -o molecule.pdbqt
mk_prepare_ligand.py -i multi_mol.sdf --multimol_outdir folder_for_pdbqt_files

2. convert docking results to SDF

AutoDock-GPU and Vina write docking results in the PDBQT format. The DLG output from AutoDock-GPU contains docked poses in PDBQT blocks. Meeko generates RDKit molecules from PDBQT files (or strings) using the SMILES string in the REMARK lines. The REMARK lines also have the mapping of atom indices between SMILES and PDBQT. SD files with docked coordinates are written from RDKit molecules.

mk_export.py molecule.pdbqt -o molecule.sdf
mk_export.py vina_results.pdbqt -o vina_results.sdf
mk_export.py autodock-gpu_results.dlg -o autodock-gpu_results.sdf

Making RDKit molecules from SMILES is safer than guessing bond orders from the coordinates, specially because the PDBQT lacks hydrogens bonded to carbon. As an example, consider the following conversion, in which OpenBabel adds an extra double bond, not because it has a bad algorithm, but because this is a nearly impossible task.

$ obabel -:"C1C=CCO1" -o pdbqt --gen3d | obabel -i pdbqt -o smi
[C]1=[C][C]=[C]O1

Python tutorial

1. making PDBQT strings for Vina or for AutoDock-GPU

from meeko import MoleculePreparation
from meeko import PDBQTWriterLegacy
from rdkit import Chem

input_molecule_file = "example/BACE_macrocycle/BACE_4.sdf"

# there is one molecule in this SD file, this loop iterates just once
for mol in Chem.SDMolSupplier(input_molecule_file, removeHs=False):
    preparator = MoleculePreparation()
    mol_setups = preparator.prepare(mol)
    for setup in mol_setups:
        setup.show() # optional
        pdbqt_string = PDBQTWriterLegacy.write_string(setup)

At this point, pdbqt_string can be written to a file for docking with AutoDock-GPU or Vina, or passed directly to Vina within Python using set_ligand_from_string(pdbqt_string). For context, see the docs on using Vina from Python.

2. RDKit molecule from docking results

from meeko import PDBQTMolecule
from meeko import RDKitMolCreate

fn = "autodock-gpu_results.dlg"
pdbqt_mol = PDBQTMolecule.from_file(fn, is_dlg=True, skip_typing=True)
rdkitmol_list = RDKitMolCreate.from_pdbqt_mol(pdbqt_mol)

The length of rdkitmol_list is one if there are no sidechains and only one ligand was docked. If multiple ligands and/or sidechains are docked simultaneously, each will be an individual RDKit molecule in rdkitmol_list. Sidechains are truncated at the C-alpha. Note that docking multiple ligands simultaneously is only available in Vina, and it differs from docking multiple ligands one after the other. Each failed creation of an RDKit molecule for a ligand or sidechain results in a None in rdkitmol_list.

For Vina's output PDBQT files, omit is_dlg=True.

pdbqt_mol = PDBQTMolecule.from_file("vina_results.pdbqt", skip_typing=True)
rdkitmol_list = RDKitMolCreate.from_pdbqt_mol(pdbqt_mol)

When using Vina from Python, the output string can be passed directly. See the docs for context on the v object.

vina_output_string = v.poses()
pdbqt_mol = PDBQTMolecule(vina_output_string, is_dlg=True, skip_typing=True)
rdkitmol_list = RDKitMolCreate.from_pdbqt_mol(pdbqt_mol)

3. Initializing MoleculePreparation from a dictionary (or JSON)

This is useful for saving and loading configuration files with json.

import json
from meeko import MoleculePreparation

mk_config = {"hydrate": True} # any arguments of MoleculePreparation.__init__
print(json.dumps(mk_config), file=open('mk_config.json', 'w'))
with open('mk_config.json') as f:
    mk_config = json.load(f)
preparator = MoleculePreparation.from_config(mk_config)

The command line tool mk_prepare_ligand.py can read the json files using option -c or --config.

Possibly useful configurations of MoleculePreparation

Here we create an instance of MoleculePreparation that attaches pseudo waters to the ligand (see paper on hydrated docking), keeps macrocycles rigid (generally a bad idea), and keeps conjugated bonds and amide bonds rigid. By default, most amides are kept rigid, except for tertiary amides with different substituents on the nitrogen.

preparator = MoleculePreparation(
    hydrate=True,
    rigid_macrocycles=True,
    rigidify_bonds_smarts = ["C=CC=C", "[CX3](=O)[NX3]"],
    rigidify_bonds_indices = [(1, 2), (0, 2)],
)

The same can be done with the command line script. Note that indices of the atoms in the SMARTS are 0-based for the Python API but 1-based for the command line script:

mk_prepare_ligand.py\
    --hydrate\
    --rigid_macrocycles\
    --rigidify_bonds_smarts "C=CC=C"\
    --rigidify_bonds_indices 2 3\
    --rigidify_bonds_smarts "[CX3](=O)[NX3]"\
    --rigidify_bonds_indices 1 3

Docking covalent ligands as flexible sidechains

The input ligand must be the product of the reaction and contain the atoms of the flexible sidechain up to (and including) the C-alpha. For example, if the ligand is an acrylamide (smiles: C=CC(=O)N) reacting with a cysteine (sidechain smiles: CCS), then the input ligand for Meeko must correspond to smiles CCSCCC(=O)N.

Meeko will align the ligand atoms that match the C-alpha and C-beta of the protein sidechain. Options --tether_smarts and --tether_smarts_indices define these atoms. For a cysteine, --tether_smarts "SCC" and --tether_smarts_indices 3 2 would work, although it is safer to define a more spefic SMARTS to avoid matching the ligand more than once. The first index (3 in this example) defines the C-alpha, and the second index defines the C-beta.

For the example in this repository, which is based on PDB entry 7aeh, the following options prepare the covalently bound ligand for tethered docking:

cd example/covalent_docking

mk_prepare_ligand.py\
    -i ligand_including_cys_sidechain.sdf\
    --receptor protein.pdb\
    --rec_residue ":CYS:6"\
    --tether_smarts "NC(=O)C(O)(C)SCC"\
    --tether_smarts_indices 9 8\
    -o prepared.pdbqt

Prody is required for preparing ligands as flexible sidechains. Often, pip install prody suffices to install Prody in the currently active virtual environment (e.g., conda). For more detailed installation instructions visit http://prody.csb.pitt.edu/downloads/.

Reactive Docking

1. Prepare protein with waterkit

Follow wk_prepare_receptor.py instructions and run with --pdb. The goal of this step is to perform essential fixes to the protein (such as missing atoms), to add hydrogens, and to follow the Amber naming scheme for atoms and residues, e.g., HIE or HID instead of HIS.

2. Prepare protein pdbqt

Here, wk.pdb was written by waterkit. Here, wk.pdb was written by waterkit. The example below will center a gridbox of specified size on the given reactive residue.

   $ mk_prepare_receptor.py\
    --pdb wk.pdb\
    -o receptor.pdbqt\
    --flexres " :ARG:348"\
    --reactive_flexres " :SER:308"
    --reactive_flexres " :SER:308"\
    --box_center_on_reactive_res\
    --box_size 40 40 40  # x y z (angstroms)

A manual box center can be specified with --box_center. Reactive parameters can also be modified:

  --r_eq_12 R_EQ_12     r_eq for reactive atoms (1-2 interaction)
  --eps_12 EPS_12       epsilon for reactive atoms (1-2 interaction)
  --r_eq_13_scaling R_EQ_13_SCALING
                        r_eq scaling for 1-3 interaction across reactive atoms
  --r_eq_14_scaling R_EQ_14_SCALING
                        r_eq scaling for 1-4 interaction across reactive atoms

Receptor preparation can't handle heteroatoms for the moment. Also nucleic acids, ions, and post-translational modifications (e.g. phosphorilation) are not supported. Only the 20 standard amino acids can be parsed, and it is required to have Amber atom names and hydrogens. No atoms can be missing.

3. Run autogrid

Make affinity maps for the _rigid.pdbqt part of the receptor. Make affinity maps for the _rigid.pdbqt part of the receptor. mk_prepare_receptor.py will prepare the GPF for you.

4. Write ligand PDBQT

mk_prepare_ligand.py -i sufex1.sdf --reactive_smarts "S(=O)(=O)F" --reactive_smarts_idx 1 -o sufex1.pdbqt\

5. Configure AD-GPU for reactive docking

For reactive docking there are two options that need to be passed to AutoDock-GPU: For reactive docking there are an additional option that needs to be passed to AutoDock-GPU: console --import_dpf

The --derivtype option, if needed, was written by mk_prepare_receptor.py to a file suffixed with .derivtype.

The filename to be passed to --import_dpf was written by mk_prepare_receptor.py and it is suffixed with reactive_config.

ADGPU -I *.reactive_config -L sufex1.pdbqt -N sufex1_docked_ -F *_flex.pdbqt -C 1

Project details

These details have not been verified by PyPI

Project links

Homepage

Release history Release notifications | RSS feed

0.7.1

Oct 23, 2025

0.6.1

Nov 18, 2024

0.6.0

Nov 11, 2024

This version

0.6.0a3 pre-release

Feb 2, 2024

0.6.0a2 pre-release

Nov 30, 2023

0.6.0a1 pre-release

Nov 30, 2023

0.6.0a0 pre-release

Nov 30, 2023

0.5.1

May 22, 2024

0.5.0

Jul 28, 2023

0.4.0

Nov 14, 2022

0.3.3

May 11, 2022

0.3.2

Mar 1, 2022

0.3.1

Feb 23, 2022

0.3.0

Jan 15, 2022

0.2

Oct 20, 2021

0.1.dev3 pre-release

Feb 17, 2021

0.1.dev2 pre-release

Feb 17, 2021

0.1.dev1 pre-release

Feb 17, 2021

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

meeko-0.6.0a3.tar.gz (154.2 kB view details)

Uploaded Feb 2, 2024 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

meeko-0.6.0a3-py3-none-any.whl (192.4 kB view details)

Uploaded Feb 2, 2024 Python 3

File details

Details for the file meeko-0.6.0a3.tar.gz.

File metadata

Download URL: meeko-0.6.0a3.tar.gz
Upload date: Feb 2, 2024
Size: 154.2 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/4.0.2 CPython/3.12.1

File hashes

Hashes for meeko-0.6.0a3.tar.gz
Algorithm	Hash digest
SHA256	`4c27b576d24cc27db744bcdb0d743dce135bbd6fdef22764ca07e4a6d7d3fc04`
MD5	`850499ffed17e557d6000be4fa78cb4a`
BLAKE2b-256	`a5088abf3a5732e1c263114060366de04a76fde2c3ca6374e2fea80369e88194`

See more details on using hashes here.

File details

Details for the file meeko-0.6.0a3-py3-none-any.whl.

File metadata

Download URL: meeko-0.6.0a3-py3-none-any.whl
Upload date: Feb 2, 2024
Size: 192.4 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/4.0.2 CPython/3.12.1

File hashes

Hashes for meeko-0.6.0a3-py3-none-any.whl
Algorithm	Hash digest
SHA256	`2a89a37822f065026a1394fcba1c72e2e08474f070136b1033c02c33b647631d`
MD5	`30fc1437518d99cf6a0bca69abc0fa0f`
BLAKE2b-256	`62c9674cb7336acafc89f9fbb77fd80311d57743d96cda81fdd62f6f99e35c31`

See more details on using hashes here.

meeko 0.6.0a3

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Meeko: preparation of small molecules for AutoDock

Usage notes

v0.6.0-alpha.3

API changes in v0.5

Dependencies

Installation (from PyPI)

Installation (from source)

Examples using the command line scripts

1. make PDBQT files

2. convert docking results to SDF

Python tutorial

1. making PDBQT strings for Vina or for AutoDock-GPU

2. RDKit molecule from docking results

3. Initializing MoleculePreparation from a dictionary (or JSON)

Possibly useful configurations of MoleculePreparation

Docking covalent ligands as flexible sidechains

Reactive Docking

1. Prepare protein with waterkit

2. Prepare protein pdbqt

3. Run autogrid

4. Write ligand PDBQT

5. Configure AD-GPU for reactive docking

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes