No project description provided

These details have not been verified by PyPI

Project links

Development Status
- 5 - Production/Stable
Intended Audience
- Science/Research
License
- OSI Approved :: GNU General Public License v3 (GPLv3)
Natural Language
- English
Operating System
- OS Independent
Programming Language
Topic
- Scientific/Engineering :: Bio-Informatics

Project description

EnzymeStructuralFiltering

Structural filtering pipeline using docking and active site heuristics to prioritze ML-predicted enzyme variants for experimental validation. This tool processes superimposed ligand poses and filters them using geometric criteria such as distances, angles, and optionally, esterase-specific filters or nucleophilic proximity.

🚀 Features

Analysis of enzyme-ligand docking using multiple docking tools (ML- and physics-based).
Optional esterase or nucleophile-focused analysis.
User-friendly pipeline only using a .pkl file as input and ligand smile strings.
Different parts of the pipeline can be run independently of each other.

📦 Installation

Option 1: Install via pip

pip install enzyme-filtering-pipline

Option 2: Clone the repository

git clone https://github.com/HelenSchmid/EnzymeStructuralFiltering.git
cd EnzymeStructuralFiltering
pip install .

:seedling: Environment Setup

Using conda

conda env create -f environment.yml
conda activate filterpipeline

🔧 Usage Example

from filtering_pipeline.pipeline import Pipeline
import pandas as pd
from pathlib import Path
df = pd.read_pickle("DEHP-MEHP.pkl").head(5)

pipeline = Pipeline(
        df = df,
        ligand_name="TPP",
        ligand_smiles="CCCCC(CC)COC(=O)C1=CC=CC=C1C(=O)OCC(CC)CCCC", # SMILES string of ligand
        smarts_pattern='[$([CX3](=O)[OX2H0][#6])]',                  # SMARTS pattern of the chemical moiety of interest of ligand
        max_matches=1000,
        esterase=1,
        find_closest_nuc=1,
        num_threads=1,
        squidly_dir='filtering_pipeline/squidly_final_models/',
        base_output_dir="pipeline_output"
    )

pipeline.run()

Running pipline on multiple ligands at the same time

You can run the filtering pipeline for multiple ligands by using a simple Bash script that passes ligand names and their SMILES strings to a Python runner script.

#!/bin/bash

# Define ligands and their SMILES representations
declare -A LIGANDS
LIGANDS["tri_2_chloroethylPi"]="C(CCl)OP(=O)(OCCCl)OCCCl"
LIGANDS["DEHP"]="CCCCC(CC)COC(=O)C1=CC=CC=C1C(=O)OCC(CC)CCCC"
LIGANDS["TPP"]="C1=CC=C(C=C1)OP(=O)(OC2=CC=CC=C2)OC3=CC=CC=C3"

# Create logs directory
mkdir -p logs

# Loop over each ligand and run the pipeline
for name in "${!LIGANDS[@]}"
do
  echo "Running for $name..."

  python benchmark_filtering_on_exp_tested_variants_run.py "$name" "${LIGANDS[$name]}" \
    2> "logs/${name}.err" \
    1> "logs/${name}.out"

  echo "Finished $name. Logs saved to logs/${name}.out and logs/${name}.err"
done

Each run invokes benchmark_filtering_on_exp_tested_variants_run.py, which looks like:

import argparse
import pandas as pd
from filtering_pipeline.pipeline import Pipeline

# SMARTS patterns to define substructures per ligand
SMARTS_MAP = {
    "TPP": "[P](=O)(O)(O)",
    "DEHP": "[C](=O)[O][C]",
    "Monuron": "Cl",
}

def parse_args():
    parser = argparse.ArgumentParser()
    parser.add_argument("ligand_name", type=str, help="Ligand name (e.g. TPP)")
    parser.add_argument("ligand_smiles", type=str, help="SMILES string of the ligand")
    return parser.parse_args()

def main():
    args = parse_args()

    smarts_pattern = SMARTS_MAP.get(args.ligand_name)

    pipeline = Pipeline(
        df=pd.read_pickle("examples/DEHP-MEHP.pkl").head(2),
        ligand_name=args.ligand_name,
        ligand_smiles=args.ligand_smiles,
        smarts_pattern=smarts_pattern,
        max_matches=5000,
        find_closest_nuc=1,
        num_threads=1,
        squidly_dir="filtering_pipeline/squidly_final_models/",
        base_output_dir=f"pipeline_output_{args.ligand_name}",
    )

    pipeline.run()

if __name__ == "__main__":
    main()

Project details

These details have not been verified by PyPI

Project links

Development Status
- 5 - Production/Stable
Intended Audience
- Science/Research
License
- OSI Approved :: GNU General Public License v3 (GPLv3)
Natural Language
- English
Operating System
- OS Independent
Programming Language
Topic
- Scientific/Engineering :: Bio-Informatics

Release history Release notifications | RSS feed

0.0.41

Aug 3, 2025

0.0.39

Aug 2, 2025

0.0.38

Aug 2, 2025

0.0.37

Aug 2, 2025

0.0.36

Aug 2, 2025

0.0.35

Aug 2, 2025

0.0.34

Jul 31, 2025

0.0.33

Jul 31, 2025

0.0.32

Jul 30, 2025

0.0.31

Jul 30, 2025

This version

0.0.5

Aug 4, 2025

0.0.4

Aug 3, 2025

0.0.3

Jul 28, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

enzyme_filtering_pipeline-0.0.5.tar.gz (38.1 kB view details)

Uploaded Aug 4, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

enzyme_filtering_pipeline-0.0.5-py3-none-any.whl (52.9 kB view details)

Uploaded Aug 4, 2025 Python 3

File details

Details for the file enzyme_filtering_pipeline-0.0.5.tar.gz.

File metadata

Download URL: enzyme_filtering_pipeline-0.0.5.tar.gz
Upload date: Aug 4, 2025
Size: 38.1 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.11.8

File hashes

Hashes for enzyme_filtering_pipeline-0.0.5.tar.gz
Algorithm	Hash digest
SHA256	`df6f0ee2bfe535d06448a3ac7abb545b61e7d84394751e7d4fb5dbe0102b74a7`
MD5	`5a0ce1b8306bbb18a976d0bebfb14828`
BLAKE2b-256	`d15add969d8388db830ae5af7167c9179153ed70089bcd30004f6ec9fbcadc50`

See more details on using hashes here.

File details

Details for the file enzyme_filtering_pipeline-0.0.5-py3-none-any.whl.

File metadata

Download URL: enzyme_filtering_pipeline-0.0.5-py3-none-any.whl
Upload date: Aug 4, 2025
Size: 52.9 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.11.8

File hashes

Hashes for enzyme_filtering_pipeline-0.0.5-py3-none-any.whl
Algorithm	Hash digest
SHA256	`f72b88086225e8fdbdda85e6ac71cedbdb43db57c09e01963e19dbd3a461414f`
MD5	`176c1fa59db7c6b79a3d0f2e80558c52`
BLAKE2b-256	`8a919db30f50387654d7e1ca32b060ac0db5ed1980877afd214c40be3073562a`

See more details on using hashes here.

enzyme-filtering-pipeline 0.0.5

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

EnzymeStructuralFiltering

🚀 Features

📦 Installation

Option 1: Install via pip

Option 2: Clone the repository

:seedling: Environment Setup

Using conda

🔧 Usage Example

Running pipline on multiple ligands at the same time

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes