A package for extracting ligands and binding pockets from PDB files
Project description
Pocket Extraction
Pocket Extraction is a Python package for extracting ligands and binding pockets from PDB files. It combines the power of Biopython and RDKit to provide flexible and efficient molecular structure processing.
Features ✨
- Extract Binding Pockets: Identify pockets around ligands using coordinates, ligand files, or custom radii.
- Extract Ligands: Retrieve ligands by name, multiple ligands separately, or all HETATM residues (excluding solvents/ions).
- Multi-Format Support:
• Input: PDB, SDF, MOL2 (ligand files).
• Output: PDB (default), SDF, MOL2. - Advanced Filtering: Select by model ID, chain ID, or ligand names.
- Batch Processing: Extract individual pockets for multiple ligands in one command.
Installation
pip install pocket_extraction
Quick Start 🚀
1. Extract Ligand and Its Pocket (CLI)
extract_ligand_and_pocket input.pdb \
-l ligand.pdb \
-p pocket.pdb \
--ligand_names ATP \
--radius 10.0
2. Extract All Ligands with Individual Pockets (CLI)
extract_ligand_and_pocket input.pdb \
-l ligands/ \
-p pockets/ \
--multi_ligand
3. Python API Example
from pocket_extraction import extract_ligand, extract_pocket
# Extract ligand "HEM" from Chain B
extract_ligand("input.pdb", "heme.pdb",
ligand_names=["HEM"], chain_id="B")
# Extract pocket around a manually defined center
extract_pocket("input.pdb", "pocket.pdb",
ligand_center=[15.3, 24.7, 32.1],
radius=8.5)
Usage Guide
🔍 Extracting Binding Pockets
Method 1: Ligand File (SDF/MOL2/PDB)
Calculate pocket from ligand structure
CLI:
extract_pocket input.pdb --ligand_file ligand.sdf -o pocket.pdb --radius 12.5
Python:
from pocket_extraction import extract_pocket, get_ligand_coords
ligand_coords = get_ligand_coords("ligand.mol2")
extract_pocket("input.pdb", "pocket.pdb",
ligand_coords=ligand_coords,
radius=12.5)
Method 2: Manual Coordinates
Specify exact pocket center
CLI:
extract_pocket input.pdb --ligand_center 10.0 20.0 30.0 -o pocket.pdb
Python:
extract_pocket("input.pdb", "pocket.pdb",
ligand_center=[10.0, 20.0, 30.0],
radius=10.0) # Default radius
⚗️ Extracting Ligands
Case 1: Specific Ligand by Name
CLI:
extract_ligand input.pdb -o nad.pdb --ligand_names NAD
Python:
extract_ligand("input.pdb", "nad.pdb",
ligand_names=["NAD"],
model_id=0, # First model
chain_id="A")
Case 2: Multiple Ligands Separately
CLI:
extract_ligand input.pdb -o ligands/ --ligand_names ATP NAD --multi_ligand
Outputs: ligands/ligand_1.pdb, ligands/ligand_2.pdb
Python:
extract_ligand("input.pdb", "output_dir/",
ligand_names=["ATP", "NAD"],
multi_ligand=True)
Case 3: All Non-Solvent HETATM Residues
CLI:
extract_ligand input.pdb -o all_ligands.pdb
Python:
extract_ligand("input.pdb", "all_ligands.pdb")
Extracting Ligands and Pockets Together
Efficiently extract ligands and their binding pockets in one step using unified workflows. Choose from three modes depending on your use case:
Case 1: Merged Multi-Residue Ligands with Unified Pocket
Combine fragmented ligand residues into a single structure and extract their shared binding pocket.
Use Cases:
• Large/complex ligands (e.g., ATP, NADH) split across multiple residues in PDB files
• Metal-cofactor systems where ligands consist of multiple coordinated residues
• Cryo-EM/X-ray structures with discontinuous ligand density assignments
CLI:
extract_ligand_and_pocket input.pdb \
-l ligand.pdb \
-p pocket.pdb \
--ligand_names HIS ARG \
--model_id 0 \
--chain_id E \
--radius 12.0
Python:
from pocket_extraction import extract_ligand_and_pocket
extract_ligand_and_pocket(
pdb_path="input.pdb",
ligand_output="ligand.pdb",
pocket_output="pocket.pdb",
ligand_names=["ATP", "NAD"],
model_id=0,
chain_id="E",
radius=12.0
)
Case 2: Individual Pockets for Each Ligand
Extract ligands and pockets into separate files.
Use Case: Compare binding environments of distinct ligands.
CLI:
extract_ligand_and_pocket input.pdb \
-l ligands/ \
-p pockets/ \
--ligand_names ATP NAD \
--multi_ligand \
--radius 10.0
Python:
extract_ligand_and_pocket(
pdb_path="input.pdb",
ligand_output="ligands/",
pocket_output="pockets/",
ligand_names=["ATP", "NAD"],
multi_ligand=True,
radius=10.0
)
Case 3: Extract All Ligands & Pockets
Automatically process all non-solvent HETATM residues.
Use Case: High-throughput screening of unknown ligands.
CLI:
extract_ligand_and_pocket input.pdb \
-l auto_ligands/ \
-p auto_pockets/ \
--multi_ligand \
--radius 10.0
Python:
extract_ligand_and_pocket(
pdb_path="input.pdb",
ligand_output="auto_ligands/",
pocket_output="auto_pockets/",
multi_ligand=True,
radius=10.0
)
License
MIT License. See LICENSE for details.
Author
Hanker Wu
📧 GitHub: HankerWu
💬 For bug reports or feature requests, please open a GitHub issue.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file pocket_extraction-0.1.2.tar.gz.
File metadata
- Download URL: pocket_extraction-0.1.2.tar.gz
- Upload date:
- Size: 14.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.0.1 CPython/3.11.8
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
68c682a034112e410bbb212c317415d66696c6e025c489efc0824fd3b6ac72e5
|
|
| MD5 |
22ceb2d25868bd8645af8c373c31aeb1
|
|
| BLAKE2b-256 |
e67d030115f40932a0193580f52dffb1588e0c58954281c46ffcd78ba8d2fbff
|
File details
Details for the file pocket_extraction-0.1.2-py3-none-any.whl.
File metadata
- Download URL: pocket_extraction-0.1.2-py3-none-any.whl
- Upload date:
- Size: 14.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.0.1 CPython/3.11.8
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
fc76a2f85b02978332694999ed354cd6cfe138c11411994212e9169581a2fad2
|
|
| MD5 |
fb49d76f95d375a68869144d67bd3deb
|
|
| BLAKE2b-256 |
8116f5e7cb159dd5d8d89a528e4aef5f1b2121095f7164bfe15a7fd1f2e1f6f5
|