Skip to main content

A tool for mapping surface pockets in PDB structures

Project description

PocketMapper ============

PocketMapper is a command-line tool to fetch protein structures, compute PISA-derived pockets,
extract atomic coordinates from mmCIF files, perform local or Foldseek alignments, compare
pockets across structures and write results to disk. It is intended for comparative analysis
of binding pockets between query and target protein chains.

Features
- Download and cache mmCIF files
- Preprocess/mmCIF splitting using gemmi
- Retrieve PISA interface/pocket information and store pocket data
- Extract CA coordinates from divided structures
- Perform local alignments or Foldseek-based alignments
- Compare pockets using alignment and substitution scoring (BLOSUM62)
- Save tabular results and auxiliary JSON files to a results directory

Requirements
- Python 3.8+
- pandas
- gemmi
- pisa (project-specific downloader wrapper used by this package)
- foldseek (optional, required only when using Foldseek alignment)
- Additional dependencies: lib (project helper module), LocalAligner class (local_aligner.py)
- Command-line wrapper: fire

Installation
1. Clone the repository (or copy the project into your workspace).
2. Create a virtualenv and install dependencies:
    python -m venv .venv
    source .venv/bin/activate
    pip install -r requirements.txt

3. Ensure external tools are available:
    - foldseek (if using Foldseek): install and available on PATH

Quick start / Usage
- Basic local alignment run for a single pair:
  pocketmapper search --query 1ABC_A_B --target 2XYZ_C_D --results_dir ./out

- Batch mode using files with one PDB_CHAIN_CHAIN per line:
  pocketmapper search --query queries.txt --target targets.txt --settings config.json

- Use bundled Foldseek DB:
  pocketmapper search --query 1ABC_A_B --target ted --foldseek True --results_dir ./out_fs

Options
- --query: Query identifier or path. Accepts a single PDB_CHAIN_CHAIN (e.g. 1ABC_A_B) or a file with one per line.
- --target: Target identifier or path. Accepts single PDB_CHAIN_CHAIN, file, or 'ted' for bundled Foldseek DB.
- --settings: Path to JSON settings file. CLI args override settings file.
- --cache_dir: Directory for caching downloaded or intermediate files.
- --results_dir: Directory to write results and temporary divided structures.
- --verbose / --debug: Increase log verbosity.
- --foldseek: If true, run Foldseek alignments (requires foldseek binary and appropriate DB).
- --pisa_pockets: Whether to retrieve PISA pockets (default: true).

Configuration (settings JSON)
The settings JSON may include keys such as:
- cache_dir, structure_dir, pocket_dir, pisa_dir, divided_struct_dir
- results_dir, query_dir, target_dir, alignment_path, pocket_comparison_path
- foldseek (bool), pisa_pockets (bool), structure (bool)

Outputs
- alignment.tsv: Alignment report (Foldseek or local aligner)
- pocket_comparison.tsv: Final pocket comparison table
- pisa_pockets and intermediate JSON snapshots under pisa_dir
- unknown_ids.json (if unknown Foldseek aliases are encountered)
- Divided mmCIF files and temporary directories under results_dir

Design / Workflow
1. Parse CLI args and settings
2. Determine types of query/target (single PDB_CHAIN_CHAIN, file, or foldseek DB)
3. Fetch mmCIF structures to cache (structure_dir)
4. Preprocess and divide structures (gemmi) into per-domain files (divided_struct_dir)
5. Retrieve PISA interface data and compute pockets (pisa)
6. Extract CA coordinates from divided mmCIFs
7. Perform alignment (local or foldseek)
8. Compare pockets using alignments and BLOSUM scoring
9. Save results and clean up temporary directories

Extending or Debugging
- Increase verbosity with --verbose or --debug to get more details in logs.
- The library 'lib' contains helper functions for fetching mmCIFs, preprocessing and comparing pockets.
- Local alignment logic is in local_aligner.py (LocalAligner).
- The default logging writes to test.log in the current working directory.

License
- Add your preferred license information here.

Contributing
- Report issues or open pull requests against the repository.
- Add tests for new functionality and keep changes small and focused.

Contact / Authors
- See project repository for maintainer and contributor information.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pocketmapper-0.0.3rc0.tar.gz (24.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pocketmapper-0.0.3rc0-py3-none-any.whl (25.0 kB view details)

Uploaded Python 3

File details

Details for the file pocketmapper-0.0.3rc0.tar.gz.

File metadata

  • Download URL: pocketmapper-0.0.3rc0.tar.gz
  • Upload date:
  • Size: 24.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.2

File hashes

Hashes for pocketmapper-0.0.3rc0.tar.gz
Algorithm Hash digest
SHA256 057eb71cfd7b991bb62186d88d07dd286f76449d958a3b2a2d1625f50590577c
MD5 001f5eb926204a9bc2c902e4c3d59f41
BLAKE2b-256 e6a456f62318cb6cc7663d8fc561e999afb095e34168dc145b254fc705ca2cce

See more details on using hashes here.

File details

Details for the file pocketmapper-0.0.3rc0-py3-none-any.whl.

File metadata

File hashes

Hashes for pocketmapper-0.0.3rc0-py3-none-any.whl
Algorithm Hash digest
SHA256 d8e8be358a6008ade7b755996e1974701bc5d1aa63bb64217a5fbb5c2e32f145
MD5 700a18a2e5b2edccf77940281922ebf8
BLAKE2b-256 1f815bf19a3fe0f7d3e4a3c2e0fe816c7c6b87170cbe8b2ae46bb7c12309a34a

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page