Skip to main content

Convert protien 3D structure BioUnit file ('.pdb1') to a standard 3D structure file ('.pdb').

Project description

pdb1topdb

PyPi Version License: MIT

A Python package to convert protien 3D structure PDB-BioUnit file (like .pdb1 or .pdb12) to a standard 3D structure PDB file (.pdb).

Installation

Install pdb1topdb package with pip:

pip install pdb1topdb

CLI usage

Use as a command-line interface (CLI):

pdb1topdb --help
pdb1topdb ./test_data/5ceg.pdb1 -o ./test_data/5ceg_biounit1.pdb
pdb1topdb ./test_data/5ceg.pdb1 # you can omit output path (will be derived)

Or convert all BioUnits files in a directory with one single command:

pdb1topdb-folder --help
pdb1topdb-folder ./test_data/ -o ./test_data/

Python usage

Use in-line in python:

# Import
from pdb1topdb import pdb1topdb, read_chains_mapping

# Convert .pdb1 -> .pdb
pdb_str, metadata = pdb1topdb(
    pdb1_path="./test_data/5ceg.pdb1",
    pdb_path="./test_data/5ceg_biounit1.pdb",
    remove_initial_biounit_file=False, # optional
    metadata_path=None, # optional
    verbose=True, # optional
)

# Read chains mapping
mapping = read_chains_mapping(pdb_path="./test_data/5ceg_biounit1.pdb")

Description

Why ? PDB files from the Protein Data Bank that are obtained by X-ray crystallography describe the asymmetric unit as .pdb or .cif files (the smallest repeating unit in the crystal), while the biological assembly is provided as a BioUnit file like .pdb1 or .pdb12 files (the biologically relevant complex of chains). BioUnit files can be more biologically relevant, however many structural bioinformatics tools expect a standard .pdb, so conversion is useful.

How ? pdb1topdb maps every chain from every model in the BioUnit to a unique chain ID in the output PDB. For example, if the BioUnit contains:

  • Model 1: chains A, B
  • Model 2: chains A, B then pdb1topdb uses the following mapping:
  • (Model 1, A) → A, (Model 1, B) → B
  • (Model 2, A) → C, (Model 2, B) → D

It merges coordinates into a single model and updates chain-related metadata lines such as SEQRES, HELIX, and LINK. The chain mapping is injected in the header as REMARK 0.

Technical Notes

PDB chain IDs are a single character, so output is limited by the available alphabet.

The default alphabet is: ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789

Mapping rules:

  • preserve a chain ID if it is in the alphabet and not already used
  • otherwise assign the next unused character from the alphabet

This means the default conversion can handle up to 62 unique chain IDs. You can override the default alphabet using argument chains_alphabet, however nonstandard chain IDs may cause compatibility issues with other software.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pdb1topdb-1.0.0.tar.gz (15.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pdb1topdb-1.0.0-py3-none-any.whl (20.2 kB view details)

Uploaded Python 3

File details

Details for the file pdb1topdb-1.0.0.tar.gz.

File metadata

  • Download URL: pdb1topdb-1.0.0.tar.gz
  • Upload date:
  • Size: 15.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.1

File hashes

Hashes for pdb1topdb-1.0.0.tar.gz
Algorithm Hash digest
SHA256 8e81f2329b37b51887c1a86ecbbe920347369457333f0045b7ea8c4957777e95
MD5 6555dc6276f9315a791ad7bd0f18705f
BLAKE2b-256 78643c04b9612171b586a66a875be6f05e2a3d2c3e71ea4843edd12799fb83e3

See more details on using hashes here.

File details

Details for the file pdb1topdb-1.0.0-py3-none-any.whl.

File metadata

  • Download URL: pdb1topdb-1.0.0-py3-none-any.whl
  • Upload date:
  • Size: 20.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.1

File hashes

Hashes for pdb1topdb-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 d482ff598a77dd4a5ee0689a30f7312a9ac8793157c90594e4b95feea176826a
MD5 b56341d8d62735478ee740556e122dfe
BLAKE2b-256 21a098e8f7e99c18ebb2ec36177cbfa43af1fea7cfa02d1931ed6ed0ae65410c

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page