Convert protien 3D structure BioUnit file ('.pdb1') to a standard 3D structure file ('.pdb').
Project description
pdb1topdb
A Python package to convert protien 3D structure PDB-BioUnit file (like .pdb1 or .pdb12) to a standard 3D structure PDB file (.pdb).
Installation
Install pdb1topdb package with pip:
pip install pdb1topdb
CLI usage
Use as a command-line interface (CLI):
pdb1topdb --help
pdb1topdb ./test_data/5ceg.pdb1 -o ./test_data/5ceg_biounit1.pdb
pdb1topdb ./test_data/5ceg.pdb1 # you can omit output path (will be derived)
Or convert all BioUnits files in a directory with one single command:
pdb1topdb-folder --help
pdb1topdb-folder ./test_data/ -o ./test_data/
Python usage
Use in-line in python:
# Import
from pdb1topdb import pdb1topdb, read_chains_mapping
# Convert .pdb1 -> .pdb
pdb_str, metadata = pdb1topdb(
pdb1_path="./test_data/5ceg.pdb1",
pdb_path="./test_data/5ceg_biounit1.pdb",
remove_initial_biounit_file=False, # optional
metadata_path=None, # optional
verbose=True, # optional
)
# Read chains mapping
mapping = read_chains_mapping(pdb_path="./test_data/5ceg_biounit1.pdb")
Description
Why ?
PDB files from the Protein Data Bank that are obtained by X-ray crystallography describe the asymmetric unit as .pdb or .cif files (the smallest repeating unit in the crystal), while the biological assembly is provided as a BioUnit file like .pdb1 or .pdb12 files (the biologically relevant complex of chains). BioUnit files can be more biologically relevant, however many structural bioinformatics tools expect a standard .pdb, so conversion is useful.
How ?
pdb1topdb maps every chain from every model in the BioUnit to a unique chain ID in the output PDB. For example, if the BioUnit contains:
- Model 1: chains
A,B - Model 2: chains
A,Bthenpdb1topdbuses the following mapping: - (Model 1,
A) →A, (Model 1,B) →B - (Model 2,
A) →C, (Model 2,B) →D
It merges coordinates into a single model and updates chain-related metadata lines such as SEQRES, HELIX, and LINK. The chain mapping is injected in the header as REMARK 0.
Technical Notes
PDB chain IDs are a single character, so output is limited by the available alphabet.
The default alphabet is:
ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789
Mapping rules:
- preserve a chain ID if it is in the alphabet and not already used
- otherwise assign the next unused character from the alphabet
This means the default conversion can handle up to 62 unique chain IDs. You can override the default alphabet using argument chains_alphabet, however nonstandard chain IDs may cause compatibility issues with other software.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file pdb1topdb-1.0.0.tar.gz.
File metadata
- Download URL: pdb1topdb-1.0.0.tar.gz
- Upload date:
- Size: 15.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.1
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8e81f2329b37b51887c1a86ecbbe920347369457333f0045b7ea8c4957777e95
|
|
| MD5 |
6555dc6276f9315a791ad7bd0f18705f
|
|
| BLAKE2b-256 |
78643c04b9612171b586a66a875be6f05e2a3d2c3e71ea4843edd12799fb83e3
|
File details
Details for the file pdb1topdb-1.0.0-py3-none-any.whl.
File metadata
- Download URL: pdb1topdb-1.0.0-py3-none-any.whl
- Upload date:
- Size: 20.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.1
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d482ff598a77dd4a5ee0689a30f7312a9ac8793157c90594e4b95feea176826a
|
|
| MD5 |
b56341d8d62735478ee740556e122dfe
|
|
| BLAKE2b-256 |
21a098e8f7e99c18ebb2ec36177cbfa43af1fea7cfa02d1931ed6ed0ae65410c
|