Store, manage and query the local copy of the PDB (Protein Data Bank) resources.
Project description
localpdb
- Store a local copy of the PDB data and related protein resources,
- Access and query the data convinently through pandas
DataFrame
structures, - Update with weekly releases of a new data with full history versioning,
- Customize to your needs or add other data sources with simple plugin system.
Quick start
- Install localpdb with pip and run setup script to download PDB data and protein structures in the PDB format:
pip3 install localpdb
localpdb_setup.py -db_path /path/to/your/localpdb --fetch_pdb
- Simple pipeline that selects for further analysis a representative set of protein structures:
- solved with X-ray crystallography,
- with resolution better than 2.5 angstroms,
- deposited in 2010 or later,
- with redundancy removed at the sequence level.
from localpdb import PDB
import gzip
lpdb = PDB(db_path='/path/to/your/localpdb')
# Select protein structures solved with X-ray diffraction (resolution above 2.5 A)
lpdb.entries = lpdb.entries.query('type == "prot"')
lpdb.entries = lpdb.entries.query('method == "diffraction"')
lpdb.entries = lpdb.entries.query('resolution <= 2.5')
lpdb.entries = lpdb.entries.query('deposition_date.dt.year >= 2010')
# Remove redundancy (select only representative structure from each sequence cluster)
lpdb.load_clustering_data(cutoff=90)
lpdb.chains = lpdb.chains[lpdb.chains['clust-90'].notnull()]
representative = lpdb.chains.groupby(by='clust-90')['resolution'].idxmin()
lpdb.chains = lpdb.chains.loc[representative]
for pdb_fn in lpdb.chains.pdb_fn:
# your analysis here
Additional resources
(In development)
Plugins
(In development)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
localpdb-0.1.0.tar.gz
(15.1 kB
view hashes)
Built Distribution
localpdb-0.1.0-py3-none-any.whl
(20.0 kB
view hashes)