A Python interface to the ProteomeXchange Repository
Project description
ppx: A Python interface to ProteomeXchange
Overview
ppx provides a simple means to access the ProteomeXchange repository from Python. Using ProteomeXchange identifiers, you can retrieve the metadata associated with a project and download project files from PRIDE, MassIVE, or other partner repositories.
For full documentation and examples, visit: https://ppx.readthedocs.io
Installation
ppx is pip
installable. ppx requires Python 3.6+ and only depends on packages
in the Python Standard Library.
pip3 install ppx
Examples
First create a PXDataset object using a valid ProteomeXchange identifier:
>>> dat = PXDataset("PXD000001")
We can then extract various data about the ProteomeXchange project from the PXDataset:
>>> dat.references
# ['Gatto L, Christoforou A. Using R and Bioconductor for proteomics data
# analysis. Biochim Biophys Acta. 2014 1844(1 pt a):42-51']
>>> dat.url
# 'ftp://ftp.pride.ebi.ac.uk/pride/data/archive/2012/03/PXD000001'
>>> dat.taxonomies
# ['Erwinia carotovora']
>>> dat.list_files()
# ['F063721.dat', 'F063721.dat-mztab.txt',
# 'PRIDE_Exp_Complete_Ac_22134.xml.gz', 'PRIDE_Exp_mzData_Ac_22134.xml.gz',
# 'PXD000001_mztab.txt', 'README.txt',
# 'TMT_Erwinia_1uLSike_Top10HCD_isol2_45stepped_60min_01-20141210.mzML',
# 'TMT_Erwinia_1uLSike_Top10HCD_isol2_45stepped_60min_01-20141210.mzXML',
# 'TMT_Erwinia_1uLSike_Top10HCD_isol2_45stepped_60min_01.mzXML',
# 'TMT_Erwinia_1uLSike_Top10HCD_isol2_45stepped_60min_01.raw',
# 'erwinia_carotovora.fasta']
Lastly, we can download files that we're interested in:
# Download "README.txt" to the "test" directory
>>> dat.download(files="README.txt", dest_dir="test")
If you are an R user...
ppx was inpsired the rpx R package by Laurent Gatto. Check it out on Bioconductor and GitHub.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.