Project Description

MS Plotting utilities

These scripts provide some simple utilities for mining and plotting MS data. The data is expected to be available in mzML or the openMS featureXML format.

Script files can be run as standalone apps or as a library.


Command Line Usage: [options]

-h, --help show this help message and exit
-m MZML, --mzml=MZML
 Input mzML file
-f FEATURE, --feature=FEATURE
 comma separated list of featureXML files
-o OUTFILE, --output=OUTFILE
 output filename (default=’plot.pdf’)
-s, --showX11 Show X11 interactive plot.
-x XROT, --xrot=XROT
 Xaxis rotation for plot (default (30)
-y YROT, --yrot=YROT
 Yaxis rotation for plot (default: -45)
-l CLIP, --limit=CLIP
 Clip threshold for plot
-c COLOURS, --colours=COLOURS
 list of colours with which to plot features. Colours will be recycled when needed (default=’r,g,m,c,y,k’)
-r RTMIN, --rtmin=RTMIN
 minimum retention time
-R RTMAX, --rtmax=RTMAX
 maximum retention time
-t MZMIN, --mzmin=MZMIN
 minimum m/z value
-T MZMAX, --mzmax=MZMAX
 maximum m/z value
-w RTWINDOW, --rtwindow=RTWINDOW
 Window around plot area in which to identify feature peaks
-p, --plotms2 Plot MS2 events
-d MS2WIN, --ms2win=MS2WIN
 MS2 ion capture window size

Module usage: mzml is the only non-named argument. All arguments as per the command line version except that lists should be provided as arrays rather than comma-delimited text.

>>> import MSPlot.msplot3d
>>> MSPlot.plot3d(mzml, featfiles=[], outfile='plot.pdf', show=False, xrot=30, yrot=-45, featcols=['r','g','b','m','c','k'], thresh=100, minrt=None,maxrt=None,minmz=None, maxmz=None, ms2win=2.0, rtwindow=20.0, plotms2=True)


Module Usage:

>>> import Unimod.unimod
>>> import MSPlot.ms1pep
# protein amino acid sequence
>>> frags=MSPlot.ms1pep.digestprotein(peptide, enzyme=1, overlap=True, unfavoured=False)
    # overlap - include 1 missed cleavage
    # unfavoured - include unfavourable sites (see the EMBOSS documentation for more details)
    # enzyme - numerical enzyme selection according to the EMBOSS documentation
    Enzymes and Reagents
         1 : Trypsin
         2 : Lys-C
         3 : Arg-C
         4 : Asp-N
         5 : V8-bicarb
         6 : V8-phosph
         7 : Chymotrypsin
         8 : CNBr
>>> frags
[{'start': '1', 'end': '15', 'sequence': 'ACDEFGHIKLMNPQR'}, {'start': '16', 'end': '23', 'sequence': 'STVWYKK'}, {'start': '24', 'end': '32', 'sequence': 'ACDPRFGHI'}, {'start': '1', 'end': '23', 'sequence': 'ACDEFGHIKLMNPQRSTVWYKK'}]
# list of digested fragment peptides.
>>> camc=Unimod.unimod.database.get_label('Carbamidomethyl')['delta_mono_mass']
# get delta mass for fixed cysteine modification.
>>> MSPlot.ms1pep.listmz(frags[0]['sequence'], charges=[2,3,4], modifications=[], fixedmods={'C': camc })
[908.43562931579208, 605.95969455552802, 454.72172717539604]
# returns a list of mz values
>>> mzlist=MSPlot.ms1pep.listmz(frags[1]['sequence'], charges=[2,3,4], modifications=['2 Phospho (T)',], fixedmods={'C': camc })
>>> mzlist
[536.7538243454826, 358.17182457532175, 268.8808246902413]

# listmz does not permute, so for variable modifications, call it with each permutation.

>>> xics=MSPlot.ms1pep.getXIC("LCMSrun.mzML", mzlist, tolerance=0.01, minrt=0, maxrt=None)
>>> xics.keys()
['rt', '358.171824575', '536.753824345', '268.88082469']
# extracts the XIC for each mz ratio. The keys are a truncated string version of the full precision calculated mass.

MSPlot.ms1pep.plotXIC(xics, colours=[‘r’,’g’,’b’,’m’,’c’,’k’], outfile=”plot.pdf”, title=”XIC plot”): # plot the XICs. Any number can be plotted, but they are separated by 7% max(XIC)


pyMS - OpenMS data parsing and analysis by Niek de Klein pymzml - mzML parsing library pyteomics - package for calculating mz values and more numpy - large scale data handling matplotlib - plotting library EMBOSS (for the digest application) Unimod - module for handling a Unimod database.


Dr David Martin, University of Dundee

