Skip to main content
Join the official 2019 Python Developers SurveyStart the survey!

MS proteomics post processing utilities

Project description

# msstitch – MS proteomics post-processing utilities

Shotgun proteomics has a number of bioinformatic tools available for identification and quantification of peptides, and the subsequent protein inference.

These scripts are written to scratch an itch felt some years ago when combining existing tools, and act as small command-line runnable programs that do small things such as adding values to a PSM table, manipulating percolator results or grouping proteins. They are capable of combining multiple different output formats into complete output.

We currently support the tools we run ourselves, but these could easily be extended to include more tool output formats.

## Tools

  • __msslookup__ - Creates SQLite databases from spectra, search and quantification data
  • __msspercolator__ - Splits, merges, filters percolator XML results, and runs qvality
  • __msspsmtable__ - Filters, splits, merges, and proteingroups on PSM tables from MSGF+. Also adds columns with extra data (quant, percolator, genes, etc)
  • __msspeptable__ - Creates and manipulates peptide tables (merging, quant data additions, etc)
  • __mssprottable__ - Idem for protein tables, including determining protein FDR

### msslookup Generates SQLite database files of various MS data. Can e.g. be used to store statistical or quant data of multiple experiment sample sets, whereafter these can be merged. But it also does protein grouping and sequence filtering thanks to the power of the DB engine.

Example: Store a multi-set tab-separated PSM table:

msslookup psms -i psmtable.txt –dbfile spectralookup.sql –spectracol 2 –fasta ENSEMBL80.fa –map ENS80_biomart.txt

### msspercolator Performs various operations on percolator output XML, e.g. splitting into target and decoy, merging, filtering peptides, runs qvality and reassigns qvality output statistics to existing percolator output.

Example: filter unique peptides on best score of a merged percolator file

msspercolator filteruni -i percolator.xml -o filteredpercolator.xml

### msspsmtable Use this for modifications of tab-separated PSM tables generated by MSGF+ (supported) or other tools.

Example: add MS2 quant data to PSM table from SQLite lookup (resulting from mslookup)

msspsmtable quanttsv -i psmtable.txt –dbfile db.sqlite –isobaric -o quantpsms.txt

Example 2: Split PSM table into multiple tables on column “Biological set”

msspsmtable splittsv -i psmtable.txt –bioset

### msspeptable Creates and modifies peptide tables

Example: create a peptide table by filtering best peptides from PSM table and removing isobaric quant data. Retains MS1 quant data by taking the highest MS1 quant for a given peptide sequence.

msspeptable psm2pep -i psmtable.txt –spectracol 2 –scorecolpattern svm –ms1quantcolpattern area –isobquantcolpattern tmt10plex -o peptidetable.txt

Example: Create column in peptide table with linear modeled q-values

msspeptable modelqvals -i peptides.txt –qcolpattern “^q-value” –scorecolpattern svm -o peptide_linearmodels.txt

### mssprottable Creates and modifies protein tables, also runs qvality on these for FDR calculation

Example: Add best-scoring peptide to protein table (Q-score by Savitsky et al 2014)

mssprottable bestpeptide -i proteins.txt –peptable peptides.txt –scorecolpattern svm –logscore -o proteins_bestpep.txt

Example: Add protein picked FDR to protein table using Q-scores

mssprottable fdr -i proteins.txt –decoyfn decoyproteins.txt –targetfasta ENSEMBL80.fa –decoyfasta decoy_ENSEMBL80.fa –picktype fasta -o proteins_fdr.txt

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Files for msstitch, version 2.16
Filename, size File type Python version Upload date Hashes
Filename, size msstitch-2.16.tar.gz (7.2 MB) File type Source Python version None Upload date Hashes View hashes

Supported by

Elastic Elastic Search Pingdom Pingdom Monitoring Google Google BigQuery Sentry Sentry Error logging AWS AWS Cloud computing DataDog DataDog Monitoring Fastly Fastly CDN SignalFx SignalFx Supporter DigiCert DigiCert EV certificate StatusPage StatusPage Status page