Skip to main content

Annotate peptide spectrum matches (PSMs) from Sage with ambiguity information

Project description

SagePeptideAmbiguityAnnotator

A tool for annotating peptide ambiguity in Sage search engine results based on fragment ion coverage.

Description

The SagePeptideAmbiguityAnnotator processes peptide spectrum matches (PSMs) from Sage search engine output and annotates peptides with ambiguity information based on fragment ion coverage. It helps identify which parts of a peptide sequence have strong evidence from fragment ions and which parts are less certain. For open searches it can also place the observed mass shift as an internal modification, or labile modification if complete fragment ion coverage is observed.

Examples (simple)

sequence    PEPTIDE
b-ions      0110000
y-ions      0000110
annot-seq   (?PE)PTI(?DE)
sequence    PEPTIDE
b-ions      0111000
y-ions      0000000
annot-seq   (?PE)PT(?IDE)

Examples (open search)

sequence    PEPTIDE
b-ions      0111000
y-ions      0000001
mass-shift  100
annot-seq   (?PE)PT(?ID)[100]E

# Observed mass shift is localized to: ID
sequence    PEPTIDE
b-ions      0111000
y-ions      0001111
mass-shift  100
annot-seq   {100}(?PE)PTIDE

# since the forward and reverse ion series overlapped, the mass shift could not be localized and is added as a labile modification.

Installation

From PyPI

pip install sage-peptide-ambiguity-annotator

From Source

git clone https://github.com/pgarrett-scripps/SagePeptideAmbiguityAnnotator.git
cd SagePeptideAmbiguityAnnotator
pip install -e .

Usage

Command Line Interface

# Normal Search (Without Mass Shifts)
sage-annotate --results results.sage.parquet \
              --fragments matched_fragments.sage.parquet \
              --output annotated_results.sage.parquet \
# Open Search (With Mass Shifts)
sage-annotate --results results.sage.parquet \
              --fragments matched_fragments.sage.parquet \
              --output annotated_results.sage.parquet \
              --mass_error_type ppm \
              --mass_error_value 50.0 \
              --mass_shift

Streamlit Web Application

streamlit run streamlit_app.py

Then open your browser at http://localhost:8501

Streamlit Community Cloud

try me: https://sage-peptide-ambiguity-annotator.streamlit.app/

Python API

from sage_peptide_ambiguity_annotator.main import (
    read_input_files, 
    process_psm_data, 
    save_output
)

# Read input files
results_df, fragments_df = read_input_files(
    "results.sage.parquet", 
    "matched_fragments.sage.parquet"
)

# Process the data
output_df = process_psm_data(
    results_df, 
    fragments_df,
    mass_error_type="ppm",
    mass_error_value=50.0,
    use_mass_shift=True
)

# Save the output
save_output(output_df, "annotated_results.sage.parquet")

Different Files / Single Peptides

import peptacular as pt

ambiguity_sequence = pt.annotate_ambiguity(
    sequence='PEPTIDE', 
    forward_coverage=[0,1,0,1,0,0,0], 
    reverse_coverage=[0,0,0,1,1,1,0], 
    mass_shift=100.0
)

Input File Requirements

Sage Results File

The Sage results file must have the following columns:

  • psm_id: Unique identifier for each PSM
  • peptide: The peptide sequence with modifications
  • stripped_peptide: The peptide sequence without modifications
  • expmass: Experimental mass
  • calcmass: Calculated mass

Output

The output file contains all columns from the input results file plus:

  • ambiguity_sequence: Annotated peptide sequence with ambiguity information
  • mass_shift: The observed mass shift between the experimental and observed precursor masses. (Only applicable with open search)

License

This project is licensed under the MIT License - see the LICENSE file for details.

Dependencies

  • pandas
  • fastparquet
  • peptacular
  • streamlit (for web app)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sage_peptide_ambiguity_annotator-1.0.1.tar.gz (7.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

File details

Details for the file sage_peptide_ambiguity_annotator-1.0.1.tar.gz.

File metadata

File hashes

Hashes for sage_peptide_ambiguity_annotator-1.0.1.tar.gz
Algorithm Hash digest
SHA256 4b506c29e650dbf2c6f412b4ffa36a3fa1f321558a8efb84dae366f0a7a66920
MD5 f2ba019801479c29106c37fa36317342
BLAKE2b-256 313314cd2447e0cf932317de1014a182b3442d908cdb4654b46c1260f2d1de66

See more details on using hashes here.

File details

Details for the file sage_peptide_ambiguity_annotator-1.0.1-py3-none-any.whl.

File metadata

File hashes

Hashes for sage_peptide_ambiguity_annotator-1.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 fd66dca215531ae91cf2577e7ad298853a1a86ca657671258625549001310e60
MD5 30423b23f6d8c6edf853065f0df131da
BLAKE2b-256 cf3cc90f56751a10a776bf68c670565c92d8871929bcf636670ebaa91fc038cf

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page