Annotate peptide spectrum matches (PSMs) from Sage with ambiguity information
Project description
SagePeptideAmbiguityAnnotator
A tool for annotating peptide ambiguity in Sage search engine results based on fragment ion coverage.
Description
The SagePeptideAmbiguityAnnotator processes peptide spectrum matches (PSMs) from Sage search engine output and annotates peptides with ambiguity information based on fragment ion coverage. It helps identify which parts of a peptide sequence have strong evidence from fragment ions and which parts are less certain. For open searches it can also place the observed mass shift as an internal modification, or labile modification if complete fragment ion coverage is observed.
Examples (simple)
sequence PEPTIDE
b-ions 0110000
y-ions 0000110
annot-seq (?PE)PTI(?DE)
sequence PEPTIDE
b-ions 0111000
y-ions 0000000
annot-seq (?PE)PT(?IDE)
Examples (open search)
sequence PEPTIDE
b-ions 0111000
y-ions 0000001
mass-shift 100
annot-seq (?PE)PT(?ID)[100]E
# Observed mass shift is localized to: ID
sequence PEPTIDE
b-ions 0111000
y-ions 0001111
mass-shift 100
annot-seq {100}(?PE)PTIDE
# since the forward and reverse ion series overlapped, the mass shift could not be localized and is added as a labile modification.
Installation
From PyPI
pip install sage-peptide-ambiguity-annotator
From Source
git clone https://github.com/pgarrett-scripps/SagePeptideAmbiguityAnnotator.git
cd SagePeptideAmbiguityAnnotator
pip install -e .
Usage
Command Line Interface
# Normal Search (Without Mass Shifts)
sage-annotate --results results.sage.parquet \
--fragments matched_fragments.sage.parquet \
--output annotated_results.sage.parquet \
# Open Search (With Mass Shifts)
sage-annotate --results results.sage.parquet \
--fragments matched_fragments.sage.parquet \
--output annotated_results.sage.parquet \
--mass_error_type ppm \
--mass_error_value 50.0 \
--mass_shift
Streamlit Web Application
streamlit run streamlit_app.py
Then open your browser at http://localhost:8501
Streamlit Community Cloud
try me: https://sage-peptide-ambiguity-annotator.streamlit.app/
Python API
from sage_peptide_ambiguity_annotator.main import (
read_input_files,
process_psm_data,
save_output
)
# Read input files
results_df, fragments_df = read_input_files(
"results.sage.parquet",
"matched_fragments.sage.parquet"
)
# Process the data
output_df = process_psm_data(
results_df,
fragments_df,
mass_error_type="ppm",
mass_error_value=50.0,
use_mass_shift=True
)
# Save the output
save_output(output_df, "annotated_results.sage.parquet")
Different Files / Single Peptides
import peptacular as pt
ambiguity_sequence = pt.annotate_ambiguity(
sequence='PEPTIDE',
forward_coverage=[0,1,0,1,0,0,0],
reverse_coverage=[0,0,0,1,1,1,0],
mass_shift=100.0
)
Input File Requirements
Sage Results File
The Sage results file must have the following columns:
psm_id: Unique identifier for each PSMpeptide: The peptide sequence with modificationsstripped_peptide: The peptide sequence without modificationsexpmass: Experimental masscalcmass: Calculated mass
Output
The output file contains all columns from the input results file plus:
ambiguity_sequence: Annotated peptide sequence with ambiguity informationmass_shift: The observed mass shift between the experimental and observed precursor masses. (Only applicable with open search)
License
This project is licensed under the MIT License - see the LICENSE file for details.
Dependencies
- pandas
- fastparquet
- peptacular
- streamlit (for web app)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file sage_peptide_ambiguity_annotator-1.0.1.tar.gz.
File metadata
- Download URL: sage_peptide_ambiguity_annotator-1.0.1.tar.gz
- Upload date:
- Size: 7.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.9.23
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4b506c29e650dbf2c6f412b4ffa36a3fa1f321558a8efb84dae366f0a7a66920
|
|
| MD5 |
f2ba019801479c29106c37fa36317342
|
|
| BLAKE2b-256 |
313314cd2447e0cf932317de1014a182b3442d908cdb4654b46c1260f2d1de66
|
File details
Details for the file sage_peptide_ambiguity_annotator-1.0.1-py3-none-any.whl.
File metadata
- Download URL: sage_peptide_ambiguity_annotator-1.0.1-py3-none-any.whl
- Upload date:
- Size: 8.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.9.23
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
fd66dca215531ae91cf2577e7ad298853a1a86ca657671258625549001310e60
|
|
| MD5 |
30423b23f6d8c6edf853065f0df131da
|
|
| BLAKE2b-256 |
cf3cc90f56751a10a776bf68c670565c92d8871929bcf636670ebaa91fc038cf
|