Annotate peptide spectrum matches (PSMs) from Sage with ambiguity information
Project description
SagePeptideAmbiguityAnnotator
A tool for annotating peptide ambiguity in Sage search engine results based on fragment ion coverage.
Description
The SagePeptideAmbiguityAnnotator processes peptide spectrum matches (PSMs) from Sage search engine output and annotates peptides with ambiguity information based on fragment ion coverage. It helps identify which parts of a peptide sequence have strong evidence from fragment ions and which parts are less certain. For open searches it can also place the observed mass shift as an internal modification, or labile modification if complete fragment ion coverage is observed.
Installation
From PyPI
pip install sage-peptide-ambiguity-annotator
From Source
git clone https://github.com/pgarrett-scripps/SagePeptideAmbiguityAnnotator.git
cd SagePeptideAmbiguityAnnotator
pip install -e .
Usage
Command Line Interface
sage-annotate --results results.sage.parquet \
--fragments matched_fragments.sage.parquet \
--output annotated_results.sage.parquet \
--mass_error_type ppm \
--mass_error_value 50.0 \
--mass_shift
Streamlit Web Application
streamlit run streamlit_app.py
Then open your browser at http://localhost:8501
Python API
from sage_peptide_ambiguity_annotator.main import (
read_input_files,
process_psm_data,
save_output
)
# Read input files
results_df, fragments_df = read_input_files(
"results.sage.parquet",
"matched_fragments.sage.parquet"
)
# Process the data
output_df = process_psm_data(
results_df,
fragments_df,
mass_error_type="ppm",
mass_error_value=50.0,
use_mass_shift=True
)
# Save the output
save_output(output_df, "annotated_results.sage.parquet")
Input File Requirements
Sage Results File
The Sage results file must have the following columns:
psm_id: Unique identifier for each PSMpeptide: The peptide sequence with modificationsstripped_peptide: The peptide sequence without modificationsexpmass: Experimental masscalcmass: Calculated mass
Output
The output file contains all columns from the input results file plus:
ambiguity_sequence: Annotated peptide sequence with ambiguity informationmass_shift: The observed mass shift between the experimental and observed precursor masses. (Only applicable with open search)
License
This project is licensed under the MIT License - see the LICENSE file for details.
Dependencies
- pandas
- fastparquet
- peptacular
- streamlit (for web app)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file sage_peptide_ambiguity_annotator-1.0.0.tar.gz.
File metadata
- Download URL: sage_peptide_ambiguity_annotator-1.0.0.tar.gz
- Upload date:
- Size: 7.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.9.22
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
9a87c26a7741d9c2280ece9e4580957398755342cf543e6de0dc5abe616c08b9
|
|
| MD5 |
5d2c58311e3c541f3a40287920c89d50
|
|
| BLAKE2b-256 |
a2f115fd528ad6a98e06f9fed9dc85cd4e5df964c386aa7523bf704e2ddbf000
|
File details
Details for the file sage_peptide_ambiguity_annotator-1.0.0-py3-none-any.whl.
File metadata
- Download URL: sage_peptide_ambiguity_annotator-1.0.0-py3-none-any.whl
- Upload date:
- Size: 8.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.9.22
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
10188d0131283994eda046ef71ea9b8f767081166992cb1d579a87ce80b33bde
|
|
| MD5 |
a26519c78dad2931579334d4c8c30a2a
|
|
| BLAKE2b-256 |
c2db9d15b7ca6ceaca2b7cb57e9a6d6698dfcb68da88d31d04c3a08659d270ed
|