Simple tool to compare sage quant results!
Project description
QuantCompare
This script is designed for calculating quantitative ratios between groups of TMT reporter ion intensities.
Install
From Pip
pip install quantcompare
How to Run
The main script is located at quantcompare.main:run, but the 'compare' entry point is provided by the package.
compare results.sage.parquet output --pairs 1,2;1,3;1,4 --groups 1:0,1,2;2:3,4,5;3:6,7,8;4:9,10
Input File (results.sage.parquet)
Required Columns
Columns | Description |
---|---|
reporter_ion_intensity | Intensities of TMT channels |
filename | The PSMs file (can be anything) |
peptide | The modified peptide sequence |
charge | The peptides charge state |
proteins | The protein names/ids (separated by ';') |
scannr | The peptides scan number |
Arguments
Posisitional Arguments | Description |
---|---|
input_file | Input parquet file. Must contain the following columns: "reporter_ion_intensity", "filename", "peptide", "charge", "proteins", and "scannr". |
output_folder | Output folder for writing output files to. This folder will be created if it does not exist. File names can be specified with the --psm_file, --peptide_file, and --protein_file arguments. |
Additional Arguments | Description |
---|---|
-h, --help | show this help message and exit |
--psm_file PSM_FILE | The file name for the psms ratios file, will be inside the output_folder dir. |
--peptide_file PEPTIDE_FILE | The file name for the peptide ratios file, will be inside the output_folder dir. |
--protein_file PROTEIN_FILE | The file name for the protein ratios file, will be inside the output_folder dir. |
--pairs PAIRS | Pairs of groups to compare. Pairs must be in the following format: "Group1,Group2;...;Group1:Group3". Each pair must only contain 2 values (separated by a comma), and these values must match those in the groups argument. Multiple pairs must be separated by a semicolon. |
--groups GROUPS | The group labels mapped to the indices for their reporter ion channels. Groups must be specified in the following format: "GroupName1:Index1,...,Index3;GroupName2:Index1,...,Index3;". Group names must be unique and can be of any type. Indexes must be separated by a comma, and multiple groups must be separated by a semicolon. |
--filter {unique,all} | Filter type for peptides. Unique will only keep peptides which map to a single protein. All will keep all peptides. |
--groupby_filename | Group by the filename for psm, peptide, and proteins. This will add a filename column to the output files. |
--output_type {csv,parquet} | Output file type. |
--inf_replacement INF_REPLACEMENT | Infinite values cause many problems with the statistics. This value will be used to replace infinite values at the log2 ratio level. Default is 100. (-inf will be replaced with -100 and inf will be replaced with 100) |
--no_psms | Don't output a PSM ratio file. |
--no_peptides | Don't output a Peptide ratio file. |
--no_proteins | Don't output a Protein ratio file. |
Basic Steps
- Read Input File
- Generate QuantGroup's (dataclass to store group_name, group_indecies, and the psm info)
- Group by PSMs (peptide, charge, Optional[filename])
- Group by Peptides (peptide, Optional[filename])
- Filter QuantGroup's based on --filter option
- Group by Proteins (proteins, Optional[filename])
Grouping Steps
- Sort QuantGroups based on grouping criteria.
- For each group of QuantGroups, create a dict mapping groupname to a list of QuantGroups.
- Loop over the pairs of groups (specified by pairs argument)
- Create a GroupRatio object for each pair (dataclass which stores the list of QuantGroups for each groupname of the pair)
- Calculate the qvalue for all GroupRatios. The GroupRatio dataclass has properties for pvalue, ratio.... ect, but not for qvalue since this requires knowledge of all GroupRatios.
Ratio Calculation (Psudo Code)
This method was chosen since one off groups (groups which contain only one datapoint) cannot generate a pvalue since the degrees of freedom would be 0.
The following values are not accurate at all.
grp1 = [[100, 110, 105], [250, 245, 255]]
grp2 = [[200, 210, 205], [500, 490, 0]]
1) Calculate the mean intensity of the reference channel (first groupname in pairs)
grp1_mean = mean(grp1)
grp1_mean = [100, 250]
2) Divide the reference mean intensity by the other groups reporter ions.
ratios = grp1_mean / grp2
ratios = [[100/200, 100/210, 100/205], [250/500, 250/490, 250/0]]
ratios = [[~0.5, ~0.45, ~0.47], [~0.5, ~0.55, inf]]
3) Flatten the array, and Remove NANs (caused by 0/0)
ratios = flatten(ratios)
ratios = [~0.5, ~0.45, ~0.47, ~0.5, ~0.55, ~inf]
ratios = remove_nan(ratios)
ratios = [~0.5, ~0.45, ~0.47, ~0.5, ~0.55, ~inf]
4) Calcualte the log2 ratios, and replace -inf and +inf with the inf_replacement value, default is 100)
log2_ratios = log2(ratios)
log2_ratios = [~-2, ~-2, ~-2, ~-2, ~-2, ~-2, inf]
log2_ratios = replace_inf(log2_ratios)
log2_ratios = [~-2, ~-2, ~-2, ~-2, ~-2, ~-2, 100]
5) Calculate the mean and std
log2_ratio = mean(log2_ratios)
log2_ratio = ~14
log2_ratio_std = std(log2_ratios)
log2_ratio_std = ~5
6) Calculate the p-value for a one-sample t-test comparing the sample mean to 0.
log2_ratio_pvalue = pvalue(log2_ratio, log2_ratio_std, len(grp1))
log2_ratio_pvalue = ~0.50
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
quantcompare-0.0.2.tar.gz
(14.1 kB
view details)
Built Distribution
File details
Details for the file quantcompare-0.0.2.tar.gz
.
File metadata
- Download URL: quantcompare-0.0.2.tar.gz
- Upload date:
- Size: 14.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.0.0 CPython/3.9.19
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | f2f64eae1ae2d5a5b3100ec944dc2f78f6f1f4ea4db420f04c7e30ceaacc0ca8 |
|
MD5 | cf89a78a84e1e42975dae3999766c9ba |
|
BLAKE2b-256 | 195b92e8be5c68b9cc0c6bb6adc15fe2de52cfaeb4bbc2bea2979a970e1fbace |
File details
Details for the file quantcompare-0.0.2-py3-none-any.whl
.
File metadata
- Download URL: quantcompare-0.0.2-py3-none-any.whl
- Upload date:
- Size: 12.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.0.0 CPython/3.9.19
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 4478795b1518588ae3eb19f5e487c828b7acea008017de6ba0510d013a19ffa3 |
|
MD5 | 314307417c9e2322ffef208f8e3edf77 |
|
BLAKE2b-256 | 67ded021ef967838e66b5bd5e02739b47d3c1a2eddff9c52c5379c008715ab2c |