A package to estimate FDR in mass-spectrometry searching results using decoy-free approach
Project description
Decoy-Free FDR Estimation for Mass-Spectrometry
A package to estimate FDR in mass-spectrometry searching results using decoy-free approach
Install
pip install decoyfree-msfdr
Usage
Regular MS/MS search FDR estimation
decoyfree-msfdr [-h] [-f INPUT_FILE] [-d INPUT_DIR] [-t INPUT_TYPE] [-m EVAL_MODEL]
[--score-field SCORE_FIELD] [--threads THREADS] [-c CONSTRAINTS] [-s MODEL_SAMPLES]
[-r RANDOM_SIZE] [--tolerance TOLERANCE] [--show_plotting] [--out_dir OUT_DIR]
Cross-linked peptide MS/MS search FDR estimation
decoyfree-xlmsfdr [-h] [-f INPUT_FILE] [-d INPUT_DIR] [-t INPUT_TYPE] [-m EVAL_MODEL]
[--score-field SCORE_FIELD] [--threads THREADS] [-c CONSTRAINTS] [-s MODEL_SAMPLES]
[-r RANDOM_SIZE] [--tolerance TOLERANCE] [--show_plotting] [--out_dir OUT_DIR]
options:
-h, --help show this help message and exit
-f INPUT_FILE, --input-file INPUT_FILE
-d INPUT_DIR, --input-dir INPUT_DIR
-t INPUT_TYPE, --input-type INPUT_TYPE
format of input files, supported formats are csv,tsv,idXML
-m EVAL_MODEL, --eval-model EVAL_MODEL
existing model to be evaluated
--score-field SCORE_FIELD
the field name holding PSM scores
--threads THREADS number of threads
-c CONSTRAINTS, --constraints CONSTRAINTS
Choices of constraints to be used
-s MODEL_SAMPLES, --model_samples MODEL_SAMPLES
Number of samples/top scores to be used in modeling
-r RANDOM_SIZE, --random_size RANDOM_SIZE
Number of random starts per skewness setting
--tolerance TOLERANCE
Threshold of the change of the point-wise log-likelihood for the EM algorithm to determine the
convergence
--show_plotting Show plotting while fitting the model
--out_dir OUT_DIR The place to save results
Options
Option | Argument | Default | Description |
---|---|---|---|
-f, --input-file | Path | N/A | Path to the search result file |
-d, --input-dir | Path | N/A | Path to the directory holding search result files |
-t, --input-type | csv, tsv, idXML | idXML | Search result format |
-m, --eval-model | Path | N/A | Path to an existing model to be evaluated |
-c, --constraints | no_constraint, unweighted_pdf_mode |
unweighted_pdf_mode | Choices of constraints to be used |
-s | 1, 2 | 2 | Number of samples/top scores to be used in modeling |
--threads | Integer | 1 | Number of threads |
-r, --random_starts | Integer | 2 | Number of random starts per skewness setting |
--tolerance | Float | 1e-8 | Threshold of the change of the point-wise log-likelihood for the EM algorithm to determine the convergence |
--show_plotting | Bool | False | Show plotting while fitting the model |
--out_dir | Path | ./results | The place to save results |
Examples
Regular MS/MS search with MSGF+ engine
Suppose the MS/MS search result with MSGF+ software is saved in .tsv format, data/sample.tsv. You can run the FDR estimation algorithm with the following command,
decoyfree-msfdr -f data/sample.tsv -t tsv --out_dir results/sample --threads 10
Multiple search results
If you have multiple search results from MSGF+ saved in the 'data/sample/' directory, and you want to use them all together to build a single model, do the following,
decoyfree-msfdr -d data/sample/ -t tsv --out_dir results/sample --threads 10
Note: this will search the directory for all the files with .tsv extension. If you specified other formats, it will search for the files with the corresponding extension.
MS/MS search results with other engines
If you are using another search engine, please specify the following information,
Option | Default | Description |
---|---|---|
--score_field | EValue | The score field used to model the data |
--log_scale | True | Whether to model on the log scale of the data |
--neg_score | True | Whether to take negative of the score. In our model, higher score means better. On log-scale, this is done after taking log |
--spec_ref_fields | "#SpecFile,SpecID" | Comma separated fields to identify a spectrum uniquely |
XL-MS/MS search with OpenPepXLLF engine
Suppose the MS/MS search result with MSGF+ software is saved in .idXML format, data/sample.idXML. You can run the FDR estimation algorithm with the following command,
decoyfree-xlmsfdr -f data/sample.idXML -t idXML --out_dir results/sample --threads 10
Note: Currently, idXML is the only format we support. Please let us know if you need to use another format in issues. I'll add support to that.
XL-MS/MS search results with other engines
If you are using another search engine, please specify the following information,
Option | Default | Description |
---|---|---|
--score_field | "OpenPepXL:score" | The score field used to model the data |
--log_scale | False | Whether to model on the log scale of the data |
--neg_score | False | Whether to take negative of the score, on log-scale, this is done after taking log |
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file decoyfree_msfdr-0.1.3.tar.gz
.
File metadata
- Download URL: decoyfree_msfdr-0.1.3.tar.gz
- Upload date:
- Size: 1.2 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.0.0 CPython/3.10.12
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 6749b26c4858a719741f1ca9598767d7131faaf975f944792c4077b7721afb19 |
|
MD5 | bae2fa96adff97fa2fcfc1985ca873b1 |
|
BLAKE2b-256 | f02f2357e25ebad7bd8a03e7cd49b89cb849058049e17f9cf9d4064c340ccd8c |
File details
Details for the file decoyfree_msfdr-0.1.3-py3-none-any.whl
.
File metadata
- Download URL: decoyfree_msfdr-0.1.3-py3-none-any.whl
- Upload date:
- Size: 71.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.0.0 CPython/3.10.12
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | d2f3a7908b51a576fac28fd322ff139b6de0d959e5b32bc9f198c1f4d03e3961 |
|
MD5 | 6e45cd00c2f5a47da2b3511930eec67f |
|
BLAKE2b-256 | 414b757241d155980ee0a32fd4badf4be465848084c919569e7e8f94e33b8a98 |