Test-time training for deep MS/MS spectrum prediction improves peptide identification
Project description
Get Started
pip install pept3
To get to know pept3
, follow the next section to run a demo data.
Set up locally
clone this repo with:
git clone https://github.com/gusye1234/pept3.git
# fetch the pre-trained model weights:
git lfs install
git lfs pull
# install pept3 to python environment
pip install -e .
# run pept3 like an installed command
pept3 ./examples/demo_data/demo_input.tab --spmodel=prosit --similarity=SA --output_tab=./examples/demo_data/demo_out.tab --need_tensor --output_tensor=./examples/demo_data/tensor.hdf5
to perform a simple test-time training over Prosit(--spmodel=prosit
) with Spectral Angle(--similarity=SA
).
The program will take ./examples/demo_data/demo_input.tab
as the input file. Then the tuned features will be outputted to ./examples/demo_data/demo_out.tab
, which is already for the downstream task, for example, as the input of the Percolator:
cd examples
bash ./percolator_demo.sh # rescoring over the tuned features set
# the result will be saved in ./examples/percolator_result
Also a python script for the above demo commands is available:
cd examples
python pept3_demo.py
You should get the identical result. The script pept3_demo.py
will demonstrate the process of how PepT3
working inside python.
Input Format
PepT3 expects a tab-delimited file format as the input, just like Percolator. Each row should contains features associated with a single PSM:
SpecId <tab> Label <tab> ScanNr <tab> peak_ions <tab> peak_inten <tab> ... Charge <tab> <tab> Peptide <tab>
For PepT3, the input tab file should at least include those fields:
SpecId
(any type): Unique id for each PSM.ScanNr
(any type): Same meaning as the Percolator.Label
({1, -1}): 1 for target PSM, -1 for decoys.peak_ions
:;
-delimited matched ions for PSM, only b/y types are considered currently. For exampleb10;b2;b3;
peak_inten
: Corresponding ions' intensities for the matched ions, also;
-delimited. For example829;4154;168;
Charge
(int):, Percursor Chargecollision_energy_aligned_normed
(float, [0,1]): Maximun-normalized NCE.Peptides
(str)
For the input example, have a look at ./examples/demo_data/demo_input.tab
.
Please note that: For any feature that not on the above list, PepT3 will automaticly merge it into the output tab
Output Format
PepT3 outputs a tab-delimited file format with each row contains enlarged features associated with a single PSM. The output tab file can be directly used as the input of the Percolator. Have a look at each features meaning in ./FEATURES.txt
.
Also, for those who want to visit the tuned spectrum prediction, use --need_tensor
option and set --output_tensor
. The prediction will be store as the format of hfd5
, with columns SpecId
and tuned-tensor
.
For the output example, please have a look at ./examples/demo_data/demo_out.tab
and ./examples/demo_data/tensor.hdf5
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
File details
Details for the file pept3-0.0.2.tar.gz
.
File metadata
- Download URL: pept3-0.0.2.tar.gz
- Upload date:
- Size: 18.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.1 CPython/3.9.13
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 0f3eb8f53f65e29b2674049b24ee3d0d18197e8efe64c65c3a7abf470c76098b |
|
MD5 | 91ea66e3d6b9ce51bbe6db0599c88bbf |
|
BLAKE2b-256 | a1ec4e52b57cb7465b092a3ad167c8b520d94283b0c8c9f7a3718c44a4f41e63 |