Skip to main content

A proteomics search engine for LC-MS1 spectra.

Project description

ms1searchpy - a DirectMS1 proteomics search engine for LC-MS1 spectra

The .tsv (or mzML) and .fasta files are required for basic operation of the script. tsv file is tab-separated text file with peptide features generated by Dinosaur software (J.Teleman et al., "Dinosaur: A Refined Open-Source Peptide MS Feature Detector", JPR 2016) or Biosaur (https://github.com/abdrakhimov1/Biosaur) from mzML file. This file can be generated by any other software for peak-picking and must contain columns 'massCalib', 'rtApex', 'charge' and 'nIsotopes' columns. For a сonvenient usage, mzML files can be used directly and the script will run an attached version of Dinosaur (installed Java is required).
For an efficient usage of retention time, user can install and use ELUDE prediction algorithm (-elude path_to_elude_binary should be used in parameters). For the most efficient usage of retention time, user can install and use DeepLC prediction algorithm (-deeplc path_to_deeplc_binary should be used in parameters).

Algorithm can be run with following command:

ms1searchpy path_to_MZML -d path_to_fasta

OR

ms1searchpy path_to_peptideFeatures -d path_to_fasta

The script output contains files: all identified proteins (filename_proteins_full.csv), filtered proteins (filename_proteins.csv), all matched peptide match fingerprints (filename_PFMs.csv), all matched peptide match fingerprints with features prepared for Machnine Learning (filename_PFMs_ML.csv) and log file with estimated mass and RT accuracies (filename_log.txt).

Citing ms1searchpy

Ivanov et al. DirectMS1: MS/MS-free identification of 1000 proteins of cellular proteomes in 5 minutes. https://doi.org/10.1021/acs.analchem.9b05095

Installation

Using the pip:

pip install ms1searchpy

Example for full installation and usage:

pip3 install ms1searchpy
pip3 install deeplc
pip3 install biosaur

Convert raw files to mzML:

msconvert.dock path_to_file.raw -o path_to_output_folder --mzML --filter "peakPicking true 1-" --filter "MS2Deisotope" --filter "zeroSamples removeExtra" --filter "threshold absolute 1 most-intense"

Extract features from mzML:

biosaur path_to_mzml
*OR “biosaur path_to_mzml --faims” for FAIMS data
*OR  “biosaur path_to_mzml --negative_mode” for negative ions data

Prepare shuffled!! decoy database. Python code example:

from pyteomics import fasta
fasta.write_decoy_db(source='/home/test/sprot_human.fasta', output=open('/home/test/sprot_human_shuffled.fasta', 'w'), mode='shuffle').close()	

Alternative way is to use -ad 1 option in ms1searchpy for automatic decoy database creation.

Run DirectMS1search:

ms1searchpy path_to.features.tsv -d path_to_shuffled.fasta -sc 2 -i 2 -nproc 9 -mc 0 -cmin 1 -ptol 8 -fdr 1 -deeplc ~/virtualenv_deeplc/bin/deeplc -ts 2 -ml 1 -nproc 9

Enjoy!

Dependencies

  • pyteomics
  • numpy
  • scipy
  • sklearn
  • lightgbm
  • pandas
  • biosaur

Links

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ms1searchpy-2.0.9.tar.gz (13.7 MB view details)

Uploaded Source

Built Distribution

ms1searchpy-2.0.9-py3-none-any.whl (13.7 MB view details)

Uploaded Python 3

File details

Details for the file ms1searchpy-2.0.9.tar.gz.

File metadata

  • Download URL: ms1searchpy-2.0.9.tar.gz
  • Upload date:
  • Size: 13.7 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.6.1 requests/2.24.0 setuptools/50.3.2 requests-toolbelt/0.9.1 tqdm/4.51.0 CPython/3.8.6

File hashes

Hashes for ms1searchpy-2.0.9.tar.gz
Algorithm Hash digest
SHA256 3b1b1ef57a769e97c65e0587bfabcc266502355453600d33e39c17d86f698bc8
MD5 3ebbf2010835f9745b600f57374aaef5
BLAKE2b-256 946197f9f835fd2e4af7c6ce706bcda67468845022c14d3a3b8ccd78939dcccc

See more details on using hashes here.

File details

Details for the file ms1searchpy-2.0.9-py3-none-any.whl.

File metadata

  • Download URL: ms1searchpy-2.0.9-py3-none-any.whl
  • Upload date:
  • Size: 13.7 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.6.1 requests/2.24.0 setuptools/50.3.2 requests-toolbelt/0.9.1 tqdm/4.51.0 CPython/3.8.6

File hashes

Hashes for ms1searchpy-2.0.9-py3-none-any.whl
Algorithm Hash digest
SHA256 446a53aeabd7fed5dfa5e022369ebc0a717633acdb852808dce1e9e68d750d1f
MD5 7402b743735e71f8c1c06c3fe1c7d348
BLAKE2b-256 4c20b57f40f6156c374675f8874a0006c16ad649987765083d7b581615250e75

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page