Python Package for running custom protein inference algorithms on tab-formatted tandem MS/MS search results.

These details have not been verified by PyPI

Project links

Homepage

Project description

Py Protein Inference

PyProteinInference is a Python package for running various protein inference algorithms on tandem mass spectrometry search results and generating protein to peptide mappings with protein level false discovery rates..

Key Features

Protein Inference and Scoring:
- Maps peptides to proteins.
- Generates protein scores from provided PSMs.
- Calculates set-based protein-level false discovery rates for MS data filtering.
Supported Input Formats:
- Search Result File Types: idXML, mzIdentML, or pepXML.
- PSM files from Percolator.
- Custom tab-delimited files.
Output:
- User-friendly CSV file containing Proteins, Peptides, q-values, and Protein Scores.
Supported Inference Procedures:
- Parsimony - Returns the Minimal set of proteins based on the input peptides.
- Exclusion - Removes all non-distinguishing peptides on the protein level.
- Inclusion - Returns all possible proteins.
- Peptide Centric - Returns protein groups based on peptide assignments.

Requirements

Python 3.9 or greater.
Python Packages: numpy, pyteomics, pulp, PyYAML, matplotlib, pyopenms, lxml, tqdm, pywebview, nicegui. These should be installed automatically during installation.

Quick Start Guide

Install the package using pip

pip install pyproteininference

Running the command line tool

To run the CLI tool either call protein_inference_cli.py like so:

protein_inference_cli.py --help

Or call the script while also calling your python interpreter

First, locate the script that gets installed on installation:

which protein_inference_cli.py
/path/to/venv/bin/protein_inference_cli.py

Then, call the script while also calling your python interpreter

python /path/to/venv/bin/protein_inference_cli.py --help

Optionally, download the protein_inference_cli.py file from the github repo here: https://github.com/thinkle12/pyproteininference/blob/master/scripts/protein_inference_cli.py

And then call the script while also calling the pyton interpreter as shown above

Running the graphical user interface

To run the GUI tool either call protein_inference_gui.py like so:

protein_inference_gui.py

Or again, call the script while also calling your python interpreter

First, locate the script that gets installed on installation:

which protein_inference_gui.py
/path/to/venv/bin/protein_inference_gui.py

Then, call the script while also calling your python interpreter

python /path/to/venv/bin/protein_inference_gui.py

Again, you can optionally download the protein_inference_gui.py file from the github repo here: https://github.com/thinkle12/pyproteininference/blob/master/scripts/protein_inference_gui.py

And then call the script while also calling the pyton interpreter as shown above

Executables

You can also download a standalone executable version of the GUI for both Windows and macOS from the releases page on GitHub: https://github.com/thinkle12/pyproteininference/releases

When launching the GUI's from the executables please wait until for the user interface to pop up. It usually takes a minute or so.

More Options for calling the CLI

Run the standard command line from an idXML file

protein_inference_cli.py \
-f /path/to/target/file.idXML \
-db /path/to/database/file.fasta \
-y /path/to/params.yaml

Run the standard command line from an mzIdentML file

protein_inference_cli.py \
-f /path/to/target/file.mzid \
-db /path/to/database/file.fasta \
-y /path/to/params.yaml

Run the standard command line from a pepXML file

protein_inference_cli.py \
-f /path/to/target/file.pep.xml \
-db /path/to/database/file.fasta \
-y /path/to/params.yaml

Run the standard command line tool with tab delimited results directly from percolator to run a particular inference method. By default, peptide centric inference is selected if a parameter file is not specified:

protein_inference_cli.py \
-t /path/to/target/file.txt \
-d /path/to/decoy/file.txt \
-db /path/to/database/file.fasta

Specifying Parameters. The two most common parameters to change are the inference type, and the decoy symbol (for identifying decoy proteins vs target proteins). The parameters can be quickly altered by creating a file called params.yaml as follows:

parameters:
  inference:
    inference_type: parsimony
  identifiers:
    decoy_symbol: "decoy_"

The inference type can be one of: parsimony, peptide_centric, inclusion, exclusion, or first_protein. All parameters are optional, so you only need to define the ones you want to alter. Parameters that are not defined are set to default values. See the package documentation for the default parameters.

Run the standard command line tool again, this time specifying the parameters as above:

protein_inference_cli.py \
-t /path/to/target/file.txt \
-d /path/to/decoy/file.txt \
-db /path/to/database/file.fasta \
-y /path/to/params.yaml

Running with docker
- Either Pull the image from docker hub:
  - docker pull hinklet/pyproteininference:latest
- Or Build the image with the following command (After having cloned the repository):
  - git clone REPOSITORY_URL
  - cd pyproteininference
  - docker build -t pyproteininference:latest .
- Run the tool, making sure to volume mount in the directory with your input data and parameters. In the case below, that local directory would be /path/to/local/directory and the path in the container is /data
```
docker run -v /path/to/local/directory/:/data \
-it hinklet/pyproteininference:latest \
python /usr/local/bin/protein_inference_cli.py \
-f /data/input_file.txt \
-db /data/database.fasta \
-y /data/parameters.yaml \
-o /data/
```

Building the Bundled Application Package using PyInstaller

Note: This is only necessary if you want to build the application package yourself. The package is already available on PyPi and can be installed using pip, or bundled executables can be downloaded from the releases page on GitHub (https://thinkle12.github.io/pyproteininference/).

After cloning the source code repository, create a new Python virtual environment under the project directory:

python -m venv venv

Activate the virtual environment:

source venv/bin/activate

Install the required packages:

pip install -r requirements.txt pyinstaller==6.11.1

Run the PyInstaller command to build the executable:

pyinstaller pyProteinInference.spec

The executable will be located in the dist directory.

Documentation

For more information please see the full package documentation (https://thinkle12.github.io/pyproteininference/).

Project details

These details have not been verified by PyPI

Project links

Homepage

Release history Release notifications | RSS feed

This version

1.1.1

Jan 19, 2025

1.1.0

Jan 17, 2025

1.0.1

Aug 8, 2024

1.0.0

Jun 21, 2022

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pyproteininference-1.1.1.tar.gz (660.4 kB view details)

Uploaded Jan 19, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

pyproteininference-1.1.1-py3-none-any.whl (79.0 kB view details)

Uploaded Jan 19, 2025 Python 3

File details

Details for the file pyproteininference-1.1.1.tar.gz.

File metadata

Download URL: pyproteininference-1.1.1.tar.gz
Upload date: Jan 19, 2025
Size: 660.4 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.0.1 CPython/3.9.21

File hashes

Hashes for pyproteininference-1.1.1.tar.gz
Algorithm	Hash digest
SHA256	`498dc46a23300599c7fc86d3f307567c3eee9ad4f0d69f70035ea7b227f0ec41`
MD5	`0c26c3effd5e8d03c6123795315a3769`
BLAKE2b-256	`11db65f81f00dbb7ffa2f70919379c187efdec74220fa343290925110a06fc24`

See more details on using hashes here.

File details

Details for the file pyproteininference-1.1.1-py3-none-any.whl.

File metadata

Download URL: pyproteininference-1.1.1-py3-none-any.whl
Upload date: Jan 19, 2025
Size: 79.0 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.0.1 CPython/3.9.21

File hashes

Hashes for pyproteininference-1.1.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`605db8b50a1f489682c872d57799b21b07957a741c936556b7bfe5d1b09bfde8`
MD5	`0f00dd6d0e7b4c1e4bf80115c0aedfa4`
BLAKE2b-256	`ab21c8c1503d15d9c54415f749edc20e67b63e5476f366afd978711e0b71d0d5`

See more details on using hashes here.

pyproteininference 1.1.1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Py Protein Inference

Key Features

Requirements

Quick Start Guide

Install the package using pip

Running the command line tool

Running the graphical user interface

Executables

More Options for calling the CLI

Building the Bundled Application Package using PyInstaller

Documentation

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes