Skip to main content

Python utility for TMT-based proteomics

Reason this release was yanked:

use 25.11.1 instead

Project description

TMTCrunch is an open-source Python utility for tandem mass tag proteomics.

Overview

TMTCrunch is designed primarily to analyze products of alternative splicing in TMT (tandem mass tag) proteomics and phospho-proteomics data. TMTCrunch performs:

  • per channel normalization;
  • normalization across channels using inherent or virtual GIS channels as a reference;
  • optional grouping of PSMs in accordance with user defined rules;
  • global or per group FDR filtration;
  • calculation of abundance at any level: unmodified peptide, peptide with modifications, protein, gene.

TMTCrunch can be used with Sage search engine or with IdentiPy/Scavager.

TMTCrunch workflow

Installation

Installing from PyPI

The latest released version can be installed from the Python Package Index:

pip install tmtcrunch

Installing from source

The cutting edge version can be installed directly from the source repository:

pip install git+https://codeberg.org/makc/tmtcrunch.git

Alternatively, clone the repo and install the package in development mode:

git clone https://codeberg.org/makc/tmtcrunch.git
pip install --editable tmtcrunch

Dependencies

TMTCrunch relies on the following Python packages:

and it would use statistics functions from astropy package if available.

Command line options

usage: tmtcrunch [-h] [--cfg CFG] [--fasta FASTA] [--input-format {auto,scavager,sage}]
                 [--output-dir OUTPUT_DIR] [--output-prefix OUTPUT_PREFIX] [--phospho]
                 [--verbose {0,1,2}] [--show-config] [--version]
                 [fractions ...]

positional arguments:
  fractions             Scavager *_PSMs_full.tsv files or directories with Sage search results.

options:
  -h, --help            show this help message and exit
  --cfg CFG             Path to configuration file. Can be specified multiple times.
  --fasta FASTA         Path to protein fasta file for mapping protein to gene symbol.
  --input-format {auto,scavager,sage}
                        Format of input data. Supported: 'auto', 'scavager', 'sage'. Default is
                        'auto'
  --output-dir OUTPUT_DIR, --odir OUTPUT_DIR
                        Existing output directory. Default is current directory.
  --output-prefix OUTPUT_PREFIX, --oprefix OUTPUT_PREFIX
                        Prefix for output files. Default is 'tmtcrunch_'.
  --phospho             Enable common modifications for phospho-proteomics.
  --verbose {0,1,2}     Logging verbosity. Default is 1.
  --show-config         Show configuration and exit.
  --version             Output version information and exit.

Configuration files

TMTCrunch stores its configuration in TOML format.

Default TMTCrunch configuration:

# Specimen columns.
specimen_columns = []
# Global internal standard (GIS) columns (for multi batch experiments).
gis_columns = []
# Simulate GIS via selected specimen columns.
# Intended for singe batch experiments only!
simulate_gis = []

# Prefix of decoy proteins.
decoy_prefix = 'DECOY_'

# Path to protein fasta file for mapping protein to gene symbol.
fasta_file = ''

# List of column names from input files to save in the output.
keep_columns = []

# If true, perform PSM groupwise analysis.
groupwise = true

# Global false discovery rate. Can be overwritten per PSM group.
global_fdr = 0.01

# If true, respect peptide modifications and terminate analysis at peptide level.
with_modifications = false

# No modifications by default. Run TMTCrunch with --phospho argument
# to enable common modifications for phospho-proteomics.
[modification.universal]
[modification.selective]

# Keys below are only applicable if groupwise analysis is requested.
# Prefixes of target proteins. If not set, `target_prefixes` will be deduced
# from the prefixes of PSM groups.
# target_prefixes = ['alt_', 'canon_']

# Each PSM group is named after its subkey and defined by three keys:
# `descr` - group description
# `prefixes` - prefixes of target proteins
# `fdr` - groupwise false discovery rate. If not set, global FDR will be used.

# Isoform PSMs: protein group of each PSM consists of target proteins
# with 'alt_' prefix only and any decoy proteins.
[psm_group.isoform]
descr = 'Isoform PSMs'
prefixes = [['alt_']]
fdr = 0.05

# Canonical PSMs: protein group of each PSM consists of target proteins
# with 'canon_' prefix only and any decoy proteins.
[psm_group.canon]
descr = 'Canonical PSMs'
prefixes = [['canon_']]
fdr = 0.01

# Shared PSMs: protein group of each PSM consists both of
# 'canon_' and 'alt_' target proteins and any decoy proteins.
[psm_group.shared]
descr = 'Shared PSMs'
prefixes = [['canon_', 'alt_']]
fdr = 0.01

Additional configuration for phospho-proteomics (use --phospho argument to enable):

with_modifications = true

# Modifications can be either universal or selective. PSMs for modified
# peptides with any universal modification and the same pattern of selective
# modifications are treated together, PSMs for peptides with different pattern
# of selective modifications are treated separately.

[modification.universal.0]
name = "TMTpro"
# TMTpro 16plex
mass_delta = 304.207146
modX = "t"
# n-term, K
site = "^K"
variable = false

[modification.universal.1]
name = "TMTplex"
# TMT 6plex, 10plex, 11plex
mass_delta = 229.162932
modX = "t"
site = "^K"
variable = false

[modification.universal.2]
name = "Carboxyamidomethylation"
mass_delta = 57.021464
modX = "cam"
site = "C"
variable = false

[modification.universal.3]
name = "Oxidation"
mass_delta = 15.994915
modX = "ox"
site = "M"
variable = true

[modification.universal.4]
name = "Deamidation"
mass_delta = 0.984016
modX = "d"
site = "NQ"
variable = true

[modification.selective.1]
name = "Phosphorylation"
mass_delta = 79.966331
modX = "p"
site = "STY"

License

TMTCrunch is distributed under the three clause BSD License.

Related software

  • AA_stat - utility for amino acid residue modification analysis.
  • Pyteomics - Python framework for proteomics data analysis.
  • IdentiPy - search engine for bottom-up proteomics.
  • Sage - proteomics search engine & quantification tool.
  • Scavager - proteomics post-search validation tool.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tmtcrunch-25.11.tar.gz (27.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

tmtcrunch-25.11-py3-none-any.whl (27.4 kB view details)

Uploaded Python 3

File details

Details for the file tmtcrunch-25.11.tar.gz.

File metadata

  • Download URL: tmtcrunch-25.11.tar.gz
  • Upload date:
  • Size: 27.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.14

File hashes

Hashes for tmtcrunch-25.11.tar.gz
Algorithm Hash digest
SHA256 b7828a66de2a9567cb78510cf15160c347bc8eab758468268a36bf12a446c1d6
MD5 e3e040198e6bb35b1e252e31f110e95b
BLAKE2b-256 9943ab8edc65743f1a059cbddee2f1adc640f111620935001bf505e25fd00628

See more details on using hashes here.

File details

Details for the file tmtcrunch-25.11-py3-none-any.whl.

File metadata

  • Download URL: tmtcrunch-25.11-py3-none-any.whl
  • Upload date:
  • Size: 27.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.14

File hashes

Hashes for tmtcrunch-25.11-py3-none-any.whl
Algorithm Hash digest
SHA256 c41c73c7124a87a864209a0c115cde80a1c4f49e356464e8585328f9c64c2b2e
MD5 bdf66e24eee596fbdaf2c608bcb34183
BLAKE2b-256 ea5733aca9a1870d86cd95c692d0bf9b4d353c9c103d8fcbccbb5629f3e10968

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page