Skip to main content

xlms-tools is a set of command line tools to apply crosslinking mass spectrometry (XL-MS) data to protein structure models.

Project description

xlms-tools

Description

xlms-tools is a set of command line tools to apply crosslinking mass spectrometry (XL-MS) data to protein structure models.

Setting up xlms-tools

xlms-tools can be installed directly from PyPI, as follows:

$ pip install xlms-tools

Or alternatively, by downloading or cloning this repository, and running the following from the project directory:

$ pip install dist/xlms_tools-1.0.7-py3-none-any.whl

Using xlms-tools

Currently, xlms-tools can be run in two modes. The first is to score how well a protein structure model agrees with XL-MS data, which is specified as a list crosslinks and monolinks derived from a XL-MS experiment. The second mode is to compute the depths of individual residues in protein structures.

To score how well a protein structure agrees with XL-MS data

  1. First, format crosslinks and monolinks into a text file with the following format:
98|A|147|A 5.1		#
72|A|161|A 4.3		# crosslinks: <residue # of a>|<chain of a>|<res#, b>|<chain, b> <occupancy>
72|A|180|A 2.7		#
35|A 1.9	#		
137|A 5.3	# monolinks: <residue # of a>|<chain of a> <occupancy>
97|A 2.6	#

where each line corresponds to either a crosslink or a monolink. Optionally, a numerical value can be appended at the end of each line, corresponding to the occupancy of each individual monolink or crosslink. For a detailed discussion on occupancy, please refer to the paper.

  1. Score protein structure model/s: To compute the crosslink (XLP) and monolink probability (MP) scores of one or more protein structures, execute the following in the command line:
$ xlms-tools -m score -l [list of crosslinks and/or monolinks] [PDB file/s] --name [name of run]

Example files can be found in the tests/ directory. You can navigate to the tests/ directory and run the scoring, as follows:

$ xlms-tools -m score -l xlms_data_qtv.txt model_1.pdb --name withoccupancy_m1 #score model_1.pdb
$ xlms-tools -m score -l xlms_data_qtv.txt model_*pdb --name withoccupancy_all #score all models
  1. The outputs include tab-separated (.tsv) files containing the scores of all the models, as well as the individual crosslink and monolink scores for each model, which can be viewed using a spreadsheet editor. Under default run conditions, a ChimeraX command (.cxc) file is also produced, which can be executed by double-clicking the .cxc file (given a working installation of ChimeraX). In the ChimeraX visualization, crosslinks and monolinks are color-coded: blue means that the spanning distance of the crosslink, or the residue depth of the monolinked residue, are well within the cutoff, red stands for a maximum distance violation (for crosslinks) or a maximum depth violation (for monolinks), while yellow is within the cutoff, but approaching it. The distance cutoffs are currently only defined for BS3/DSS, but will be expanded in future releases.

link type blue yellow red
any monolink depth ≤ 6.25Å - depth > 6.25Å
BS3 or DSS crosslink CA-CA distance ≤ 21Å 21Å < CA-CA distance ≤ 33Å CA-CA distance > 33Å

To compute residue depths in a protein structure

  1. Run the following command:
$ xlms-tools -m depth [PDB file/s]

Example files can be found in the tests/ directory. You can navigate to the tests/ directory and run the depth computations, as follows:

$ xlms-tools -m depth model_1.pdb #compute residue depths for model_1.pdb
$ xlms-tools -m depth model_*pdb #compute residue depths for all models
  1. Each line of the output file (.depth file) corresponds to one residue
# format: <residue number>:<chain>	<amino acid>	<residue depth in Å>
A:26	LYS	4.317777777777779	
A:27	LEU	4.608300983124843
A:28	VAL	5.739574753218949
A:29	VAL	8.684474490011493
A:30	ALA	8.926983412282244
A:31	THR	8.0268463237516
A:32	ASP	6.0348264453008715
A:33	THR	4.487608873498731
A:34	ALA	4.371281572999748
A:35	PHE	5.2527378512712275
A:36	VAL	5.226420625104608
A:37	PRO	6.844806528409208
...

API

xlms_tools.depth.computedepth(biopdbstruct)

  • Computes the depth of each residue in a Biopython PDB structure

Parameters:

Returns:

  • depths: dictionary containing residue depths, with key-value pairs in the format chainid:residue#-residue_depth
  • residues: dictionary containing residue names, with key-value pairs in the format chainid:residue#-residue_name

Example usage:

from Bio.PDB import PDBParser
from xlms_tools.depth import computedepth

parser = PDBParser()
biopdbstruct = parser.get_structure('test', 'tests/bigmodel.pdb')
depths, residues = computedepth(biopdbstruct)

xlms_tools.score.linkmetrics(biopdbstruct, link, depths, linkweight=1.0, linker='BS3/DSS')

  • Computes the score and CA-CA distance/residue depth in a Biopython PDB structure for a given crosslink or monolink.

Parameters:

  • biopdbstruct: an instance of the Bio.PDB structure class
  • link: a tuple defining either a monolink (residue#, chainid), or a crosslink (residue#_1, chainid_1, residue#_2, chainid_2)
  • depths: dictionary of residue depths (from xlms_tools.depth.computedepth(biopdbstruct))
  • linkweight: for quantitative XL-MS data, a measure of relative abundance or occupancy. If unspecified, default weight is 1.0
  • linker: crosslinking reagent used (if unspecified, default is BS3/DSS)

Returns:

  • score: monolink probability (MP) score for a monolink, crosslink probability (XLP) score for a crosslink
  • monolinkdepth: depth of the monolinked residue
  • crosslinkdistance: CA-CA distance of crosslink

Example usage:

from Bio.PDB import PDBParser
from xlms_tools.depth import computedepth
from xlms_tools.score import linkmetrics

parser = PDBParser()
biopdbstruct = parser.get_structure('test', 'tests/bigmodel.pdb')
depths, residues = computedepth(biopdbstruct)

# compute monolink score
mpscore, mpdepth = linkmetrics(biopdbstruct, (1, 'A'), depths)

# computed crosslink score
xlpscore, distance = linkmetrics(biopdbstruct, (1049, 'B', 1409, 'B'), depths)

Citations

When using xlms-tools, please cite: Manalastas-Cantos, K., Adoni, K. R., Pfeifer, M., Märtens, B., Grünewald, K., Thalassinos, K., & Topf, M. (2024). Modeling flexible protein structure with AlphaFold2 and cross-linking mass spectrometry. Molecular & Cellular Proteomics. https://doi.org/10.1016/j.mcpro.2024.100724

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

xlms_tools-1.0.7.tar.gz (17.8 kB view details)

Uploaded Source

Built Distribution

xlms_tools-1.0.7-py3-none-any.whl (18.9 kB view details)

Uploaded Python 3

File details

Details for the file xlms_tools-1.0.7.tar.gz.

File metadata

  • Download URL: xlms_tools-1.0.7.tar.gz
  • Upload date:
  • Size: 17.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.8.5

File hashes

Hashes for xlms_tools-1.0.7.tar.gz
Algorithm Hash digest
SHA256 73bec25369c38b0ffcc8b70b57f53256fed0c6813a507fb6ab352c0cefebac6f
MD5 00b28b71e1bb85adebcfc7a5817be489
BLAKE2b-256 ebca8e74215b853872efa86c6ffa09233c1dea95c0bc355695285f9bb7ea5e2d

See more details on using hashes here.

File details

Details for the file xlms_tools-1.0.7-py3-none-any.whl.

File metadata

  • Download URL: xlms_tools-1.0.7-py3-none-any.whl
  • Upload date:
  • Size: 18.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.8.5

File hashes

Hashes for xlms_tools-1.0.7-py3-none-any.whl
Algorithm Hash digest
SHA256 5baeceabff94b26d915640af0043d96cae0c24854113096516598b0a2de74ef5
MD5 bed7f7e07889aee55183798acbd3888c
BLAKE2b-256 b911cecd03474401bbbfedd6eeccc0cc676da56dd0c876cb2a8e795f2863a39b

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page