Skip to main content

xlms-tools is a set of command line tools to apply crosslinking mass spectrometry (XL-MS) data to protein structure models.

Project description

xlms-tools

Description

xlms-tools is a set of command line tools to apply crosslinking mass spectrometry (XL-MS) data to protein structure models.

Setting up xlms-tools

xlms-tools can be installed directly from PyPI, as follows:

$ pip install xlms-tools

Or alternatively, by downloading or cloning this repository, and running the following from the project directory:

$ pip install dist/xlms_tools-1.0.6-py3-none-any.whl

Using xlms-tools

Currently, xlms-tools can be run in two modes. The first is to score how well a protein structure model agrees with XL-MS data, which is specified as a list crosslinks and monolinks derived from a XL-MS experiment. The second mode is to compute the depths of individual residues in protein structures.

To score how well a protein structure agrees with XL-MS data

  1. First, format crosslinks and monolinks into a text file with the following format:
98|A|147|A 5.1		#
72|A|161|A 4.3		# crosslinks: <residue # of a>|<chain of a>|<res#, b>|<chain, b> <occupancy>
72|A|180|A 2.7		#
35|A 1.9	#		
137|A 5.3	# monolinks: <residue # of a>|<chain of a> <occupancy>
97|A 2.6	#

where each line corresponds to either a crosslink or a monolink. Optionally, a numerical value can be appended at the end of each line, corresponding to the occupancy of each individual monolink or crosslink. For a detailed discussion on occupancy, please refer to the paper.

  1. Score protein structure model/s: To compute the crosslink (XLP) and monolink probability (MP) scores of one or more protein structures, execute the following in the command line:
$ xlms-tools -m score -l [list of crosslinks and/or monolinks] [PDB file/s] --name [name of run]

Example files can be found in the tests/ directory. You can navigate to the tests/ directory and run the scoring, as follows:

$ xlms-tools -m score -l xlms_data_qtv.txt model_1.pdb --name withoccupancy_m1 #score model_1.pdb
$ xlms-tools -m score -l xlms_data_qtv.txt model_*pdb --name withoccupancy_all #score all models
  1. The outputs include (a) a tab-separated (.tsv) file containing the scores, which can be viewed using a spreadsheet editor, as well as (b) a ChimeraX command (.cxc) file, which can be executed by double-clicking the .cxc file (given a working installation of ChimeraX). In the ChimeraX visualization, crosslinks and monolinks are color-coded: blue means that the spanning distance of the crosslink, or the residue depth of the monolinked residue, are well within the cutoff, red stands for a maximum distance violation (for crosslinks) or a maximum depth violation (for monolinks), while yellow is within the cutoff, but approaching it. The distance cutoffs are currently only defined for BS3/DSS, but will be expanded in future releases.

link type blue yellow red
any monolink depth ≤ 6.25Å - depth > 6.25Å
BS3 or DSS crosslink CA-CA distance ≤ 21Å 21Å < CA-CA distance ≤ 33Å CA-CA distance > 33Å

To compute residue depths in a protein structure

  1. Run the following command:
$ xlms-tools -m depth [PDB file/s]

Example files can be found in the tests/ directory. You can navigate to the tests/ directory and run the depth computations, as follows:

$ xlms-tools -m depth model_1.pdb #compute residue depths for model_1.pdb
$ xlms-tools -m depth model_*pdb #compute residue depths for all models
  1. Each line of the output file (.depth file) corresponds to one residue
# format: <residue number>:<chain>	<amino acid>	<residue depth in Å>
A:26	LYS	4.317777777777779	
A:27	LEU	4.608300983124843
A:28	VAL	5.739574753218949
A:29	VAL	8.684474490011493
A:30	ALA	8.926983412282244
A:31	THR	8.0268463237516
A:32	ASP	6.0348264453008715
A:33	THR	4.487608873498731
A:34	ALA	4.371281572999748
A:35	PHE	5.2527378512712275
A:36	VAL	5.226420625104608
A:37	PRO	6.844806528409208
...

API

xlms_tools.depth.computedepth(biopdbstruct)

  • Computes the depth of each residue in a Biopython PDB structure

Parameters:

Returns:

  • depths: dictionary containing residue depths, with key-value pairs in the format chainid:residue#-residue_depth
  • residues: dictionary containing residue names, with key-value pairs in the format chainid:residue#-residue_name

Example usage:

from Bio.PDB import PDBParser
from xlms_tools.depth import computedepth

parser = PDBParser()
biopdbstruct = parser.get_structure('test', 'tests/bigmodel.pdb')
depths, residues = computedepth(biopdbstruct)

xlms_tools.score.linkmetrics(biopdbstruct, link, depths, linkweight=1.0, linker='BS3/DSS')

  • Computes the score and CA-CA distance/residue depth in a Biopython PDB structure for a given crosslink or monolink.

Parameters:

  • biopdbstruct: an instance of the Bio.PDB structure class
  • link: a tuple defining either a monolink (residue#, chainid), or a crosslink (residue#_1, chainid_1, residue#_2, chainid_2)
  • depths: dictionary of residue depths (from xlms_tools.depth.computedepth(biopdbstruct))
  • linkweight: for quantitative XL-MS data, a measure of relative abundance or occupancy. If unspecified, default weight is 1.0
  • linker: crosslinking reagent used (if unspecified, default is BS3/DSS)

Returns:

  • score: monolink probability (MP) score for a monolink, crosslink probability (XLP) score for a crosslink
  • monolinkdepth: depth of the monolinked residue
  • crosslinkdistance: CA-CA distance of crosslink

Example usage:

from Bio.PDB import PDBParser
from xlms_tools.depth import computedepth
from xlms_tools.score import linkmetrics

parser = PDBParser()
biopdbstruct = parser.get_structure('test', 'tests/bigmodel.pdb')
depths, residues = computedepth(biopdbstruct)

# compute monolink score
mpscore, mpdepth = linkmetrics(biopdbstruct, (1, 'A'), depths)

# computed crosslink score
xlpscore, distance = linkmetrics(biopdbstruct, (1049, 'B', 1409, 'B'), depths)

Citations

When using xlms-tools, please cite: Manalastas-Cantos, K., Adoni, K. R., Pfeifer, M., Märtens, B., Grünewald, K., Thalassinos, K., & Topf, M. (2024). Modeling flexible protein structure with AlphaFold2 and cross-linking mass spectrometry. Molecular & Cellular Proteomics. https://doi.org/10.1016/j.mcpro.2024.100724

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

xlms_tools-1.0.6.tar.gz (17.5 kB view details)

Uploaded Source

Built Distribution

xlms_tools-1.0.6-py3-none-any.whl (18.7 kB view details)

Uploaded Python 3

File details

Details for the file xlms_tools-1.0.6.tar.gz.

File metadata

  • Download URL: xlms_tools-1.0.6.tar.gz
  • Upload date:
  • Size: 17.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.8.5

File hashes

Hashes for xlms_tools-1.0.6.tar.gz
Algorithm Hash digest
SHA256 4ce378a8e93455af818af5d9f8388b0544f4d410c0663dcbddb36d9582daed43
MD5 4237cbac8ac6176bc37601a0f5beb55f
BLAKE2b-256 8a001286dcede903f047cf879173d9739714e5fe400f99588d762d943f63e001

See more details on using hashes here.

File details

Details for the file xlms_tools-1.0.6-py3-none-any.whl.

File metadata

  • Download URL: xlms_tools-1.0.6-py3-none-any.whl
  • Upload date:
  • Size: 18.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.8.5

File hashes

Hashes for xlms_tools-1.0.6-py3-none-any.whl
Algorithm Hash digest
SHA256 fb4634028594c144a6e5a776292f4da5dc2c18b5c802c31ee469a5802219323b
MD5 deb16d5e2e016dd6f4082e53ae3813eb
BLAKE2b-256 7a9531c97678753ec685febfebd40b03393dca217810890d52e10c9ed6d6d286

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page