Skip to main content

xlms-tools is a set of command line tools to apply crosslinking mass spectrometry (XL-MS) data to protein structure models.

Project description

xlms-tools

Description

xlms-tools is a set of command line tools to apply crosslinking mass spectrometry (XL-MS) data to protein structure models.

Setting up xlms-tools

xlms-tools can be installed directly from PyPI, as follows:

$ pip install xlms-tools

Or alternatively, by downloading or cloning this repository, and running the following from the project directory:

$ pip install dist/xlms_tools-1.0.4-py3-none-any.whl

Using xlms-tools

Currently, xlms-tools can be run in two modes. The first is to score how well a protein structure model agrees with XL-MS data, which is specified as a list crosslinks and monolinks derived from a XL-MS experiment. The second mode is to compute the depths of individual residues in protein structures.

To score how well a protein structure agrees with XL-MS data

  1. First, format crosslinks and monolinks into a text file with the following format:
98|A|147|A 5.1		#
72|A|161|A 4.3		# crosslinks: <residue # of a>|<chain of a>|<res#, b>|<chain, b> <occupancy>
72|A|180|A 2.7		#
35|A 1.9	#		
137|A 5.3	# monolinks: <residue # of a>|<chain of a> <occupancy>
97|A 2.6	#

where each line corresponds to either a crosslink or a monolink. Optionally, a numerical value can be appended at the end of each line, corresponding to the occupancy of each individual monolink or crosslink. For a detailed discussion on occupancy, please refer to the paper.

  1. Score protein structure model/s: To compute the crosslink (XLP) and monolink probability (MP) scores of one or more protein structures, execute the following in the command line:
$ xlms-tools -m score -l [list of crosslinks and/or monolinks] [PDB file/s] --name [name of run]

Example files can be found in the tests/ directory. You can navigate to the tests/ directory and run the scoring, as follows:

$ xlms-tools -m score -l xlms_data_qtv.txt model_1.pdb --name withoccupancy_m1 #score model_1.pdb
$ xlms-tools -m score -l xlms_data_qtv.txt model_*pdb --name withoccupancy_all #score all models
  1. The outputs include (a) a tab-separated (.tsv) file containing the scores, which can be viewed using a spreadsheet editor, as well as (b) a ChimeraX command (.cxc) file, which can be executed by double-clicking the .cxc file (given a working installation of ChimeraX). In the ChimeraX visualization, crosslinks and monolinks are color-coded: blue means that the spanning distance of the crosslink, or the residue depth of the monolinked residue, are well within the cutoff, red stands for a maximum distance violation (for crosslinks) or a maximum depth violation (for monolinks), while yellow is within the cutoff, but approaching it. The distance cutoffs are currently only defined for BS3/DSS, but will be expanded in future releases.

link type blue yellow red
any monolink depth ≤ 6.25Å - depth > 6.25Å
BS3 or DSS crosslink CA-CA distance ≤ 21Å 21Å < CA-CA distance ≤ 33Å CA-CA distance > 33Å

To compute residue depths in a protein structure

  1. Run the following command:
$ xlms-tools -m depth [PDB file/s]

Example files can be found in the tests/ directory. You can navigate to the tests/ directory and run the depth computations, as follows:

$ xlms-tools -m depth model_1.pdb #compute residue depths for model_1.pdb
$ xlms-tools -m depth model_*pdb #compute residue depths for all models
  1. Each line of the output file (.depth file) corresponds to one residue
# format: <residue number>:<chain>	<amino acid>	<residue depth in Å>
A:26	LYS	4.317777777777779	
A:27	LEU	4.608300983124843
A:28	VAL	5.739574753218949
A:29	VAL	8.684474490011493
A:30	ALA	8.926983412282244
A:31	THR	8.0268463237516
A:32	ASP	6.0348264453008715
A:33	THR	4.487608873498731
A:34	ALA	4.371281572999748
A:35	PHE	5.2527378512712275
A:36	VAL	5.226420625104608
A:37	PRO	6.844806528409208
...

API

xlms_tools.depth.computedepth(biopdbstruct)

  • Computes the depth of each residue in a Biopython PDB structure

Parameters:

Returns:

  • depths: dictionary containing residue depths, with key-value pairs in the format chainid:residue#-residue_depth
  • residues: dictionary containing residue names, with key-value pairs in the format chainid:residue#-residue_name

Example usage:

from Bio.PDB import PDBParser
from xlms_tools.depth import computedepth

parser = PDBParser()
biopdbstruct = parser.get_structure('test', 'tests/bigmodel.pdb')
depths, residues = computedepth(biopdbstruct)

xlms_tools.score.linkmetrics(biopdbstruct, link, depths, linkweight=1.0, linker='BS3/DSS')

  • Computes the score and CA-CA distance/residue depth in a Biopython PDB structure for a given crosslink or monolink.

Parameters:

  • biopdbstruct: an instance of the Bio.PDB structure class
  • link: a tuple defining either a monolink (residue#, chainid), or a crosslink (residue#_1, chainid_1, residue#_2, chainid_2)
  • depths: dictionary of residue depths (from xlms_tools.depth.computedepth(biopdbstruct))
  • linkweight: for quantitative XL-MS data, a measure of relative abundance or occupancy. If unspecified, default weight is 1.0
  • linker: crosslinking reagent used (if unspecified, default is BS3/DSS)

Returns:

  • score: monolink probability (MP) score for a monolink, crosslink probability (XLP) score for a crosslink
  • monolinkdepth: depth of the monolinked residue
  • crosslinkdistance: CA-CA distance of crosslink

Example usage:

from Bio.PDB import PDBParser
from xlms_tools.depth import computedepth
from xlms_tools.score import linkmetrics

parser = PDBParser()
biopdbstruct = parser.get_structure('test', 'tests/bigmodel.pdb')
depths, residues = computedepth(biopdbstruct)

# compute monolink score
mpscore, mpdepth = linkmetrics(biopdbstruct, (1, 'A'), depths)

# computed crosslink score
xlpscore, distance = linkmetrics(biopdbstruct, (1049, 'B', 1409, 'B'), depths)

Citations

When using xlms-tools, please cite: Manalastas-Cantos, K., Adoni, K. R., Pfeifer, M., Märtens, B., Grünewald, K., Thalassinos, K., & Topf, M. (2024). Modeling flexible protein structure with AlphaFold2 and cross-linking mass spectrometry. Molecular & Cellular Proteomics. https://doi.org/10.1016/j.mcpro.2024.100724

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

xlms_tools-1.0.4.tar.gz (17.4 kB view details)

Uploaded Source

Built Distribution

xlms_tools-1.0.4-py3-none-any.whl (18.2 kB view details)

Uploaded Python 3

File details

Details for the file xlms_tools-1.0.4.tar.gz.

File metadata

  • Download URL: xlms_tools-1.0.4.tar.gz
  • Upload date:
  • Size: 17.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.8.5

File hashes

Hashes for xlms_tools-1.0.4.tar.gz
Algorithm Hash digest
SHA256 87968cd310c09b4fa882fbe54e5d6f00dbd652ddf802f346f2b2de41862dbbc2
MD5 ba352955b7f055fe6e1f4bddab87da21
BLAKE2b-256 f67e9ca06d458c4dbf5ecb905b69157d47bdb8381e90567f1688caf3bb208b05

See more details on using hashes here.

File details

Details for the file xlms_tools-1.0.4-py3-none-any.whl.

File metadata

  • Download URL: xlms_tools-1.0.4-py3-none-any.whl
  • Upload date:
  • Size: 18.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.8.5

File hashes

Hashes for xlms_tools-1.0.4-py3-none-any.whl
Algorithm Hash digest
SHA256 4f6b3af3b4070323457d2dbddd4a64d8e29c2d187c6563745c998b9405e8de46
MD5 cee2bfc894c2a167f32178656de7d80c
BLAKE2b-256 edd7dbb8ccd359d536116f97b555a529da0c31dddbbba6f98c6bbd463a14b297

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page