No project description provided

These details have not been verified by PyPI

Project description

imspy - Python package for working with timsTOF raw data

Welcome to the imspy short introduction. This package is designed to work with timsTOF raw data files, which are generated by the Bruker timsTOF mass spectrometers. The package provides a high-level API for accessing raw data, as well as a chemistry module for working with peptide sequences. The package also includes algorithms for ion mobility and retention time prediction, as well as machine learning algorithms for data analysis. Want to see how to build full data processing pipelines with imspy? Check out the imspy_dda and timsim cmd tools.

Raw data access

Establish a connection to a timsTOF raw file and access data

import numpy as np
from imspy.timstof import TimsDataset

# you can use in-memory mode for faster access, but it requires more memory
tdf = TimsDataset("path/to/rawfolder.d", in_memory=False)

# show global meta data table
print(tdf.global_meta_data)

# show frame meta data
print(tdf.meta_data)

# get the first frame (bruker frame indices start at 1)
frame = tdf.get_tims_frame(1)

# you can also use indexing
frame = tdf[1]

# print data as pandas dataframe
frame.df()

# get all spectra in a tims frame (sorted by scan = ion mobility)
spectra = frame.to_tims_spectra()

# get a slice of multiple frames
frames = tdf.get_tims_slice(np.array([1, 2, 3]))

# or, by using slicing
frames = tdf[1:4]

DDA data

from imspy.timstof import TimsDatasetDDA
# read a DDA dataset
tdf = TimsDatasetDDA("path/to/rawfolder.d", in_memory=False)

# get raw data of precursors together with their fragment ions
dda_fragments = tdf.get_pasef_fragments()

# the timsTOF re-fragments precursors below a certain intensity threshold,
# you can aggregate the data for increased sensitivity like so:
dda_fragments_grouped = dda_fragments.groupby('precursor_id').agg({
    'frame_id': 'first',
    'time': 'first',
    'precursor_id': 'first',
    # this will sum up the raw data of all fragments with the same precursor_id
    'raw_data': 'sum',
    'scan_begin': 'first',
    'scan_end': 'first',
    'isolation_mz': 'first',
    'isolation_width': 'first',
    'collision_energy': 'first',
    'largest_peak_mz': 'first',
    'average_mz': 'first',
    'monoisotopic_mz': 'first',
    'charge': 'first',
    'average_scan': 'first',
    'intensity': 'first',
    'parent_id': 'first',
})

# for convenience, you can calculate the inverse mobility 
# of the precursor ion by finding the maximum intensity along the scan dimension
mobility = dda_fragments_grouped.apply(
    lambda r: r.raw_data.get_inverse_mobility_along_scan_marginal(), axis=1
)

# add the inverse mobility to the grouped data as a new column
dda_fragments_grouped['mobility'] = mobility

DIA data

from imspy.timstof import TimsDatasetDIA
# read a DIA dataset
tdf = TimsDatasetDIA("path/to/rawfolder.d", in_memory=False)

The chemistry module

Basic usage

from imspy.chemistry.elements import ELEMENTAL_MONO_ISOTOPIC_MASSES, ELEMENTAL_ISOTOPIC_ABUNDANCES
from imspy.chemistry.sum_formula import SumFormula

# create a sum formula object that represents the molecule stachyose trihydrate
stachyose_trihydrate = SumFormula("C24H48O24")

# get the monoisotopic mass of the molecule
mono_mass = stachyose_trihydrate.monoisotopic_mass

# get the isotope distribution of the molecule, will be returned as an MzSpectrum object
mz_spec = stachyose_trihydrate.generate_isotope_distribution(charge=1)

This functionality is easily combined with the UNIMOD annotation database, which is included in the sagpy package.

from sagepy.core.unimod import modification_atomic_composition
from imspy.chemistry.sum_formula import SumFormula

# carbamidomethylation is a common modification that is annotated in the UNIMOD database
mods = modification_atomic_composition()

# get the atomic composition of carbamidomethylation
carbamidomethylation = mods["[UNIMOD:4]"]

# create a sum formula object that represents the molecule with carbamidomethylation
formula = SumFormula(''.join([key + str(value) for key, value in carbamidomethylation.items()]))
mono_mass = formula.monoisotopic_mass

Working with peptide sequences

from imspy.data.peptide import PeptideSequence

# create a peptide sequence object, might contain modifications
sequence = PeptideSequence("PEPTIDEC[UNIMOD:4]PEPTIDE")

# get the monoisotopic mass of the peptide
mono_mass = sequence.mono_isotopic_mass

# get the product ion series of the peptide sequence, e.g. b- and y-ions
b_ions, y_ions = product_ion_series = sequence.calculate_product_ion_series(
    charge=2,
    fragment_type='b',
)

# generate an isotopic distribution of the peptide product ion sequence with annotations, this will hold 
# detailed information about every single peak in the spectrum like b- and y-ion annotations, charge, isotopic number, etc.
annotated_spectrum = sequence.calculate_mono_isotopic_product_ion_spectrum_annotated(
    charge=2,
    fragment_type='b'
)

Algorithms and machine learning

ion mobility and retention time prediction

from imspy.algorithm import (DeepPeptideIonMobilityApex, DeepChromatographyApex, 
                             load_deep_ccs_predictor, load_deep_retention_time_predictor)
from imspy.algorithm.utility import load_tokenizer_from_resources
from imspy.chemistry.mobility import one_over_k0_to_ccs

# some example peptide sequences
sequences = ["PEPTIDE", "PEPTIDEC[UNIMOD:4]PEPTIDE"]
mz_values = [784.58, 1423.72]
charges = [1, 2]

# the retention time predictor model
rt_predictor = DeepChromatographyApex(load_deep_retention_time_predictor(),
                                      load_tokenizer_from_resources("tokenizer-ptm"), verbose=True)

# predict retention times for peptide sequences
predicted_rt = rt_predictor.simulate_separation_times(sequences=sequences)

# the ion mobility predictor model
im_predictor = DeepPeptideIonMobilityApex(load_deep_ccs_predictor(),
                                          load_tokenizer_from_resources("tokenizer-ptm"))

# predict ion mobilities for peptide sequences and translate them to collision cross sections
predicted_inverse_mobility = im_predictor.simulate_ion_mobilities(sequences=sequences, charges=charges, mz=mz_values)
ccs = [one_over_k0_to_ccs(inv_im, mz, charge) for inv_im, mz, charge in zip(predicted_inverse_mobility, mz_values, charges)]

Intensity prediction

We provide a wrapper for the Prosit intensity prediction model, timsTOF version, which can be used to predict the intensity of fragment ions. If you use this model, please give credit to the original authors.

from imspy.algorithm import Prosit2023TimsTofWrapper

# some example peptide sequences
sequences = ["PEPTIDE", "PEPTIDEC[UNIMOD:4]PEPTIDE"]
mz_values = [784.58, 1423.72]
charges = [1, 2]

# collision energies need to be calibrated, check out the Prosit documentation for more information or read the calibrate_collision_energies function
collision_energies = [20.5, 30.2]

# the Prosit model
prosit_model = Prosit2023TimsTofWrapper()

# predict expected ion intensities for peptide sequences
predicted_intensity = prosit_model.predict_intensities(
    sequences=sequences,
    charges=charges,
    collision_energies=collision_energies,
    # will return the flat 174 dimensional feature vector per sequence created by Prosit
    flatten=True
)

Locality sensitive hashing

Locality sensitive hashing is a technique to find similar data points in high-dimensional spaces. We provide the option to cluster spectra based on their similarity using the LSH algorithm.

from imspy.timstof import TimsDatasetDDA
from imspy.algorithm.hashing import TimsHasher

# read a DDA dataset
tdf = TimsDatasetDDA("path/to/raw/folder.d", in_memory=False)

# read a frame
frame = tdf.get_tims_frame(1)

# create windows from frame
scans, indices, W = frame.to_dense_windows(
    window_length=5,
    resolution=1,
    overlapping=True,
)

# create a TimsHasher object
hasher = TimsHasher(trials=256, len_trial=22, seed=42, num_dalton=5, resolution=1)

# calculate trials number of keys, each having len_tral bits for each window
K = hasher.calculate_keys(W)

Mixture models

</code></pre>
<h2>Pipeline: DDA data analysis (imspy_dda)</h2>
<p>After you successfully installed the package, you can use the <code>imspy_dda</code> command line tool to analyze DDA data.
This will print out a list of options and arguments that you can use to analyze your data:</p>
<pre lang="python"><code>imspy_dda --help

Pipeline: Synthetic raw data generation (timsim)

After you successfully installed the package, you can use the timsim command line tool to generate synthetic raw data. This will print out a list of options and arguments that you can use to generate synthetic raw data:

timsim --help

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

0.3.23

Aug 5, 2025

0.3.22

Aug 5, 2025

0.3.21

Jul 18, 2025

0.3.20

Jul 5, 2025

0.3.19

Jun 16, 2025

0.3.18

Jun 6, 2025

0.3.16

Mar 12, 2025

0.3.15

Feb 26, 2025

0.3.14

Feb 25, 2025

0.3.13

Feb 21, 2025

0.3.12

Feb 21, 2025

0.3.11

Jan 22, 2025

This version

0.3.10

Jan 13, 2025

0.3.9

Jan 13, 2025

0.3.8

Jan 13, 2025

0.3.7

Jan 13, 2025

0.3.6

Jan 11, 2025

0.3.5

Jan 9, 2025

0.3.4

Jan 8, 2025

0.3.3

Jan 7, 2025

0.3.2

Jan 6, 2025

0.3.1

Jan 6, 2025

0.3.0

Jan 6, 2025

0.2.35

Oct 14, 2024

0.2.34

Oct 13, 2024

0.2.33

Oct 9, 2024

0.2.32

Sep 12, 2024

0.2.31

Sep 10, 2024

0.2.29

Sep 4, 2024

0.2.28

Sep 4, 2024

0.2.27

Sep 4, 2024

0.2.26

Sep 4, 2024

0.2.25

Jul 20, 2024

0.2.24

Jul 20, 2024

0.2.23

Jun 1, 2024

0.2.22

May 30, 2024

0.2.20

May 28, 2024

0.2.19

May 24, 2024

0.2.18

May 16, 2024

0.2.17

Mar 13, 2024

0.2.16

Feb 22, 2024

0.2.15

Dec 6, 2023

0.2.14

Nov 27, 2023

0.2.13

Nov 24, 2023

0.2.12

Nov 24, 2023

0.2.11

Nov 24, 2023

0.2.10

Nov 24, 2023

0.2.9

Nov 24, 2023

0.2.8

Nov 24, 2023

0.2.7

Nov 24, 2023

0.2.6

Nov 11, 2023

0.2.5

Nov 8, 2023

0.2.2

Nov 6, 2023

0.2.0

Nov 4, 2023

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

imspy-0.3.10.tar.gz (29.0 MB view details)

Uploaded Jan 13, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

imspy-0.3.10-py3-none-any.whl (29.0 MB view details)

Uploaded Jan 13, 2025 Python 3

File details

Details for the file imspy-0.3.10.tar.gz.

File metadata

Download URL: imspy-0.3.10.tar.gz
Upload date: Jan 13, 2025
Size: 29.0 MB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: poetry/2.0.1 CPython/3.11.11 Linux/6.8.0-1017-azure

File hashes

Hashes for imspy-0.3.10.tar.gz
Algorithm	Hash digest
SHA256	`46bbf3ca40e8b9e245c4e328b821a799774576a9763bcbb6f4de7cb247b96642`
MD5	`b5969274fb8a2331cafa4aa5469c0a18`
BLAKE2b-256	`a945937b91eae3ada045a42e1edbe81684f37a5c5a2389f4fa773d7114069eec`

See more details on using hashes here.

File details

Details for the file imspy-0.3.10-py3-none-any.whl.

File metadata

Download URL: imspy-0.3.10-py3-none-any.whl
Upload date: Jan 13, 2025
Size: 29.0 MB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: poetry/2.0.1 CPython/3.11.11 Linux/6.8.0-1017-azure

File hashes

Hashes for imspy-0.3.10-py3-none-any.whl
Algorithm	Hash digest
SHA256	`6cf03edf85829be8b44a3dd3966f75529e2a5d707f6879ff0f767b8d33c0eb84`
MD5	`361016635e7eed39ab06add7bd00eb58`
BLAKE2b-256	`76a76149a080bef22d726285859dff9060ac78e4a57db9591f5b6cf7a2ecc98d`

See more details on using hashes here.

imspy 0.3.10

Navigation

Verified details

Maintainers

Unverified details

Meta

Classifiers

Project description

imspy - Python package for working with timsTOF raw data

Raw data access

Establish a connection to a timsTOF raw file and access data

DDA data

DIA data

The chemistry module

Basic usage

Working with peptide sequences

Algorithms and machine learning

ion mobility and retention time prediction

Intensity prediction

Locality sensitive hashing

Mixture models

Pipeline: Synthetic raw data generation (timsim)

Project details

Verified details

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes