Skip to main content

Analysis, conversion and visualization of diaPASEF data.

Project description

Author: Max Frank, Hannes Roest Date: 2018-04-26

diapysef is a convenience package for working with DIA-PASEF data. It has functionalities to convert Bruker raw files into a format that OpenMS can understand. Thus OpenSwath can be used to analyze the data and TOPPView can be used to visualize. diapysef itself has also some basic visualization capability that allows to display the window setting of a DIA-PASEF run in the context of a precursor map.

Installation

We have not uploaded this package to pyPI, since the package contains some small example data and small amounts of bruker code. You can install the package through the provided wheel. Make sure you have python and pip installed. Then, in your terminal command prompt, run:

## Optional: if conversion with compression is required install the newest pyopenms nightly build
## Otherwhise, from the folder containing the .whl file run
pip install diapysef-0.1-py2.py3-none-any.whl

On windows make sure that you add the Scripts/ folder of your python installation to your PATH to be able to call the command line tools from anywhere.

Converting raw files

Assuming you have added the python scripts folder to your path you can simply run:

convertTDFtoMzML.py

If you see an output like this:

Bruker sdk not found. Some functionalities that need access to raw data will not be available. To activate that functionality place libtimsdata.so (Linux) or timsdata.dll in the src folder.

This functionality can only be carried out if the bruker sdk is present. Please install it first. The sdk can be installed by installing proteowizard(version >=3, http://proteowizard.sourceforge.net), or by placing the a library file in your path (For windows this will be timsdata.dll and for Linux libtimsdata.so).

You will have to install a Bruker sdk that can handle TDF3.0. You can either place the sdk file in your working directory (safest option) or somewhere in your PATH. Another option is to install the latest version of ProteoWizard which supports access to the bruker sdk.

Now you can run the tool without arguments to get the usage info:
bash convertTDFtoMzML.py
Found Bruker sdk. Access to the raw data is possible.

usage: convertTDFtoMzML.py [-h] -a ANALYSIS_DIR -o OUTPUT_FNAME
                           [-m MERGE_SCANS] [-r FRAME_LIMIT FRAME_LIMIT]
convertTDFtoMzML.py: error: the following arguments are required: -a/--analysis_dir, -o/--output_name

Data access and convenience functions

The rest of the tools are available as scripts but can also be used in a more modular fashion from wihtin python directly. It can access raw files from both PASEF and DIA-PASEF runs and reads in some MaxQuant txt files. Since these functions do not acutally need acess to the raw data, they can also be run without the sdk.

Obtaining a window layout file

This can be done with a commandline tool:

get_dia_windows.py 20180320_AnBr_SA_diaPASEF_200ng_HeLa_Rost_Method_4_a_01_A1_01_2143.d/ windows.csv

Or in python:

import diapysef as dp

# Open connection to a DIA-PASEF run
dia = dp.TimsData("/media/max/D6E01AF3E01ADA17/code/dia-pasef/bruker/20180320_AnBr_SA_diaPASEF_200ng_HeLa_Rost_Method_4_a_01_A1_01_2143.d/")
# Obtain the window layout from the first frames
win = dia.get_windows()
# Save as csv
win.to_csv("window_layout.csv")
print("File Written")
File Written

Annotating ion mobilities

This is useful to convert scan numbers which are corresponding to different ion mobilities depending on the run to 1/K0 which is a more standardized measure.

This is needed, for example, to generate a library for OpenSwath targeted extraction. We can annotate Ion mobilities with 1/K0 values in a maxquant output using the calibration information in the raw file.

annotate_mq_ionmobility.py 20180309_HeLa_MQ_combined/ 20180309_TIMS1_Metab_AnBr_SA_200ng_HELA_Bremen13_14_A1_01_2129.d/ annotated1K0

Or in python:

import diapysef as dp

#Open connection to the pasef data file
pas = dp.PasefData("/media/max/D6E01AF3E01ADA17/code/dia-pasef/bruker/20180309_TIMS1_Metab_AnBr_SA_200ng_HELA_Bremen13_14_A1_01_2129.d/")
# Open connection to the Maxquant output from the same run
mq = dp.PasefMQData("/media/max/D6E01AF3E01ADA17/code/dia-pasef/bruker/20180309_HeLa_MQ_combined/")

## Annotate all peptides
# Read in the allPeptides table from the output and annotate with 1/K0 using the calibration obtained from pas
mq.get_all_peptides()
mq.annotate_ion_mobility(pas)
#Or more directly
mq.get_all_peptides(pas)
# Save the table
all_pep = mq.all_peptides
all_pep.to_csv("all_peptides_1K0.csv")

## Annotate evidence
# Read in the allPeptides table from the output and annotate with 1/K0 using the calibration obtained from pas
mq.get_evidence()
mq.annotate_ion_mobility(pas)
#Or more directly
mq.get_evidence(pas)
# Save the table
ev = mq.evidence
ev.to_csv("evidence_1K0.csv")

Plotting window layouts

The above operations let you obtain a precursor map (either with all MS1 features or with the peptide evidence) and a window layout. It is informative to plot these together to get some insight into how well the windows cover the precursor space.

We provide the following plotting function, as a commandline script

plot_dia_windows.py window_layout.csv all_peptides_1K0.csv

Or in python:

import diapysef as dp
import pandas as pd

dia = dp.TimsData("/media/max/D6E01AF3E01ADA17/code/dia-pasef/bruker/20180320_AnBr_SA_diaPASEF_200ng_HeLa_Rost_Method_4_a_01_A1_01_2143.d/")
win = dia.get_windows()
# Diapysef saves a precursor layout from a Pasef run internally so it is possible to quickly plot windows without
# specifying a precursor map
dp.plot_window_layout(windows = win)

# If the windows should be plotted against a certain precursor map (e.g. all_peptides obtained above) you can specify
# an additional dataframe
precursors = pd.read_csv("all_peptides_1K0.csv")

dp.plot_window_layout(windows = win, precursor_map = precursors)
output_7_0.png output_7_1.png

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

diapysef-0.3.5.tar.gz (674.3 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

diapysef-0.3.5-py3.10.egg (748.9 kB view details)

Uploaded Egg

diapysef-0.3.5-py3.9.egg (755.2 kB view details)

Uploaded Egg

diapysef-0.3.5-py3-none-any.whl (679.4 kB view details)

Uploaded Python 3

File details

Details for the file diapysef-0.3.5.tar.gz.

File metadata

  • Download URL: diapysef-0.3.5.tar.gz
  • Upload date:
  • Size: 674.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.20.0 setuptools/40.6.3 requests-toolbelt/0.9.1 tqdm/4.32.1 CPython/3.6.3

File hashes

Hashes for diapysef-0.3.5.tar.gz
Algorithm Hash digest
SHA256 f7e02a8269ef94d9a0e924325043d081913fad2d48b8d9706b54fa2b92ea044b
MD5 f7edfea3e007030a671361d4ebfdebca
BLAKE2b-256 017107114bc9294369ae3ef2a2b64f9f436f8a14d84577ecfa9557657a8d5f5f

See more details on using hashes here.

File details

Details for the file diapysef-0.3.5-py3.10.egg.

File metadata

  • Download URL: diapysef-0.3.5-py3.10.egg
  • Upload date:
  • Size: 748.9 kB
  • Tags: Egg
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.9.13

File hashes

Hashes for diapysef-0.3.5-py3.10.egg
Algorithm Hash digest
SHA256 5b588048aec83c14b87dc03848868c2115a5d9c856934c113ecf4cc1b7637720
MD5 71316f437245551c63b4534c0d16c93d
BLAKE2b-256 a97257af34b8b5fe345736a53f2ce463e377393f464234ac7d7132b5e89e1e00

See more details on using hashes here.

File details

Details for the file diapysef-0.3.5-py3.9.egg.

File metadata

  • Download URL: diapysef-0.3.5-py3.9.egg
  • Upload date:
  • Size: 755.2 kB
  • Tags: Egg
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.9.13

File hashes

Hashes for diapysef-0.3.5-py3.9.egg
Algorithm Hash digest
SHA256 905db5042364d989ae7b736f34d768f1a2f49b955e3f079338ee82b4451ca39c
MD5 7ecb62ab26ca09de2b4fe64f98794289
BLAKE2b-256 92ff6a2e91807f913df21fbb72a3ee1080b10cb4e0a36ee4f5ffefc2d7444c81

See more details on using hashes here.

File details

Details for the file diapysef-0.3.5-py3-none-any.whl.

File metadata

  • Download URL: diapysef-0.3.5-py3-none-any.whl
  • Upload date:
  • Size: 679.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.20.0 setuptools/40.6.3 requests-toolbelt/0.9.1 tqdm/4.32.1 CPython/3.6.3

File hashes

Hashes for diapysef-0.3.5-py3-none-any.whl
Algorithm Hash digest
SHA256 f5ebb03bebc0ab2338d59b1a3326dc7824bb6aac06360f1e559f8f4c27f2b21f
MD5 a442bb27f9d3e9c076763a36556ef887
BLAKE2b-256 120c80a96e5cd0b1c89da0d364fc16fa40ec28834f84e69cfe19cb27a592f5dc

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page