extract ms2/mgf files from bruker d folders
Project description
tdfextractor
A Python package to extract MS/MS spectra from Bruker TimsTOF .D folders and convert them to standard formats (MS2 and MGF).
Installation
pip install tdfextractor
Usage
tdfextractor provides two command-line tools for extracting spectra:
MS2 Extraction
Extract MS2 format files (compatible with MS-GF+, Comet, etc.):
ms2-extractor /path/to/sample.d
# shorthand
ms2-ex
ms2-ex /path/to/sample.d --output custom_output.ms2 --min-intensity 100 --min-charge 2
ms2-ex /path/to/directory_with_multiple_d_folders --output /path/to/output_directory
MGF Extraction
Extract MGF format files
mgf-extractor /path/to/sample.d
#shorthand
mgf-ex
mgf-ex /path/to/sample.d --casanovo # Optimized for Casanovo de novo sequencing
mgf-ex /path/to/directory_with_multiple_d_folders --output /path/to/output_directory
Output Options
Both extractors support flexible output options:
- No output specified: Files are created within each .D folder with auto-generated names
- Specific file path: Use
-o filename.ms2or-o filename.mgffor single .D folder processing - Output directory: Use
-o /path/to/output_dirfor batch processing multiple .D folders - Overwrite protection: Use
--overwriteto replace existing output files
Batch Processing
When processing multiple .D folders, the extractors will:
- Automatically find all .D folders in the specified directory
- Create output files with names matching the .D folder names
- Skip existing files unless
--overwriteis specified - Create the output directory if it doesn't exist
Command Line Arguments
Both MS2 and MGF extractors share the same arguments, with only a few format-specific options:
| Argument | Type | Default | Description |
|---|---|---|---|
analysis_dir |
str | - | Path to the .D analysis directory or directory containing .D folders |
-o, --output |
str | <analysis_dir_name>.<ext> |
Output file path or directory |
--remove-precursor |
flag | False | Remove precursor peaks from MS/MS spectra |
--precursor-peak-width |
float | 2.0 | Width around precursor m/z to remove (Da) |
--batch-size |
int | 100 | Batch size for processing spectra |
--top-n-peaks |
int | None | Keep only top N most intense peaks per spectrum |
--min-spectra-intensity |
float | None | Minimum intensity threshold for MS/MS peaks (absolute or 0.0-1.0 for percentage) |
--max-spectra-intensity |
float | None | Maximum intensity threshold for MS/MS peaks (absolute or 0.0-1.0 for percentage) |
--min-spectra-mz |
float | None | Minimum m/z filter for MS/MS peaks |
--max-spectra-mz |
float | None | Maximum m/z filter for MS/MS peaks |
--min-precursor-intensity |
int | None | Minimum precursor intensity filter |
--max-precursor-intensity |
int | None | Maximum precursor intensity filter |
--min-precursor-charge |
int | None | Minimum precursor charge state filter |
--max-precursor-charge |
int | None | Maximum precursor charge state filter |
--min-precursor-mz |
float | None | Minimum precursor m/z filter |
--max-precursor-mz |
float | None | Maximum precursor m/z filter |
--min-precursor-rt |
float | None | Minimum precursor retention time filter (seconds) |
--max-precursor-rt |
float | None | Maximum precursor retention time filter (seconds) |
--min-precursor-ccs |
float | None | Minimum precursor CCS filter |
--max-precursor-ccs |
float | None | Maximum precursor CCS filter |
--min-precursor-neutral-mass |
float | None | Minimum precursor neutral mass filter |
--max-precursor-neutral-mass |
float | None | Maximum precursor neutral mass filter |
--mz-precision |
int | 5 | Number of decimal places for m/z values |
--intensity-precision |
int | 0 | Number of decimal places for intensity values |
--keep-empty-spectra |
flag | False | Write empty spectra to output file |
--overwrite |
flag | False | Overwrite existing output files |
-v, --verbose |
flag | False | Enable verbose logging |
Format-Specific Arguments
MS2 Extractor Only:
--ip2: Use IP2 preset settings (sets min charge to 1)
MGF Extractor Only:
--casanovo: Use Casanovo preset settings (enables precursor removal, top-150 peaks, min intensity 0.01, m/z range 50-2500, min charge 1)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file tdfextractor-0.3.0.tar.gz.
File metadata
- Download URL: tdfextractor-0.3.0.tar.gz
- Upload date:
- Size: 16.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.9.23
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
7c2e1667873245ee3f6b25b9299ac694d8b6bb9f103f46859bc74ed9cf15e435
|
|
| MD5 |
45512d920c808b1b8f4c74148d3fe1fa
|
|
| BLAKE2b-256 |
e569165b397f624e6c21cad4dc3c2b08641cd0e1b1a427c99ebc913422acd9de
|
File details
Details for the file tdfextractor-0.3.0-py3-none-any.whl.
File metadata
- Download URL: tdfextractor-0.3.0-py3-none-any.whl
- Upload date:
- Size: 18.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.9.23
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
96360f91f05613193b257bce00358d55d5b5f5914acb700ba148f75ca44f5895
|
|
| MD5 |
245286cd2d4cc52a6d600bc55e64ffd7
|
|
| BLAKE2b-256 |
7d3817d3006eb6b1b423f82d423ed6bb1de1f2526f4585d83476c1682b311997
|