Extract MS2, MGF, and mzML files from Bruker timsTOF .d folders

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

These details have not been verified by PyPI

Project description

tdfextractor

A Python package to extract MS/MS spectra from Bruker TimsTOF .D folders and convert them to standard formats (MS2, MGF, and mzML).

Installation

pip install tdfextractor

Usage

tdfextractor provides two command-line tools for extracting spectra:

MS2 Extraction

Extract MS2 format files (compatible with MS-GF+, Comet, etc.):

ms2-extractor /path/to/sample.d

# shorthand
ms2-ex 
ms2-ex /path/to/sample.d --output custom_output.ms2 --min-intensity 100 --min-charge 2
ms2-ex /path/to/directory_with_multiple_d_folders --output /path/to/output_directory

MGF Extraction

Extract MGF format files

mgf-extractor /path/to/sample.d

#shorthand
mgf-ex
mgf-ex /path/to/sample.d --casanovo  # Optimized for Casanovo de novo sequencing
mgf-ex /path/to/directory_with_multiple_d_folders --output /path/to/output_directory

mzML Extraction

Extract mzML format files (includes both MS1 and MS2 PASEF spectra):

mzml-extractor /path/to/sample.d

# shorthand
mzml-ex /path/to/sample.d
mzml-ex /path/to/sample.d --no-ms1  # MS2 spectra only
mzml-ex /path/to/sample.d --mz-compression zstd --intensity-encoding 32
mzml-ex /path/to/directory_with_multiple_d_folders --output /path/to/output_directory

Output Options

Both extractors support flexible output options:

No output specified: Files are created within each .D folder with auto-generated names
Specific file path: Use -o filename.ms2 or -o filename.mgf for single .D folder processing
Output directory: Use -o /path/to/output_dir for batch processing multiple .D folders
Overwrite protection: Use --overwrite to replace existing output files

Batch Processing

When processing multiple .D folders, the extractors will:

Automatically find all .D folders in the specified directory
Create output files with names matching the .D folder names
Skip existing files unless --overwrite is specified
Create the output directory if it doesn't exist

Command Line Arguments

Both MS2 and MGF extractors share the same arguments, with only a few format-specific options:

Argument	Type	Default	Description
`analysis_dir`	str	-	Path to the .D analysis directory or directory containing .D folders
`-o, --output`	str	`<analysis_dir_name>.<ext>`	Output file path or directory
`--remove-precursor`	flag	False	Remove precursor peaks from MS/MS spectra
`--precursor-peak-width`	float	2.0	Width around precursor m/z to remove (Da)
`--batch-size`	int	100	Batch size for processing spectra
`--top-n-peaks`	int	None	Keep only top N most intense peaks per spectrum
`--min-spectra-intensity`	float	None	Minimum intensity threshold for MS/MS peaks (absolute or 0.0-1.0 for percentage)
`--max-spectra-intensity`	float	None	Maximum intensity threshold for MS/MS peaks (absolute or 0.0-1.0 for percentage)
`--min-spectra-mz`	float	None	Minimum m/z filter for MS/MS peaks
`--max-spectra-mz`	float	None	Maximum m/z filter for MS/MS peaks
`--min-precursor-intensity`	float	None	Minimum precursor intensity filter
`--max-precursor-intensity`	float	None	Maximum precursor intensity filter
`--min-precursor-charge`	int	None	Minimum precursor charge state filter
`--max-precursor-charge`	int	None	Maximum precursor charge state filter
`--min-precursor-mz`	float	None	Minimum precursor m/z filter
`--max-precursor-mz`	float	None	Maximum precursor m/z filter
`--min-precursor-rt`	float	None	Minimum precursor retention time filter (seconds)
`--max-precursor-rt`	float	None	Maximum precursor retention time filter (seconds)
`--min-precursor-ccs`	float	None	Minimum precursor CCS filter
`--max-precursor-ccs`	float	None	Maximum precursor CCS filter
`--min-precursor-neutral-mass`	float	None	Minimum precursor neutral mass filter
`--max-precursor-neutral-mass`	float	None	Maximum precursor neutral mass filter
`--mz-precision`	int	5	Number of decimal places for m/z values
`--intensity-precision`	int	0	Number of decimal places for intensity values
`--keep-empty-spectra`	flag	False	Write empty spectra to output file
`--overwrite`	flag	False	Overwrite existing output files
`--workers`	int	1	Number of worker threads for processing multiple .d folders
`-v, --verbose`	flag	False	Enable verbose logging

Format-Specific Arguments

MS2 Extractor Only:

--ip2: Use IP2 preset settings (sets min charge to 2, top 500 peaks)

MGF Extractor Only:

--casanovo: Use Casanovo preset settings (enables precursor removal, top-150 peaks, min intensity 0.01, m/z range 50-2500, min charge 2)

mzML Extractor Only:

Argument	Type	Default	Description
`--no-ms1`	flag	False	Skip MS1 spectra; write only MS2 PASEF spectra
`--mz-compression`	str	`zlib`	Compression for m/z arrays (`none`, `zlib`, `zstd`, `numpress-linear`, `numpress-slof`, `numpress-pic`)
`--intensity-compression`	str	`zlib`	Compression for intensity arrays
`--mobility-compression`	str	`zlib`	Compression for per-peak ion mobility arrays (MS1)
`--mz-encoding`	int	`64`	Bit width for m/z values (`32` or `64`)
`--intensity-encoding`	int	`32`	Bit width for intensity values (`32` or `64`)
`--centroid-noise-filter`	str	`none`	Noise filter before centroiding (`none`, `mad`, `percentile`, `histogram`, `baseline`, `iterative_median`)
`--centroid-mz-tolerance`	float	`8.0`	m/z tolerance for centroiding
`--centroid-mz-tolerance-type`	str	`ppm`	Unit for m/z tolerance (`ppm` or `da`)
`--centroid-im-tolerance`	float	`0.05`	Ion mobility tolerance for centroiding
`--centroid-im-tolerance-type`	str	`relative`	Unit for ion mobility tolerance (`relative` or `absolute`)
`--centroid-min-peaks`	int	`5`	Minimum raw peaks required to form a centroided peak

Performance Options

The --workers argument allows parallel processing of multiple .d folders:

# Process multiple .d folders with 4 worker threads
mgf-ex /path/to/directory_with_multiple_d_folders --workers 4

Note: Workers only affect processing when multiple .d folders are being processed simultaneously. Each worker processes one complete .d folder independently.

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

pgarrett

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

0.4.0

Apr 10, 2026

0.3.0

Jun 11, 2025

0.2.1

Jun 10, 2025

0.2.0

Jun 10, 2025

0.1.8

Jun 11, 2023

0.1.7

Apr 13, 2023

0.1.6

Apr 13, 2023

0.1.5

Apr 13, 2023

0.1.4

Apr 13, 2023

0.1.3

Nov 21, 2022

0.1.2

Sep 29, 2022

0.1.1

Sep 29, 2022

0.1.0

Sep 29, 2022

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tdfextractor-0.4.0.tar.gz (31.3 kB view details)

Uploaded Apr 10, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

tdfextractor-0.4.0-py3-none-any.whl (29.7 kB view details)

Uploaded Apr 10, 2026 Python 3

File details

Details for the file tdfextractor-0.4.0.tar.gz.

File metadata

Download URL: tdfextractor-0.4.0.tar.gz
Upload date: Apr 10, 2026
Size: 31.3 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for tdfextractor-0.4.0.tar.gz
Algorithm	Hash digest
SHA256	`25fdd3fc99c83960f4166514a333404e8f116a6edbb76e3f6cbfa81724e062c4`
MD5	`f02acd1242ddff9a6a287b8a9c078e8b`
BLAKE2b-256	`2894610de6b3647130de5b6501f489c2d80ca6d735bc81806b130bc36f3ad0b0`

See more details on using hashes here.

Provenance

The following attestation bundles were made for tdfextractor-0.4.0.tar.gz:

Publisher: python-publish.yml on tacular-omics/tdfextractor

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: tdfextractor-0.4.0.tar.gz
- Subject digest: 25fdd3fc99c83960f4166514a333404e8f116a6edbb76e3f6cbfa81724e062c4
- Sigstore transparency entry: 1273000772
- Sigstore integration time: Apr 10, 2026
Source repository:
- Permalink: tacular-omics/tdfextractor@5d3c3d6dfea05fe54e09eaa984753d5bbc8d367a
- Branch / Tag: refs/tags/v0.4.0
- Owner: https://github.com/tacular-omics
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: python-publish.yml@5d3c3d6dfea05fe54e09eaa984753d5bbc8d367a
- Trigger Event: release

File details

Details for the file tdfextractor-0.4.0-py3-none-any.whl.

File metadata

Download URL: tdfextractor-0.4.0-py3-none-any.whl
Upload date: Apr 10, 2026
Size: 29.7 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for tdfextractor-0.4.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`02855fb6515b29b427087322b82c6d032990c8f746c879cf5c76d23e09f6ed93`
MD5	`0f9482db970a77fa53601fdf4d927d2b`
BLAKE2b-256	`017a5f1836c798c5e2ed9385b2ff624a5f22086e9ed4b9f8d10f4b803c1e3c8e`

See more details on using hashes here.

Provenance

The following attestation bundles were made for tdfextractor-0.4.0-py3-none-any.whl:

Publisher: python-publish.yml on tacular-omics/tdfextractor

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: tdfextractor-0.4.0-py3-none-any.whl
- Subject digest: 02855fb6515b29b427087322b82c6d032990c8f746c879cf5c76d23e09f6ed93
- Sigstore transparency entry: 1273001008
- Sigstore integration time: Apr 10, 2026
Source repository:
- Permalink: tacular-omics/tdfextractor@5d3c3d6dfea05fe54e09eaa984753d5bbc8d367a
- Branch / Tag: refs/tags/v0.4.0
- Owner: https://github.com/tacular-omics
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: python-publish.yml@5d3c3d6dfea05fe54e09eaa984753d5bbc8d367a
- Trigger Event: release

tdfextractor 0.4.0

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Meta

Unverified details

Meta

Classifiers

Project description

tdfextractor

Installation

Usage

MS2 Extraction

MGF Extraction

mzML Extraction

Output Options

Batch Processing

Command Line Arguments

Format-Specific Arguments

Performance Options

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Meta

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance