A lightweight Python library for parsing mzML mass spectrometry files.
Project description
A lightweight Python library for parsing mzML mass spectrometry files. Implements a type-safe, lazy-loading API with direct support for modern mzML structures (>= 1.1.0).
Installation
pip install mzmlpy
Optional extras:
pip install mzmlpy[numpress] # MS-Numpress decoding
pip install mzmlpy[zstd] # Zstandard compression
pip install mzmlpy[rapidgzip] # Parallel gzip decompression (recommended for .gz files)
Quick Start
from mzmlpy import Mzml
with Mzml("path/to/file.mzML") as reader:
print(f"File: {reader.file_name} | Spectra: {len(reader.spectra)}")
for spectrum in reader.spectra:
mz = spectrum.mz
intensity = spectrum.intensity
print(f" {spectrum.id} MS{spectrum.ms_level} — {len(mz)} peaks")
Both .mzML and .mzML.gz files are supported. Metadata is parsed eagerly; binary data is decoded on demand.
Reading Gzipped Files
When opening .mzML.gz files, the gzip_mode parameter controls how the file is accessed:
| Mode | Description |
|---|---|
"extract" (default) |
Decompress to <tmpdir>/mzmlpy/ and cache across sessions. First open pays decompression cost; subsequent opens reuse the cache instantly. The OS clears tmp on reboot. |
"indexed" |
Seekable access to the compressed file using rapidgzip. No decompression to disk. Requires pip install mzmlpy[rapidgzip]. |
"stream" |
Stream sequentially. Lowest startup cost but no efficient random access. |
For most use cases, "extract" or "indexed" is recommended:
# Default — extracts to tmp, cached across sessions
with Mzml("data.mzML.gz") as reader:
spec = reader.spectra[0]
# Indexed — no extraction, seekable access (requires rapidgzip)
with Mzml("data.mzML.gz", gzip_mode="indexed") as reader:
spec = reader.spectra[0]
To reclaim disk space before the OS clears tmp on reboot:
from mzmlpy import clear_cache
clear_cache()
Performance
Benchmarked on a real-world DDA file (33,535 spectra, first-open cold start, with rapidgzip):
| Mode | Startup | Iterate (500 spectra) | Random access (5 reads) |
|---|---|---|---|
plain .mzML |
0.042s | 0.087s | 0.001s |
in_memory=True |
1.499s | 0.362s | 0.002s |
gzip_mode="extract" |
0.957s | 0.083s | 0.001s |
gzip_mode="indexed" ¹ |
6.850s | 0.135s | 0.074s |
gzip_mode="stream" |
0.089s | 0.155s | 22.8s |
¹ "indexed" startup includes building the gzip seek index and mzML offset index on first open — both are cached alongside the file, so subsequent opens are fast.
"extract" pays a one-time decompression cost (~1s for a large file) then matches plain .mzML speed. "stream" is sequential-only — random access requires re-scanning from the start.
For full usage examples see the Getting Started guide and API Reference.
Development
just lint # ruff check
just format # ruff isort + format
just ty # ty type checker
just test # pytest
# or all at once:
just check
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file mzmlpy-0.4.0.tar.gz.
File metadata
- Download URL: mzmlpy-0.4.0.tar.gz
- Upload date:
- Size: 13.1 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
12ce07fe4b5e43aa55c1c8fd2ee9fd320bbb3e1c4092e2762aa8982dffe6a9e4
|
|
| MD5 |
f7986cca8ffa840de56383ec86d4f9f7
|
|
| BLAKE2b-256 |
833dcb7760ca49c40ad031346567b9ce6ba6b9321108108b1c88aec430e58a70
|
File details
Details for the file mzmlpy-0.4.0-py3-none-any.whl.
File metadata
- Download URL: mzmlpy-0.4.0-py3-none-any.whl
- Upload date:
- Size: 44.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e74629b05b0dc440665fd99cf3cd5e85b8b3904dd6a262bb62df75392d52e35a
|
|
| MD5 |
7867f6f0017abd969fc4a5a6f0e38a6e
|
|
| BLAKE2b-256 |
0187eeac1022c8ce72aa6db97a3440fcb1347754be25e243369eeb7e28dd5f7c
|