Inline spectrum URL encoder — embeds a complete mass spectrum in a compact, URL-safe token.
Project description
spectrl
Inline spectrum URL encoder for mass spectrometry.
Encodes a complete mass spectrum — peaks, metadata, precursor info — into a single compact, URL-safe token. The entire spectrum lives in the string. No backend required.
spectrl1.<header>.<mz_array>.<intensity_array>[.<extra_arrays>]
Why
A USI references a spectrum stored in a repository. spectrl embeds it. Use spectrl when you want to share a spectrum directly in a URL, QR code, notebook, or paper — without requiring the reader to have access to the original file.
The two are complementary: a spectrl token can carry a USI back-link, and spectra too large to embed fall back to a USI reference.
Install
pip install spectrl
Requires Python 3.12+.
Quick start
Encode from mzmlpy
from mzmlpy.run import Mzml
from spectrl import encode_spectrum, from_mzmlpy
with Mzml("data.mzML") as mzml:
spec = mzml.spectra[0]
token = encode_spectrum(from_mzmlpy(spec))
print(token)
# spectrl1.hQ...
Encode manually
import numpy as np
from spectrl import encode_spectrum
from spectrl.model import InlineSpectrum, SpectrlCvParam
spec = InlineSpectrum(
default_array_length=3,
mz=np.array([147.0, 175.1, 246.2]),
intensity=np.array([1e5, 8e4, 3e4]),
id="scan=42",
params=[
SpectrlCvParam(accession="MS:1000511", value=2), # ms level
SpectrlCvParam(accession="MS:1000130"), # positive scan
SpectrlCvParam(accession="MS:1000127"), # centroid
],
)
token = encode_spectrum(spec)
Decode
from spectrl import decode_token
decoded = decode_token(token)
print(decoded.mz) # numpy array
print(decoded.intensity) # numpy array
print(decoded.id) # "scan=42"
URL bindings
from spectrl import to_fragment, to_query, to_data_uri, extract_token
# Embed in a URL fragment (recommended — never sent to server)
url = to_fragment(token, "https://viewer.example.com/spectrum")
# https://viewer.example.com/spectrum#spectrl1.hQ...
# Or as a query parameter
url = to_query(token, "https://viewer.example.com/spectrum")
# https://viewer.example.com/spectrum?d=spectrl1.hQ...
# Or as a data URI
uri = to_data_uri(token)
# data:application/vnd.spectrl;v=1,spectrl1.hQ...
# Extract token back from any of the above
token = extract_token(url)
Trim large spectra
from spectrl import top_n
# Keep the 50 most intense peaks before encoding
trimmed = top_n(spec, 50)
token = encode_spectrum(trimmed)
Lossless encoding
# Default is lossy MS-Numpress (~0.003 mDa m/z error, ~0.007% intensity error)
# Use lossless=True for bit-exact IEEE-754 doubles
token = encode_spectrum(spec, lossless=True)
Token format
spectrl1.<b64url(msgpack_header)>.<b64url(mz_array)>.<b64url(intensity_array)>[.<b64url(...)>]
spectrl1— magic + format version; clean version bumps.- Segments separated by
.; each is base64url without padding (RFC 4648 §5). - Header — msgpack map with integer keys mirroring mzML structure: ms level, polarity, scan times, precursor isolation window, activation method, collision energy, ProForma interpretation, and a truncated SHA-256 content hash.
- Array segments — one per array type (m/z, intensity, charge, ion mobility). Each encoded as MS-Numpress (lossy) or raw IEEE-754 (lossless) + zlib, matching mzML's own
binaryDataArraypipeline.
Encoding precision
Measured over 479,455 peaks from a real LC-MS/MS dataset (BSA, Orbitrap):
| Array | Mean error | Max error |
|---|---|---|
| m/z (MS-Numpress linear) | 0.0025 mDa / 0.006 ppm | 0.005 mDa / 0.056 ppm |
| Intensity (MS-Numpress slof) | 0.007% relative | 0.029% relative |
Size vs mzML
Measured on the same BSA dataset (1,684 spectra):
| Format | MS1 avg (545 peaks) | MS2 avg (109 peaks) |
|---|---|---|
| Raw mzML XML | 12,876 B | 6,004 B |
| spectrl (lossy) | 4,241 B | 1,340 B |
| spectrl (lossless) | 10,302 B | 1,909 B |
CLI
# Encode from JSON
echo '{"mz":[147.0,175.1],"intensity":[1e5,8e4]}' | spectrl encode
# Decode a token
echo "spectrl1.hQ..." | spectrl decode
# Inspect the header as readable JSON
echo "spectrl1.hQ..." | spectrl inspect
Design
- mzML-faithful — metadata is carried as CV accession maps (MS: ontology), mirroring mzML
cvParamsemantics. No invented field names. - CV binding — all accession constants come from mzmlpy's StrEnum enums; no hardcoded integers.
- Deterministic (within an implementation) — canonical form (m/z-ascending, fixed numpress scale factors) yields a stable token from a given implementation, plus a truncated SHA-256 content hash (key 9) verified on decode as a transport-integrity check. Token bytes are not guaranteed identical across implementations (DEFLATE/msgpack output is not canonical); see SPECIFICATION.md.
- ProForma — carries a ProForma 2.0 peptide interpretation string (key 8), the same mechanism used by USI.
Specification
The normative token format is specified in SPECIFICATION.md (draft, intended for submission to HUPO-PSI). This README is a tutorial; the specification is the contract. A machine-readable CV/codec/key registry lives in schema/registry.json.
Contributing
See CONTRIBUTING.md and the Code of Conduct. Changes to the on-the-wire token format are governed more strictly — see the Format changes section of the contributing guide.
License
Licensed under the Apache License 2.0. If you use spectrl in research, please cite it via CITATION.cff.
Related
- mzmlpy — the mzML parser this library bridges from
- PSI Universal Spectrum Identifier (USI) — references spectra in repositories; complementary to spectrl
- ProForma 2.0 — peptidoform notation carried in the token
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file spectrl-0.1.0.tar.gz.
File metadata
- Download URL: spectrl-0.1.0.tar.gz
- Upload date:
- Size: 5.7 MB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
365d1209ed61804765a7320e72edd398401b276038cd4bce468ebc3eb07dea41
|
|
| MD5 |
059586afe57cbeb429c418516d993929
|
|
| BLAKE2b-256 |
bb512e753c5e25081cf2fab8d0eedd46e56059447c1583f7c39257727b745d07
|
Provenance
The following attestation bundles were made for spectrl-0.1.0.tar.gz:
Publisher:
publish.yml on pgarrett-scripps/spectrl
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
spectrl-0.1.0.tar.gz -
Subject digest:
365d1209ed61804765a7320e72edd398401b276038cd4bce468ebc3eb07dea41 - Sigstore transparency entry: 1805089970
- Sigstore integration time:
-
Permalink:
pgarrett-scripps/spectrl@cca4ab8788a2e1b82eb7e6feebd6332d13e2890b -
Branch / Tag:
refs/heads/main - Owner: https://github.com/pgarrett-scripps
-
Access:
private
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@cca4ab8788a2e1b82eb7e6feebd6332d13e2890b -
Trigger Event:
workflow_dispatch
-
Statement type:
File details
Details for the file spectrl-0.1.0-py3-none-any.whl.
File metadata
- Download URL: spectrl-0.1.0-py3-none-any.whl
- Upload date:
- Size: 26.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b7d63e801bd4b0fbf2a27d9318b61f82d39d4b5ef97601032a2b81100a88796d
|
|
| MD5 |
0992ad6b66fc6cffd3fde235b3160a7f
|
|
| BLAKE2b-256 |
9bace5085e7779f0926e4bd1c53d625e18f4689fbf5a1b33cbd8d30286ebb84e
|
Provenance
The following attestation bundles were made for spectrl-0.1.0-py3-none-any.whl:
Publisher:
publish.yml on pgarrett-scripps/spectrl
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
spectrl-0.1.0-py3-none-any.whl -
Subject digest:
b7d63e801bd4b0fbf2a27d9318b61f82d39d4b5ef97601032a2b81100a88796d - Sigstore transparency entry: 1805089982
- Sigstore integration time:
-
Permalink:
pgarrett-scripps/spectrl@cca4ab8788a2e1b82eb7e6feebd6332d13e2890b -
Branch / Tag:
refs/heads/main - Owner: https://github.com/pgarrett-scripps
-
Access:
private
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@cca4ab8788a2e1b82eb7e6feebd6332d13e2890b -
Trigger Event:
workflow_dispatch
-
Statement type: