Skip to main content

Inline spectrum URL encoder — embeds a complete mass spectrum in a compact, URL-safe token.

Project description

spectrl

Inline spectrum URL encoder for mass spectrometry.

Encodes a complete mass spectrum — peaks, metadata, precursor info — into a single compact, URL-safe token. The entire spectrum lives in the string. No backend required.

spectrl1.<header>.<mz_array>.<intensity_array>[.<extra_arrays>]

Why

A USI references a spectrum stored in a repository. spectrl embeds it. Use spectrl when you want to share a spectrum directly in a URL, QR code, notebook, or paper — without requiring the reader to have access to the original file.

The two are complementary: a spectrl token can carry a USI back-link, and spectra too large to embed fall back to a USI reference.

Install

pip install spectrl

Requires Python 3.12+.

Quick start

Encode from mzmlpy

from mzmlpy.run import Mzml
from spectrl import encode_spectrum, from_mzmlpy

with Mzml("data.mzML") as mzml:
    spec = mzml.spectra[0]
    token = encode_spectrum(from_mzmlpy(spec))

print(token)
# spectrl1.hQ...

Encode manually

import numpy as np
from spectrl import encode_spectrum
from spectrl.model import InlineSpectrum, SpectrlCvParam

spec = InlineSpectrum(
    default_array_length=3,
    mz=np.array([147.0, 175.1, 246.2]),
    intensity=np.array([1e5, 8e4, 3e4]),
    id="scan=42",
    params=[
        SpectrlCvParam(accession="MS:1000511", value=2),   # ms level
        SpectrlCvParam(accession="MS:1000130"),             # positive scan
        SpectrlCvParam(accession="MS:1000127"),             # centroid
    ],
)

token = encode_spectrum(spec)

Decode

from spectrl import decode_token

decoded = decode_token(token)
print(decoded.mz)        # numpy array
print(decoded.intensity) # numpy array
print(decoded.id)        # "scan=42"

URL bindings

from spectrl import to_fragment, to_query, to_data_uri, extract_token

# Embed in a URL fragment (recommended — never sent to server)
url = to_fragment(token, "https://viewer.example.com/spectrum")
# https://viewer.example.com/spectrum#spectrl1.hQ...

# Or as a query parameter
url = to_query(token, "https://viewer.example.com/spectrum")
# https://viewer.example.com/spectrum?d=spectrl1.hQ...

# Or as a data URI
uri = to_data_uri(token)
# data:application/vnd.spectrl;v=1,spectrl1.hQ...

# Extract token back from any of the above
token = extract_token(url)

Trim large spectra

from spectrl import top_n

# Keep the 50 most intense peaks before encoding
trimmed = top_n(spec, 50)
token = encode_spectrum(trimmed)

Lossless encoding

# Default is lossy MS-Numpress (~0.003 mDa m/z error, ~0.007% intensity error)
# Use lossless=True for bit-exact IEEE-754 doubles
token = encode_spectrum(spec, lossless=True)

Token format

spectrl1.<b64url(msgpack_header)>.<b64url(mz_array)>.<b64url(intensity_array)>[.<b64url(...)>]
  • spectrl1 — magic + format version; clean version bumps.
  • Segments separated by .; each is base64url without padding (RFC 4648 §5).
  • Header — msgpack map with integer keys mirroring mzML structure: ms level, polarity, scan times, precursor isolation window, activation method, collision energy, ProForma interpretation, and a truncated SHA-256 content hash.
  • Array segments — one per array type (m/z, intensity, charge, ion mobility). Each encoded as MS-Numpress (lossy) or raw IEEE-754 (lossless) + zlib, matching mzML's own binaryDataArray pipeline.

Encoding precision

Measured over 479,455 peaks from a real LC-MS/MS dataset (BSA, Orbitrap):

Array Mean error Max error
m/z (MS-Numpress linear) 0.0025 mDa / 0.006 ppm 0.005 mDa / 0.056 ppm
Intensity (MS-Numpress slof) 0.007% relative 0.029% relative

Size vs mzML

Measured on the same BSA dataset (1,684 spectra):

Format MS1 avg (545 peaks) MS2 avg (109 peaks)
Raw mzML XML 12,876 B 6,004 B
spectrl (lossy) 4,241 B 1,340 B
spectrl (lossless) 10,302 B 1,909 B

CLI

# Encode from JSON
echo '{"mz":[147.0,175.1],"intensity":[1e5,8e4]}' | spectrl encode

# Decode a token
echo "spectrl1.hQ..." | spectrl decode

# Inspect the header as readable JSON
echo "spectrl1.hQ..." | spectrl inspect

Design

  • mzML-faithful — metadata is carried as CV accession maps (MS: ontology), mirroring mzML cvParam semantics. No invented field names.
  • CV binding — all accession constants come from mzmlpy's StrEnum enums; no hardcoded integers.
  • Deterministic (within an implementation) — canonical form (m/z-ascending, fixed numpress scale factors) yields a stable token from a given implementation, plus a truncated SHA-256 content hash (key 9) verified on decode as a transport-integrity check. Token bytes are not guaranteed identical across implementations (DEFLATE/msgpack output is not canonical); see SPECIFICATION.md.
  • ProForma — carries a ProForma 2.0 peptide interpretation string (key 8), the same mechanism used by USI.

Specification

The normative token format is specified in SPECIFICATION.md (draft, intended for submission to HUPO-PSI). This README is a tutorial; the specification is the contract. A machine-readable CV/codec/key registry lives in schema/registry.json.

Contributing

See CONTRIBUTING.md and the Code of Conduct. Changes to the on-the-wire token format are governed more strictly — see the Format changes section of the contributing guide.

License

Licensed under the Apache License 2.0. If you use spectrl in research, please cite it via CITATION.cff.

Related

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

spectrl-0.1.0.tar.gz (5.7 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

spectrl-0.1.0-py3-none-any.whl (26.9 kB view details)

Uploaded Python 3

File details

Details for the file spectrl-0.1.0.tar.gz.

File metadata

  • Download URL: spectrl-0.1.0.tar.gz
  • Upload date:
  • Size: 5.7 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for spectrl-0.1.0.tar.gz
Algorithm Hash digest
SHA256 365d1209ed61804765a7320e72edd398401b276038cd4bce468ebc3eb07dea41
MD5 059586afe57cbeb429c418516d993929
BLAKE2b-256 bb512e753c5e25081cf2fab8d0eedd46e56059447c1583f7c39257727b745d07

See more details on using hashes here.

Provenance

The following attestation bundles were made for spectrl-0.1.0.tar.gz:

Publisher: publish.yml on pgarrett-scripps/spectrl

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file spectrl-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: spectrl-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 26.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for spectrl-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 b7d63e801bd4b0fbf2a27d9318b61f82d39d4b5ef97601032a2b81100a88796d
MD5 0992ad6b66fc6cffd3fde235b3160a7f
BLAKE2b-256 9bace5085e7779f0926e4bd1c53d625e18f4689fbf5a1b33cbd8d30286ebb84e

See more details on using hashes here.

Provenance

The following attestation bundles were made for spectrl-0.1.0-py3-none-any.whl:

Publisher: publish.yml on pgarrett-scripps/spectrl

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page