Skip to main content

A tool for compressing MS/MS data

Project description

MS/MS Data Compression Package

Description

This Python package is designed for efficient compression of Mass Spectrometry (MS/MS) data. It is based on the MassComp algorithm, which is described in the following paper: https://doi.org/10.1186/s12859-019-2962-7

Version

0.2.0

Features

  • Delta and Hex Encoding: Efficiently encodes m/z values and intensities to optimize the compression.
  • Brotli Compression: Utilizes Brotli, a high-performance compression algorithm, offering superior compression ratios and speeds compared to gzip.

Installation

To install the MS/MS Data Compression package, run:

pip install msms-compression

Usage

The package includes the following main compressor classes:

  • SpectrumCompressorUrl: Utilizes URL-safe Base64 encoding.
  • SpectrumCompressor: Uses Base85 encoding.
  • Note: The m/z values must be sorted in ascending order before compression, and contain only positive values.

Example:

from msms_compression import SpectrumCompressor

# Sample data
mz_values, intensity_values = [100.0, 101.0, 102.0], [10.0, 20.0, 30.0]

# Initialize the compressor
compressor = SpectrumCompressor()

# Compress data
compressed_data = compressor.compress(mz_values, intensity_values)
print("Compressed Data:", compressed_data)

# Decompress data
decompressed_mz, decompressed_intensity = compressor.decompress(compressed_data)
assert decompressed_mz == mz_values
assert decompressed_intensity == intensity_values

Compression Strategy Comparison

strategy Compression Ratio Compression Ratio Rank URL Compression Ratio URL Compression Ratio Rank Compression Time Compression Time Rank Decompression Time Decompression Time Rank
SpectrumCompressorLossy 5.952 1 5.023 1 0.030 3 0.008 1
SpectrumCompressor 3.890 2 3.299 5 0.054 5 0.009 3
SpectrumCompressorUrl 3.646 3 4.528 2 0.057 6 0.008 2
SpectrumCompressorGzip 3.148 4 2.658 6 0.026 2 0.009 5
SpectrumCompressorUrlGzip 2.951 5 3.665 3 0.024 1 0.009 4
SpectrumCompressorUrlLzstring 2.800 6 3.418 4 0.031 4 0.109 6

The lossy compression strategy converts each intensity to a 2 character hex string (which offers 256 unique values). This strategy is lossy, but offers the best compression ratio. M/Z values are losslessly compressed using delta encoding for all strategies, including lossy.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

msms_compression-0.2.0.tar.gz (6.3 kB view details)

Uploaded Source

Built Distribution

msms_compression-0.2.0-py3-none-any.whl (5.3 kB view details)

Uploaded Python 3

File details

Details for the file msms_compression-0.2.0.tar.gz.

File metadata

  • Download URL: msms_compression-0.2.0.tar.gz
  • Upload date:
  • Size: 6.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.9.18

File hashes

Hashes for msms_compression-0.2.0.tar.gz
Algorithm Hash digest
SHA256 0650a0d31018969a8149c53e73432c43d355a1f3b069f1d36812960666a773ae
MD5 edbed03a35cc305139feb1bffb4521c0
BLAKE2b-256 9dc7753e921a4daf16cbc0830776b9ac54e1573f9740e3012fd466201c5e9494

See more details on using hashes here.

File details

Details for the file msms_compression-0.2.0-py3-none-any.whl.

File metadata

File hashes

Hashes for msms_compression-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 8989ae37ac139bd48e304718e46b7396aeb90271f41180cc8420a1ab5d24db5d
MD5 431ea64fecc38aba1980d25b2b47d0d2
BLAKE2b-256 790e1e76f2c9b9a08045cb0289ae0c32d5ea82b1a64dd15ef3d46dec47e34e1d

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page