Skip to main content

MS I/O readers with optional vendor bindings

Project description

Pymsio

Pymsio is a lightweight module for reading mass-spectrometry data files into a unified NumPy/Polars representation.
Its design and implementation are based on the AlphaRaw project: https://github.com/MannLabs/alpharaw/.

It currently supports:

  • Thermo RAW files (via pythonnet + Thermo Fisher CommonCore DLLs)
  • mzML files

Both formats are exposed through a common interface.


Requirements

  • OS: Windows, Linux (macOS not tested)
  • Python: >= 3.9
  • Thermo RAW:
    • Requires Thermo Fisher CommonCore DLLs (ThermoFisher.CommonCore.Data.dll, ThermoFisher.CommonCore.RawFileReader.dll) obtained from the RawFileReader project (https://github.com/thermofisherlsms/RawFileReader).
    • Linux also needs Mono (use install_mono.sh).

Installation

  1. Clone the repository

    git clone https://github.com/bertis-informatics/pymsio.git
    cd pymsio
    
  2. Provide the Thermo DLLs (only needed for Thermo RAW)

    • Linux only: ensure Mono is installed (required by pythonnet). Use the helper script:

      ./install_mono.sh
      
    1. Download (or git clone) RawFileReader: https://github.com/thermofisherlsms/RawFileReader
    2. Copy the two DLLs from RawFileReader/Libs/Net471/:
      • ThermoFisher.CommonCore.Data.dll
      • ThermoFisher.CommonCore.RawFileReader.dll
    3. Make the DLLs discoverable:
      • Option A — Bundle DLLs inside the package <path-to-pymsio>/pymsio/dlls/thermo_fisher/
        • Copy the DLLs into pymsio/dlls/thermo_fisher/ before running pip install -e . so they ship with the installation.
        • Example:
          mkdir -p pymsio/dlls/thermo_fisher
          cp /path/to/RawFileReader/Libs/Net471/*.dll /path/to/pymsio/pymsio/dlls/thermo_fisher/
          
      • Option B — Set up an environment variable PYMSIO_THERMO_DLL_DIR
        • Windows example:
          setx PYMSIO_THERMO_DLL_DIR "<path-to-your-dll-folder>"
          
        • Linux example:
          export PYMSIO_THERMO_DLL_DIR="<path-to-your-dll-folder>"
          
          (Add the export line to ~/.bashrc to keep it persistent.)
        • Copy the DLLs into the folder referenced by the variable.
  3. Install pymsio

    Option A — Conda environment

    conda create -n pymsio-env python=3.12 -y
    conda activate pymsio-env
    pip install .
    

    Option B — pip + venv (Python 3.12+)

    Linux/macOS

    # Install Python 3.12 (if you don't already have it)
    # sudo apt update
    # sudo apt install python3.12 python3.12-venv
    
    python3.12 -m venv .venv
    source .venv/bin/activate
    # python --version
    pip install .
    

    Windows PowerShell (with Python Launcher)

    # Install Python 3.12 (if you don't already have it)
    # py install 3.12
    
    py -3.12 -m venv .venv
    .\.venv\Scripts\Activate.ps1
    # python --version
    pip install .
    

pymsio is available on PyPI, so you can also install and use it directly inside your virtual environment with(DLLs download and path setting also required):

pip install pymsio

Quick Start

Read a file (Thermo RAW or mzML) via ReaderFactory

from pathlib import Path
from pymsio.readers import ReaderFactory 

path = Path("path/to/your/file.raw")   # or .mzML

# 1) Get appropriate reader
reader = ReaderFactory.get_reader(path)

# 2) Read metadata (Polars DataFrame)
meta_df = reader.get_meta_df()
print(meta_df.head())

# 3) Read one frame (np.ndarray, shape (N, 2), [mz, intensity])
frame_num = int(meta_df.item(0, "frame_num"))
peaks = reader.get_frame(frame_num)
print(peaks.shape)

# 4) Load full dataset 
msdata = reader.load()
print(msdata.peak_arr.shape)

Read multiple frames

frame_nums = meta_df["frame_num"].to_list() # or List[] which has frame numbers
peak_arr = reader.get_frames(frame_nums)
print(peak_arr.shape)

Notes

  • If Thermo RAW fails with missing assemblies, double-check that the two DLLs are in: PYMSIO_THERMO_DLL_DIR (Environment variable) or .../{cwd}/dlls/thermo_fisher/

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pymsio-0.1.6.tar.gz (20.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pymsio-0.1.6-py3-none-any.whl (20.1 kB view details)

Uploaded Python 3

File details

Details for the file pymsio-0.1.6.tar.gz.

File metadata

  • Download URL: pymsio-0.1.6.tar.gz
  • Upload date:
  • Size: 20.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.12

File hashes

Hashes for pymsio-0.1.6.tar.gz
Algorithm Hash digest
SHA256 a650208bc03081b48cbd056627cb89b2445a219dfbba1057ef7e24bbc460a969
MD5 8da9a339733bcd0249f95d25654c45ac
BLAKE2b-256 72229cafa72f5482a7732dc15cac890e6ea44b014330f8e1ddeb16542df4beb7

See more details on using hashes here.

File details

Details for the file pymsio-0.1.6-py3-none-any.whl.

File metadata

  • Download URL: pymsio-0.1.6-py3-none-any.whl
  • Upload date:
  • Size: 20.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.12

File hashes

Hashes for pymsio-0.1.6-py3-none-any.whl
Algorithm Hash digest
SHA256 0e339f33ee1b1030d707c555df9f26b5fac0647d87b8901c2e22dfd9f2e147de
MD5 48b8a43df7470ad192bb08f0acadab0d
BLAKE2b-256 db575445f7164bf7a3f2ec1682cbbea4874aa7590ad9406a151a0eb724d5feb3

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page