MS I/O readers with optional vendor bindings
Project description
Pymsio
Pymsio is a small utility library for reading mass-spectrometry data files into a unified NumPy/Polars representation.
Its Thermo RAW reader design was inspired by AlphaRaw, but implemented independently for the pymsio codebase.
It currently supports:
- Thermo RAW files (via
pythonnet+ Thermo Fisher CommonCore DLLs) - mzML files
Both formats are exposed through a common interface.
Requirements
- Python >= 3.12
pythonnetis a required dependency (installed automatically with pymsio)- For Thermo RAW, you must provide Thermo Fisher CommonCore DLLs (not redistributed)
- On Linux, Thermo RAW reading requires Mono. See the “Linux: install Mono” section below.
Recommended: download DLLs from RawFileReader GitHub (manual)
-
Open the official RawFileReader repository
- In your browser, go to:
https://github.com/thermofisherlsms/RawFileReader
- In your browser, go to:
-
Download the source as a ZIP file
- Click the green “Code” button.
- Click “Download ZIP”.
- Save the ZIP file (e.g.
RawFileReader-main.zip) to a location you know.
-
Extract the ZIP file
-
Unzip
RawFileReader-main.zip. -
You should now have a folder like:
RawFileReader-main/ Libs/ Net471/ NetCore/ Net8/ Net5/ ...
-
-
Locate the CommonCore DLLs
pymsio is currently tested with the Net471 libraries.
-
Open the folder:
RawFileReader-main/Libs/Net471/
-
Inside that folder, find:
ThermoFisher.CommonCore.Data.dllThermoFisher.CommonCore.RawFileReader.dll
(There may be additional DLLs in that folder; pymsio only needs the two above.)
You will use these two files later in the installation steps,
so keep them in an easy-to-find location (e.g. on your Desktop or in a temporaryThermoDLLs/folder). -
Linux: install Mono (required for Thermo RAW)
To read Thermo .raw files with pymsio on Linux, Mono is required (pythonnet uses the Mono runtime by default on Linux/macOS).
First, verify whether Mono is already installed:
mono --version
If Mono is not installed, install it using the official Mono Project instructions, or install it using the install_mono.sh script provided in the pymsio GitHub repository.
Installation
Thermo RAW support setup
1) Obtain Thermo Fisher CommonCore DLLs
pymsio needs the following .NET assemblies:
ThermoFisher.CommonCore.Data.dllThermoFisher.CommonCore.RawFileReader.dll
These DLLs are owned by Thermo Fisher Scientific and subject to their license, so they are not bundled with pymsio.
2) Install the DLLs where pymsio can find them
pymsio will look for the two DLLs in either of the following locations (in this order):
- A directory specified by an environment variable
<current working directory>/dlls/thermo_fisher/(i.e., relative to where you run)
Option 1) Environment variable-based DLL folder (recommended)
Set an environment variable to the folder that contains the two DLL files.
Environment variable name
PYMSIO_THERMO_DLL_DIR
Windows example
- Create a folder (example):
C:\Users\{username}\Documents\pymsio\thermo_fisher - Copy these two files into it:
ThermoFisher.CommonCore.Data.dllThermoFisher.CommonCore.RawFileReader.dll
- Set the env var (PowerShell):
setx PYMSIO_THERMO_DLL_DIR "C:\Users\{username}\Documents\pymsio\thermo_fisher"
- Open a new terminal (so the env var is applied) and run your script.
Linux example
- Create a folder (example):
/home/{username}/dlls/thermo_fisher - Copy the two DLLs into that folder.
- Set the env var (bash):
export PYMSIO_THERMO_DLL_DIR="/home/{username}/dlls/thermo_fisher"
(To persist it, add the export line to~/.bashrcor your shell profile.)
Option 2) CWD-based DLL folder (quick / portable)
If you prefer a project-local setup (no env vars), place the DLLs under:
<your current working directory>/
dlls/
thermo_fisher/
ThermoFisher.CommonCore.Data.dll
ThermoFisher.CommonCore.RawFileReader.dll
For example, if you run Python from /projects/my_run/, then:
/projects/my_run/dlls/thermo_fisher/
Install pymsio
If conda (Anaconda or Miniconda) is not installed, first follow the Install Miniconda (if needed) section to install conda. Then, run the commands below.
conda create -n pymsio-env python=3.12 -y
conda activate pymsio-env
pip install -U pymsio
Install Miniconda (if needed)
Windows
-
Open the official Miniconda / Anaconda download page:
https://www.anaconda.com/download -
In the Windows section, download the Miniconda (Windows 64-bit) installer
(or Anaconda if you prefer the full distribution). -
Run the downloaded
.exefile. -
Follow the installer steps.
If you are unsure about the options, you can generally accept the defaults. -
After installation, open Anaconda Prompt.
-
Verify that Conda is available by running in Anaconda Prompt:
conda --version
If this prints a version number, Conda is ready.
Linux
-
Open the official Miniconda / Anaconda download page:
https://www.anaconda.com/download -
Download the Miniconda (Linux x86_64) installer
(file name similar toMiniconda3-latest-Linux-x86_64.sh). -
In a terminal, go to the folder where the installer was downloaded and run:
bash Miniconda3-latest-Linux-x86_64.sh -
Follow the prompts:
- Press Enter to scroll,
- type
yesto accept the license, - choose an install location (the default
~/miniconda3is usually fine), - when asked to initialize Conda, answering yes is recommended.
-
Close the terminal and open a new one, then verify that Conda is available:
conda --versionIf this prints a version number, Conda is ready.
Quick Start
Read a file (Thermo RAW or mzML) via ReaderFactory
from pathlib import Path
from pymsio.readers import ReaderFactory
path = Path("path/to/your/file.raw") # or .mzML
# 1) Get appropriate reader
reader = ReaderFactory.get_reader(path)
# 2) Read metadata (Polars DataFrame)
meta_df = reader.get_meta_df()
print(meta_df.head())
# 3) Read one frame (np.ndarray, shape (N, 2), [mz, intensity])
frame_num = int(meta_df.item(0, "frame_num"))
peaks = reader.get_frame(frame_num)
print(peaks.shape)
# 4) Load full dataset
msdata = reader.load()
print(msdata.peak_arr.shape)
Read multiple frames
frame_nums = meta_df["frame_num"].to_list() # or List[] which has frame numbers
peak_arr = reader.get_frames(frame_nums)
print(peak_arr.shape)
Notes
- If Thermo RAW fails with missing assemblies, double-check that the two DLLs are in:
PYMSIO_THERMO_DLL_DIR(Environment variable) or.../{cwd}/dlls/thermo_fisher/
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file pymsio-0.1.3.tar.gz.
File metadata
- Download URL: pymsio-0.1.3.tar.gz
- Upload date:
- Size: 22.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
60cda575fd85c53dae272b4fdae76c320c7abe4e34e30500350011dac93f7e2b
|
|
| MD5 |
72515b12bfb4cc214c07dbe2c1c787c3
|
|
| BLAKE2b-256 |
602543d4f5c2debb03f31c434bc41847714dcc8336c0bb08808dddfa6d0de492
|
File details
Details for the file pymsio-0.1.3-py3-none-any.whl.
File metadata
- Download URL: pymsio-0.1.3-py3-none-any.whl
- Upload date:
- Size: 20.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
46b8a34c19b09763b5e9a7d1cb2daa50c2ead1db878599f5e4cab18e0dd3929b
|
|
| MD5 |
7062a29971e3a14b9b3d20ba6353e7b4
|
|
| BLAKE2b-256 |
3a1118e418129a45eb66e3f8523406deee8520ac101551b64f5d6b588d3d9e05
|