Skip to main content

Python package to perform compound identification in mass spectrometry via spectral library matching.

Project description

PyCompound Logo

PyCompound is a Python-based tool for spectral library matching designed to identify chemical compounds from mass spectrometry data. It is available in three formats: a Python package, a command-line interface (CLI), and a graphical user interface (GUI) built with Python/Shiny. PyCompound provides a flexible and extensible framework for spectral library matching and introduces several key features. These include entropy-based similarity measures such as Shannon, Tsallis, and the Rényi entropy similarity measure introduced here for the first time, as well as conventional similarity metrics, including cosine and binary similarity measures. PyCompound supports customizable preprocessing workflows that allow users to explicitly control the order of spectral preprocessing steps. In addition, PyCompound includes transformation parameter optimization using grid search and metaheuristic algorithms, and it supports the construction of user-defined mixture or composite similarity measures by combining two or more similarity metrics. PyCompound supports both high-resolution mass spectrometry (HRMS) data (e.g., LC-MS/MS) and nominal-resolution mass spectrometry (NRMS) data (e.g., GC-MS). For the full documentation, including toy examples, see the GitHub repository (https://github.com/hdlugas/pycompound).

Installation

PyCompound requires the Python dependencies Matplotlib, NumPy, Pandas, SciPy, Pyteomics, and netCDF4. Specifically, PyCompound was validated with python=3.12.4, matplotlib=3.8.4, numpy=1.26.4, pandas=2.2.2, scipy=1.13.1, pyteomics=4.7.2, netCDF4=1.6.5, lxml=5.1.0, joblib=1.5.2, and shiny=1.4.0, although it may work with other versions of these tools. A user may consider creating a conda environment (see https://docs.conda.io/projects/conda/en/latest/user-guide/getting-started.html for guidance on getting started with conda if you are unfamiliar). For a system with conda installed, one can create the environment pycompound_env, activate it, and install the necessary dependencies with:

1. Prerequisites by Operating System

Before installing, ensure your system is prepared for the specific requirements of your operating system.

Windows Users (Setup & Dependencies)

Windows users should use the Anaconda PowerShell Prompt to ensure all paths are configured correctly.

  1. Initial Setup: If you do not have a Python manager, download and install Miniconda (https://docs.anaconda.com/miniconda/).
  2. Open the Prompt: Click Start, search for "Anaconda PowerShell Prompt", and open it.
  3. Install Core Tools: Run the following to install the required data libraries and Git:
conda install -c conda-forge netcdf4 lxml git -y

Linux Users

To ensure Git is available within your environment, run:

conda install -c conda-forge git -y

Note: If you are on an older system and see a C++ Compiler does not support -std=c++17 error, run this command instead:

conda install -c conda-forge gxx_linux-64 gcc_linux-64 git -y

2. Environment Setup & Cloning the Repository

To run the provided examples or the Shiny app, you must clone the repository to access the sample data and visual assets.

# 1. Clone the repository
git clone https://github.com/hdlugas/pycompound.git
cd pycompound

# 2. Create and activate the environment
conda create -n pycompound_env -y python=3.12
conda activate pycompound_env

3. Install PyCompound

Option A: Install from PyPI (Stable)

pip install pycompound

Note: To install a specific version, for example, you can install version 0.1.19 by: pip install pycompound==0.1.19

Option B: Install from GitHub (Development)

pip install git+https://github.com/hdlugas/pycompound.git

4. Running PyCompound

Run the Toy Examples

With the repository cloned and the environment active, you can now run the Python package examples. Navigate to the Toy Examples section on the GitHub repository and copy the code into a Python script or interpreter. Since you are in the pycompound root directory, the paths to tests/data/ will work automatically.

Launch the Shiny App

The Shiny app requires the www/ folder to display correctly. Since you have cloned the repository, you can launch it immediately:

shiny run --launch-browser app.py

Note: If the browser does not open automatically, navigate to the address shown in your terminal (usually http://127.0.0.1:8000).

Publicly available web version: https://connect.posit.cloud/fy7392

Toy examples of the Python package and CLI versions are available in the Toy Examples on the GitHub repository.

Toy examples and video tutorials for the PyCompound Shiny application are available on YouTube (https://www.youtube.com/@PyCompound).

Key References

Dlugas, H., Zhang, X., Bao, J., Li, J., Kato, I., Kim, S. (2026). PyCompound: a versatile Python package for flexible spectral-library matching in mass spectrometry-based compound identification. Submitted.

Dlugas, H., Zhang, X., Kim, S. (2025). Comparative analysis of continuous similarity measures for compound identification in mass spectrometry-based metabolomics. Chemometrics and Intelligent Laboratory Systems, 263, 105417. https://doi.org/10.1016/j.chemolab.2025.105417.

Kim, S., Kato, I., & Zhang, X. (2022). Comparative Analysis of Binary Similarity Measures for Compound Identification in Mass Spectrometry-Based Metabolomics. Metabolites, 12(8), 694. https://doi.org/10.3390/metabo12080694.

Li, Y., Kind, T., Folz, J. et al. (2021). Spectral entropy outperforms MS/MS dot product similarity for small-molecule compound identification. Nat Methods, 18 1524–1531. https://doi.org/10.1038/s41592-021-01331-z.

Kim, S., Koo, I., Wei, X., & Zhang, X. (2012). A method of finding optimal weight factors for compound identification in gas chromatography-mass spectrometry. Bioinformatics, 28(8), 1158-1163. https://doi.org/10.1093/bioinformatics/bts083.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pycompound-0.1.19.tar.gz (50.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pycompound-0.1.19-py3-none-any.whl (35.3 kB view details)

Uploaded Python 3

File details

Details for the file pycompound-0.1.19.tar.gz.

File metadata

  • Download URL: pycompound-0.1.19.tar.gz
  • Upload date:
  • Size: 50.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.12

File hashes

Hashes for pycompound-0.1.19.tar.gz
Algorithm Hash digest
SHA256 ccaecd93df15942ab8f8f1f32e6aa38b338d83f4269436d7ad0afedca486d3ef
MD5 38e7bbdac0f7f63e2fec4dbcead4e0f2
BLAKE2b-256 02d9a5388372b15d53d09c6dc0991bc988c8587a37893b35ec9d23ff57c9f1c1

See more details on using hashes here.

File details

Details for the file pycompound-0.1.19-py3-none-any.whl.

File metadata

  • Download URL: pycompound-0.1.19-py3-none-any.whl
  • Upload date:
  • Size: 35.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.12

File hashes

Hashes for pycompound-0.1.19-py3-none-any.whl
Algorithm Hash digest
SHA256 4818e9b46a2e7deb6b1e8b3718154a2d683d727d21dc0ba40fa201938eabfcae
MD5 c5bc26f5ba188314eb72f379a98aaeb6
BLAKE2b-256 f33512f41b1eb516a71d1862ec0b3b110998677c81e8ec43a76ba8549e5a5301

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page