Skip to main content

Unsupervised Substructure Discovery using Topic Modelling with Automated Annotation.

Project description

header Maintainer Maintainer Maintainer DOI

MS2LDA is an advanced tool designed for unsupervised substructure discovery in mass spectrometry data, utilizing topic modeling and providing automated annotation of discovered motifs. This tool significantly enhances the capabilities described in the original MS2LDA paper (2016), offering users an integrated workflow with improved usability, detailed visualizations, and a searchable motif database (MotifDB).

Mass spectrometry fragmentation patterns hold abundant structural information vital for analytical chemistry, natural product research, and food safety assessments. However, interpreting this data remains challenging, and only a fraction of available information is traditionally utilized. MS2LDA addresses this by identifying recurring substructures (motifs) across spectral datasets without relying on prior compound identification, thus accelerating structure elucidation and analysis.


MS2LDA Installation and Usage

You can install MS2LDA using pip, Conda, or Poetry, depending on your preferences and requirements.

Quick Install with pip

pip install ms2lda

Quick Start Demo

Get started with MS2LDA in minutes! See the Quick Start Guide for step-by-step instructions using Conda, Poetry, or virtualenv.

Installation Guides

For more detailed installation options and development setup:


Command Line Tool Usage

MS2LDA provides powerful command-line tools for batch processing and analysis of mass spectrometry data.

For detailed instructions on using the command-line interface, see the Command Line Tool Guide.


MS2LDAViz Application

MS2LDA includes a web-based visualization application (MS2LDAViz) for exploring and analyzing results.

For instructions on starting and using the visualization application, see the MS2LDAViz Guide.


MS2LDA Documentation

📚 View the full documentation

Our comprehensive documentation includes:

  • Getting started guides
  • API reference
  • Tutorials and examples
  • Parameter settings and advanced usage

Citing MS2LDA

Please cite our work if you use MS2LDA in your research:

Torres Ortega, L.R., Dietrich, J., Wandy, J., Mol, H., & van der Hooft, J.J.J. (2025). Large-scale discovery and annotation of hidden substructure patterns in mass spectrometry profiles. bioRxiv. doi: https://doi.org/10.1101/2025.06.19.659491


Our Research Group

GitHub Logo Github Logo


Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ms2lda-2.0.1.tar.gz (791.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ms2lda-2.0.1-py3-none-any.whl (818.4 kB view details)

Uploaded Python 3

File details

Details for the file ms2lda-2.0.1.tar.gz.

File metadata

  • Download URL: ms2lda-2.0.1.tar.gz
  • Upload date:
  • Size: 791.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.1.4 CPython/3.12.11 Darwin/24.5.0

File hashes

Hashes for ms2lda-2.0.1.tar.gz
Algorithm Hash digest
SHA256 eed2fdca2713c8e03a001dfc745e646716bf2fbc99834c683f1385aaa46d625a
MD5 7889707bd0008aa6575c4f4d6e46f0f7
BLAKE2b-256 85f3c99f03ed0037a6846c893232aeb8a11a413809e1e1f152ca949e0b3a968a

See more details on using hashes here.

File details

Details for the file ms2lda-2.0.1-py3-none-any.whl.

File metadata

  • Download URL: ms2lda-2.0.1-py3-none-any.whl
  • Upload date:
  • Size: 818.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.1.4 CPython/3.12.11 Darwin/24.5.0

File hashes

Hashes for ms2lda-2.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 34811b8c2c077abe69203f34c1e3b8a31f20ba7a50304a945bdb6209a9202dd5
MD5 103f18b9deed7aa444677dd0adfcc32b
BLAKE2b-256 1092f0b0c6809a7813b1687251e7ca56a8284cb5a1d891372826fcd71c060422

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page