Skip to main content
Donate to the Python Software Foundation or Purchase a PyCharm License to Benefit the PSF! Donate Now

a package to cluster metabolomics data and plot dendrograms

Project description


A package to cluster metabolomics data, and plot dendrograms

Build Status


Converts MGF format and component list into non-redundant list. Component-analyte list is converted into a data matrix and analytes are dynamically binned and clustered.

Install on Linux or Mac from bash

Installing is easiest with pip. Assuming you have python3 installed you can run the following to install.

python3 -m pip install --user biodendro

# or

git clone && cd BioDendro
python3 -m pip install --user biodendro

The --user flag tells pip to install to a user directory rather than a system directory. Generally this will be under ~/.local for Mac and Linux. Make sure that ~/.local/bin is added to your $PATH if this is the case see here.

For more advanced users, we recommend using a virtual environment or conda environment.

To install as root, you can omit --user, though this is generally discouraged.

sudo python3 -m pip install biodendro

To install the latest and greatest version, you can use git, to install directly from the repository.

python3 -m pip install --user git+

# or

git clone && cd BioDendro
python3 -m pip install --user .

Both the BioDendro script and the python package will now be available to use (assuming Python is configured correctly).

Quick Start Example - command line

The quickest way to run is using the command-line interface.

A list of options can be obtained with the --help flag.

BioDendro --help

To run the basic pipeline using the example MGF and components file do:

BioDendro --results-dir my_results_dir MSMS.mgf component_list.txt

Quick Start Example - Python library

The pipeline is also available as a python function/library. The command above would be equivalent to the following in python.

import BioDendro

tree = BioDendro.pipeline("MSMS.mgf", "component_list.txt", results_dir="my_results_dir")

From there you could analyse the results stored in tree. The example jupyter notebooks contain more detailed explanations of different parameters.

quick-start-example.ipynb contains basic information about running the pipelines.

longer-example.ipynb contains more detailed information about how the pipeline works, and how you can modify parameters.


Command line API

The pipeline can also be run from a bash or bash-like terminal. This is useful if you're not planning on tweaking the parameters much and just want to run the darn thing.

For these examples, we're using the ipython magic command %%bash to run the commands in bash. You can omit the %%bash bit if you're running straight in the terminal.

To get a list of all options available use the --help (or -h) flag.

BioDendro --help

The minimum options to run the pipeline are the MGF file and a components list.

Using the example data in the BioDendro repo we could run...


BioDendro MSMS.mgf component_list.txt

As before, the results will be stored in a directory with the current date and the current time added to the end of it.

You can change the parameters to use by supplying additional flags, however, this will run the whole pipeline again, so it you just need to adjust the cutoff or decide to use braycurtis instead of jaccard distances, you might be better off using the python API.


BioDendro --scaling --cluster-method braycurtis --cutoff 0.5 MSMS.mgf component_list.txt

would be equivalent to running the following in python

tree = BioDendro.pipeline("MSMS.mgf", "component_list.txt", clustering_method="braycurtis", scaling=True, cutoff=0.5)

Project details

Release history Release notifications

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Filename, size & hash SHA256 hash help File type Python version Upload date
BioDendro-0.0.1.tar.gz (17.7 kB) Copy SHA256 hash SHA256 Source None

Supported by

Elastic Elastic Search Pingdom Pingdom Monitoring Google Google BigQuery Sentry Sentry Error logging AWS AWS Cloud computing DataDog DataDog Monitoring Fastly Fastly CDN SignalFx SignalFx Supporter DigiCert DigiCert EV certificate StatusPage StatusPage Status page