Skip to main content

Clustering heatmap tool for kraken-style reports

Project description

coverage report pipeline status PyPI version DOI

groupBug

Clustering heatmap tool for kraken-style reports. Takes kraken style reports in text file format from eithr Kraken or Centrifuge (use centrifuge-kreport.pl). Produces a clustermap using seaborn of top species (default) using z scores for the heatmap and euclidean centroid clustering for the dendrograms.

This work was inspired by the excellent hclust script available for metaphlan analysis, see here https://bitbucket.org/biobakery/biobakery/wiki/metaphlan2#rst-header-create-a-heatmap-with-hclust2. And Pavian see here, https://github.com/fbreitwieser/pavian.

This work was funded by NIHR Biomedical Research Centre at Oxford University Hospitals NHS Foundation Trust and the University of Oxford.

Installation

From github

git clone https://gitlab.com/ModernisingMedicalMicrobiology/groupBug
cd groupBug
sudo python3 setup.py install

From pip3

sudo pip3 install groupBug

From pip3 if you don't have admin rights

pip3 install --user groupBug

Requirements

Written in python3 only and Currently dependant on the following packages:

pandas ete3 matplotlib seaborn six coverage nose

An X display server is needed if the -sv/--saveName parameter is not used.

Usage

Command line options are as follows.

usage: groupBug.py [-h] -k KRAKEN_REPORTS [KRAKEN_REPORTS ...] [-d DOMAIN]
                   [-t TAXIDS [TAXIDS ...]] [-sv SAVENAME] [-n TOPNUM]
                   [-suf SUF]

cluster heatmap and information from kraken reports

optional arguments:
  -h, --help            show this help message and exit
  -k KRAKEN_REPORTS [KRAKEN_REPORTS ...], --kraken_reports KRAKEN_REPORTS [KRAKEN_REPORTS ...]
                        list of kraken style report files
  -d DOMAIN, --domain DOMAIN
                        Domain of life to display, bacteria, viruses etc
  -t TAXIDS [TAXIDS ...], --taxids TAXIDS [TAXIDS ...]
                        list of taxids to specifically count
  -sv SAVENAME, --saveName SAVENAME
                        file name to save plot as
  -n TOPNUM, --topNum TOPNUM
                        Number of discrete species to display
  -suf SUF, --suf SUF   suffix to delete from sample name

For example, use this command to display the top bacterial species.

groupBug.py -k kreports/* 

This will prodcuce a chart like this.

The file names are used as sample labels along the x axis. To remove suffixes, use the -suf options like such.

groupBug.py -k reports/* -suf _kreport_score_150.txt

This will prodcuce a chart like this.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Filename, size & hash SHA256 hash help File type Python version Upload date
groupBug-0.3-py3-none-any.whl (5.7 kB) Copy SHA256 hash SHA256 Wheel py3
groupBug-0.3.tar.gz (5.0 kB) Copy SHA256 hash SHA256 Source None

Supported by

Elastic Elastic Search Pingdom Pingdom Monitoring Google Google BigQuery Sentry Sentry Error logging AWS AWS Cloud computing DataDog DataDog Monitoring Fastly Fastly CDN SignalFx SignalFx Supporter DigiCert DigiCert EV certificate StatusPage StatusPage Status page