Clustering heatmap tool for kraken-style reports
Project description
groupBug
Clustering heatmap tool for kraken-style reports. Takes kraken style reports in text file format from eithr Kraken or Centrifuge (use centrifuge-kreport.pl). Produces a clustermap using seaborn of top species (default) using z scores for the heatmap and euclidean centroid clustering for the dendrograms.
This work was inspired by the excellent hclust script available for metaphlan analysis, see here https://bitbucket.org/biobakery/biobakery/wiki/metaphlan2#rst-header-create-a-heatmap-with-hclust2. And Pavian see here, https://github.com/fbreitwieser/pavian.
This work was funded by NIHR Biomedical Research Centre at Oxford University Hospitals NHS Foundation Trust and the University of Oxford.
Installation
From github
git clone https://gitlab.com/ModernisingMedicalMicrobiology/groupBug
cd groupBug
sudo python3 setup.py install
From pip3
sudo pip3 install groupBug
From pip3 if you don't have admin rights
pip3 install --user groupBug
Requirements
Written in python3 only and Currently dependant on the following packages:
pandas ete3 matplotlib seaborn six coverage nose
An X display server is needed if the -sv/--saveName parameter is not used.
Usage
Command line options are as follows.
usage: groupBug.py [-h] -k KRAKEN_REPORTS [KRAKEN_REPORTS ...] [-d DOMAIN]
[-t TAXIDS [TAXIDS ...]] [-sv SAVENAME] [-n TOPNUM]
[-suf SUF]
cluster heatmap and information from kraken reports
optional arguments:
-h, --help show this help message and exit
-k KRAKEN_REPORTS [KRAKEN_REPORTS ...], --kraken_reports KRAKEN_REPORTS [KRAKEN_REPORTS ...]
list of kraken style report files
-d DOMAIN, --domain DOMAIN
Domain of life to display, bacteria, viruses etc
-t TAXIDS [TAXIDS ...], --taxids TAXIDS [TAXIDS ...]
list of taxids to specifically count
-sv SAVENAME, --saveName SAVENAME
file name to save plot as
-n TOPNUM, --topNum TOPNUM
Number of discrete species to display
-suf SUF, --suf SUF suffix to delete from sample name
For example, use this command to display the top bacterial species.
groupBug.py -k kreports/*
This will prodcuce a chart like this.
The file names are used as sample labels along the x axis. To remove suffixes, use the -suf options like such.
groupBug.py -k reports/* -suf _kreport_score_150.txt
This will prodcuce a chart like this.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.