Hierarchical non-parametric Bayesian clustering of digital expression data
Project description
DGEclust is a program for clustering and differential expression analysis of digital expression data generated by next-generation sequencing assays, such as RNA-seq, CAGE and others. It takes as input a table of count data and it estimates the number and parameters of the clusters supported by the data. At a later stage, these can be used for identifying differentially expressed genes and for gene- and sample-wise clustering of the original data matrix. Internally, DGEclust uses a Hierarchical Dirichlet Process Mixture Model for modelling over-dispersed count data, combined with a blocked Gibbs sampler for efficient Bayesian learning.
This program is part of the software collection of the [Computational Genomics Group](http://bioinformatics.bris.ac.uk/) at the University of Bristol and it is under continuous development. You can find more technical details on the statistical methodologies used in this software in the following papers:
http://arxiv.org/abs/1301.4144 (Vavoulis & Gough, J Comput Sci Syst Biol 7:001-009, 2013)
http://arxiv.org/abs/1405.0723 (Vavoulis et al., submitted, 2014)
For more information, including bug reports, send an email to Dimitris.Vavoulis@bristol.ac.uk or Julian.Gough@bristol.ac.uk
Enjoy!
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.