Gibbs sampler for the Hierarchical Latent Dirichlet Allocation topic model. This is based on the hLDA implementation from Mallet, having a fixed depth on the nCRP tree.
Project description
Hierarchical Latent Dirichlet Allocation
Hierarchical Latent Dirichlet Allocation (hLDA) addresses the problem of learning topic hierarchies from data. The model relies on a non-parametric prior called the nested Chinese restaurant process, which allows for arbitrarily large branching factors and readily accommodates growing data collections. The hLDA model combines this prior with a likelihood that is based on a hierarchical variant of latent Dirichlet allocation.
Hierarchical Topic Models and the Nested Chinese Restaurant Process
The Nested Chinese Restaurant Process and Bayesian Nonparametric Inference of Topic Hierarchies
Implementation
- hlda/sampler.py is the Gibbs sampler for hLDA inference, based on the implementation from Mallet having a fixed depth on the nCRP tree.
Installation
- Simply use
pip install hlda
to install the package. - An example notebook that infers the hierarchical topics on the BBC Insight corpus can be found in notebooks/bbc_test.ipynb.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.