Dynamic topic models
Project description
horizont implements a number of topic models. Conventions from scikit-learn are followed.
The following models are implemented using Gibbs sampling.
- Latent Dirichlet allocation (Blei et al., 2003; Pritchard et al., 2000)
- (Coming soon) Logistic normal topic model
- (Coming soon) Dynamic topic model (Blei and Lafferty, 2006)
Getting started
horizont.LDA implements latent Dirichlet allocation (LDA) using Gibbs sampling. The interface follows conventions in scikit-learn.
>>> import numpy as np >>> from horizont import LDA >>> X = np.array([[1,1], [2, 1], [3, 1], [4, 1], [5, 8], [6, 1]]) >>> model = LDA(n_topics=2, random_state=0, n_iter=100) >>> doc_topic = model.fit_transform(X) # estimate of document-topic distributions >>> model.components_ # estimate of topic-word distributions
Requirements
Python 2.7 or Python 3.3+ is required. The following packages are also required:
- numpy
- scipy
- scikit-learn
- futures (Python 2.7 only)
GSL is required for random number generation inside the Pólya-Gamma random variate generator. On Debian-based sytems, GSL may be installed with the command sudo apt-get install libgsl0-dev. horizont looks for GSL headers and libraries in /usr/include and /usr/lib/ respectively.
Cython is needed if compiling from source.
Important links
- Documentation: http://pythonhosted.org/horizont
- Source code: https://github.com/ariddell/horizont/
- Issue tracker: https://github.com/ariddell/horizont/issues
License
horizont is licensed under Version 3.0 of the GNU General Public License. See LICENSE file for a text of the license or visit http://www.gnu.org/copyleft/gpl.html.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.