Python library that performs Latent Dirichlet Allocation using Gibbs sampling.
# topic-modelling-tools Topic Modelling with Latent Dirichlet Allocation using Gibbs sampling. This version of the package uses the GNU Scientific Library for random number generation, providing faster performance than numpy.
by Stephen Hansen, email@example.com Associate Professor of Economics, University of Oxford
Python/Cython code for cleaning text and estimating LDA via collapsed Gibbs sampling as in Griffiths and Steyvers (2004).
Tutorial scripts and notebooks making use of this library, along with some example data, can be found in: https://github.com/sekhansen/text-mining-tutorial
## Installation instructions
This version of the package requires the GNU Scientific Library (GSL) to be installed. You can download GSL from ftp://ftp.gnu.org/gnu/gsl/ or for Mac OSX using homebrew, you can do brew install gsl. If you have conda, do conda install gsl.
(For a version that doesn’t require GSL (but is somewhat slower), checkout the “master” branch of this repository, or pip install topic-modelling-tools.)
If you already have GSL, Python and pip installed, pip install topic-modelling-tools_gsl should work. The package depends on some other python libraries such as numpy and nltk but this should be taken care of by pip.
The only other requirement is that a C++ compiler is needed to build the Cython code. For Mac OS X you can download Xcode command-line tools, while for Windows you can download the Visual Studio C++ compiler.
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
|Filename, size||File type||Python version||Upload date||Hashes|
|Filename, size topic_modelling_tools_fast-0.7.dev0-cp36-cp36m-macosx_10_7_x86_64.whl (112.0 kB)||File type Wheel||Python version cp36||Upload date||Hashes View|
|Filename, size topic-modelling-tools_fast-0.7.dev0.tar.gz (4.1 MB)||File type Source||Python version None||Upload date||Hashes View|
Hashes for topic_modelling_tools_fast-0.7.dev0-cp36-cp36m-macosx_10_7_x86_64.whl
Hashes for topic-modelling-tools_fast-0.7.dev0.tar.gz