Skip to main content

Iterative topic modelling platform

Project description

Ignis: Iterative Topic Modelling Platform

Ignis is an extensible platform that provides a common interface for creating and visualising topic models.

By default, it supports creating LDA models using Tomotopy (https://bab2min.github.io/tomotopy/) and visualising them using pyLDAvis (https://github.com/bmabey/pyLDAvis), but support for other models and frameworks can be written in as necessary.

Development Dependencies

If you intend to extend or modify the platform, an Anaconda 3 environment may be used to easily manage the project dependencies. Note that the conda-forge (https://conda-forge.org/) channel may have to be enabled for some of these dependencies to be installed.

General:

conda install python=3.7 tqdm jupyter

Corpus-prep Demo:

conda install gensim nltk
python -m nltk.downloader stopwords

Tomotopy:

conda install py-cpuinfo numpy
pip install tomotopy

pyLDAvis:

conda install pandas pyldavis

For ease of use:

conda install jupyter_contrib_nbextensions black

Documentation:

conda install sphinx sphinx_rtd_theme
pip install m2r2
cd docs
sphinx-build -b html . _build

Indeterminacy

N.B.: Some of the dependencies (e.g., Tomotopy, Gensim) sometimes seem to rely on PYTHONHASHSEED being set in order to consistently reproduce results (together with setting the actual random seed), although this behaviour is not always reproducible. This behaviour may have been fixed with the upstream release of Tomotopy v0.9.1., but to be safe, PYTHONHASHSEED should be explicitly set where necessary.

If using a Conda environment, this can be done with:

conda env config vars set PYTHONHASHSEED=<seed>

For direct invocation:

PYTHONHASHSEED=<seed> python script.py

For Jupyter notebooks in a non-Conda environment, edit the Jupyter kernel.json to add an appropriate env key.

pyLDAvis

The older versions of Pandas (<0.24.0a) pinned by the default distribution of pyLDAvis generate the visualisation data much more quickly than newer versions of Pandas. Ignis comes with a built-in monkey-patched version of the pyLDAvis .prepare() function that works better with these newer versions, and uses it by default.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ignis-tm-0.1.9.tar.gz (35.7 kB view hashes)

Uploaded Source

Built Distribution

ignis_tm-0.1.9-py3-none-any.whl (43.0 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page