Skip to main content

A library for topic modeling and visualization.

Project description

DARIAH Topics is an easy-to-use Python library for topic modeling and visualization. Getting started is really easy. All you have to do is import the library – you can train a model straightaway from raw textfiles.

It supports two implementations of latent Dirichlet allocation:

  • The lightweight, Cython-based package lda
  • The more robust, Java-based package MALLET

Installation

$ pip install dariah

Example

>>> import dariah
>>> dariah.topics(directory="british-fiction-corpus",
...               stopwords=100,
...               num_topics=10,
...               num_iterations=1000)

Developing

Poetry automatically creates a virtual environment, builds and publishes the project to PyPI. Install dependencies with:

$ poetry install

run tests:

$ poetry run pytest

format code:

$ poetry run black dariah

build the project:

$ poetry build

and publish it on PyPI:

$ poetry publish

About DARIAH-DE

DARIAH-DE supports research in the humanities and cultural sciences with digital methods and procedures. The research infrastructure of DARIAH-DE consists of four pillars: teaching, research, research data and technical components. As a partner in DARIAH-EU, DARIAH-DE helps to bundle and network state-of-the-art activities of the digital humanities. Scientists use DARIAH, for example, to make research data available across Europe. The exchange of knowledge and expertise is thus promoted across disciplines and the possibility of discovering new scientific discourses is encouraged.

This software library has been developed with support from the DARIAH-DE initiative, the German branch of DARIAH-EU, the European Digital Research Infrastructure for the Arts and Humanities consortium. Funding has been provided by the German Federal Ministry for Research and Education (BMBF) under the identifier 01UG1610J.

https://raw.githubusercontent.com/DARIAH-DE/Topics/master/docs/images/dariah-de_logo.png https://raw.githubusercontent.com/DARIAH-DE/Topics/master/docs/images/bmbf_logo.png

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Files for dariah, version 2.0.1
Filename, size File type Python version Upload date Hashes
Filename, size dariah-2.0.1-py3-none-any.whl (13.5 kB) File type Wheel Python version py3 Upload date Hashes View
Filename, size dariah-2.0.1.tar.gz (12.4 kB) File type Source Python version None Upload date Hashes View

Supported by

Pingdom Pingdom Monitoring Google Google Object Storage and Download Analytics Sentry Sentry Error logging AWS AWS Cloud computing DataDog DataDog Monitoring Fastly Fastly CDN DigiCert DigiCert EV certificate StatusPage StatusPage Status page