Skip to main content

Text Mining and Topic Modeling Toolkit

Project description

tmtoolkit is a set of tools for text mining and topic modeling with Python. It contains functions for text preprocessing like lemmatization, stemming or POS tagging especially for English and German texts. Preprocessing is done in parallel by using all available processors on your machine. The topic modeling features include topic model evaluation metrics, allowing to calculate models with different parameters in parallel and comparing them (e.g. in order to find the best number of topics for a given set of documents). Topic models can be generated in parallel for different copora and/or parameter sets using the LDA implementations either from lda, scikit-learn or gensim.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tmtoolkit-0.4.0.tar.gz (15.2 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

tmtoolkit-0.4.0-py2.py3-none-any.whl (15.3 MB view details)

Uploaded Python 2Python 3

File details

Details for the file tmtoolkit-0.4.0.tar.gz.

File metadata

  • Download URL: tmtoolkit-0.4.0.tar.gz
  • Upload date:
  • Size: 15.2 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No

File hashes

Hashes for tmtoolkit-0.4.0.tar.gz
Algorithm Hash digest
SHA256 f65805163fbdcc46200f25c75047a593075b49864ccfa18d45940b0bc6d74afc
MD5 aa93a6cf00c8e63a45c066f22b598586
BLAKE2b-256 8e6f83b2eeba2a91681819c2c46c5d234958e86775c4a9f3782ac3072a382af3

See more details on using hashes here.

File details

Details for the file tmtoolkit-0.4.0-py2.py3-none-any.whl.

File metadata

File hashes

Hashes for tmtoolkit-0.4.0-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 bba11d7018561769ed5fc5506f0ffb70961e108bd81ae10a079fc13e0350c2d4
MD5 75792244848f97ee303359590495cf73
BLAKE2b-256 db24b63724f041dea8933e502ed4e282599ecb296f4090d4d1a3a5ccd7cfb5f7

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page