Skip to main content

Short-text tagger generates topic distributions for all texts in a corpus.

Project description

short_text_tagger

https://img.shields.io/pypi/v/short_text_tagger.svg https://img.shields.io/travis/JohnAnthonyBowllan/short_text_tagger.svg Documentation Status

short_text_tagger generates topic distributions for all texts in a corpus.

  • Free software: MIT license

Installation

pip install short_text_tagger

Usage

If you have graph-tool installed and want to use its community detection functionality to generate topics, then import short_text_tagger.generate_topic_distributions_from_corpus into your project. This function expects a pandas DataFrame with columns id and text.

If you don’t have graph-tool installed or want to substitute other community detection algorithms, then you have the option of importing cleaned_texts_df_from_data from short_text_tagger for text preprocessing and adding a required words column to the aforementioned DataFrame. After, you can import assign_text_probabilities, which expects the input DataFrame with an added words column and a list of dictionaries (word to topic mappings) and returns the same DataFrame with appended topic probability columns. The hook is the creation of the list of word to topic mappings. In this package, that functionality is provided by word_to_block_dict.

Credits

This package was created with Cookiecutter and the audreyr/cookiecutter-pypackage project template.

History

0.1.0 (2020-10-09)

  • First release on PyPI.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

short_text_tagger-0.1.7.tar.gz (14.3 kB view hashes)

Uploaded Source

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page