Skip to main content

This Short-Text Analyzer is created to help analyze the open-ended survey response which usually has less than three sentences. The analysis includes topic modeling, sentiment analysis, and visualization.

Project description

Short-text-analyzer

This ShortTextAnalyzer was created to help analyze the open-ended survey response which usually has less than three sentences. The analysis includes topic modeling, sentiment analysis, and visualization. This topic modeling was done using pre-trained representations of language, namely BERT, combine with the clustering algorithm.

Documentation Page: https://thisisphume.github.io/short-text-analyzer/

Install

pip install short-text-analyzer

Install all the required packages from the requirement.txt file.

pip install -r requirements.txt

from shorttextanalyzer.core import *

How to use

analyzer = shortTextAnalyzer(comments_series, 4)
output_result = analyzer.analyze_getResult()
Embedding Method for Visualization is  2AE  with MSE of 0.6560611658549391
Embedding Method for Clustering is  2AE  with MSE of 0.4782262679093038
Number of clusters via HDBSCAN is:  5.0
Number of clusters via KMeans is:   4

Here we specify that we want 4 clusters/topic from this data.

Output: result

  • sentimentScore: Polarity score ranges from [-1,1] where 1 means positive statement and -1 means a negative statement.
  • Subjective: score ranges from [0,1] where 1 refer to personal opinion, emotion or judgment and 0 means it is factual information.
  • clusterByKMeans: assigned cluster number for each comments using KMeans
  • clusterByHDBSCAN: assigned cluster number for each comments using HDBSCAN
output_result.sample(2)
comments comment_lang comments_clean sentimentScore subjectiveScore clusterByKMeans clusterByHDBSCAN
50 sondage parfait fr perfect poll 1.00 1.000000 2 1
875 it wasn't very clear what the purpose of the f... en it wasn't very clear what the purpose of the f... 0.19 0.415833 1 1

Visualization: how good is our clusters? HDBSCAN and KMeans

analyzer.plot_output()

png

png

Reference

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

shorttextanalyzer-0.1.1.tar.gz (14.6 kB view hashes)

Uploaded Source

Built Distribution

shorttextanalyzer-0.1.1-py3-none-any.whl (15.5 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page