Skip to main content

Visual Text Explorer tool. Enables the exploration and text analysis through word frequency and named entity recognition in Jupyter Notebooks

Project description

TextExplorer

VisualTextAnalyzer help users to understand the text data. It includes word frequency analysis and named entities recognition, which help users to explore the fundamental characteristics of the text data. We use bar charts to create the visualizations integrated with the Jupyter Notebook environment. Word frequency analysis is a frequent task in text analytics. Word frequency measures the most frequently occurring words in a given text. Common stopwords like ‘to’, ‘in’, ‘for’, were removed for the word frequency analysis. Named entity recognition is an information extraction method. The entities that are present in the text are classified into predefined entity types like ‘Person’, ‘Organization’, ‘City’, etc. By using this method, users can get great insights into the types of entities present in the given textual dataset.

Visual Text Analyzer

Text Exploration

In Jupyter Notebook:

import VisualTextAnalyzer
import pandas as pd
data = pd.read_csv('yelp_labelled_sample.csv')
VisualTextAnalyzer.plot_text_summary(data, category_column='category', text_column='comments')

Demo

In Jupyter Notebook::

import VisualTextAnalyzer
yelp_data = VisualTextAnalyzer.get_yelp_labelled_data()
VisualTextAnalyzer.plot_text_summary(yelp_data, category_column='category', text_column='comments')

Export Texts

You might want to export a subset of selected texts for further analyses. To do so, use the following code (after exporting it through the UI):

obj_text = VisualTextAnalyzer.get_exported_texts()

The returned object has the following attributes:

  • texts: List of texts.
  • category: All texts belong to that category.
  • word: All texts contain that word.

Get Processed data (Words and Entities)

You might want to get the processed data, which includes word and entity frequencies, that is ready for analysis before generating the visualization. To do so, use the code:

processed_data = VisualTextAnalyzer.get_words_entities(data, category_column='category', text_column='comments')
VisualTextAnalyzer.plot_text_summary(words_entities=processed_data)

The function 'get_words_entities' returns an object that contains has the following attributes:

  • words: Word frequencies.
  • entities: Entity frequencies.
  • raw_texts: All texts separated in two categories: positive and negative.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

visual-text-explorer-0.1.9.tar.gz (1.3 MB view details)

Uploaded Source

Built Distribution

visual_text_explorer-0.1.9-py3-none-any.whl (1.3 MB view details)

Uploaded Python 3

File details

Details for the file visual-text-explorer-0.1.9.tar.gz.

File metadata

  • Download URL: visual-text-explorer-0.1.9.tar.gz
  • Upload date:
  • Size: 1.3 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/41.4.0 requests-toolbelt/0.8.0 tqdm/4.48.2 CPython/3.7.4

File hashes

Hashes for visual-text-explorer-0.1.9.tar.gz
Algorithm Hash digest
SHA256 c074ed2e653a38e58714ccbc56b996cd60c59348f711d740922fb68844c76b44
MD5 d4e2ba3a955e447fdb6ecdc82e2db809
BLAKE2b-256 324485df2f9e691c54cf5d3f1e727063a075f6cf6dbc3b65614aa3e0c743346b

See more details on using hashes here.

File details

Details for the file visual_text_explorer-0.1.9-py3-none-any.whl.

File metadata

  • Download URL: visual_text_explorer-0.1.9-py3-none-any.whl
  • Upload date:
  • Size: 1.3 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/41.4.0 requests-toolbelt/0.8.0 tqdm/4.48.2 CPython/3.7.4

File hashes

Hashes for visual_text_explorer-0.1.9-py3-none-any.whl
Algorithm Hash digest
SHA256 1c22caeaa579eeb936d361dc01d2de206551405c9c0d3793890402fe8390f3f1
MD5 9184186c51d4fec7cd21c1187ebb4e27
BLAKE2b-256 ad9fc37248f2ea6a8679fc859338bbecae07fa742f163f68ab340e602d5f5e10

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page