Skip to main content

simplifies word counting

Project description

Word Frequency and Sentiment Analysis Package Overview This Python package allows you to analyze word frequencies from text or PDF files, visualize the results using bar plots, and perform sentiment analysis on the text. It includes functionalities to remove stop words, customize the list of stop words, and generate plots for the most frequent words.

The package also provides an interactive Gradio interface to upload PDFs, perform analysis, and view the results.

Features PDF Reading: Extract text from PDF files. Word Frequency Analysis: Count word occurrences and filter results. Sentiment Analysis: Analyze the sentiment (positive, negative, neutral) of the text. Stop Words Removal: Remove common stop words and custom stop words from the text. Visualization: Generate bar plots for the top N frequent words. Gradio Interface: Upload PDFs and perform analysis using an intuitive web interface. Installation Clone the repository or install the package from PyPI:

bash Copy code pip install word-freq-analysis Install additional dependencies if needed:

bash Copy code pip install gradio matplotlib PyPDF2 Usage

  1. Gradio Interface Launch the Gradio app to interact with the package:

python Copy code from word_freq_app import showWindow

Launch the interface

showWindow() 2. Word Frequency Functions Analyze a PDF File Analyze word frequency and sentiment from a PDF:

python Copy code from word_freq_app import getWord_freq_file_removingStopWords

word_freq = getWord_freq_file_removingStopWords("example.pdf") print(word_freq) Analyze Text Analyze word frequency from raw text:

python Copy code from word_freq_app import getWord_freq_text_removing_StopWords

text = "This is an example text to analyze." word_freq = getWord_freq_text_removing_StopWords(text) print(word_freq) Plot Word Frequency Generate a plot for the top N words:

python Copy code from word_freq_app import plot_top_n_words_text

text = "This is an example text to analyze." plot_path = plot_top_n_words_text(text, top_n=5) print(f"Plot saved at: {plot_path}") API Methods

  1. Gradio Analysis python Copy code analyse(file, top_n) Inputs: file: A PDF file to analyze. top_n: Number of top frequent words to display. Outputs: Sentiment (text) Word frequency (text) Word frequency bar plot (image)
  2. Word Frequency Analysis getWord_freq_file_removingStopWords(file): Analyze word frequency from a file while removing stop words. getWord_freq_text_removing_StopWords(text): Analyze word frequency from text while removing stop words. getWord_freq_file_without_Removing_StopWords(file): Analyze word frequency from a file without removing stop words. getWord_freq_text_without_Removing_StopWords(text): Analyze word frequency from text without removing stop words.
  3. Custom Stop Words add_custom_stop_words(wordsList): Add custom stop words to exclude from analysis.
  4. Plot Word Frequency plot_top_n_words_text(text, top_n): Plot the top N frequent words from text. plot_top_n_words_file(file, top_n): Plot the top N frequent words from a file. Example python Copy code from word_freq_app import plot_top_n_words_file

Plot top 10 words from a PDF file

plot_path = plot_top_n_words_file("example.pdf", top_n=10) print(f"Word frequency plot saved at: {plot_path}") Dependencies gradio matplotlib PyPDF2 nltk

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

isimplify-0.1.8.tar.gz (41.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

isimplify-0.1.8-py3-none-any.whl (41.7 kB view details)

Uploaded Python 3

File details

Details for the file isimplify-0.1.8.tar.gz.

File metadata

  • Download URL: isimplify-0.1.8.tar.gz
  • Upload date:
  • Size: 41.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.4 CPython/3.12.5 Darwin/23.5.0

File hashes

Hashes for isimplify-0.1.8.tar.gz
Algorithm Hash digest
SHA256 fbfa6584b054d907761905fcccf118597e1424840bd696848d6a1b87fda08c16
MD5 f938750a7745f18ed6de5b3b314fcf72
BLAKE2b-256 7284c5c13d97a265062ac49453e363a8f57f54d90ff8071c3199e52eb1761c39

See more details on using hashes here.

File details

Details for the file isimplify-0.1.8-py3-none-any.whl.

File metadata

  • Download URL: isimplify-0.1.8-py3-none-any.whl
  • Upload date:
  • Size: 41.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.4 CPython/3.12.5 Darwin/23.5.0

File hashes

Hashes for isimplify-0.1.8-py3-none-any.whl
Algorithm Hash digest
SHA256 1c29fd9e00019ecbe28e28f7f14a35c108f4d31ceeb29656ad2e36086c2fcca5
MD5 e91747d745dfd1fb31d1d0ade5ea1677
BLAKE2b-256 e10759a10982a8778e3f142860c06e8d8cc08880528a042449a07bda72d906ea

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page