simplifies word counting
Project description
Word Frequency and Sentiment Analysis Package Overview This Python package allows you to analyze word frequencies from text or PDF files, visualize the results using bar plots, and perform sentiment analysis on the text. It includes functionalities to remove stop words, customize the list of stop words, and generate plots for the most frequent words.
The package also provides an interactive Gradio interface to upload PDFs, perform analysis, and view the results.
Features PDF Reading: Extract text from PDF files. Word Frequency Analysis: Count word occurrences and filter results. Sentiment Analysis: Analyze the sentiment (positive, negative, neutral) of the text. Stop Words Removal: Remove common stop words and custom stop words from the text. Visualization: Generate bar plots for the top N frequent words. Gradio Interface: Upload PDFs and perform analysis using an intuitive web interface. Installation Clone the repository or install the package from PyPI:
bash Copy code pip install word-freq-analysis Install additional dependencies if needed:
bash Copy code pip install gradio matplotlib PyPDF2 Usage
- Gradio Interface Launch the Gradio app to interact with the package:
python Copy code from word_freq_app import showWindow
Launch the interface
showWindow() 2. Word Frequency Functions Analyze a PDF File Analyze word frequency and sentiment from a PDF:
python Copy code from word_freq_app import getWord_freq_file_removingStopWords
word_freq = getWord_freq_file_removingStopWords("example.pdf") print(word_freq) Analyze Text Analyze word frequency from raw text:
python Copy code from word_freq_app import getWord_freq_text_removing_StopWords
text = "This is an example text to analyze." word_freq = getWord_freq_text_removing_StopWords(text) print(word_freq) Plot Word Frequency Generate a plot for the top N words:
python Copy code from word_freq_app import plot_top_n_words_text
text = "This is an example text to analyze." plot_path = plot_top_n_words_text(text, top_n=5) print(f"Plot saved at: {plot_path}") API Methods
- Gradio Analysis python Copy code analyse(file, top_n) Inputs: file: A PDF file to analyze. top_n: Number of top frequent words to display. Outputs: Sentiment (text) Word frequency (text) Word frequency bar plot (image)
- Word Frequency Analysis getWord_freq_file_removingStopWords(file): Analyze word frequency from a file while removing stop words. getWord_freq_text_removing_StopWords(text): Analyze word frequency from text while removing stop words. getWord_freq_file_without_Removing_StopWords(file): Analyze word frequency from a file without removing stop words. getWord_freq_text_without_Removing_StopWords(text): Analyze word frequency from text without removing stop words.
- Custom Stop Words add_custom_stop_words(wordsList): Add custom stop words to exclude from analysis.
- Plot Word Frequency plot_top_n_words_text(text, top_n): Plot the top N frequent words from text. plot_top_n_words_file(file, top_n): Plot the top N frequent words from a file. Example python Copy code from word_freq_app import plot_top_n_words_file
Plot top 10 words from a PDF file
plot_path = plot_top_n_words_file("example.pdf", top_n=10) print(f"Word frequency plot saved at: {plot_path}") Dependencies gradio matplotlib PyPDF2 nltk
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file isimplify-0.1.8.tar.gz.
File metadata
- Download URL: isimplify-0.1.8.tar.gz
- Upload date:
- Size: 41.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.8.4 CPython/3.12.5 Darwin/23.5.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
fbfa6584b054d907761905fcccf118597e1424840bd696848d6a1b87fda08c16
|
|
| MD5 |
f938750a7745f18ed6de5b3b314fcf72
|
|
| BLAKE2b-256 |
7284c5c13d97a265062ac49453e363a8f57f54d90ff8071c3199e52eb1761c39
|
File details
Details for the file isimplify-0.1.8-py3-none-any.whl.
File metadata
- Download URL: isimplify-0.1.8-py3-none-any.whl
- Upload date:
- Size: 41.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.8.4 CPython/3.12.5 Darwin/23.5.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
1c29fd9e00019ecbe28e28f7f14a35c108f4d31ceeb29656ad2e36086c2fcca5
|
|
| MD5 |
e91747d745dfd1fb31d1d0ade5ea1677
|
|
| BLAKE2b-256 |
e10759a10982a8778e3f142860c06e8d8cc08880528a042449a07bda72d906ea
|