Animated version of classic word cloud for time-series text data
Project description
AnimatedWordCloud
Animated version of classic word cloud for time-series text data
Classic word cloud graph does not consider the time variation in text data. Animated word cloud improves on this and displays text datasets collected over multiple periods in a single MP4 file. The core framework for the animation of word frequencies was developed by Michael Cane in the WordsSwarm project. AnimatedWordCloud makes the codes efficiently work on various text datasets of the Latin alphabet languages.
It reads dates in:
- US-style: MM/DD/YYYY (2013-12-31, Feb-09-2009, 2013-12-31 11:46:17, etc.)
- European-style: DD/MM/YYYY (2013-31-12, 09-Feb-2009, 2013-31-12 11:46:17, etc.) date and datetime formats.
Installation
It requires Python 3.8, Box2D, beautifulsoup4, pygame, PyQt6 - visualization, Arabica and ftfy for text preprocessing.
To install using pip, use:
pip install AnimatedWordCloud
Usage
- Import the library:
from AnimatedWordCloud import animated_word_cloud
- Generate frames:
animated_word_cloud generates 90 png word cloud images per period. It scales word frequencies to display word clouds on text datasets of different sizes. Frames are stored in the working directory in the newly created .post_processing/frames folder. It currently provides unigram frequencies (n-gram of order one or just words). Bigram frequencies will be added later.
def animated_word_cloud(text: str, # Text
time: str, # Time
date_format: str, # Date format: 'eur' - European, 'us' - American
ngram: int = '', # N-gram order, 1 = unigram
freq: str = '', # Aggregation period: 'Y'/'M'
stopwords: [], # Languages for stop words
)
To apply the method, use:
import pandas as pd
data = pd.read_csv("data.csv")
animated_word_cloud(text = data['text'], # Read text column
time = data['date'], # Read date column
date_format = 'us', # Specify date format
ngram = 1, # Show individual word frequencies
freq ='Y', # Yearly frequency
stopwords = ['english', 'german','french']) # Clean from English, German and French stop words
- Create video from frames:
Download ffmpeg folder and the frames2video.bat file from here and place them into the postprocessing folder. Next, run frames2video.bat, which will generate an wordSwarmOut.mp4 file, which is the desired output.
Documentation, examples and tutorials
-
Read the documentation: TBA
-
For more examples of coding, read these tutorials: TBA
Here are examples of animated word clouds:
Research trends in Economics Youtube
European Central Bankers' speeches Youtube
Please visit here for any questions, issues, bugs, and suggestions.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file AnimatedWordCloud-1.0.1.tar.gz
.
File metadata
- Download URL: AnimatedWordCloud-1.0.1.tar.gz
- Upload date:
- Size: 35.1 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.10.9
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 73935a6444de0dd6111a318d07ffb9c5afe08cdce543e76a3f764604d989c63b |
|
MD5 | b7a8115aef319263cb56823d5c87f474 |
|
BLAKE2b-256 | 904228b7eb10111048b4134008102f591c1bebafca039a7af786d61aefd961bf |
File details
Details for the file AnimatedWordCloud-1.0.1-py3-none-any.whl
.
File metadata
- Download URL: AnimatedWordCloud-1.0.1-py3-none-any.whl
- Upload date:
- Size: 35.3 MB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.10.9
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 12e4d65a61419624867476aa3e5d9c9c8491701fd3276ad3ebe21ceb946ae287 |
|
MD5 | de0391497f2a88d3ae3ad391b5d0c759 |
|
BLAKE2b-256 | de51d31eaa58bcfaccd47b7b7895cbc02d0385526713ecd531a20c0c057a9fee |