Skip to main content

Animated version of classic word cloud for time-series text data

Project description

pypi python License: MIT

AnimatedWordCloud

Animated version of classic word cloud for time-series text data

Classic word cloud graph does not consider the time variation in text data. Animated word cloud improves on this and displays text datasets collected over multiple periods in a single MP4 file. The core framework for the animation of word frequencies was developed by Michael Cane in the WordsSwarm project. AnimatedWordCloud makes the codes efficiently work on various text datasets of the Latin alphabet languages.

It reads dates in:

  • US-style: MM/DD/YYYY (2013-12-31, Feb-09-2009, 2013-12-31 11:46:17, etc.)
  • European-style: DD/MM/YYYY (2013-31-12, 09-Feb-2009, 2013-31-12 11:46:17, etc.) date and datetime formats.

Installation

It requires Python 3.8, Box2D, beautifulsoup4, pygame, PyQt6 - visualization, Arabica and ftfy for text preprocessing.

To install using pip, use:

pip install AnimatedWordCloud

Usage

  • Import the library:
from AnimatedWordCloud import animated_word_cloud
  • Generate frames:

animated_word_cloud generates 90 png word cloud images per period. It scales word frequencies to display word clouds on text datasets of different sizes. Frames are stored in the working directory in the newly created .post_processing/frames folder. It currently provides unigram frequencies (n-gram of order one or just words). Bigram frequencies will be added later.

def animated_word_cloud(text: str,         # Text
                        time: str,         # Time
                        date_format: str,  # Date format: 'eur' - European, 'us' - American
                        ngram: int = '',   # N-gram order, 1 = unigram     
                        freq: str = '',    # Aggregation period: 'Y'/'M'
                        stopwords: [],     # Languages for stop words
) 

To apply the method, use:

import pandas as pd
data = pd.read_csv("data.csv")
animated_word_cloud(text = data['text'],                         # Read text column
                    time = data['date'],                         # Read date column
                    date_format = 'us',                          # Specify date format
                    ngram = 1,                                   # Show individual word frequencies
                    freq ='Y',                                   # Yearly frequency
                    stopwords = ['english', 'german','french'])  # Clean from English, German and French stop words
  • Create video from frames:

Download ffmpeg folder and the frames2video.bat file from here and place them into the postprocessing folder. Next, run frames2video.bat, which will generate an wordSwarmOut.mp4 file, which is the desired output.

AnimatedWordCloud

Documentation, examples and tutorials

  • Read the documentation: TBA

  • For more examples of coding, read these tutorials: TBA

Here are examples of animated word clouds:

Research trends in Economics Youtube

European Central Bankers' speeches Youtube


Please visit here for any questions, issues, bugs, and suggestions.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

AnimatedWordCloud-1.0.1.tar.gz (35.1 MB view details)

Uploaded Source

Built Distribution

AnimatedWordCloud-1.0.1-py3-none-any.whl (35.3 MB view details)

Uploaded Python 3

File details

Details for the file AnimatedWordCloud-1.0.1.tar.gz.

File metadata

  • Download URL: AnimatedWordCloud-1.0.1.tar.gz
  • Upload date:
  • Size: 35.1 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.10.9

File hashes

Hashes for AnimatedWordCloud-1.0.1.tar.gz
Algorithm Hash digest
SHA256 73935a6444de0dd6111a318d07ffb9c5afe08cdce543e76a3f764604d989c63b
MD5 b7a8115aef319263cb56823d5c87f474
BLAKE2b-256 904228b7eb10111048b4134008102f591c1bebafca039a7af786d61aefd961bf

See more details on using hashes here.

File details

Details for the file AnimatedWordCloud-1.0.1-py3-none-any.whl.

File metadata

File hashes

Hashes for AnimatedWordCloud-1.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 12e4d65a61419624867476aa3e5d9c9c8491701fd3276ad3ebe21ceb946ae287
MD5 de0391497f2a88d3ae3ad391b5d0c759
BLAKE2b-256 de51d31eaa58bcfaccd47b7b7895cbc02d0385526713ecd531a20c0c057a9fee

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page