A simple Python (2 or 3) script to generate a PNG word-cloud image from a bunch of text files. Based on word_cloud.

A simple Python script to generate a square wordcloud from one (or more) text file(s). Supporting both Python 2 and 3 (2.7+ and 3.4+).

Based on the great word_cloud module by @amueller.

How to use it?

1. Requirements

The usual module matplotlib is needed for the plotting, docopt is needed for the command line interface, and word_cloud is needed for the actual work (generating the cloud of words after reading the files).

The required Python (2 or 3) modules can be installed with pip, either directly:

# Directly:
sudo pip install matplotlib docopt word_cloud

Or with the requirements.txt file:

sudo pip install -r requirements.txt

Note: if ansicolortags is available, it will be used to print nice colors in the help and during the generation of word clouds.

2. Installation

Clone the repository, copy the script ( somewhere in your PATH (e.g., ~/.local/bin/).

You can also just download the script itself:

$ wget
$ cp /path/to/a/directory/in/your/PATH/

Note: The script is also available from PyPI : You can install it using pip.

$ sudo pip install generatewordcloud

3. Usage


$ --help

From one or two files

Generate a wordcloud from two txt files in the current directory, save it to wordcloud_txt.png.

$ -o ./wordcloud_txt.png ./file1.txt ./file2.txt

Generate a wordcloud from the textfile hamlet.txt (~ 8000 lines), saving to hamlet.png:

$ -o ./hamlet.png ./hamlet.txt

(It should work on pretty big text files without any issue.)

Other examples

From a lot of Python scripts (~ 200)

From a lot of Bash scripts (~ 150)

From a lot of LaTeX files (~ 180)

Meta example

Generate a wordcloud from the and files of this very project, save it to wordcloud_meta.png!

$ -o ./wordcloud_meta.png ./*.md ./*.py


  • [x] Support one or more input file(s), will cleanly skip any file it fails to find or fails to read,
  • [x] Custom output file, won’t be overwritten (except with -f flag),
  • [x] Nice command line interface (argparse powered). I switched to docopt after realizing how awesome it is!
  • [x] Has a command line option for every important parameter (max nb of words, width, height etc).
  • [x] Input filenames with spaces in their name were seen as several files (e.g. this file.txt), FIXED with the switch to docopt.

Complete documentation (--help)

$ -h | --help
Usage: [-s | --show] [-f | --force] [-o OUTFILE | --outfile=OUTFILE]
                         [-t TITLE | --title=TITLE] [-m MAX | --max=MAX]
                         [-w WIDTH | --width=WIDTH] [-H HEIGHT | --height=HEIGHT]
                         INFILE... (-h | --help) (-v | --version)

  -h --help            Show this help message and exit.
  -v --version         Show program's version number and exit.
  -s --show            Show the image but do not save it [default False].
  -f --force           Force to write the image, even if present (default is to ask before overwriting an existing file) [default False].
  -o OUTFILE --outfile=OUTFILE
                       Filename for the generated image [default 'wordcloud.png'].
  -t TITLE --title=TITLE
                       Title for the image [default None].
  -m MAX --max MAX
                       Max number of words to display on the cloud word [default 150].
  -w WIDTH --width WIDTH
                       Width of the generate image [default 400].
  -H HEIGHT --height HEIGHT
                       Height of the generate image [default 300].
  INFILE               A text file to read.


  • [x] Start it, from this example,
  • [x] Run it on some interesting examples, embed them here (as images),
  • [X] Check on weird encodings? (i.e., not UTF-8). It works fine!
  • [X] Test it against :closed_book: VERY large files (million of line) ? It works fine, slowly but fine.
  • [X] Test it against LOTS of files (several thousands) ? It works fine, slowly but fine.
  • [X] Publish it on PyPI: it is available at
  • [ ] Write a small article about it for my blog.

Knows issues

  • [ ] Only tested on (X)Ubuntu (15.10), but it should work on other GNU/Linux distribution and Mac OS X (and probably Windows), if they support docopt and has both docopt and word_cloud installed.

Unknown issues?

Use the issue tracker to notify me of a bug!


Why write this script?

There already is a lot of good cloud word generator online, e.g.

  1. I wanted a way to visualize the major keywords of Bash and Python (my two favorite programming languages) and of Markdown/Strapdown, reStructuredText and LaTeX (my favorite typeset documents system),
  2. The original project word_cloud seemed cool. And it is. Great job @amueller !
  3. Clouds of words are interesting! And Python is awesome!

License ?

This plug-in is published under the terms of the GPLv3 License (file LICENSE.txt), © Lilian Besson, 2016.

