Skip to main content

A package for creating wordcloud maps in Python.

Project description

WordCloud_Mapper

PyPI - Version GitHub Pipenv locked Python version PyPI - License GitHub repo size

WordCloud_Mapper is a Python package that allows one to create wordclouds shaped like regions on a map.

Such visualisations are especially useful when communicating sets of data that consist of many different observations and each observation is attributed to a specific region and size of occurrence. Take the example below, a dataset containing the name of the biggest companies (in terms of estimated number of employees in 2019) for each state in Germany.


https://github.com/GabZech/wordcloud_mapper/raw/main/docs/figures/germany_nuts1.png

Installation

To install WordCloud_Mapper, run in your terminal:

pip install wordcloud_mapper

or

pip install wordcloud-mapper

Features and usage

  • Create a wordcloud map from data stored in a DataFrame object using wordcloud_map().

  • Easily resize a map by any desired scaling factor using resize_map().

  • Load dummy datasets to test out the package’s features using load_companies().

  • Calculate how unique a word is to a particular region in comparison to other regions by calculating the Term Frequency — Inverse Document Frequency (TF-IDF) score for each word in each region using calc_tfidf().

See the documentation for more information on how to use the package and its functions.

Notes on geographical nomenclature

The classification of regions used here follows the European Union’s Nomenclature of Territorial Units for Statistics (NUTS), a geocode standard for referencing the subdivisions of countries. The advantage of using this system is that the classification of regions across countries is standardised and hierarchically structured. For instance, Germany has the base code DE (NUTS 0), the state of Bavaria has the code DE2 (NUTS 1), its subregion of Oberbayern has the code DE21 (NUTS 2) and the city of Munich has the code DE212 (NUTS 3). Since each region is given a unique identifier which is directly linked to the regional level above it, it is fairly easy to identify and match any dataset to these regions.

However, this means that this package currently only works for creating wordcloud maps for EU countries. For an overview of the NUTS regions and levels, you can browse the available maps for each EU country or use this interactive map instead. If you have a dataset containing postcodes and want to convert these to NUTS regions, you can find the correspondence tables here.

In a future release, support nor non-NUTS regional referencing systems will be implemented.

Feedback and contributions

This package is under active development, so any feedback, recommendations, suggestions or contribution requests are more than welcome!

Please read the contribution instructions or email g.dev@posteo.net if you would like to provide any feedback.

History

0.1.0 (2022-07-27)

  • First release on PyPI.

0.2.0 (2022-09-11)

New functionality:

  • Add new function calc_tfidf() to calculate TF-IDF score of each word in each region in a dataframe.

  • Add wordcloud colour generating function based on rank of words.

  • Add colour_hue parameter to wordcloud_map() allowing users to choose one specific colour hue for all regions.

Parameters exposed to users:

  • Allow users to change the parameters when downloading NUTS shapefiles from Eurostat’s API in wordcloud_map().

  • Allow users to change the sharpness of the regional border lines by channging the DPI value used when creating the masks.

  • Allow users to use shapefiles form a local filepath instead of downloading from GISCO’s online database.

Others:

  • Change default coordination system when downloading shapefiles.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

wordcloud_mapper-0.2.0.tar.gz (1.8 MB view hashes)

Uploaded Source

Built Distribution

wordcloud_mapper-0.2.0-py2.py3-none-any.whl (55.2 kB view hashes)

Uploaded Python 2 Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page