Value-proportional word cloud generator with true size relationships
Project description
TrueWordCloud
Value-Proportional Word Cloud Generator
A word cloud generator that maintains TRUE proportional relationships between values. Unlike traditional word clouds that arbitrarily resize words to fit a canvas, TrueWordCloud ensures font sizes are ALWAYS proportional to the input values.
Key Features
- ✅ True Proportionality - Font sizes strictly proportional to input values (no squeezing/normalization)
- 🎨 Three Layout Algorithms - Choose between 'greedy' (fast, deterministic), 'square' (compact, randomized), and 'distance_transform' (compact packing using distance transform)
- 🖼️ Mask Support - Use custom mask images to constrain word placement (black=allowed, white=forbidden)
- 🌈 Color Masks - Use colored masks to assign word colors from an image
- 🖋️ Mask Outline - Optionally overlay the mask outline on the generated word cloud
- 📐 Dynamic Canvas - Canvas size determined by content, not pre-fixed dimensions
- 🔢 Any Numeric Values - Works with frequencies, keyness scores, TF-IDF, probabilities, etc.
- 🎯 No Overlaps - Guaranteed non-overlapping word placement
- 🌈 Custom Colors - Flexible color function support
- 📊 Detailed Statistics - Use
generate_with_stats()to get placement and layout stats
Installation
pip install truewordcloud
Or install from source:
git clone https://github.com/laurenceanthony/truewordcloud.git
cd truewordcloud
pip install -e .
Quick Start
from truewordcloud import TrueWordCloud
# Simple usage
values = {'python': 100, 'data': 80, 'science': 75, 'visualization': 60}
twc = TrueWordCloud(values=values)
image = twc.generate()
image.save('wordcloud.png')
Layout Algorithms
Greedy Spiral (method='greedy')
Best for: Speed, reproducibility, circular aesthetics
- ⚡ Fast spiral placement from center outward
- 🔒 Deterministic (same input → same output)
- 🎯 Creates radial/circular patterns
- ✅ Ideal for scientific papers, reports, consistent branding
twc = TrueWordCloud(values=values, method='greedy')
Square Packing (method='square')
Best for: Compact layouts, gap filling, visual variety
- 📦 Center-outward square packing with intelligent gap filling
- 🎲 Randomized (varied layouts each run)
- 📐 Maintains roughly square aspect ratio (width ≈ height)
- ✅ Ideal for presentations, posters, artistic displays
twc = TrueWordCloud(values=values, method='square')
Distance Transform Packing (method='distance_transform')
Best for: Most compact, mask-constrained layouts
- 🧲 Uses distance transform to pack words tightly
- 🖼️ Works especially well with masks
- 🧩 Fills gaps more efficiently than other methods
- ✅ Ideal for artistic, shape-constrained, or dense word clouds
twc = TrueWordCloud(values=values, method='distance_transform')
Mask Support
You can constrain word placement to a custom shape using a mask image (black=allowed, white=forbidden):
from PIL import Image
mask_img = Image.open('mask.png').convert('L')
twc = TrueWordCloud(values=values, method='greedy')
image = twc.generate(mask=mask_img)
image.save('masked_wordcloud.png')
Mask Outline
To overlay the mask outline on the word cloud:
image = twc.generate(mask=mask_img, mask_outline=True, mask_outline_color='#00AAFF', mask_outline_width=2)
image.save('masked_wordcloud_with_outline.png')
Color Masks
You can use a colored mask to assign word colors from an image:
color_mask_img = Image.open('color_mask.png')
twc = TrueWordCloud(values=values, method='greedy', use_mask_colors=True, mask_shape_mode='colors')
image = twc.generate(mask=color_mask_img)
image.save('color_masked_wordcloud.png')
Advanced Usage
Custom Colors
def color_func(word, freq, norm_freq):
# norm_freq is between 0 and 1
if norm_freq > 0.7:
return (255, 0, 0) # Red for high frequency
elif norm_freq > 0.4:
return (0, 0, 255) # Blue for medium
else:
return (128, 128, 128) # Gray for low
twc = TrueWordCloud(values=values, color_func=color_func)
All Parameters
twc = TrueWordCloud(
values={'word': 100, 'cloud': 50}, # Required: word -> value mapping
method='greedy', # 'greedy', 'square', or 'distance_transform'
base_font_size=100, # Font size for max value word
font_path='/path/to/font.ttf', # Custom font (auto-detected if None)
min_font_size=10, # Minimum font size
background_color=(255, 255, 255), # RGB tuple
margin=2, # Pixels between words
color_func=None, # Custom color function
use_mask_colors=False, # Use colors from mask image
mask_shape_mode='no-colors' # 'no-colors' or 'colors'
)
# Generate with statistics
image, stats = twc.generate_with_stats(mask=mask_img)
print(stats) # {'num_words': 2, 'size_range': (50, 100), 'canvas_size': (800, 600), 'method': 'greedy', ...}
Comparison with Traditional Word Clouds
| Feature | TrueWordCloud | Traditional Word Clouds |
|---|---|---|
| Proportionality | ✅ Strict (font_size ∝ value) | ❌ Arbitrary resizing to fit |
| Canvas Size | Dynamic (fits content) | Fixed (pre-defined) |
| Reproducibility | ✅ Greedy method | Sometimes |
| Layout Options | 3 algorithms + mask | Usually 1 |
| Value Types | Any numeric | Usually just frequencies |
| Mask Support | ✅ Yes | Sometimes |
| Color Masks | ✅ Yes | Rare |
Why True Proportionality Matters
Traditional word clouds often lie about the data:
- A word with value 100 might be rendered at 80pt
- A word with value 50 might be rendered at 75pt
- Ratios like 2:1 become 1.07:1
TrueWordCloud guarantees:
- Value 100 → 100pt, Value 50 → 50pt
- Ratios are preserved: 2:1 stays 2:1
- Visual size accurately represents data magnitude
Use Cases
- Linguistic Analysis - Word frequencies, keyness scores, TF-IDF
- Survey Results - Response counts, satisfaction scores
- Scientific Papers - Maintaining accurate proportional relationships
- Marketing - Brand mentions, sentiment scores
- Education - Concept importance, study time allocation
- Artistic/Shape Clouds - Custom shapes, logos, or images as masks
Requirements
- Python 3.7+
- Pillow (PIL)
- numpy
- scipy
License
MIT License - see LICENSE file for details
Contributing
Contributions welcome! Please open an issue or submit a pull request.
Citation
If you use TrueWordCloud in academic work, please cite:
@software{truewordcloud2026,
title={TrueWordCloud: Value-Proportional Word Cloud Generator},
author={Laurence Anthony},
year={2026},
url={https://github.com/laurenceanthony/truewordcloud}
}
Examples
Frequency Data
word_frequencies = {
'the': 1000, 'Python': 500, 'data': 400, 'analysis': 300,
'machine': 250, 'learning': 250, 'algorithm': 200
}
twc = TrueWordCloud(values=word_frequencies, method='greedy')
twc.generate().save('frequencies.png')
Keyness Scores
keyness_scores = {
'significant': 12.5, 'analysis': 8.3, 'corpus': 6.7,
'frequency': 5.2, 'text': 4.1
}
twc = TrueWordCloud(values=keyness_scores, method='square', base_font_size=50)
twc.generate().save('keyness.png')
With Custom Styling
from PIL import ImageColor
def rainbow_color(word, freq, norm_freq):
# Rainbow gradient based on frequency
hue = int(norm_freq * 270) # 0 (red) to 270 (blue)
return ImageColor.getrgb(f'hsl({hue}, 100%, 50%)')
twc = TrueWordCloud(
values=word_frequencies,
method='square',
color_func=rainbow_color,
background_color=(0, 0, 0), # Black background
margin=5
)
twc.generate().save('rainbow.png')
With Mask and Mask Outline
from PIL import Image
mask_img = Image.open('mask_heart.png').convert('L')
twc = TrueWordCloud(values=word_frequencies, method='distance_transform')
image = twc.generate(mask=mask_img, mask_outline=True, mask_outline_color='#00AAFF', mask_outline_width=2)
image.save('heart_mask_wordcloud.png')
With Color Mask
color_mask_img = Image.open('mask_heart_color.png')
twc = TrueWordCloud(values=word_frequencies, method='greedy', use_mask_colors=True, mask_shape_mode='colors')
image = twc.generate(mask=color_mask_img)
image.save('color_mask_wordcloud.png')
FAQ
Q: Why are the layouts different sizes?
A: Canvas size is determined by content. More words or higher values = larger canvas. This maintains true proportions.
Q: Can I fix the canvas size?
A: Not directly, as that would require resizing words (breaking true proportionality). Instead, adjust base_font_size to control overall scale.
Q: Which method should I use?
A: Use greedy for speed and reproducibility. Use square for compact layouts and visual variety. Use distance_transform for the most compact, mask-constrained layouts.
Q: How do I make words fit in a specific area?
A: Reduce base_font_size until the generated canvas is the desired size.
Q: How do I use a mask or color mask?
A: See the Mask Support and Color Masks sections above for examples.
Made with ❤️ for accurate data visualization
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file truewordcloud-1.1.0.tar.gz.
File metadata
- Download URL: truewordcloud-1.1.0.tar.gz
- Upload date:
- Size: 21.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ffd29c9224fbe0ecfd354d7804b5a5eb51817d8f1a2e85b2538d23ee8e11787d
|
|
| MD5 |
932d92c2b575ae7521aa337fd18e8d55
|
|
| BLAKE2b-256 |
e7982c9f21d01e7c203f7e6f98c4b03d539176dcd3a40fe512365b9e0547ef90
|
File details
Details for the file truewordcloud-1.1.0-py3-none-any.whl.
File metadata
- Download URL: truewordcloud-1.1.0-py3-none-any.whl
- Upload date:
- Size: 16.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f189a5d8df2d38a29d468fd0f75a550f4b8d6c580f75a47d0a36a40379a56f20
|
|
| MD5 |
cdf32539596196416961f5a4e0bd4ac1
|
|
| BLAKE2b-256 |
34a08899090806158bb2b90b5e6b647e5801ce65d38bed7da6186aa9110d8804
|