A very compact representation of an image placeholder (thumbhash, RGBA-safe fork)
Project description
thash
A modern Python port of the ThumbHash encoder by Evan Wallace. ThumbHash represents an image as ~20 bytes — small enough to inline in HTML, large enough to render a recognizable color/aspect placeholder before the real image loads.
This is an independently published fork of thumbhash by Justin Forlenza. Notable changes vs. upstream:
- Alpha-channel crash fixed (operator-precedence bug in
rgba_to_thumb_hash— see upstream issue #1). - NumPy-accelerated backend with cached cosine basis and float32 DCT (~100–140× faster than the reference implementation, byte-identical output).
- High-level
encode()API that accepts paths, bytes, PIL images, NumPy arrays, and OpenCV BGR arrays — pick the input you already have, no boilerplate. - Decoder + CLI for rendering a hash back to a placeholder image (
thumb_hash_to_rgba, orthash photo.jpg -o preview.png). - Configurable
target_sizeso you can trade hash quality for encoding speed.
Installation
# Pure-Python fallback only (no deps)
pip install thash
# Recommended runtime (NumPy fast path + Pillow decoding)
pip install thash[all]
If you use uv:
uv add thash --extra all
Requires Python ≥ 3.10.
Quick start
The high-level API takes pretty much any image-shaped thing:
from thash import encode
# From a file path or URL-fetched bytes
hash_bytes = encode("photo.jpg")
hash_bytes = encode(open("photo.jpg", "rb").read())
# From a PIL image (already in memory, no re-decode)
from PIL import Image
hash_bytes = encode(Image.open("photo.jpg"))
# From a NumPy array (H,W,3) or (H,W,4) — assumed RGB/RGBA
import numpy as np
arr = np.asarray(Image.open("photo.jpg"))
hash_bytes = encode(arr)
# From an OpenCV BGR array
import cv2
bgr = cv2.imread("photo.jpg")
hash_bytes = encode(bgr, color_order="BGR")
# Grayscale / float arrays in [0, 1] also work — they're normalized for you
hash_bytes = encode(arr.astype(np.float32) / 255.0)
Decoding the hash back
from thash import (
thumb_hash_to_rgba,
thumb_hash_to_average_rgba,
thumb_hash_to_approximate_aspect_ratio,
)
# Render the hash to a small RGBA preview (flat bytes, length 4*w*h)
w, h, rgba = thumb_hash_to_rgba(hash_bytes, base_size=256)
from PIL import Image
Image.frombytes("RGBA", (w, h), rgba).save("preview.png")
# Want a numpy array instead?
import numpy as np
arr = np.frombuffer(rgba, dtype=np.uint8).reshape(h, w, 4)
# Cheaper queries that don't reconstruct pixels:
r, g, b, a = thumb_hash_to_average_rgba(hash_bytes) # values in [0, 1]
aspect = thumb_hash_to_approximate_aspect_ratio(hash_bytes) # w / h
base_size is the longer edge of the reconstructed image. ThumbHash only carries ~5×5 / 7×7 frequency coefficients, so the IDCT is run directly at the requested resolution rather than upsampled — values up to a few hundred pixels look smooth without any extra resampling. The aspect ratio comes from the encoded lx / ly (e.g. 7:4 for a landscape, 5:7 for a portrait); near-non-integer ratios like 1.6 get quantized to 1.75, this is a spec property, not an implementation choice.
Command-line
Installing the package exposes a thash command (equivalent to python -m thash):
# --- Encoding: print a hash for each input ---
thash photo.jpg # base64 hash, one per line
thash --format hex photo.jpg
thash --format bytes photo.jpg
thash photo.jpg cover.png hero.webp # multi-file: "path<TAB>hash" per line
thash --target-size 64 photo.jpg # trade quality for encoding speed
# --- Rendering: save a placeholder preview PNG ---
thash photo.jpg -o preview.png # encode + decode + save
thash photo.jpg -o preview.png --size 128 # cap the longer edge
thash "2dYJLJSBdoiAiHVoSHZzcBf4iA==" -o p.png # base64 hash → PNG (no source image needed)
thash d9d6092c94817688808875684876737017f888 -o p.png # hex hash → PNG
thash a.jpg b.jpg "2dYJ...==" -o out/ # multi input → directory, auto-named
The CLI uses the high-level encode() / thumb_hash_to_rgba() APIs. It needs Pillow for decoding images / writing PNG previews; NumPy is optional (only accelerates the encode / decode). Install with pip install thash[pillow] for the CLI or [all] for the fast path too. Hash inputs are auto-detected: hex strings (even length, hex alphabet) are tried first, then base64 (standard and URL-safe).
Tuning speed vs. quality
target_size controls the longer dimension of the image after thumbnail (spec max is 100). Smaller = faster, lower fidelity:
target_size |
DCT time | Visual quality |
|---|---|---|
| 100 (default) | ~125 μs | Reference / spec-compatible |
| 64 | ~85 μs | Indistinguishable in practice |
| 50 | ~75 μs | Fine for any placeholder use |
| 32 | ~65 μs | Colors correct, details blurred |
| 16 | ~45 μs | Average color + rough orientation only |
encode("photo.jpg", target_size=50) # 4× DCT speedup, hash is still spec-valid
encode("photo.jpg", target_size=50, resize=False) # error if image is already > 50px
Note: For very large input images the bottleneck is usually PIL decode + resize, not the DCT.
target_sizeonly matters once your input is already small (e.g. a tensor in an ML pipeline). For batch processing many photos from disk, parallelize withconcurrent.futures.ProcessPoolExecutorbefore reaching for GPU.
Backends
The package picks the NumPy backend at import time if available, otherwise falls back to a pure-Python reference implementation. You can force one explicitly:
encode(img, backend="numpy") # default, BLAS-accelerated matmul
encode(img, backend="pure") # reference Python, no deps
Backend availability is reflected by module flags:
from thash import has_numpy, has_pil
Backend comparison (random RGBA inputs, byte-identical output)
case size alpha pure numpy
---------------------------------------------------------
tiny-square 10x10 False 300 μs 41 μs
small-square 32x32 False 2.7 ms 66 μs
medium-square 64x64 False 11.4 ms 86 μs
max-square 100x100 False 26.8 ms 124 μs
landscape 100x56 False 11.7 ms 98 μs
max-square+a 100x100 True 28.2 ms 168 μs
HD-720p 1280x720 False — 48 ms
FHD-1080p 1920x1080 False — 208 ms
UHD-4K 3840x2160 False — 516 ms
NumPy is ~100–140× faster than the reference impl on spec-sized inputs (geometric mean ~88×, median ~137×). Three optimizations stack here:
- Cosine basis cached by
(n, k)—np.coscost amortizes across calls with shared dimensions (common after thumbnail). - P and Q channels combined into a single batched 3×3 matmul.
- float32 DCT — Bandwidth halved, BLAS
sgemmfaster thandgemm; verified byte-identical on 490 random inputs across all spec shapes.
The pure-Python fallback is kept so the package works with zero deps. Run uv run python benchmarks/run.py to reproduce.
Low-level API
The original byte-list API still works for callers who want to manage RGBA themselves:
from thash import rgba_to_thumb_hash, image_to_thumb_hash
# Flat list: [R, G, B, A, R, G, B, A, ...], length = 4 * w * h
hash_bytes = rgba_to_thumb_hash(width, height, flat_rgba_ints)
# Open a file via Pillow, thumbnail to ≤100x100, encode
hash_bytes = image_to_thumb_hash("photo.jpg")
rgba_to_thumb_hash automatically picks the NumPy backend if available, falling back to pure Python otherwise.
Development
git clone https://github.com/Jannchie/thumbhash-py.git
cd thumbhash-py
uv sync --all-extras --all-groups # full dev env (deps + dev tools + bench)
uv run pytest # tests
uv run ruff check thash benchmarks # lint
uv run python benchmarks/run.py # benchmark suite
Credits
- Original ThumbHash algorithm: Evan Wallace —
evanw/thumbhash - Original Python port: Justin Forlenza —
justinforlenza/thumbhash-py
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file thash-1.2.0.tar.gz.
File metadata
- Download URL: thash-1.2.0.tar.gz
- Upload date:
- Size: 18.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.11.14 {"installer":{"name":"uv","version":"0.11.14","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":null,"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
9ef438bfe9f89f2e68ff4114edc171c41bacf756d93710754dcba2489f398894
|
|
| MD5 |
35671524f4ad42157c2ee95afda7f08c
|
|
| BLAKE2b-256 |
caf7153f70fa55217f70e5ca4074e763aa2e02ba70be5110015d467dc65af68b
|
File details
Details for the file thash-1.2.0-py3-none-any.whl.
File metadata
- Download URL: thash-1.2.0-py3-none-any.whl
- Upload date:
- Size: 20.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.11.14 {"installer":{"name":"uv","version":"0.11.14","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":null,"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c0cf95171cf87b1d08c79d0dda61e6f07b19398bf330fc8d70b9ffbea447b98a
|
|
| MD5 |
f987ad3d993e6837102a2516f3042670
|
|
| BLAKE2b-256 |
9660d0455462d3a66b046789153bc3fca02ac75faa40d8ddf7c871dd94d78cc3
|