Multi-dimensional image similarity comparison. Like a prism decomposes light, imageprism decomposes similarity into independent dimensions.

These details have not been verified by PyPI

Project links

Project description

imageprism

Compare two images across several kinds of similarity in one call. Runs on CPU, no PyTorch, no GPU, no API keys.

"Similar" is ambiguous. Two images can be the same file re-saved, the same kind of scene, the same specific object, or the same person, and those are different questions with different answers. imageprism scores each one as its own dimension and hands back the numbers together, so you choose the dimensions your problem actually needs. Everything runs on CPU through ONNX Runtime, NumPy, and Pillow.

from imageprism import ImagePrism, Dimension

prism = ImagePrism(dimensions=[Dimension.HASH, Dimension.SEMANTIC])
result = prism.compare("a.jpg", "b.jpg")
result.scores  # {"hash": 0.12, "semantic": 0.82}

That is the whole public surface: one class, one method.

Doing this by hand usually means installing imagehash, a CLIP wrapper, and a face library, then reconciling three preprocessing pipelines and three output formats. imageprism puts them behind one API.

Install

pip install imageprism

ImagePrism() with no arguments uses hashing only, which needs no downloads and runs immediately. Adding a model-backed dimension downloads its model once, caches it locally, and works offline after that.

Dimensions

Dimension	Answers	Technique	Model
`hash`	Pixel-level duplicate?	pHash + dHash + aHash	none (pure algorithm)
`semantic`	Same concept or category?	CLIP cosine similarity	CLIP ViT-B/32 quantized, ~89MB
`instance`	Same specific object?	DINOv2 cosine similarity	DINOv2-small, ~87MB
`style`	Similar visual style?	MobileNetV2 feature similarity	MobileNetV2, ~14MB
`face`	Same person?	Face detection + embedding	UltraFace ~1.2MB + ArcFace ~137MB (swappable, see below)

Dimensions can be passed as enum members or plain strings: dimensions=["hash", "semantic"] works.

Reading the scores

Each score is a float, but the scales differ per dimension - 0.5 does not mean "50% similar". Rough calibration, from the benchmarks and spot checks below:

hash: fraction of matching hash bits. Above ~0.9 is a near-duplicate. Unrelated images land around 0.5, not 0.
semantic: CLIP cosine similarity, which lives in a compressed range. Unrelated images score around 0.5; above ~0.75 usually means the same concept.
instance: DINOv2 cosine similarity. The same object re-photographed scores high (0.7+); unrelated images fall near 0.
face: ArcFace cosine similarity. On LFW the optimal same-person threshold is about 0.32. The score is None when no face is detected in either image, which is different from 0.0 (faces found, but different people).
style: MobileNetV2 feature cosine. Treat as a rough signal; it is not benchmarked yet.

Thresholds always depend on your data, so validate on a sample before hard-coding one.

Profiles

A profile picks a set of dimensions and blends them into one weighted score, keeping the per-dimension breakdown alongside.

from imageprism import ImagePrism, Profile

prism = ImagePrism(profile=Profile.COPYRIGHT)
result = prism.compare("original.jpg", "suspect.jpg")
result.weighted_score  # 0.58
result.scores          # {"hash": 0.51, "instance": 0.34, "semantic": 0.82}

There are six: ecommerce, copyright, dedup, visual_search, identity, forgery. The last two use the face dimension, so read the licensing note below before relying on them.

Custom weights and per-dimension config

from imageprism import ImagePrism, Dimension, HashConfig

prism = ImagePrism(
    weights={Dimension.HASH: 0.6, Dimension.SEMANTIC: 0.4},
    config={Dimension.HASH: HashConfig(algorithms=("phash",), hash_size=16)},
)

Weights are normalized to sum to 1, so relative values are all that matter. A dimension that cannot score a pair (face with no face detected) contributes 0 to the weighted score.

Embeddings and caching

You can pull embeddings out to store in your own index. Repeated comparisons reuse them: the cache is keyed on pixel content, so comparing one image against many others embeds it only once.

emb = prism.embed("a.jpg")          # {"hash": np.array([...]), "semantic": np.array([...])}
prism.compare("a.jpg", "b.jpg")     # a.jpg is embedded here
prism.compare("a.jpg", "c.jpg")     # a.jpg comes from the cache

Batch dedup

dedup embeds each image once and groups near-duplicates, keeping one representative per group. A typical use is trimming a video down to its distinct frames before running something expensive on each one.

from imageprism import ImagePrism, Dimension

# frames pulled from a video, in order
frames = ["frame_0001.jpg", "frame_0002.jpg", "frame_0003.jpg"]

prism = ImagePrism(dimensions=[Dimension.HASH])
result = prism.dedup(frames, threshold=0.9)

result.unique                     # indices of the distinct frames
result.labels                     # for each frame, the representative it was grouped under
distinct = [frames[i] for i in result.unique]

Each image is embedded once, then compared against the representatives kept so far, so the model work stays linear in the number of images. There is no approximate index yet, so a large set of mostly-distinct images grows quadratically in the comparison step.

The right threshold depends on the dimension: around 0.9 on hashing catches re-encodes and small edits, while a lower value on semantic groups by content. Configure a profile or weights instead of a single dimension to dedup on a blended score.

Face and model licensing

Face works out of the box, with one caveat. It detects the largest face with UltraFace (MIT) and embeds it with ArcFace by default. Those default ArcFace weights have no clear commercial license, because like most high-accuracy face models they trace back to research-only datasets. The first time you run the face dimension, imageprism prints a warning.

For commercial use, bring your own embedding model:

from imageprism import ImagePrism, Dimension, FaceConfig

prism = ImagePrism(
    dimensions=[Dimension.FACE],
    config={Dimension.FACE: FaceConfig(embed_repo="your-org/your-model", embed_file="model.onnx")},
)

The model needs to accept a 112x112 RGB face crop. Common choices are FaceX (Apache-2.0), InsightFace buffalo_l (MIT code, but the weights need a commercial license), or one you train yourself. imageprism ships no face weights, so the choice of what you have rights to is yours.

Benchmarks

The numbers below reproduce with the scripts in benchmarks/.

Hashing, on 200 LFW images under 15 transforms (JPEG, resize, crop, rotation, blur, noise, flip, brightness, contrast):

Config	AUC	Accuracy
default (pHash + dHash + aHash, mean)	0.919	0.885
aHash only	0.937	0.889
dHash only	0.900	0.870
pHash only	0.875	0.863

JPEG, resize, blur, noise, brightness, and contrast all sit near 1.0 AUC. The weak points are a 50% center crop (about 0.40) and a horizontal flip (about 0.59).

Semantic, retrieval on the CIFAR-100 test set (1000 images, 100 classes):

Metric	Score
Recall@1	0.44
Recall@5	0.70
Recall@10	0.80
Recall@20	0.88

CIFAR-100 images are 32px upscaled to 224 before they reach CLIP, so treat these as a floor rather than a ceiling.

Face, LFW verification over 6000 pairs: 0.963 AUC, 0.909 accuracy, 0.726 TAR at FAR=1%. Well-aligned ArcFace reaches roughly 0.998 accuracy; the gap comes from the plain crop-and-resize alignment described below.

Instance and style are not benchmarked yet.

Limitations

Dedup is greedy and brute-force. It embeds each image once, but the comparison step has no approximate index, so a large set of mostly-distinct images scales quadratically. There is no corpus-scale similarity search yet; a FAISS-backed index is the planned next step.
Hashing handles JPEG, resize, blur, noise, and brightness almost perfectly, but a 50% center crop drops it to about 0.40 AUC and a horizontal flip to about 0.59.
The style dimension uses MobileNetV2 features rather than gram matrices on intermediate layers, so it is a rough signal and is not benchmarked yet.
Profile weights are sensible defaults, not values tuned on data.
Face alignment is a plain crop and resize with no landmark step, which puts LFW accuracy near 91% against roughly 99.8% for well-aligned ArcFace. It works, but it is not state of the art.
A single ImagePrism instance is not thread-safe; the embedding cache is unsynchronized. Use one instance per thread.

When to use something else

If you need only one kind of similarity, reach for the specialized tool: imagehash for perceptual hashing, CLIP directly for semantic search, insightface for faces. imageprism is worth it when you need two or more of these behind one interface. It saves the integration work rather than trying to beat any of those libraries at their single job.

License

MIT, see LICENSE. Model weights download from their original sources under their own licenses.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.1.0

Jul 5, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

imageprism-0.1.0.tar.gz (12.1 kB view details)

Uploaded Jul 5, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

imageprism-0.1.0-py3-none-any.whl (17.0 kB view details)

Uploaded Jul 5, 2026 Python 3

File details

Details for the file imageprism-0.1.0.tar.gz.

File metadata

Download URL: imageprism-0.1.0.tar.gz
Upload date: Jul 5, 2026
Size: 12.1 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.11.0 {"installer":{"name":"uv","version":"0.11.0","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for imageprism-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`d6691ab6c5cab5f7ccacf0418561a392ae23328fffe9bc63e7f5d53d89468b26`
MD5	`5daeab1173570008cc12f23402d16000`
BLAKE2b-256	`f1f2c10b946157066dce06bbbadae9fa863cf8e24aa904e3461a5c40355510de`

See more details on using hashes here.

File details

Details for the file imageprism-0.1.0-py3-none-any.whl.

File metadata

Download URL: imageprism-0.1.0-py3-none-any.whl
Upload date: Jul 5, 2026
Size: 17.0 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.11.0 {"installer":{"name":"uv","version":"0.11.0","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for imageprism-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`dcf4ad8cb19a431a5d0ac4c3dd636dde0fa2c60bea41c830d3365af4ebdafcae`
MD5	`eb5978d25ed8c6d7ec2340a1165a3ca9`
BLAKE2b-256	`09a4612b26dfcaa1df6435e3291ade520e9e2c38a36a6c88d60081415ba3053a`

See more details on using hashes here.

imageprism 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

imageprism

Install

Dimensions

Reading the scores

Profiles

Custom weights and per-dimension config

Embeddings and caching

Batch dedup

Face and model licensing

Benchmarks

Limitations

When to use something else

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes