A Python library for image similarity analysis using Image Encoders and Neural Networks

These details have been verified by PyPI

Project links

Homepage

GitHub Statistics

Maintainers

MechaCritter

These details have not been verified by PyPI

Project description

pyvisim

License Version Status Python Contributions

Welcome to `pyvisim`!

pyvisim is a Python library for computing image similarities using the encoders Fisher Vectors, VLAD and the Siamese Neural Networks.

Why pyvisim
Installation
Pretrained Models
Contributing
Get in Touch
TODO
License
References

For a technical deep-dive into the library internals, see the developer documentation.

Status

[!WARNING] This project is still in early development, so the API might change anytime (with deprecation, but the change will come soon afterwards). Feel free to use it in development environments, but I would recommend against using it in production.

The first stable release will have the version tag v1.0.0 and will come approximately by the end of August 2026.

Why `pyvisim`?

pyvisim is designed to provide a simple and efficient way to compare images.

Quick Start

With just a few lines of code, you can compute the similarity score between two images using the VLAD encoder:

Example: Compute Similarity Score Using VLAD

from pyvisim.encoders import VLADEncoder, PretrainedVLAD
from pyvisim.datasets import OxfordFlowerDataset

# Load images from the Oxford Flower Dataset. Has to be NumPy Images!
dataset = OxfordFlowerDataset()
image1, *_  = dataset[0]
image2, *_ = dataset[1]

# Load a bundled pretrained VLAD encoder (RootSIFT features, k=256).
# The feature extractor and similarity metric come with it.
encoder = VLADEncoder.from_pretrained(PretrainedVLAD.OXFORD102_K256_ROOTSIFT)

# Compute the similarity score. By default, cosine similarity is used.
similarity_score = encoder.similarity_score(image1, image2)

print(f"Similarity Score: {similarity_score}")

By default the encoder uses cosine similarity. To use a different metric, pass its name; "cosine", "euclidean", "l1" and "manhattan" are supported:

encoder = VLADEncoder.from_pretrained(PretrainedVLAD.OXFORD102_K256_ROOTSIFT)
encoder.similarity_func = "euclidean"

A fitted encoder can be saved to a .encoder file and restored later:

path = encoder.save_to_disk("vlad_oxford102")  # writes vlad_oxford102.encoder
encoder = VLADEncoder.load_from_disk(path)

You can also visit the introduction notebook for more examples.

I also provided various notebooks for different use-cases. Feel free to check them out, and let me know if you have any suggestions or questions!

Image Retrieval
Retrieve the top-k most similar images from a dataset.
- Use encoding methods like VLAD or Fisher Vectors to quickly find the most relevant matches. Please visit this juptyer notebook for an example.
- Example use: Building a fast image search engine for photo management software.
Deep Learning Embeddings
- Generate VLAD or Fisher vectors from neural network embeddings, e.g., VGG16 or other models.
- Enhance your deep learning pipeline by leveraging traditional encoding methods on top of CNN features.
Image Clustering
- Cluster images based on their similarities to group them by category or content. An example and benchmarking can be found in this notebook.
- Useful for organizing unlabeled data or generating pseudo-labels for further training.
Pipeline for Combining Multiple Encoders
- Chain various encoders in a single pipeline. An example can be found in this notebook.
- Achieve more robust similarity metrics by blending different feature representations.
Siamese Network (Coming Soon!)
- Train a neural network to learn a similarity function directly from pairs/triples of images.
- Possible use cases include face recognition, signature verification, or any image-based identity matching.

Installation

To use the library, you can simply install it via pip:

pip install pyvisim

or clone the repository and install it locally:

git clone https://github.com/MechaCritter/Python-Visual-Similarity.git
cd Python-Visual-Similarity
pip install .

Note that the notebooks are only available if you clone the repository.

All experiments in this project was made on the Oxford Flower Dataset [7], for which I have created a custom dataset class. To use this class, import it as follows:

from pyvisim.datasets import OxfordFlowerDataset

For more details on the dataset, please refer to the documentation.

Pretrained Models

pyvisim ships ready-to-use pretrained encoders trained on the Oxford-102 flower dataset. Each one is a bundled .encoder file that already includes the right feature extractor and the cosine similarity metric, so loading one gives you a working encoder in a single line:

from pyvisim.encoders import (
    VLADEncoder,
    FisherVectorEncoder,
    PretrainedVLAD,
    PretrainedFisher,
)

vlad = VLADEncoder.from_pretrained(PretrainedVLAD.OXFORD102_K256_ROOTSIFT)
fisher = FisherVectorEncoder.from_pretrained(PretrainedFisher.OXFORD102_K256_VGG16_PCA)

All clustering models were trained with k=256. The choice of k was made arbitrarily based on the paper ⁵, where the authors tested with k=32, 64, 128, 256, 512, and so on. Since higher values would take too long, I chose k=256 as a balance between performance and computational cost. Variants ending in _PCA reduce the feature dimensions by half with PCA before clustering.

[!CAUTION] Deprecated: the old weights=KMeansWeights.X / weights=GMMWeights.X constructor argument is deprecated and will be removed in 1.0.0. Use from_pretrained() with the PretrainedVLAD/PretrainedFisher enums (or load_from_disk() with a .encoder file) instead. The enums above load the exact same trained models.

VLAD encoders (`PretrainedVLAD`)

Loaded with VLADEncoder.from_pretrained(...).

Member	Feature Extractor	PCA Applied	Feature Dimensions
`OXFORD102_K256_VGG16_PCA`	Last Conv Layer (VGG16)	Yes	257
`OXFORD102_K256_VGG16`	Last Conv Layer (VGG16)	No	514
`OXFORD102_K256_ROOTSIFT_PCA`	RootSIFT features	Yes	64
`OXFORD102_K256_ROOTSIFT`	RootSIFT features	No	128
`OXFORD102_K256_SIFT_PCA`	SIFT features	Yes	64
`OXFORD102_K256_SIFT`	SIFT features	No	128

Fisher Vector encoders (`PretrainedFisher`)

Loaded with FisherVectorEncoder.from_pretrained(...).

Member	Feature Extractor	PCA Applied	Feature Dimensions
`OXFORD102_K256_VGG16_PCA`	Last Conv Layer (VGG16)	Yes	257
`OXFORD102_K256_VGG16`	Last Conv Layer (VGG16)	No	514
`OXFORD102_K256_ROOTSIFT_PCA`	RootSIFT features	Yes	64
`OXFORD102_K256_ROOTSIFT`	RootSIFT features	No	128
`OXFORD102_K256_SIFT_PCA`	SIFT features	Yes	64
`OXFORD102_K256_SIFT`	SIFT features	No	128

Notes

Feature Extraction:
- Deep Features (VGG16): Feature maps from the last convolutional layer of VGG16. At each spatial location, the relative x and y coordinates are concatenated to the feature vector, resulting in 512 + 2 = 514 dimensions ⁶.
- SIFT: Scale-Invariant Feature Transform descriptors, which was the original feature used for VLAD and Fisher Vector encoding ⁵.
- RootSIFT: A variant of SIFT with Hellinger kernel normalization⁴.
Dimensionality Reduction:
- Models with _PCA in their names apply PCA to reduce the feature dimensions to by half.
- The clustering models will learn from the transformed features after PCA is applied.

Contributing

We love contributions of all kinds—whether it’s suggesting new features, fixing bugs, or writing docs! Here’s how you can get involved:

Fork this repository.
Create a new branch for your changes.
Open a pull request with a clear description of your idea or fix.

We welcome all feedback and hope to build a supportive community around pyvisim!

Get in Touch

If you have any questions or just want to say hi, feel free to:

Open an issue on GitHub.
Write me an email at vunhathuy234@gmail.com.
Connect on LinkedIn to follow my work and share your thoughts.

TODO

The features below are planned for future releases:

With v1.0.0, remove the deprecated weights constructor argument and the _CLUSTERING_TO_PCA_MAPPING internal variable, since they are no longer needed with the new from_pretrained() API.
Implement the siamese network.
Add tensor sketch approximation and mutual information analysis for Fisher Vector, according to this paper by Weixia Zhang, Jia Yan, Wenxuan Shi, Tianpeng Feng, and Dexiang Deng ¹
Add support for vision transformers for the DeepConvFeature class.

You are welcome to implement any of these features or suggest new ones!

License

This project is licensed under the terms of the MIT license.

References

[1] Weixia Zhang, Jia Yan, Wenxuan Shi, Tianpeng Feng, and Dexiang Deng, "Refining Deep Convolutional Features for Improving Fine-Grained Image Recognition," EURASIP Journal on Image and Video Processing, 2017.
[2] Relja Arandjelović and Andrew Zisserman, 'All About VLAD', Department of Engineering Science, University of Oxford.
[3] E. Spyromitros-Xioufis, S. Papadopoulos, I. Kompatsiaris, G. Tsoumakas, and I. Vlahavas, "An Empirical Study on the Combination of SURF Features with VLAD Vectors for Image Search," Informatics and Telematics Institute, Center for Research and Technology Hellas, Thessaloniki, Greece; Department of Informatics, Aristotle University of Thessaloniki, Greece.
[4] Relja Arandjelović and Andrew Zisserman, "Three things everyone should know to improve object retrieval," Department of
Engineering Science, University of Oxford.
[5] Hervé Jégou, Florent Perronnin, Matthijs Douze, Jorge Sánchez, Patrick Pérez, and Cordelia Schmid, "Aggregating Local Image Descriptors into Compact Codes," IEEE.
[6] Liangliang Wang and Deepu Rajan, "An Image Similarity Descriptor for Classification Tasks," J. Vis. Commun. Image R., vol. 71, pp. 102847, 2020.
[7] Oxford Flower Dataset.

Project details

These details have been verified by PyPI

Project links

Homepage

GitHub Statistics

Maintainers

MechaCritter

These details have not been verified by PyPI

Release history Release notifications | RSS feed

0.6.0

Jun 20, 2026

0.5.1

Jun 19, 2026

0.5.0

Jun 19, 2026

0.4.2

Jun 18, 2026

0.4.1

Jun 18, 2026

This version

0.4.0

Jun 18, 2026

0.3.1

Jun 18, 2026

0.3.0

Jun 17, 2026

0.2.0

Jun 15, 2026

0.1.3

Jan 19, 2025

0.1.3rc0 pre-release

Jan 19, 2025

0.1.2rc0 pre-release

Jan 19, 2025

0.1.1

Jan 19, 2025

0.1.1rc0 pre-release

Jan 19, 2025

0.1.0

Jan 19, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pyvisim-0.4.0.tar.gz (10.8 MB view details)

Uploaded Jun 18, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

pyvisim-0.4.0-py3-none-any.whl (10.8 MB view details)

Uploaded Jun 18, 2026 Python 3

File details

Details for the file pyvisim-0.4.0.tar.gz.

File metadata

Download URL: pyvisim-0.4.0.tar.gz
Upload date: Jun 18, 2026
Size: 10.8 MB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.13

File hashes

Hashes for pyvisim-0.4.0.tar.gz
Algorithm	Hash digest
SHA256	`5a31698677db7f69d11efa67af6ff0d3d97ae9649990c8744515253698df985b`
MD5	`42913b6c5c40fb8263c0b32f56e7d6b0`
BLAKE2b-256	`c878d54238d3a32adcaed519b5a11232eace0b20611ccfd4851cf577ac892e0e`

See more details on using hashes here.

Provenance

The following attestation bundles were made for pyvisim-0.4.0.tar.gz:

Publisher: publish_pypi.yml on MechaCritter/Python-Visual-Similarity

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: pyvisim-0.4.0.tar.gz
- Subject digest: 5a31698677db7f69d11efa67af6ff0d3d97ae9649990c8744515253698df985b
- Sigstore transparency entry: 1859586882
- Sigstore integration time: Jun 18, 2026
Source repository:
- Permalink: MechaCritter/Python-Visual-Similarity@e3da07450916acedf4238b46baca9caa477bd8d3
- Branch / Tag: refs/tags/v0.4.0
- Owner: https://github.com/MechaCritter
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish_pypi.yml@e3da07450916acedf4238b46baca9caa477bd8d3
- Trigger Event: push

File details

Details for the file pyvisim-0.4.0-py3-none-any.whl.

File metadata

Download URL: pyvisim-0.4.0-py3-none-any.whl
Upload date: Jun 18, 2026
Size: 10.8 MB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.13

File hashes

Hashes for pyvisim-0.4.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`32cdd4aa4f2375637e852f60937196bc6f9d86d116dc3ddb72994d4c45bf4ba7`
MD5	`ff11a5907abfb4ee49c453423af48916`
BLAKE2b-256	`ae9af332f0e6e9bcc6be66bc98f234f51055baf400d203ca036caeae35ebaa5b`

See more details on using hashes here.

Provenance

The following attestation bundles were made for pyvisim-0.4.0-py3-none-any.whl:

Publisher: publish_pypi.yml on MechaCritter/Python-Visual-Similarity

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: pyvisim-0.4.0-py3-none-any.whl
- Subject digest: 32cdd4aa4f2375637e852f60937196bc6f9d86d116dc3ddb72994d4c45bf4ba7
- Sigstore transparency entry: 1859586893
- Sigstore integration time: Jun 18, 2026
Source repository:
- Permalink: MechaCritter/Python-Visual-Similarity@e3da07450916acedf4238b46baca9caa477bd8d3
- Branch / Tag: refs/tags/v0.4.0
- Owner: https://github.com/MechaCritter
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish_pypi.yml@e3da07450916acedf4238b46baca9caa477bd8d3
- Trigger Event: push

pyvisim 0.4.0

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

Welcome to pyvisim!

Table of Contents

Status

Why pyvisim?

Quick Start

Example: Compute Similarity Score Using VLAD

Installation

Pretrained Models

VLAD encoders (PretrainedVLAD)

Fisher Vector encoders (PretrainedFisher)

Notes

Contributing

Get in Touch

TODO

License

References

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance

Welcome to `pyvisim`!

Why `pyvisim`?

VLAD encoders (`PretrainedVLAD`)

Fisher Vector encoders (`PretrainedFisher`)