Skip to main content

A Vision Transformer based image retrieval system

Project description

ViT Image Retrieval

Python Version License PyPI version

A Python-based content-based image retrieval (CBIR) system using Vision Transformer (ViT) features and FAISS indexing. This application provides both a graphical user interface (GUI) and programmatic API for indexing and searching similar images.

Features

  • Vision Transformer Features: Utilizes ViT-B/16 model pre-trained on ImageNet for robust feature extraction
  • Fast Similarity Search: Implements FAISS IVF indexing for efficient similarity search
  • Cross-Platform Support: Works on Windows, macOS, and Linux
  • User-Friendly GUI:
    • Interactive interface for feature extraction and image search
    • Double-click or right-click to open images and containing folders
    • Progress tracking for batch operations
  • Multiple Image Format Support: Handles PNG, JPG, JPEG, and WebP formats
  • GPU Acceleration: Optional GPU support for faster processing when available

Installation

Prerequisites

  • Python 3.8 or higher
  • pip package manager

Install from Source

  1. Clone the repository:
git clone https://github.com/bnsreenu/vit-image-retrieval.git
cd vit-image-retrieval
  1. Install the package:
pip install -e .

Install from PyPI

pip install vit-image-retrieval

Usage

GUI Application

Launch the application using either command:

vit
# or
vit-image-retrieval-gui

The GUI has two main tabs:

  1. Feature Extraction Tab:

    • Select a directory of images to index
    • Optionally provide an index name
    • Monitor progress through the progress bar
    • Save index for later use
  2. Image Retrieval Tab:

    • Load previously created index
    • Select query image
    • Set number of similar images to retrieve
    • View results with similarity scores

Python API

from vit_image_retrieval import ImageRetrievalSystem

# Initialize the system
retrieval = ImageRetrievalSystem()

# Index a directory of images
retrieval.index_images("path/to/image/directory")

# Save the index for later use
retrieval.save("my_index.faiss", "my_metadata.json")

# Load existing index
retrieval = ImageRetrievalSystem(
    index_path="my_index.faiss",
    metadata_path="my_metadata.json"
)

# Search for similar images
results = retrieval.search("path/to/query/image.jpg", k=5)

# Process results
for path, similarity, metadata in results:
    print(f"Similar image: {path} (similarity: {similarity:.3f})")

Platform-Specific Notes

Windows

  • Supports direct file and folder opening through Windows Explorer
  • Uses native file selection dialogs

macOS

  • Uses native macOS commands for file operations
  • Finder integration for viewing files and folders
  • Requires no additional configuration

Linux

  • Automatically handles Qt platform plugin configurations
  • Uses system's default applications for file operations
  • Requires X11 or Wayland display server

Requirements

  • torch >= 2.0.0
  • torchvision >= 0.15.0
  • faiss-cpu >= 1.7.4 (or faiss-gpu for GPU support)
  • PyQt5 >= 5.15.0
  • Pillow >= 9.0.0
  • numpy >= 1.20.0

Development

To contribute or modify:

  1. Fork the repository
  2. Create a new branch:
git checkout -b feature-name
  1. Make changes and test
  2. Submit a pull request

Common Issues and Solutions

  1. Linux Qt Plugin Error: Automatically handled by removing conflicting Qt plugin paths
  2. GPU Memory Issues: Use CPU version if encountering GPU memory problems
  3. File Permission Errors: Check user permissions for the working directory

License

This project is licensed under the MIT License - see the LICENSE file for details.

Acknowledgments

  • Vision Transformer implementation from torchvision
  • FAISS library from Facebook Research
  • PyQt5 for the graphical interface

Citation

If you use this software in your research, please cite:

@software{vit_image_retrieval,
  title = {ViT Image Retrieval},
  author = {Dr. Sreenivas Bhattiprolu},
  year = {2024},
  url = {https://github.com/bnsreenu/vit-image-retrieval}
}

Support

For support, please:

  1. Check the issues page for existing solutions
  2. Create a new issue with:
    • Your operating system
    • Python version
    • Complete error message
    • Steps to reproduce the problem

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

vit_image_retrieval-1.2.0.tar.gz (19.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

vit_image_retrieval-1.2.0-py3-none-any.whl (20.4 kB view details)

Uploaded Python 3

File details

Details for the file vit_image_retrieval-1.2.0.tar.gz.

File metadata

  • Download URL: vit_image_retrieval-1.2.0.tar.gz
  • Upload date:
  • Size: 19.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.10.15

File hashes

Hashes for vit_image_retrieval-1.2.0.tar.gz
Algorithm Hash digest
SHA256 1a14b2b5de074835ff0bfa1e22f6ed11f2ab6ab678141a59fa1c3f52421271c4
MD5 c302bd3e0ecbe4b72ae722aa65d1313d
BLAKE2b-256 c3371cfb58b3045305cba1cb633c1c688b6ced17f5057ee817a9aff47deccb24

See more details on using hashes here.

File details

Details for the file vit_image_retrieval-1.2.0-py3-none-any.whl.

File metadata

File hashes

Hashes for vit_image_retrieval-1.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 55b91626d6ddced915c7a76f8c70901f88831eaa35d138efea95ab51daaf309e
MD5 4ea0ccd9a700fe82818f21246e76e8a4
BLAKE2b-256 efb847688bb120bbfe835e8875d32609f2d2ea7066604fea72b5b2888eff2ddf

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page