Skip to main content

A lightweight Python library for generating image embeddings with semantic search

Project description

Imgemb

CI pypi.org

A lightweight Python library for generating and comparing image embeddings using various methods. This library provides tools for image similarity search, clustering, and comparison.

Features

  • Multiple embedding methods:
    • Average Color: Simple RGB color averaging
    • Grid-based: Divides image into grid cells and computes color features
    • Edge-based: Uses Sobel edge detection and histogram features
    • CLIP-based: Semantic embeddings for natural language search
  • Command-line interface (CLI) for easy usage
  • Normalization options for embeddings
  • Tools for finding similar images in a directory
  • Support for batch processing
  • Semantic Search:
    • Natural language queries for image search
    • Zero-shot image classification
    • Cross-modal understanding between text and images
    • GPU acceleration support

Installation

From PyPI (Recommended)

pip install imgemb

From Source

git clone https://github.com/aryanraj2713/image_embeddings.git
cd image_embeddings
pip install -e ".[dev]"  # Install with development dependencies

Quick Start

Using as a Python Library

from imgemb import ImageEmbedder

# Initialize embedder
embedder = ImageEmbedder(
    method='grid',           # 'average_color', 'grid', or 'edge'
    grid_size=(4, 4),       # For grid method
    normalize=True          # Whether to normalize embeddings
)

# Generate embedding for a single image
embedding = embedder.embed_image('path/to/image.jpg')

# Compare two images
similarity = embedder.compare_images('image1.jpg', 'image2.jpg')

# Find similar images in a directory
similar_images = embedder.find_similar_images(
    'query.jpg',
    'path/to/image/directory',
    top_k=5
)

Semantic Search

from imgemb import SemanticSearcher

# Initialize searcher
searcher = SemanticSearcher()

# Index a directory of images
searcher.index_directory("path/to/images")

# Search using natural language
results = searcher.search("a photo of a dog playing in the park", top_k=5)

# Print results
for path, score in results:
    print(f"{path}: {score:.3f}")

Using the Command Line Interface

  1. Compare two images:
imgemb compare image1.jpg image2.jpg --method grid --grid-size 4 4
  1. Generate embeddings for images:
imgemb generate path/to/images/ --output embeddings.json --method edge
  1. Find similar images:
imgemb find-similar query.jpg image/directory/ -k 5 --method grid

Embedding Methods

Average Color

Computes the mean RGB values of the entire image. Simple but effective for basic color-based similarity.

embedder = ImageEmbedder(method='average_color')

Grid-based

Divides the image into a grid and computes mean RGB values for each cell. Better for capturing spatial color distribution.

embedder = ImageEmbedder(method='grid', grid_size=(4, 4))

Edge-based

Uses Sobel edge detection and histogram features. Good for capturing structural similarities.

embedder = ImageEmbedder(method='edge')

Development

Setup Development Environment

# Create virtual environment
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

# Install development dependencies
pip install -e ".[dev]"

Running Tests

pytest tests/ -v

Contributing

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/amazing-feature)
  3. Commit your changes (git commit -m 'Add amazing feature')
  4. Push to the branch (git push origin feature/amazing-feature)
  5. Open a Pull Request

License

This project is licensed under the MIT License - see the LICENSE file for details.

Requirements

  • Python ≥ 3.8
  • OpenCV (opencv-python)
  • NumPy
  • Matplotlib
  • scikit-learn

Citation

If you use this library in your research, please cite:

@software{image_embeddings,
  title = {Image Embeddings: A Lightweight Library for Image Similarity},
  author = {Raj, Aryan},
  year = {2024},
  url = {https://github.com/aryanraj2713/image_embeddings}
}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

imgemb-0.2.1.tar.gz (40.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

imgemb-0.2.1-py3-none-any.whl (12.2 kB view details)

Uploaded Python 3

File details

Details for the file imgemb-0.2.1.tar.gz.

File metadata

  • Download URL: imgemb-0.2.1.tar.gz
  • Upload date:
  • Size: 40.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.3

File hashes

Hashes for imgemb-0.2.1.tar.gz
Algorithm Hash digest
SHA256 aa23c80d0be37d21b70745535f90626d3885d17888a37d8bfb624adea0a02483
MD5 2c8486fc4b1cd4390e0dac6307baea41
BLAKE2b-256 6d6514316cf9e5ed56906b2e5684a2403d3971e863762c3e961981b1e0ed635d

See more details on using hashes here.

File details

Details for the file imgemb-0.2.1-py3-none-any.whl.

File metadata

  • Download URL: imgemb-0.2.1-py3-none-any.whl
  • Upload date:
  • Size: 12.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.3

File hashes

Hashes for imgemb-0.2.1-py3-none-any.whl
Algorithm Hash digest
SHA256 67cb38041d5227d7f15b9a831ce04d6d26e1c91e85be63c069e5795c62f793af
MD5 ee16a48e6efab8032293044e737e214e
BLAKE2b-256 bff484b709d7ed9c4580e29cb889d92c7989bc70c374217a7b8f3185f10990f9

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page