A lightweight Python library for generating image embeddings with semantic search
Project description
Imgemb
A lightweight Python library for generating and comparing image embeddings using various methods. This library provides tools for image similarity search, clustering, and comparison.
Features
- Multiple embedding methods:
- Average Color: Simple RGB color averaging
- Grid-based: Divides image into grid cells and computes color features
- Edge-based: Uses Sobel edge detection and histogram features
- CLIP-based: Semantic embeddings for natural language search
- Command-line interface (CLI) for easy usage
- Normalization options for embeddings
- Tools for finding similar images in a directory
- Support for batch processing
- Semantic Search:
- Natural language queries for image search
- Zero-shot image classification
- Cross-modal understanding between text and images
- GPU acceleration support
Installation
From PyPI (Recommended)
pip install imgemb
From Source
git clone https://github.com/aryanraj2713/image_embeddings.git
cd image_embeddings
pip install -e ".[dev]" # Install with development dependencies
Quick Start
Using as a Python Library
from imgemb import ImageEmbedder
# Initialize embedder
embedder = ImageEmbedder(
method='grid', # 'average_color', 'grid', or 'edge'
grid_size=(4, 4), # For grid method
normalize=True # Whether to normalize embeddings
)
# Generate embedding for a single image
embedding = embedder.embed_image('path/to/image.jpg')
# Compare two images
similarity = embedder.compare_images('image1.jpg', 'image2.jpg')
# Find similar images in a directory
similar_images = embedder.find_similar_images(
'query.jpg',
'path/to/image/directory',
top_k=5
)
Semantic Search
from imgemb import SemanticSearcher
# Initialize searcher
searcher = SemanticSearcher()
# Index a directory of images
searcher.index_directory("path/to/images")
# Search using natural language
results = searcher.search("a photo of a dog playing in the park", top_k=5)
# Print results
for path, score in results:
print(f"{path}: {score:.3f}")
Using the Command Line Interface
- Compare two images:
imgemb compare image1.jpg image2.jpg --method grid --grid-size 4 4
- Generate embeddings for images:
imgemb generate path/to/images/ --output embeddings.json --method edge
- Find similar images:
imgemb find-similar query.jpg image/directory/ -k 5 --method grid
Embedding Methods
Average Color
Computes the mean RGB values of the entire image. Simple but effective for basic color-based similarity.
embedder = ImageEmbedder(method='average_color')
Grid-based
Divides the image into a grid and computes mean RGB values for each cell. Better for capturing spatial color distribution.
embedder = ImageEmbedder(method='grid', grid_size=(4, 4))
Edge-based
Uses Sobel edge detection and histogram features. Good for capturing structural similarities.
embedder = ImageEmbedder(method='edge')
Development
Setup Development Environment
# Create virtual environment
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
# Install development dependencies
pip install -e ".[dev]"
Running Tests
pytest tests/ -v
Contributing
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Commit your changes (
git commit -m 'Add amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
License
This project is licensed under the MIT License - see the LICENSE file for details.
Requirements
- Python ≥ 3.8
- OpenCV (opencv-python)
- NumPy
- Matplotlib
- scikit-learn
Citation
If you use this library in your research, please cite:
@software{image_embeddings,
title = {Image Embeddings: A Lightweight Library for Image Similarity},
author = {Raj, Aryan},
year = {2024},
url = {https://github.com/aryanraj2713/image_embeddings}
}
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file imgemb-0.2.1.tar.gz.
File metadata
- Download URL: imgemb-0.2.1.tar.gz
- Upload date:
- Size: 40.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
aa23c80d0be37d21b70745535f90626d3885d17888a37d8bfb624adea0a02483
|
|
| MD5 |
2c8486fc4b1cd4390e0dac6307baea41
|
|
| BLAKE2b-256 |
6d6514316cf9e5ed56906b2e5684a2403d3971e863762c3e961981b1e0ed635d
|
File details
Details for the file imgemb-0.2.1-py3-none-any.whl.
File metadata
- Download URL: imgemb-0.2.1-py3-none-any.whl
- Upload date:
- Size: 12.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
67cb38041d5227d7f15b9a831ce04d6d26e1c91e85be63c069e5795c62f793af
|
|
| MD5 |
ee16a48e6efab8032293044e737e214e
|
|
| BLAKE2b-256 |
bff484b709d7ed9c4580e29cb889d92c7989bc70c374217a7b8f3185f10990f9
|