CLIP-powered multimodal image search engine with web interface and CLI tools

These details have not been verified by PyPI

Project links

Project description

Folder Vision - CLIP Image Search 🔍

A powerful multimodal image search engine built with OpenAI's CLIP (Contrastive Language-Image Pre-training) model. Search through your image collections using natural language descriptions or find similar images using other images as queries.

Features ✨

🔍 Text-to-Image Search: Find images using natural language descriptions
🖼️ Image-to-Image Search: Find similar images using another image as a query
🌐 Web Interface: Beautiful, intuitive web UI for easy searching
⚡ CLI Interface: Command-line tools for batch processing and automation
💾 Smart Caching: Automatic embedding caching for fast subsequent searches
🚀 FastAPI Backend: Modern, high-performance web API
📱 Responsive Design: Works on desktop, tablet, and mobile devices

Quick Start 🚀

Installation

# Clone or download the repository
cd folder-vision

# Install dependencies
pip install -e .

Web Interface

Start the web server:

fv serve --port 8000

Then open your browser to http://localhost:8000 and enjoy the visual interface!

Command Line Usage

Index your images:

fv index /path/to/your/images

Search with text:

fv search-text "a red car in the city"

Search with an image:

fv search-image /path/to/query_image.jpg

How It Works 🧠

Folder Vision uses OpenAI's CLIP model to understand both images and text in the same semantic space. This allows for:

Image Indexing: Convert all your images into high-dimensional vector embeddings
Text Understanding: Convert your search queries into comparable vectors
Similarity Matching: Find the most similar images using cosine similarity
Fast Retrieval: Use cached embeddings for instant search results

Web Interface Features 🌐

The web interface provides:

📁 Folder Indexing: Point to any folder and index all images automatically
🔤 Text Search: Type natural language descriptions to find matching images
🖼️ Visual Search: Upload an image to find similar ones in your collection
📊 Statistics: View indexing statistics and model information
🎨 Visual Results: See thumbnail previews with similarity scores
📱 Responsive Design: Works perfectly on all devices

CLI Commands 💻

Serve Web Interface

# Start web server (default: http://0.0.0.0:8000)
fv serve

# Custom host and port
fv serve --host localhost --port 3000

# Development mode with auto-reload
fv serve --reload

Index Images

# Index all images in a folder
fv index /path/to/images

# Index without saving cache
fv index /path/to/images --no-cache

Search Commands

# Text search (natural language)
fv search-text "sunset over mountains"
fv search-text "a cat sleeping on a couch" --top-k 5

# Image search (visual similarity)
fv search-image /path/to/query.jpg
fv search-image query.png --top-k 20

# JSON output for scripting
fv search-text "dogs playing" --format json

Statistics

# View search engine statistics
fv stats

API Endpoints 🔌

The FastAPI backend provides these endpoints:

GET / - Web interface
POST /index - Index a folder
GET /search/text - Search by text query
POST /search/image - Search by image upload
GET /image/{path} - Serve image files
GET /stats - Get statistics
GET /health - Health check

Supported Image Formats 📸

JPEG (.jpg, .jpeg)
PNG (.png)
BMP (.bmp)
GIF (.gif)
TIFF (.tiff)
WebP (.webp)

Performance & Optimization ⚡

GPU Support: Automatically uses GPU if available (CUDA)
Batch Processing: Efficient batch encoding of images
Smart Caching: Embeddings are cached to disk for instant reloading
Memory Management: Processes large collections without memory issues
Concurrent Processing: Handles multiple search requests simultaneously

Example Use Cases 💡

Personal Photo Management

# Index your photo library
fv index ~/Pictures

# Find vacation photos
fv search-text "beach vacation sunset"

# Find similar photos to a favorite shot
fv search-image ~/Pictures/favorite_sunset.jpg

Digital Asset Management

# Index product images
fv index /company/product_photos

# Find specific product types
fv search-text "red athletic shoes"
fv search-text "office furniture desk"

Creative Workflows

# Index design assets
fv index /projects/design_assets

# Find inspiration
fv search-text "minimalist logo design"
fv search-text "modern interior architecture"

System Requirements 🖥️

Python: 3.9 or higher
Memory: 4GB RAM minimum, 8GB+ recommended for large collections
Storage: Additional space for embedding cache files
GPU: Optional but recommended for faster processing (CUDA-compatible)

Model Information 🤖

Base Model: OpenAI CLIP ViT-B/32
Embedding Dimension: 512
Input Resolution: 224x224 pixels
Vocabulary: 49,408 tokens

Advanced Configuration ⚙️

Environment Variables

# Set custom cache directory
export CLIP_CACHE_DIR=/path/to/cache

# Disable GPU usage
export CUDA_VISIBLE_DEVICES=""

Custom Model

You can use different CLIP models by modifying the code:

# In clip_search.py
search_engine = CLIPImageSearch(model_name="openai/clip-vit-large-patch14")

Troubleshooting 🔧

Common Issues

"No images indexed" error:

Make sure to run fv index <folder_path> first
Check that the folder contains supported image formats

Slow indexing:

Enable GPU acceleration if available
Process smaller batches of images
Use SSD storage for better I/O performance

Memory errors:

Reduce batch size in the code
Process images in smaller folders
Increase system RAM

Web interface not loading:

Check if port is already in use
Try a different port: fv serve --port 8080
Check firewall settings

Development 👩‍💻

Project Structure

folder-vision/
├── folder_vision/
│   ├── __init__.py
│   ├── app.py           # FastAPI web application
│   ├── cli.py           # Command-line interface
│   └── clip_search.py   # CLIP search engine
├── requirements.txt     # Python dependencies
├── pyproject.toml      # Project configuration
└── README.md           # This file

Running Tests

# Install development dependencies
pip install -e ".[dev]"

# Run tests
pytest tests/

Contributing

Fork the repository
Create a feature branch
Make your changes
Add tests
Submit a pull request

License 📄

This project is licensed under the MIT License - see the LICENSE file for details.

Acknowledgments 🙏

OpenAI for the incredible CLIP model
Hugging Face for the transformers library
FastAPI for the excellent web framework
PyTorch for the deep learning foundation

Support 💬

If you encounter any issues or have questions:

Check the troubleshooting section above
Search existing issues on GitHub
Create a new issue with detailed information
Include system information and error messages

Made with ❤️ by the Folder Vision Team

Start exploring your images in a whole new way! 🚀

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

1.0.0

Aug 26, 2025

0.1.1

Aug 26, 2025

0.1.0

Aug 26, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

folder_vision-1.0.0.tar.gz (69.9 kB view details)

Uploaded Aug 26, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

folder_vision-1.0.0-py3-none-any.whl (70.4 kB view details)

Uploaded Aug 26, 2025 Python 3

File details

Details for the file folder_vision-1.0.0.tar.gz.

File metadata

Download URL: folder_vision-1.0.0.tar.gz
Upload date: Aug 26, 2025
Size: 69.9 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.9.6

File hashes

Hashes for folder_vision-1.0.0.tar.gz
Algorithm	Hash digest
SHA256	`de91188956f3684fca31100e85e9f2eae51123cbf5aa5cff0a18126ed6496ea4`
MD5	`4dc9be2fa7757fdf1cf0b34a17c86bf9`
BLAKE2b-256	`a71c7557c4432956565d657eb96dec011e4dbcf6a547b2e218f10aa6d39728c8`

See more details on using hashes here.

File details

Details for the file folder_vision-1.0.0-py3-none-any.whl.

File metadata

Download URL: folder_vision-1.0.0-py3-none-any.whl
Upload date: Aug 26, 2025
Size: 70.4 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.9.6

File hashes

Hashes for folder_vision-1.0.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`61a59102bfa3095042734d4a627939b1c4a2775246113f34b4056d0228d70e73`
MD5	`348727b3f8476ceb5c82954961d29acf`
BLAKE2b-256	`a32ff1fbdef6e20360944b433552bb63b4d1e616f9d6022b4a13889157ab29f3`

See more details on using hashes here.

folder-vision 1.0.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Folder Vision - CLIP Image Search 🔍

Features ✨

Quick Start 🚀

Installation

Web Interface

Command Line Usage

How It Works 🧠

Web Interface Features 🌐

CLI Commands 💻

Serve Web Interface

Index Images

Search Commands

Statistics

API Endpoints 🔌

Supported Image Formats 📸

Performance & Optimization ⚡

Example Use Cases 💡

Personal Photo Management

Digital Asset Management

Creative Workflows

System Requirements 🖥️

Model Information 🤖

Advanced Configuration ⚙️

Environment Variables

Custom Model

Troubleshooting 🔧

Common Issues

Development 👩‍💻

Project Structure

Running Tests

Contributing

License 📄

Acknowledgments 🙏

Support 💬

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes