Skip to main content

MCP server for local image search using CLIP embeddings

Project description

Local Image Search

Local image search using MLX CLIP embeddings and Daft for batch processing.

Features

  • Generate CLIP embeddings for images using Apple's MLX framework
  • Batch process images efficiently with Daft
  • Search images using natural language queries

Requirements

  • macOS with Apple Silicon (M1/M2/M3/M4)
  • Python 3.11+

Setup

# Clone the repo
git clone https://github.com/Eventual-Inc/local-image-search.git
cd local-image-search

# Install dependencies
uv sync

# Download and convert CLIP model (~600MB, first time only)
cd clip && uv run python convert.py && cd ..

Usage

Embed images from a directory

uv run python embed.py ~/Pictures           # embed all images
uv run python embed.py ~/Pictures --dry-run # count and estimate time
uv run python embed.py . --no-recursive     # current dir only

Embeddings are cached in embeddings.lance/. Re-running skips unchanged files.

Supported formats

Format Extensions Tested
JPEG .jpg, .jpeg Created and embedded
PNG .png Created and embedded
GIF .gif Created and embedded
WebP .webp Created and embedded
BMP .bmp Created and embedded
TIFF .tiff, .tif Created and embedded
HEIC/HEIF .heic, .heif Real iPhone photo + converted PNG

Corrupted or unreadable images get zero vectors (won't match searches).

Search

Start the server (loads model once):

uv run python server.py

Search via CLI:

uv run python search.py "sunset"           # list results
uv run python search.py "people" -n 10     # show 10 results

Or via API:

curl -X POST http://127.0.0.1:8000/search \
  -H "Content-Type: application/json" \
  -d '{"query": "yellow mouse", "limit": 5}'

Demo scripts

uv run python simple_image_search.py  # basic in-memory search (2 images)
uv run python daft_image_search.py    # batch processing demo

Project Structure

local-image-search/
├── clip/                    # MLX CLIP implementation (from ml-explore/mlx-examples)
│   ├── model.py             # CLIP model architecture
│   ├── clip.py              # Model loading and inference
│   ├── convert.py           # HuggingFace to MLX converter
│   ├── image_processor.py   # Image preprocessing
│   ├── tokenizer.py         # Text tokenization
│   ├── mlx_model/           # Converted model weights (generated)
│   └── LICENSE              # MIT License (Apple Inc.)
├── data/
│   └── pokemon/             # Pokemon artwork (1025 images)
├── embeddings.lance/        # Lance DB storage (generated)
├── core.py                  # Shared utilities (EmbedImages, find_images, etc.)
├── embed.py                 # CLI tool to sync embeddings from a directory
├── test_embed.py            # Tests for embed.py
├── simple_image_search.py   # Basic in-memory search demo
├── daft_image_search.py     # Daft-based batch processing demo
├── benchmark.py             # Benchmark script
├── plot_benchmark.py        # Generate benchmark plot
├── benchmark_results.csv    # Raw benchmark data (10 runs)
├── benchmark_plot.png       # Benchmark visualization
├── pyproject.toml           # Project dependencies
└── uv.lock                  # Dependency lockfile

Benchmarks

Embedding time for the Pokemon dataset (1025 images) on M4 Max, averaged over 10 runs.

Benchmark Results

Run benchmarks yourself:

uv run python benchmark.py      # Run one iteration, appends to CSV
uv run python benchmark.py 100  # Benchmark with specific number of images
uv run python plot_benchmark.py # Generate plot from CSV

Real-world performance (M4 Max, home directory)

Metric Value
Images found 11,843
Scan time ~26s
Embed time ~39s
Total time ~65s
Embed speed 260 img/s
Re-run (cached) ~31s (scan only)

Current Progress and Next Steps

See CLAUDE.md

Data Attribution

Pokemon Artwork

  • Source: PokeAPI/sprites
  • License: Repository is CC0 1.0 Universal
  • Copyright: All Pokemon images are Copyright The Pokemon Company

CLIP Implementation

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

local_image_search-0.1.0.tar.gz (19.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

local_image_search-0.1.0-py3-none-any.whl (21.4 kB view details)

Uploaded Python 3

File details

Details for the file local_image_search-0.1.0.tar.gz.

File metadata

  • Download URL: local_image_search-0.1.0.tar.gz
  • Upload date:
  • Size: 19.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.24 {"installer":{"name":"uv","version":"0.9.24","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for local_image_search-0.1.0.tar.gz
Algorithm Hash digest
SHA256 50b0211d1532bb24d84b21528f5636c81d337788bbe742247d0b62c23412f83c
MD5 19fe16d0591d5f0d74815b4ba29312c6
BLAKE2b-256 7f66dacf94c28bbcbc79ea5753ac54e6cfa4acc34548a7355690b8c1079fe03d

See more details on using hashes here.

File details

Details for the file local_image_search-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: local_image_search-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 21.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.24 {"installer":{"name":"uv","version":"0.9.24","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for local_image_search-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 9162e0a6c66f16b1f88f72fabb25c752eaa9408cfc9d56e3863fcdfc5b28d484
MD5 991c338d3b2e977284e4615bac79f3ac
BLAKE2b-256 a17e8cabdc96d7276037d36e00ad689ac7d3272f1d9b369ef2a5dee6d00a6371

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page