Local-first creative image archive search with SQLite, FAISS, FastAPI, and Typer.
Project description
Image Archive Search
Local-first creative image archive search for personal libraries.
This project indexes one or more folders of images on your machine, generates local thumbnails, CLIP zero-shot enrichment labels, and CLIP-style embeddings, stores metadata in SQLite, stores vectors in FAISS, and serves a localhost web UI for natural-language and image-to-image search.
MVP v1 Features
- Local-first only. No external APIs or cloud services.
- Index local image folders recursively.
- Store file path, hash, dimensions, timestamps, folder, thumbnail, embedding metadata, and structured enrichment fields.
- Incremental indexing that skips unchanged files and resumes cleanly after interruptions.
- Text search with embedding retrieval plus structured tag/style/object boosting.
- Image-to-image similarity search from an indexed asset or uploaded query image.
- Folder and date filtering.
- Content-type filtering.
- Similar-images view for any asset.
- Guided CLI workflow with
run, plus power-user commandsinit,index,serve,status, andreindex. - Installable CLI shape with the
image-archive-searchcommand, packaged frontend assets, per-user app data, and aresetcommand.
Supported File Types
.jpg.jpeg.png
Project Structure
backend/ Python package, CLI, API, indexing pipeline, search services
frontend/ Minimal local web UI served by FastAPI
models/ Notes and placeholders for local model assets
scripts/ Helper scripts
tests/ Basic test suite
How It Works
initcreates a local app data directory, SQLite DB, FAISS index, and config file.runorindexscans image files, skips unchanged assets, creates thumbnails, embeddings, and CLIP zero-shot enrichment fields, then persists everything locally.servelaunches the FastAPI server and serves the UI fromhttp://127.0.0.1:8000.- The UI lets you search in plain English, filter by indexed folder, content type, or date, upload a query image, and inspect similar results.
Install And Run
For users, the intended packaged command is:
uvx --from image-archive-search image-archive-search run
For a permanent install:
uv tool install image-archive-search
image-archive-search run
During local development from this repo:
python3 -m venv .venv
source .venv/bin/activate
pip install --upgrade pip
pip install -e ".[dev]"
image-archive-search run
The guided command initializes the app, opens the terminal folder picker, indexes selected folders, and can start the localhost UI.
Local Development Setup
1. Create a virtual environment
python3 -m venv .venv
source .venv/bin/activate
2. Install dependencies
pip install --upgrade pip
pip install -e ".[dev]"
Notes:
- The first time you run indexing, the embedding model weights will be downloaded locally and reused from cache afterward.
- Structured enrichment defaults to CLIP zero-shot labels and does not require Ollama.
- Ollama remains an optional backend for richer VLM enrichment if you set
enrichment_backend: ollama. - On some Apple Silicon or Linux setups,
faiss-cpumay be easiest to install throughcondaif a wheel is not available for your environment. - On some macOS setups, OpenMP libraries from FAISS and Torch can conflict. The CLI applies a compatibility workaround automatically, but if you still see an OpenMP startup error, run commands with
KMP_DUPLICATE_LIB_OK=TRUE.
3. Initialize the archive
image-archive-search init
By default this creates per-user files outside the repo:
- macOS config/data:
~/Library/Application Support/image-archive-search/ - Linux data:
~/.local/share/image-archive-search/ - Linux config:
~/.config/image-archive-search/config.yaml - Windows config/data: under
%APPDATA%and%LOCALAPPDATA%
You can still force a repo-local config for development:
image-archive-search init --config-path config.yaml
4. Guided flow
image-archive-search run
This guided command:
- initializes the local archive if needed
- opens a terminal folder navigator
- lets you multi-select folders to index
- runs the full indexing pipeline
- optionally starts the local server
5. Index a folder directly
image-archive-search index /path/to/library
You can run index again on the same folder. Unchanged files are skipped automatically.
6. Serve the local app
image-archive-search serve
Then open:
CLI Commands
image-archive-search init
image-archive-search run
image-archive-search index /path/to/library
image-archive-search reindex
image-archive-search status
image-archive-search serve --host 127.0.0.1 --port 8000
image-archive-search reset
The legacy image-archive command remains available. New users should prefer image-archive-search.
Example Workflow
python3 -m venv .venv
source .venv/bin/activate
pip install --upgrade pip
pip install -e ".[dev]"
image-archive-search run
Publishing To PyPI
- Pick a final package name on PyPI. The current package name is
image-archive-search. - Build the wheel and source distribution:
python3 -m pip install --upgrade build twine
python3 -m build
- Check the package:
python3 -m twine check dist/*
- Publish to TestPyPI first:
python3 -m twine upload --repository testpypi dist/*
-
Test install from TestPyPI in a clean environment.
-
Publish to PyPI:
python3 -m twine upload dist/*
After publishing, users can run:
uvx --from image-archive-search image-archive-search run
Configuration
The default config is created by init in the per-user app config directory. A sample is also provided as config.example.yaml.
Key fields:
indexed_pathsthumbnail_dirsqlite_pathfaiss_index_pathembedding_model_nameenrichment_backendenrichment_modelenrichment_modeenrichment_versionollama_hostdevicebatch_sizenum_workers
Search Behavior
- Text search embeds the query with the local embedding model, retrieves nearest vectors from FAISS, and boosts results whose content type, tags, styles, objects, and short summaries match the query.
- Similar search uses the indexed asset embedding or a locally uploaded image.
- Exact self-matches are excluded from similar results by default.
Limitations
- MVP v1 supports images only. Video is intentionally out of scope.
- CLIP zero-shot enrichment is fast but less nuanced than a larger VLM for OCR-heavy document analysis and detailed object reasoning.
- Index updates currently focus on new, changed, stale, or partially processed records. Automatic deletion handling for files removed from disk is minimal in v1.
- The first indexing run can be slow because local models are loaded and warmed up.
- The UI is intentionally minimal and optimized for usability over design polish.
Future Roadmap
- Better reranking and search-time faceting
- Duplicate clustering
- Richer asset facets and saved collections
- Video, OCR, and extra metadata extractors
- Faster background indexing workers
- Model selection from the UI
Repo Tree
.
|-- backend/
| `-- image_archive/
|-- frontend/
|-- models/
|-- scripts/
|-- tests/
|-- config.example.yaml
|-- pyproject.toml
`-- README.md
Commands To Run Locally
image-archive-search init
image-archive-search run
image-archive-search serve
Or with the packaged command:
image-archive-search run
image-archive-search serve
image-archive-search reset
Known Limitations
- Removed files are not fully garbage-collected from search results in every case yet.
- Embeddings and CLIP zero-shot enrichment depend on local model downloads.
- Very large archives may benefit from future background jobs and sharded indexing.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file image_archive_search-0.1.0.tar.gz.
File metadata
- Download URL: image_archive_search-0.1.0.tar.gz
- Upload date:
- Size: 44.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.9.6
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
430ef47e137afb31a7cc6fcc1e20e5842f04df1326a89a43da838469c38eb13b
|
|
| MD5 |
595f7883f81a3ece3e0cdecca8720c52
|
|
| BLAKE2b-256 |
99cb999e6018b0b37084d262aa3865fa48a961acb25b060a051c63d8a22111c2
|
File details
Details for the file image_archive_search-0.1.0-py3-none-any.whl.
File metadata
- Download URL: image_archive_search-0.1.0-py3-none-any.whl
- Upload date:
- Size: 45.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.9.6
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ab6d98843d98d59286ae9a0105f14d796812a9c544fc9e32580b4809169b765d
|
|
| MD5 |
71834579e08389435ac0e5ac3397d936
|
|
| BLAKE2b-256 |
e66b8a3525c58a654e00afcdfa1a67fbae286505d23107a9fa5ca0502230a5cc
|