Skip to main content

Local-first multimodal semantic memory for your machine — searchable text + screenshots, MCP-native, runs on CPU.

Project description

Lookback

Local-first, multimodal semantic memory for your machine.

Index your files, code, PDFs, browser history, and screenshots into a LanceDB store on disk. Query by meaning from the CLI or from any MCP-capable AI tool (Claude Code, Cursor, Continue, ChatGPT Desktop, Windsurf, Zed). Everything runs on-device — no cloud, no GPU.

Highlights

  • Multimodal. Real semantic search over text + screenshots in a single index. Cross-modal: search for "fluffy clouds in the sky" and you'll get back the screenshot, not just text mentioning clouds.
  • Local-first. Models (Nomic Embed v1.5 + MobileCLIP2-S2) run on CPU via ONNX Runtime. Your data and your queries never leave your laptop.
  • MCP-native. A single lookback serve makes the index available as a tool to every modern AI assistant. See MCP_SETUP.md.
  • Dev-grade DX. Single pip install, sensible defaults, one config file, every subcommand documented.

Status

Milestone Scope State
M0 Design + scaffold
M1 Lance schema + store
M2 Text embedder ABC + mock + Nomic adapter; chunking; markdown extractor; indexer
M3 Image embedder mock + screenshot extractor
M4 PDF + code extractors
M5 CLI: init / index / search / stats / models
M6 Model registry, system probe, recommendation, init model selection
M7 Real Nomic + MobileCLIP weights wired end-to-end, @needs_models smoke tests
M8a Cross-modal text→image search via MobileCLIP joint text tower; --modality flag
M8b File watcher (lookback watch); MCP server (lookback serve); hybrid FTS + vector (--hybrid); MCP setup docs

194 tests, all green (10 of them gated on real model weights; auto-skip when absent). Run uv sync && uv run pytest -q to verify.

Quick start

# Install
pip install lookback-ai    # PyPI distribution; imports as `lookback`
# OR for local development:
uv sync && uv tool install --editable .

# Bootstrap config with system-aware model recommendation
uv run lookback init
# Detected: Darwin · arm64 · Apple Silicon · 16.0 GB RAM · 8 CPU
# Recommended: text=nomic-v1.5  image=mobileclip-s2

# Download weights (~700 MB total — Nomic v1.5 + MobileCLIP2-S2 vision + text + tokenizer)
uv run lookback models download nomic-v1.5 mobileclip-s2

# First-time index pass over directories you care about
uv run lookback index ~/Documents
uv run lookback index ~/Pictures/Screenshots

# Search
uv run lookback search "transformer attention notes"
uv run lookback search "a diagram with red and blue arrows" --modality image
uv run lookback search "IVF_PQ tuning" --hybrid       # FTS + vector RRF fusion

# Keep the index up to date as files change
uv run lookback watch ~/Documents

# Expose to AI tools via MCP
uv run lookback serve                                  # stdio (IDE-friendly)
uv run lookback serve --transport http --port 7777     # HTTP for remote

See MCP_SETUP.md for Claude Code / Cursor / Continue / ChatGPT Desktop / Windsurf / Zed configuration snippets.

Commands at a glance

Command What it does
lookback init Detect system, recommend models, write ~/.lookback/config.toml. Flags: --text-model, --image-model, --interactive.
lookback models list Show every registered model with HF repo and disk-size estimate.
lookback models download <name> [<name> …] Fetch weights into models_dir.
lookback index <path> Walk a path, hash + skip-if-unchanged, embed new/changed files, write to Lance.
lookback search <query> Semantic search. Flags: `--modality text
lookback stats Row counts per table.
lookback watch <path> Foreground watcher — re-indexes on FS events.
lookback serve MCP server. `--transport stdio

Storage layout

~/.lookback/
├── config.toml         # one TOML, hand-editable
├── models/
│   ├── nomic-v1.5/
│   │   ├── onnx/model.onnx
│   │   └── tokenizer.json
│   └── mobileclip-s2/
│       ├── onnx/s2/vision_model.onnx
│       ├── onnx/s2/text_model.onnx
│       └── tokenizer.json
└── data/
    ├── chunks_text.lance       (Nomic 768-d)
    ├── chunks_image.lance      (MobileCLIP 512-d)
    └── files.lance             (file-level state for incremental indexing)

What it indexes by default

Tier 1 (configured in roots, on by default):

  • Markdown / plaintext.md, .markdown, .mdx, .txt, .log, .rst
  • PDFs — text-layer extraction via pypdf (OCR for image-only PDFs is M9)
  • Source code — 40+ languages (Python, TS/JS, Go, Rust, Java, Swift, C/C++, Ruby, …) with language tags as source_kind
  • Screenshots.png, .jpg, .webp, .gif, .bmp. Visually searchable via MobileCLIP.

Skipped: hidden directories, .gitignore'd paths, node_modules/.venv/target/build/dist/etc., files larger than max_file_bytes (50 MiB default), symlinks (unless follow_symlinks = true).

Hero demos (with real weights)

$ lookback search "fluffy clouds in the sky" --modality image
Image hits
┃ score  kind        meta                    ┃
│ 0.779  screenshot  {"filename": "sky.png"} │
│ 0.926  screenshot  {"filename": "dog.png"} │

$ lookback search "transformer attention paper"
Text hits
│ 0.309  markdown  {"section": "Attention is all you need", ...} │
Image hits
│ 0.836  screenshot  {"filename": "diagram.png"} │

$ lookback search "IVF_PQ tuning" --hybrid
Text hits
│ 0.033  markdown  {"section": "IVF_PQ index tuning", ...}    # exact-keyword boost

Design + architecture

See DESIGN.md for:

  • Lance schema (chunks_text + chunks_image + files) and the perf-guide-driven decisions
  • Embedding choices, dim selection, distance metrics
  • Per-extractor chunking strategies
  • Index types (IVF_PQ vector + bitmap/btree scalar + FTS inverted)
  • Session-by-session implementation log

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

lookback_ai-0.1.0.tar.gz (180.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

lookback_ai-0.1.0-py3-none-any.whl (48.2 kB view details)

Uploaded Python 3

File details

Details for the file lookback_ai-0.1.0.tar.gz.

File metadata

  • Download URL: lookback_ai-0.1.0.tar.gz
  • Upload date:
  • Size: 180.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.6.4

File hashes

Hashes for lookback_ai-0.1.0.tar.gz
Algorithm Hash digest
SHA256 32da85797953494309a92516df5b40f32bb23046c7af074501f25f29660ac0f7
MD5 43ce65712de00cf6bc5c3da6a898533d
BLAKE2b-256 5398c1bd046ad5933be0681c5fcd994217681a309937fbb362f69cc40da111ac

See more details on using hashes here.

File details

Details for the file lookback_ai-0.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for lookback_ai-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 3e5db094bdbb7de1f9a5f3e4a8d8724e2c167e73fbed047d34d892202209e8bb
MD5 882594d0a259f49ff2280cb108d87f4b
BLAKE2b-256 06fdb019b077c707cf6efd841476f29f7db83454c575b31447a7bb2e4315a318

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page