Skip to main content

Extract searchable knowledge from any document. Expose it to LLMs via MCP.

Project description

punt-quarry

Local semantic search for AI agents and humans.

License CI PyPI Python Working Backwards

Quarry indexes documents in 20+ formats, embeds them with a local ONNX model (snowflake-arctic-embed-m-v1.5, 768-dim), stores vectors in LanceDB, and serves semantic search to Claude Code, Claude Desktop, and the CLI. Everything runs locally — no API keys, no cloud accounts. The embedding model (~500 MB) downloads once on first use.

Platforms: macOS, Linux

Quick Start

curl -fsSL https://raw.githubusercontent.com/punt-labs/quarry/abb5173/install.sh | sh

Restart Claude Code, then:

> /ingest report.pdf                    # index a document (runs in background)
> /quarry status                        # after a moment, confirm it's there
> /find "what does the report say about margins"   # search by meaning

Once installed, a plugin hook auto-indexes your current project directory on every session start — you don't need to /ingest your codebase manually.

Manual install (if you already have uv)
uv tool install punt-quarry
quarry install
quarry doctor
Verify before running
curl -fsSL https://raw.githubusercontent.com/punt-labs/quarry/abb5173/install.sh -o install.sh
shasum -a 256 install.sh
cat install.sh
sh install.sh

Claude Desktop

Download punt-quarry.mcpb and double-click to install. Alternatively, quarry install configures Claude Desktop automatically.

Note: Uploaded files in Claude Desktop live in a sandbox that quarry cannot access. Use remember for uploaded content, or provide local file paths to ingest.

Features

  • 20+ formats --- PDFs (with OCR for scanned pages), source code (AST-aware splitting), spreadsheets, presentations, HTML, Markdown, LaTeX, DOCX, images
  • Semantic search --- retrieval is by meaning, not keyword. A query about "margins" finds passages about profitability even if they never use that word
  • Daemon architecture --- one quarry serve process loads the embedding model once (~200 MB RAM) and serves all Claude Code sessions via mcp-proxy over WebSocket
  • Passive knowledge capture --- SessionStart hook auto-indexes the working directory, PostToolUse hook auto-ingests fetched URLs, PreCompact hook captures transcripts before context compaction
  • Named databases --- isolated LanceDB directories with independent sync registries. Switch with use for work/personal separation
  • Research agent --- researcher subagent combines quarry local search with web research, auto-ingests valuable findings

What It Looks Like

Ingest a document

> /ingest report.pdf

▶ Ingesting report.pdf (background)

Check what's indexed

> /quarry

▶ Database: default
  Documents: 47
  Chunks: 1,203
  Size: 12.4 MB
  Model: snowflake-arctic-embed-m-v1.5 (768-dim)

Search by meaning

> /find "what were the Q3 revenue figures"

▶ [report.pdf p.12 | text/.pdf] (similarity: 0.4521)
  Third quarter revenue reached $142M, up 18% year-over-year,
  driven primarily by expansion in the enterprise segment.
  Gross margins improved to 71% from 68% in Q2.

Commands

Slash Commands (Claude Code)

Command What it does
/ingest <source> Ingest a URL, directory, or file
/remember <name> Ingest inline text under a document name
/find <query> Semantic search. Questions get synthesized answers; keywords get raw results
/explain <topic> Search and synthesize an explanation
/source <claim> Find which document a claim comes from
/quarry [sub] Manage: status, sync, collections, databases, registrations

MCP Tools

Tool Purpose Execution
ingest Index a file or URL Background
remember Index inline text Background
register_directory Register directory for sync Background
sync_all_registrations Re-index all registered directories Background
find Semantic search with filters Sync
show Document metadata or page text Sync
list Documents, collections, databases, registrations Sync
status Database statistics Sync
delete Remove document or collection Background
deregister_directory Remove registration Background
use Switch active database Sync

CLI

quarry ingest report.pdf                       # index a file
quarry ingest https://example.com              # index a webpage
echo "notes" | quarry remember --name notes.md # index inline text
quarry find "revenue trends"                   # semantic search
quarry list documents                          # list indexed documents
quarry register ~/Documents/notes              # watch a directory
quarry sync                                    # re-index registered dirs
quarry use work                                # switch database
quarry status                                  # database dashboard
quarry doctor                                  # health check
quarry serve                                   # start daemon on :8420

Setup

Quarry works with zero configuration. These environment variables are available for customization:

Variable Default Description
QUARRY_API_KEY (none) Bearer token for quarry serve
QUARRY_ROOT ~/.punt-labs/quarry/data Base directory for all databases
CHUNK_MAX_CHARS 1800 Max characters per chunk (~450 tokens)
CHUNK_OVERLAP_CHARS 200 Overlap between consecutive chunks

For the full configuration reference, see Architecture section 7.

Passive Knowledge Capture

Beyond explicit /ingest and /find commands, quarry runs as a Claude Code plugin with hooks that capture knowledge automatically during your sessions:

Hook When it fires What it does
Session start On every session start Auto-registers your project directory and syncs it in the background. Your codebase is searchable without manual ingestion.
Web fetch After any WebFetch tool call URLs Claude fetches during research are auto-ingested into a web-captures collection. Reuses already-retrieved content when available, falls back to URL ingest otherwise.
Pre-compact Before context compaction Captures the conversation transcript into a session-notes collection. Discoveries that would be lost when the context window shrinks are preserved as searchable chunks.

All hooks are fail-open — failures are ignored and never block Claude Code. Each hook is individually toggleable via .punt-labs/quarry/config.md YAML frontmatter. See AGENTS.md for the full integration model.

How It Works

Quarry runs as a daemon. Claude Code sessions connect through mcp-proxy:

                    stdio                      WebSocket
Claude Code <-----------------> mcp-proxy <---------------------> quarry serve
             MCP JSON-RPC       (~5 MB Go)                        (one daemon)

Without the proxy, every session spawns a separate Python process, each loading the embedding model into ~200 MB of RAM. With it, startup is instant and state is shared across all sessions.

quarry install downloads mcp-proxy (SHA256-verified, correct platform) and configures MCP clients.

Documentation

Architecture | Z Specification | Design | Agents | Changelog

Development

uv sync                        # install dependencies
make check                     # run all quality gates (lint, type, test)
make test                      # test suite only
make format                    # auto-format code
make docs                      # build LaTeX documents

License

MIT

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

punt_quarry-1.7.1.tar.gz (77.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

punt_quarry-1.7.1-py3-none-any.whl (92.9 kB view details)

Uploaded Python 3

File details

Details for the file punt_quarry-1.7.1.tar.gz.

File metadata

  • Download URL: punt_quarry-1.7.1.tar.gz
  • Upload date:
  • Size: 77.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for punt_quarry-1.7.1.tar.gz
Algorithm Hash digest
SHA256 3f420d3116acc47091ab5ab7fa36509db604ca0bc0197d4f1a5d2ed00b2da058
MD5 7318df75b7ede2f7ca2395279da3b6ce
BLAKE2b-256 880930b19719d0bed48b525953fb7685a9c212f95300923ea49914842f9aa3bc

See more details on using hashes here.

Provenance

The following attestation bundles were made for punt_quarry-1.7.1.tar.gz:

Publisher: release.yml on punt-labs/quarry

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file punt_quarry-1.7.1-py3-none-any.whl.

File metadata

  • Download URL: punt_quarry-1.7.1-py3-none-any.whl
  • Upload date:
  • Size: 92.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for punt_quarry-1.7.1-py3-none-any.whl
Algorithm Hash digest
SHA256 64be873f722a02f1c70ba2d85454c34b55c40901c9358017bf276e574bfbc196
MD5 10c8ec3eb56e748b69038858d386ad3f
BLAKE2b-256 08ad9c4387cbc6c1f657667fa47e21dcec1167be1f6af8d509a937ab3ac42e5c

See more details on using hashes here.

Provenance

The following attestation bundles were made for punt_quarry-1.7.1-py3-none-any.whl:

Publisher: release.yml on punt-labs/quarry

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page