Extract searchable knowledge from any document. Expose it to LLMs via MCP.
Project description
punt-quarry
Local semantic search for AI agents and humans.
Quarry indexes documents in 20+ formats, embeds them with a local ONNX model (snowflake-arctic-embed-m-v1.5, 768-dim), stores vectors in LanceDB, and serves semantic search to Claude Code, Claude Desktop, and the CLI. Everything runs locally — no API keys, no cloud accounts. The embedding model (~120 MB int8) downloads once on first use. CUDA GPUs are auto-detected for faster inference.
Platforms: macOS, Linux
Quick Start
curl -fsSL https://raw.githubusercontent.com/punt-labs/quarry/1961675/install.sh | sh
Restart Claude Code, then:
> /ingest report.pdf # index a document (runs in background)
> /quarry status # after a moment, confirm it's there
> /find "what does the report say about margins" # search by meaning
Once installed, a plugin hook auto-indexes your current project directory on every session start — you don't need to /ingest your codebase manually.
Manual install (if you already have uv)
uv tool install punt-quarry
quarry install
quarry doctor
Verify before running
curl -fsSL https://raw.githubusercontent.com/punt-labs/quarry/1961675/install.sh -o install.sh
shasum -a 256 install.sh
cat install.sh
sh install.sh
Remote Server
Run quarry on a GPU server and connect from any Mac or Linux client over TLS.
Server (GPU host, serves remote clients):
export QUARRY_API_KEY=$(openssl rand -hex 32)
curl -fsSL https://raw.githubusercontent.com/punt-labs/quarry/1961675/install.sh | sh -s -- --network
Generates TLS certificates, binds daemon to 0.0.0.0, registers a systemd service, and prints a CA fingerprint. NVIDIA GPUs are auto-detected for CUDA inference.
Client (connects to remote server):
curl -fsSL https://raw.githubusercontent.com/punt-labs/quarry/1961675/install.sh | sh
quarry login <server-hostname> --api-key <token>
No special flag needed --- the default install runs a local daemon on localhost. quarry login redirects queries to the remote server over wss:// with TOFU certificate pinning.
Claude Desktop
Download punt-quarry.mcpb and double-click to install. Alternatively, quarry install configures Claude Desktop automatically.
Note: Uploaded files in Claude Desktop live in a sandbox that quarry cannot access. Use remember for uploaded content, or provide local file paths to ingest.
Features
- 20+ formats --- PDFs (with OCR for scanned pages), source code (AST-aware splitting), spreadsheets, presentations, HTML, Markdown, LaTeX, DOCX, images
- Semantic search --- retrieval is by meaning, not keyword. A query about "margins" finds passages about profitability even if they never use that word
- Daemon architecture --- one
quarry serveprocess loads the embedding model once and serves all Claude Code sessions via mcp-proxy over WebSocket - Passive knowledge capture ---
quarry enablesets up three scoped collections per project: file sync, passive captures (web fetches + session transcripts), and per-agent memory. Captures are separated from the code index so research doesn't pollute code search - Named databases --- isolated LanceDB directories with independent sync registries. Switch with
usefor work/personal separation - Research agent ---
researchersubagent combines quarry local search with web research, auto-ingests valuable findings
What It Looks Like
Ingest a document
> /ingest report.pdf
▶ Ingesting report.pdf (background)
Check what's indexed
> /quarry
▶ Database: default
Documents: 47
Chunks: 1,203
Size: 12.4 MB
Model: snowflake-arctic-embed-m-v1.5 (768-dim)
Search by meaning
> /find "what were the Q3 revenue figures"
▶ [report.pdf p.12 | text/.pdf] (similarity: 0.4521)
Third quarter revenue reached $142M, up 18% year-over-year,
driven primarily by expansion in the enterprise segment.
Gross margins improved to 71% from 68% in Q2.
Commands
Slash Commands (Claude Code)
| Command | What it does |
|---|---|
/ingest <source> |
Ingest a URL, directory, or file |
/remember <name> |
Ingest inline text under a document name |
/find <query> |
Semantic search. Questions get synthesized answers; keywords get raw results |
/explain <topic> |
Search and synthesize an explanation |
/source <claim> |
Find which document a claim comes from |
/quarry [sub] |
Manage: status, sync, collections, databases, registrations |
MCP Tools
| Tool | Purpose | Execution |
|---|---|---|
ingest |
Index a file or URL | Background |
remember |
Index inline text | Background |
register_directory |
Register directory for sync | Background |
sync_all_registrations |
Re-index all registered directories | Background |
find |
Semantic search with filters | Sync |
show |
Document metadata or page text | Sync |
list |
Documents, collections, databases, registrations | Sync |
status |
Database statistics | Sync |
delete |
Remove document or collection | Background |
deregister_directory |
Remove registration | Background |
use |
Switch active database | Sync |
CLI
quarry ingest report.pdf # index a file
quarry ingest https://example.com # index a webpage
echo "notes" | quarry remember --name notes.md # index inline text
quarry find "revenue trends" # hybrid search (vector + FTS)
quarry list documents # list indexed documents
quarry register ~/Documents/notes # watch a directory
quarry sync # re-index registered dirs
quarry use work # switch database
quarry enable # set up project collections + captures
quarry disable # remove project registration + data
quarry status # database dashboard
quarry doctor # health check
quarry serve # start daemon on :8420
quarry install # set up daemon, TLS certs, mcp-proxy
# Remote connections
quarry login okinos.local --api-key <token> # TOFU login to remote server
quarry logout # disconnect, revert to local daemon
quarry remote list --ping # show remote config and health
# Agent memory tagging
quarry ingest notes.md --agent-handle claude --memory-type fact
quarry find "deployment steps" --agent-handle claude
echo "key insight" | quarry remember --name insight.md --agent-handle claude \
--memory-type observation --summary "Key insight from review"
Setup
Quarry works with zero configuration. These environment variables are available for customization:
| Variable | Default | Description |
|---|---|---|
QUARRY_PROVIDER |
(auto) | ONNX execution provider: cpu, cuda, or unset (auto-detect) |
QUARRY_API_KEY |
(none) | Bearer token for quarry serve |
QUARRY_ROOT |
~/.punt-labs/quarry/data |
Base directory for all databases |
CHUNK_MAX_CHARS |
1800 |
Max characters per chunk (~450 tokens) |
CHUNK_OVERLAP_CHARS |
200 |
Overlap between consecutive chunks |
For the full configuration reference, see Architecture section 7.
Passive Knowledge Capture
Beyond explicit /ingest and /find commands, quarry runs as a Claude Code plugin with hooks that capture knowledge automatically during your sessions:
| Hook | When it fires | What it does |
|---|---|---|
| Session start | On every session start | Auto-registers your project directory and syncs it in the background. Your codebase is searchable without manual ingestion. |
| Web fetch | After any WebFetch tool call |
URLs Claude fetches during research are auto-ingested into the project's <name>-captures collection (or global web-captures if no project is enabled). |
| Pre-compact | Before context compaction | Captures the conversation transcript into the project's <name>-captures collection (or global session-notes if no project is enabled). |
All hooks are fail-open — failures are ignored and never block Claude Code. Each hook is individually toggleable via .punt-labs/quarry/config.md YAML frontmatter. See AGENTS.md for the full integration model.
How It Works
Quarry runs as a daemon. Claude Code sessions connect through mcp-proxy:
stdio wss:// (TLS)
Claude Code <-----------------> mcp-proxy <---------------------> quarry serve
MCP JSON-RPC (~5 MB Go) pinned CA cert (one daemon)
Without the proxy, every session spawns a separate Python process, each loading the embedding model into ~200 MB of RAM. With it, startup is instant and state is shared across all sessions. All connections use TLS with a self-signed CA — even on localhost.
quarry install downloads mcp-proxy (SHA256-verified, correct platform) and configures MCP clients.
Documentation
Architecture | Z Specification | Design | Agents | Changelog
Development
uv sync # install dependencies
make check # run all quality gates (lint, type, test)
make test # test suite only
make format # auto-format code
make docs # build LaTeX documents
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file punt_quarry-1.16.0.tar.gz.
File metadata
- Download URL: punt_quarry-1.16.0.tar.gz
- Upload date:
- Size: 126.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0b62ebee7fadbe61a5c2fc3cbfd4860616cb29ec5620a417b88dd49f8c7e1325
|
|
| MD5 |
1c0e197e0622342a84037321134609cb
|
|
| BLAKE2b-256 |
a8aff847eb13b7d2cde824a10dd06f7a09b2393dde8ad952bb7e8afe1c03a1f4
|
Provenance
The following attestation bundles were made for punt_quarry-1.16.0.tar.gz:
Publisher:
release.yml on punt-labs/quarry
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
punt_quarry-1.16.0.tar.gz -
Subject digest:
0b62ebee7fadbe61a5c2fc3cbfd4860616cb29ec5620a417b88dd49f8c7e1325 - Sigstore transparency entry: 1510161992
- Sigstore integration time:
-
Permalink:
punt-labs/quarry@35a4b33c7014cf4f3dafb830c1d1bb52d1e26662 -
Branch / Tag:
refs/tags/v1.16.0 - Owner: https://github.com/punt-labs
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@35a4b33c7014cf4f3dafb830c1d1bb52d1e26662 -
Trigger Event:
push
-
Statement type:
File details
Details for the file punt_quarry-1.16.0-py3-none-any.whl.
File metadata
- Download URL: punt_quarry-1.16.0-py3-none-any.whl
- Upload date:
- Size: 144.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b817f9b3ff958bdf2d94f480e8e69ebb56f3bade910cc778a240ac83d0b28bfa
|
|
| MD5 |
65651f3df2a4852fe4ea2427e0f25b22
|
|
| BLAKE2b-256 |
158f66ff84590bcc3611b35e53dfa37dace5fa1d446679246e6a34ef38d74c91
|
Provenance
The following attestation bundles were made for punt_quarry-1.16.0-py3-none-any.whl:
Publisher:
release.yml on punt-labs/quarry
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
punt_quarry-1.16.0-py3-none-any.whl -
Subject digest:
b817f9b3ff958bdf2d94f480e8e69ebb56f3bade910cc778a240ac83d0b28bfa - Sigstore transparency entry: 1510162031
- Sigstore integration time:
-
Permalink:
punt-labs/quarry@35a4b33c7014cf4f3dafb830c1d1bb52d1e26662 -
Branch / Tag:
refs/tags/v1.16.0 - Owner: https://github.com/punt-labs
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@35a4b33c7014cf4f3dafb830c1d1bb52d1e26662 -
Trigger Event:
push
-
Statement type: