Skip to main content

Local document memory with instant semantic search. Drop any file. Ask anything. Get an answer in under a second.

Project description

vstash

PyPI license python tests

Local document memory with hybrid retrieval that beats ColBERTv2 on 3/5 BEIR datasets. Single SQLite file. Zero cloud dependencies. 20.9 ms at 50K chunks.

A 33M parameter embedding model, fine-tuned with zero human labels using vstash's own hybrid retrieval disagreement signal, surpasses ColBERTv2 (110M params) on SciFact, NFCorpus, and SciDocs. The model is published as Stffens/bge-small-rrf-v2.

pip install vstash
vstash add paper.pdf notes.md https://example.com/article
vstash search "what's the main argument?"

Retrieval Quality

Dataset Docs vstash (tuned) ColBERTv2 BM25 vs ColBERTv2
SciFact 5K 0.695 0.693 0.665 +0.2%
NFCorpus 3.6K 0.395 0.344 0.325 +14.8%
SciDocs 25K 0.188 0.154 0.158 +21.8%
FiQA 57K 0.328 0.356 0.236 -7.8%
ArguAna 8.7K 0.424 0.463 0.315 -8.4%

NDCG@10 on BEIR. Tuned model: Stffens/bge-small-rrf-v2 (33M params, 384d). Reproducible via python -m experiments.beir_benchmark.


How It Works

Query --> Embed --+--> Vector ANN (sqlite-vec) --+
                  |                               +--> Adaptive RRF --> MMR Dedup --> Results
                  +--> FTS5 BM25 ----------------+
  1. Hybrid search: vector similarity + keyword matching, fused via Reciprocal Rank Fusion
  2. Adaptive RRF: IDF-based per-query weights. Rare terms boost keywords, common terms boost vectors. +21.4% on ArguAna
  3. MMR dedup: diverse sections from long documents surface instead of redundant chunks
  4. Self-tuned embedding: vstash retrain fine-tunes your embedding model using disagreements between vector and keyword search. Zero labels needed

Install

pip install vstash                    # SDK + search
pip install 'vstash[ingest]'          # + PDF, DOCX, PPTX parsing
pip install 'vstash[serve]'           # + web UI (vstash serve)
pip install 'vstash[all]'             # everything

Quick Start

# Search (free, no API key)
vstash add report.pdf ~/notes/ https://arxiv.org/abs/2310.06825
vstash search "what is the proposed method?"

# Ask (needs a local LLM -- auto-detects Ollama, LM Studio)
vstash ask "summarize the key findings"
vstash chat                           # interactive session

# Fine-tune on your own data
vstash retrain                        # generates training data from your corpus, trains locally
vstash reindex --model ~/.vstash/models/retrained

Python SDK

from vstash import Memory

mem = Memory(project="my_agent")
mem.add("docs/spec.pdf")
mem.remember("OAuth uses PKCE for public clients", title="auth-notes")

results = mem.search("deployment strategy", top_k=5)
for r in results:
    print(r.text, r.score, r.collection, r.tags, r.added_at)

answer = mem.ask("What are the system requirements?")

Commands

vstash add <file/dir/url>    Add documents to memory
vstash remember "<text>"     Ingest text directly
vstash search "<query>"      Semantic search (free, local)
vstash ask "<question>"      Answer from your documents (needs LLM)
vstash chat                  Interactive Q&A
vstash list                  Show all documents
vstash stats                 Memory statistics
vstash forget <file>         Remove a document
vstash retrain               Fine-tune embeddings on your data
vstash reindex               Re-embed with a new model
vstash watch <dir>           Auto-ingest on file changes
vstash serve                 Web UI on localhost
vstash check [--repair]      Integrity check and repair
vstash config                Show configuration
vstash profile <cmd>         Manage named profiles
vstash journal <cmd>         Cross-session agent memory

MCP Server

16 tools for Claude Desktop, Claude Code, Cursor, or any MCP client:

vstash-mcp                            # start MCP server
{
  "mcpServers": {
    "vstash": {
      "command": "vstash-mcp"
    }
  }
}

Self-Supervised Embedding Refinement

vstash can improve its own embedding model by exploiting disagreements between vector and keyword search:

vstash retrain                        # 1. Generate training pairs from your corpus
                                      # 2. Fine-tune with MNRL (needs sentence-transformers)
vstash reindex --model ~/.vstash/models/retrained  # 3. Apply the improved model

82% of queries produce disagreement between vector and FTS search. These disagreements are free training signal. The published model (Stffens/bge-small-rrf-v2) was trained this way: 76K triples, zero human labels, 30 min on a T4 GPU.

Results: +7.4% NDCG on SciFact, +19.5% on NFCorpus, +5.5% on SciDocs. The 33M model surpasses an untrained 110M model on 3/5 datasets.


Privacy

Component Data leaves machine?
Embeddings (FastEmbed) Never
Search (sqlite-vec + FTS5) Never
Inference (Ollama/LM Studio) Never
Inference (Cerebras/OpenAI) Yes (query + context sent to API)

Search is always private. Use a local LLM for fully private answers.


Paper

vstash: Local-First Hybrid Retrieval with Adaptive Fusion for LLM Agents

Four contributions: adaptive RRF, self-supervised embedding refinement, negative result on post-RRF scoring, production substrate. LaTeX version at paper/arxiv/vstash.tex.


Documentation

Guide Description
How It Works Search pipeline, chunking, RRF
Configuration Full TOML reference
Embedding Models Model comparison, vstash retrain
MCP Server 16 tools for LLM agents
Experiments BEIR benchmarks, ablations

Experiments

Experiment Key Result Command
BEIR Benchmark Beats ColBERTv2 on 3/5 datasets python -m experiments.beir_benchmark
Embedding Fine-tune +7.4% NDCG, zero labels python -m experiments.finetune_rrf
Scale Benchmark 20.9ms at 50K chunks python -m experiments.scale_benchmark
Relevance Signal F1=0.996 cross-domain python -m experiments.relevance_signal_beir

What's New in v0.28

  • vstash retrain: fine-tune embeddings on your own data using hybrid retrieval disagreement
  • Stffens/bge-small-rrf-v2: published embedding model (+7.4% SciFact, +19.5% NFCorpus)
  • SearchResult.added_at/collection/tags/layer: full metadata on search hits
  • add_documents_batch(): bulk ingest in single transaction
  • Embedder provenance: embedding_model stamped on fresh stores
  • Search 32% faster: MMR cache, batch expand_context, norm precompute

See CHANGELOG for full version history.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

vstash-0.29.0.tar.gz (490.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

vstash-0.29.0-py3-none-any.whl (142.2 kB view details)

Uploaded Python 3

File details

Details for the file vstash-0.29.0.tar.gz.

File metadata

  • Download URL: vstash-0.29.0.tar.gz
  • Upload date:
  • Size: 490.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for vstash-0.29.0.tar.gz
Algorithm Hash digest
SHA256 a332e47116f4a574585664764d6901a529ace58c9daccc7a1ad7dfc21c3bc289
MD5 10f9125c3675f5fbdcd2bda035da0b9e
BLAKE2b-256 9c43f9154306320e21793f7175806e5dd9f6ac043c065d16c52d95bebc2747f8

See more details on using hashes here.

Provenance

The following attestation bundles were made for vstash-0.29.0.tar.gz:

Publisher: publish.yml on stffns/vstash

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file vstash-0.29.0-py3-none-any.whl.

File metadata

  • Download URL: vstash-0.29.0-py3-none-any.whl
  • Upload date:
  • Size: 142.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for vstash-0.29.0-py3-none-any.whl
Algorithm Hash digest
SHA256 512d6571593a73635d2fc2e32cfcc912b27d6155f78e4a850c26023c25e12358
MD5 208b8d54a879bebcb5f514e01543d97c
BLAKE2b-256 0aa58515429fe4dee6253e25e02fda4e61feae6cc9a0e3052f4a5aaf93c34cfc

See more details on using hashes here.

Provenance

The following attestation bundles were made for vstash-0.29.0-py3-none-any.whl:

Publisher: publish.yml on stffns/vstash

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page