Local document memory with instant semantic search. Drop any file. Ask anything. Get an answer in under a second.

These details have not been verified by PyPI

Project description

vstash

Local document memory with instant semantic search.

vstash demo

Drop any file. Ask anything. Get an answer fast.

vstash add paper.pdf notes.md https://example.com/article
vstash ask "what's the main argument about X?"
vstash chat

Why vstash?

Most RAG tools are slow, cloud-dependent, or require a running server. vstash is none of those things.

Layer	Technology	Why
Embeddings	FastEmbed (ONNX Runtime)	~700 chunks/s, fully in-process, no server
Vector store	sqlite-vec	Single `.db` file, cosine similarity, zero deps
Keyword search	FTS5 (SQLite)	Exact matches, porter stemming, built into SQLite
Hybrid ranking	Reciprocal Rank Fusion	Best of both: semantic + keyword, no training needed
Inference	Cerebras API / Ollama / OpenAI	~2,000 tok/s via Cerebras, or 100% local via Ollama
Parsing	markitdown	PDF, DOCX, PPTX, XLSX, HTML, Markdown, URLs

Philosophy: extreme speed at every layer.

Install

pip install vstash

Or from source:

git clone https://github.com/stffns/vstash
cd vstash
pip install -e .

Quick start

1. Configure your backend (copy the example):

cp vstash.toml.example vstash.toml
# Edit: set your Cerebras API key, or switch to ollama

Or just set the env var:

export CEREBRAS_API_KEY=your_key_here

2. Add documents:

vstash add report.pdf
vstash add ~/docs/notes.md
vstash add https://arxiv.org/abs/2310.06825
vstash add ./my-project/          # entire directory, recursive

3. Ask questions:

vstash ask "what is the proposed method?"
vstash ask "summarize the key findings"
vstash ask "what are the limitations?"

4. Chat interactively:

vstash chat

Commands

vstash add <file/dir/url>   Add documents to memory (supports PDF, DOCX, PPTX, MD, TXT, code, URLs)
vstash ask "<question>"     Answer a question from your documents
vstash chat                 Interactive Q&A session
vstash list                 Show all documents in memory
vstash stats                Memory statistics (docs, chunks, DB size)
vstash forget <file>        Remove a document from memory
vstash config               Show current configuration
vstash-mcp                  Start MCP server (for Claude Desktop integration)

Options for vstash ask:

--top-k INT             Number of chunks to retrieve (default: from config)
--sources/--no-sources  Show source citations (default: show)
--stream/--no-stream    Stream the response token by token (default: stream)

MCP Server — Claude Desktop Integration

vstash includes a built-in MCP server that gives Claude Desktop persistent document memory across sessions.

Setup

1. Install vstash:

pip install vstash

2. Add to Claude Desktop config (~/Library/Application Support/Claude/claude_desktop_config.json):

{
  "mcpServers": {
    "vstash": {
      "command": "vstash-mcp"
    }
  }
}

💡 If using pyenv, use the full path: "command": "/path/to/.pyenv/versions/3.x.x/bin/vstash-mcp"

3. Restart Claude Desktop.

Available MCP Tools

Tool	Description
`vstash_add(path)`	Ingest a file, directory, or URL into memory
`vstash_ask(query, top_k)`	Semantic search → LLM-generated answer with sources
`vstash_search(query, top_k)`	Raw retrieval without LLM — returns chunks with scores
`vstash_list()`	List all ingested documents
`vstash_stats()`	Database statistics (doc count, chunks, size)
`vstash_forget(source)`	Remove a document from memory

Example

Once configured, Claude can use vstash directly:

User: What did my research notes say about transformer attention?
Claude: [calls vstash_ask("transformer attention mechanisms")]
       Based on your notes in paper.pdf...

⚠️ Make sure your ~/.vstash/vstash.toml includes the API key under [cerebras] (or your chosen backend), since MCP servers don't inherit shell environment variables.

Configuration

vstash looks for vstash.toml in your current directory, then ~/.vstash/vstash.toml, then falls back to defaults.

[inference]
backend = "cerebras"       # "cerebras" | "ollama" | "openai"
model   = "llama3.1-8b"

[cerebras]
api_key = ""               # or set CEREBRAS_API_KEY env var

[ollama]
host  = "http://localhost:11434"
model = "llama3.2"

[embeddings]
model = "BAAI/bge-small-en-v1.5"   # 384 dims, ~700 chunks/s

[chunking]
size    = 1024    # tokens per chunk
overlap = 128     # token overlap between chunks
top_k   = 5       # chunks retrieved per query

[storage]
db_path = "~/.vstash/memory.db"

Embedding models

Model	Dims	Speed	Quality
`BAAI/bge-small-en-v1.5`	384	~700 chunks/s	★★★★☆
`BAAI/bge-base-en-v1.5`	768	~300 chunks/s	★★★★★
`nomic-ai/nomic-embed-text-v1.5`	768	~300 chunks/s	★★★★★

⚠️ Changing the embedding model requires re-ingesting all documents (dimensions must match).

How it works

Ingestion pipeline

file/URL
  → markitdown         (parse to plain text)
  → tiktoken           (count tokens)
  → chunk_text()       (1024 tok / 128 overlap)
  → FastEmbed ONNX     (embed each chunk, ~700 chunks/s)
  → sqlite-vec         (store vectors)
  → FTS5               (index text for keyword search)

Search pipeline

query
  → FastEmbed ONNX     (embed query)
  → sqlite-vec         (top-k×10 vector candidates by cosine similarity)
  → FTS5               (top-k×10 keyword candidates by BM25)
  → RRF                (merge rankings: score = Σ 1/(60+rank))
  → top-k results      (default: 5 chunks)
  → LLM                (Cerebras, Ollama, or OpenAI)

Reciprocal Rank Fusion (k=60, vec_weight=0.6, fts_weight=0.4) ensures that:

Semantic queries ("fast inference approach") find conceptually related chunks
Exact keyword queries ("Cerebras API") are never missed due to embedding drift

Privacy

Component	Data leaves machine?
Embeddings (FastEmbed)	Never — fully local ONNX
Vector store (sqlite-vec)	Never — local `.db` file
Inference (Cerebras/OpenAI)	Yes — query + retrieved chunks sent to API
Inference (Ollama)	Never — fully local

If privacy is paramount, use backend = "ollama" in your config.

Supported file types

PDF, DOCX, PPTX, XLSX, Markdown, TXT, HTML, CSV, Python, JavaScript, TypeScript, Go, Rust, Java — and any URL.

Roadmap

Phase 1 ✅: Core pipeline — ingest, embed, search, answer
Phase 2 ✅: MCP server — Claude Desktop integration with persistent memory
Phase 3: Watch mode (auto-ingest on file change), vstash export, JSON output
Phase 4: Multi-agent sync via cr-sqlite (CRDT peer-to-peer, no server required)

Easter Egg 🥚

In a 2018 Cornell paper "Local Homology of Word Embeddings", researchers used the variable $v_{stash}$ (p. 11) to refer to the "vector of the word stash" — making this the first documented use of the exact term in the context of AI/embeddings.

License

MIT

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

0.36.0

Apr 30, 2026

0.35.0

Apr 27, 2026

0.34.0

Apr 24, 2026

0.33.0

Apr 23, 2026

0.32.0

Apr 16, 2026

0.31.0

Apr 16, 2026

0.30.0

Apr 15, 2026

0.29.0

Apr 14, 2026

0.28.0

Apr 10, 2026

0.27.0

Apr 9, 2026

0.26.0

Apr 8, 2026

0.25.1

Apr 7, 2026

0.25.0

Apr 7, 2026

0.24.1

Apr 7, 2026

0.24.0

Apr 7, 2026

0.23.0

Apr 7, 2026

0.22.0

Apr 7, 2026

0.21.0

Apr 7, 2026

0.20.2

Apr 7, 2026

0.20.1

Apr 7, 2026

0.20.0

Apr 6, 2026

0.19.0

Apr 6, 2026

0.18.2

Apr 6, 2026

0.18.1

Apr 6, 2026

0.18.0

Apr 5, 2026

0.17.5

Apr 5, 2026

0.17.4

Apr 5, 2026

0.17.3

Apr 5, 2026

0.17.2

Apr 4, 2026

0.17.1

Apr 4, 2026

0.17.0

Apr 4, 2026

0.16.0

Apr 3, 2026

0.15.0

Apr 3, 2026

0.14.0

Apr 2, 2026

0.13.0

Apr 2, 2026

0.12.0

Apr 2, 2026

0.11.0

Apr 2, 2026

0.10.4

Apr 1, 2026

0.10.3

Apr 1, 2026

0.10.2

Apr 1, 2026

0.10.1

Mar 31, 2026

0.10.0

Mar 31, 2026

0.9.0

Mar 30, 2026

0.8.1

Mar 29, 2026

0.8.0

Mar 29, 2026

0.7.0

Mar 28, 2026

0.6.2

Mar 28, 2026

0.6.1

Mar 28, 2026

0.6.0

Mar 28, 2026

0.5.3

Mar 27, 2026

0.5.2

Mar 27, 2026

0.5.1

Mar 27, 2026

0.5.0

Mar 27, 2026

0.4.1

Mar 26, 2026

0.4.0

Mar 26, 2026

0.3.1

Mar 24, 2026

0.3.0

Mar 23, 2026

0.2.6

Mar 23, 2026

This version

0.2.5

Mar 23, 2026

0.2.4

Mar 21, 2026

0.2.3

Mar 21, 2026

0.2.2

Mar 20, 2026

0.2.1

Mar 20, 2026

0.2.0

Mar 20, 2026

0.1.0

Mar 20, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

vstash-0.2.5.tar.gz (919.4 kB view details)

Uploaded Mar 23, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

vstash-0.2.5-py3-none-any.whl (37.9 kB view details)

Uploaded Mar 23, 2026 Python 3

File details

Details for the file vstash-0.2.5.tar.gz.

File metadata

Download URL: vstash-0.2.5.tar.gz
Upload date: Mar 23, 2026
Size: 919.4 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.1

File hashes

Hashes for vstash-0.2.5.tar.gz
Algorithm	Hash digest
SHA256	`5455091ca23ff7f177bb9fc4c67f9c6eebaabab0355aabb33d9dd8bcd6d59596`
MD5	`88c1e9fef911787ec247cd5ee50496df`
BLAKE2b-256	`3056961f3ee13f9f0ed53c4cf3ee32fd90af4085990ca13083df9b2354c09e28`

See more details on using hashes here.

File details

Details for the file vstash-0.2.5-py3-none-any.whl.

File metadata

Download URL: vstash-0.2.5-py3-none-any.whl
Upload date: Mar 23, 2026
Size: 37.9 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.1

File hashes

Hashes for vstash-0.2.5-py3-none-any.whl
Algorithm	Hash digest
SHA256	`dab9e7df9468638302b9bbd5faf14492bde32bfa2752adfd7436c9eb07c42660`
MD5	`ca87a7df641a9374c04323095c2d39cd`
BLAKE2b-256	`84e5e471633eb634556e57b46e89b61572613d0074b12cdbc9bcf5b0ee908f89`

See more details on using hashes here.

vstash 0.2.5

Navigation

Verified details

Maintainers

Unverified details

Meta

Project description

vstash

Why vstash?

Install

Quick start

Commands

MCP Server — Claude Desktop Integration

Setup

Available MCP Tools

Example

Configuration

Embedding models

How it works

Ingestion pipeline

Search pipeline

Privacy

Supported file types

Roadmap

Easter Egg 🥚

License

Project details

Verified details

Maintainers

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes