Skip to main content

GPU-accelerated RAG module for vaultspec vault search

Project description

vaultspec-rag

Python CI MCP uv License: MIT


Semantic search for your vaultspec vault and project codebase

vaultspec-rag adds GPU-accelerated search to projects managed by vaultspec-core. It indexes your .vault/ documents -- research notes, architecture decisions, plans, execution logs -- alongside your source code. Query both with natural language so your AI tools find relevant context on their own.


Getting started

Prerequisites

  • Python 3.13 or later
  • uv
  • A CUDA GPU with at least 3 GB VRAM (mandatory -- no CPU fallback)
  • vaultspec-core

Install

uv add vaultspec-rag
uv run vaultspec-rag install

The first command pulls in vaultspec-core and all GPU dependencies. The second seeds vaultspec-rag's bundled rule/MCP files into the workspace and patches your pyproject.toml with the cu130 torch index so uv resolves the CUDA torch wheel on Linux and Windows (macOS is left on PyPI torch). You'll be prompted before the pyproject.toml edit; pass --yes to skip the prompt (required in non-TTY contexts) or --no-torch-config to opt out. Add --sync to run uv sync --reinstall-package torch automatically after the patch.

Flag precedence: --no-torch-config always wins (the patch is not applied regardless of --force / --yes). --force is the user's blanket opt-in — it implies --yes for the torch-config prompt. On a non-TTY without --yes or --force, the patch is skipped with a warning and the command exits non-zero (code 2) so CI fails loudly. The default for the interactive prompt is no: hitting Enter without typing declines.

After install, run vaultspec-rag --version and then vaultspec-rag index as usual.

Manual cu130 configuration

If you'd rather configure the cu130 torch index by hand (air-gapped environments, custom resolvers), add the following to your pyproject.toml. These bytes are byte-equal to what vaultspec-rag install writes and what the CPU-only error message displays, so all three surfaces stay in lockstep:

[[tool.uv.index]]
name = "pytorch-cu130"
url = "https://download.pytorch.org/whl/cu130"
explicit = true

[tool.uv.sources]
torch = [{ index = "pytorch-cu130", marker = "sys_platform == 'linux' or sys_platform == 'win32'" }]

# uv ignores [tool.uv.sources] for purely-transitive deps.
# Add torch as a direct dep too, e.g. in [project].dependencies
# or [dependency-groups].dev:  "torch>=2.4"

The trailing comment is significant: uv silently ignores [tool.uv.sources] entries for purely-transitive packages, so the source pin only takes effect once torch appears in your own dependency lists. Add it to either [project].dependencies or [dependency-groups].dev:

[dependency-groups]
dev = [
    "torch>=2.4",
]

Then run uv lock --refresh-package torch && uv sync. The lockfile entry for torch should show source = { registry = "https://download.pytorch.org/whl/cu130" } (not pypi.org/simple); if it still resolves from PyPI, the direct-dep step was missed. [tool.uv.sources] declarations in a dependency's own pyproject.toml do not propagate to consumers, which is why this step is necessary.

Troubleshooting: "PyTorch was installed without CUDA support"

If vaultspec-rag index reports the CPU-only wheel on a machine with a GPU, uv resolved torch from PyPI (which only ships CPU wheels on Linux/Windows). There are three failure modes that all surface the same error; check them in order:

  • Patch isn't applied. Run vaultspec-rag install (or paste the manual snippet above), then uv sync --reinstall-package torch.
  • Patch is applied but torch is not a direct dep. uv ignores [tool.uv.sources] for purely-transitive packages, so the cu130 pin is a no-op until you add torch>=2.4 to [project].dependencies or [dependency-groups].dev (see the Manual section above). After adding it, run uv lock --refresh-package torch && uv sync.
  • Patch is applied, torch is a direct dep, but resolution still picks the cpu wheel. Your uv.lock is stale. Run uv lock --refresh-package torch && uv sync to force a re-resolve. Inspect uv.lock afterwards: the torch entry should read source = { registry = "https://download.pytorch.org/whl/cu130" }.

The No CUDA GPU detected error is reserved for the genuinely GPU-less case (driver missing, headless VM without a device, etc.).

Verify

vaultspec-rag --version

Index and search

vaultspec-rag indexes two sources: vault (.vault/ documents) and code (project source files).

vaultspec-rag index                          # both
vaultspec-rag index --type vault             # vault only
vaultspec-rag index --type code              # code only

vaultspec-rag search "architecture decision"
vaultspec-rag search --type code "error handling"

Using the MCP server

The Model Context Protocol (MCP) server gives AI assistants direct access to vault and codebase search. It runs in two transport modes with different project-resolution rules.

stdio mode -- one process per project. The MCP client launches vaultspec-search-mcp as a subprocess, scoped to a single workspace via VAULTSPEC_RAG_ROOT. Use this for Claude Desktop, Claude Code, and similar single-project AI tools.

{
  "mcpServers": {
    "vaultspec-rag": {
      "command": "vaultspec-search-mcp",
      "env": {
        "VAULTSPEC_RAG_ROOT": "/path/to/your/project"
      }
    }
  }
}

HTTP mode -- one daemon, many projects. Start vaultspec-rag server service start as a background daemon, then connect any MCP client to http://127.0.0.1:8766/mcp. The daemon has no default project; every tool call must include project_root. Use this to share one GPU-loaded service across workspaces.

See the MCP integration reference for the full tool list, both modes' contracts, and choosing between them.


Further reading

Guide What it covers
Usage modes Ad-hoc vs. service operation
CLI commands Command tree, flags, --port fast path
Configuration Precedence, environment variables, .vaultragignore
Service management Background daemon, health endpoint, model warmup
Python API Facade functions for programmatic use
Architecture overview Access layers, GPU lifecycle, multi-project support
Models Embedding stack and model cards

Getting help

Open an issue on GitHub.


Contributing and license

Contributions welcome -- bug reports, feature ideas, or pull requests. vaultspec-rag uses the MIT License.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

vaultspec_rag-0.2.6.tar.gz (904.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

vaultspec_rag-0.2.6-py3-none-any.whl (253.1 kB view details)

Uploaded Python 3

File details

Details for the file vaultspec_rag-0.2.6.tar.gz.

File metadata

  • Download URL: vaultspec_rag-0.2.6.tar.gz
  • Upload date:
  • Size: 904.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.11.8 {"installer":{"name":"uv","version":"0.11.8","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for vaultspec_rag-0.2.6.tar.gz
Algorithm Hash digest
SHA256 2bfcc79ea9dfdcdfcbd2724f551497cbd19e4aaaf73087cf950b00e4a6e9d9c7
MD5 35b8c00a20187a6fcebcba105ec24556
BLAKE2b-256 0bb63428876c062e72d87f1098aa4456f5c813d92fca56253eec49e1a9704855

See more details on using hashes here.

File details

Details for the file vaultspec_rag-0.2.6-py3-none-any.whl.

File metadata

  • Download URL: vaultspec_rag-0.2.6-py3-none-any.whl
  • Upload date:
  • Size: 253.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.11.8 {"installer":{"name":"uv","version":"0.11.8","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for vaultspec_rag-0.2.6-py3-none-any.whl
Algorithm Hash digest
SHA256 bb0bd0ee306da9ac336782bf441522bde82fec3d1e700797f122038193e22666
MD5 a6df29d17bae95d412d6ec9f9a4e2aa6
BLAKE2b-256 9a2c990abb1e1d83c6919ffb2893491ff167c79dbed874b68d660fac46b56094

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page