GPU-accelerated RAG module for vaultspec vault search

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

gergely.wootsch

These details have not been verified by PyPI

Project description

vaultspec-rag logo

vaultspec-rag

Semantic search for your vaultspec vault and project codebase

vaultspec-rag adds GPU-accelerated search to projects managed by vaultspec-core. It indexes your .vault/ documents -- research notes, architecture decisions, plans, execution logs -- alongside your source code. Query both with natural language so your AI tools find relevant context on their own.

Getting started

Prerequisites

Python 3.13 or later
uv
A CUDA GPU with at least 3 GB VRAM (mandatory -- no CPU fallback)
vaultspec-core

Install

uv add vaultspec-rag
uv run vaultspec-rag install

The first command pulls in vaultspec-core and all GPU dependencies. The second seeds vaultspec-rag's bundled rule/MCP files into the workspace and updates your pyproject.toml so uv can resolve the cu130 CUDA torch wheel on Linux and Windows (macOS is left on PyPI torch). After confirmation, install writes the canonical cu130 [[tool.uv.index]] / [tool.uv.sources] block and adds torch>=2.4 to [project].dependencies when no recognized direct dependency already exists. Auto-managed entries are marked with [tool.vaultspec-rag] managed-torch-direct-dependency = true so uninstall can remove only the dependency it owns. You'll be prompted before the pyproject.toml edit; pass --yes to skip the prompt (required in non-TTY contexts) or --no-torch-config to opt out. Add --sync to run uv sync --reinstall-package torch automatically after the patch and direct dependency are present.

Flag precedence: --no-torch-config always wins (the patch is not applied regardless of --force / --yes). --force is the user's blanket opt-in — it implies --yes for the torch-config prompt. On a non-TTY without --yes or --force, the patch is skipped with a warning and the command exits non-zero (code 2) so CI fails loudly. The default for the interactive prompt is no: hitting Enter without typing declines.

After install, run vaultspec-rag --version and then vaultspec-rag index as usual.

Manual cu130 configuration

If you'd rather configure the cu130 torch index by hand (air-gapped environments, custom resolvers, or --no-torch-config), add the following to your pyproject.toml. This is the canonical cu130 block vaultspec-rag install writes, with an educational comment showing the required direct dependency:

[[tool.uv.index]]
name = "pytorch-cu130"
url = "https://download.pytorch.org/whl/cu130"
explicit = true

[tool.uv.sources]
torch = [{ index = "pytorch-cu130", marker = "sys_platform == 'linux' or sys_platform == 'win32'" }]

# uv ignores [tool.uv.sources] for purely-transitive deps.
# Add torch as a direct dep too, e.g. in [project].dependencies
# or [dependency-groups].dev:  "torch>=2.4"

The trailing comment is significant for manual configuration: uv silently ignores [tool.uv.sources] entries for purely-transitive packages, so the source pin only takes effect once torch appears in your own dependency lists. Standard vaultspec-rag install --yes handles this by adding torch>=2.4 to [project].dependencies when it can. If you opted out or are editing TOML by hand, add it to either [project].dependencies or [dependency-groups].dev:

[dependency-groups]
dev = [
    "torch>=2.4",
]

Then run uv lock --refresh-package torch && uv sync. The lockfile entry for torch should show source = { registry = "https://download.pytorch.org/whl/cu130" } (not pypi.org/simple). If it still resolves from PyPI, confirm both the cu130 source block and a direct dependency are present before refreshing the lockfile again. [tool.uv.sources] declarations in a dependency's own pyproject.toml do not propagate to consumers, which is why the direct dependency is necessary.

Troubleshooting: "PyTorch was installed without CUDA support"

If vaultspec-rag index reports the CPU-only wheel on a machine with a GPU, uv resolved torch from PyPI (which only ships CPU wheels on Linux/Windows). The fix is the cu130 patch, a direct dependency, and a refreshed lock/sync. Check these failure modes in order:

Patch isn't applied. Run vaultspec-rag install --yes (or paste the manual snippet above), then uv sync --reinstall-package torch.
Patch is applied but torch is not a direct dep. This usually means the install was declined, run with --no-torch-config, blocked by an incompatible [project] / [project].dependencies shape, or the TOML was hand-edited. uv ignores [tool.uv.sources] for purely-transitive packages, so the cu130 pin is a no-op until torch>=2.4 appears in [project].dependencies or [dependency-groups].dev (see the Manual section above). After adding it, run uv lock --refresh-package torch && uv sync.
Patch is applied, torch is a direct dep, but resolution still picks the cpu wheel. Your uv.lock is stale. Run uv lock --refresh-package torch && uv sync to force a re-resolve. Inspect uv.lock afterwards: the torch entry should read source = { registry = "https://download.pytorch.org/whl/cu130" }.

The No CUDA GPU detected error is reserved for the genuinely GPU-less case (driver missing, headless VM without a device, etc.).

Verify

vaultspec-rag --version

Index and search

vaultspec-rag indexes two sources: vault (.vault/ documents) and code (project source files). Code indexing excludes vaultspec internal directories such as .vault/ and .vaultspec/, so --type code only searches project source content.

vaultspec-rag index                          # both
vaultspec-rag index --type vault             # vault only
vaultspec-rag index --type code              # code only
vaultspec-rag index --rebuild                # drop selected collections, then re-index
vaultspec-rag clean all --yes                # wipe index data without re-indexing

vaultspec-rag search "architecture decision"
vaultspec-rag search --type code "error handling"

Search concurrency contract

The local backend is qdrant-local. Its runtime contract is: concurrent search accepted: true; same-project search strategy: serialized; cross-project search strategy: parallel; storage process model: exclusive.

Concurrent search accepted means requests may overlap safely, while local Qdrant access for the same project is serialized inside the process. Do not open the same local Qdrant storage from multiple vaultspec-rag processes. A second opener reports lock contention and directs callers to route concurrent work through one resident service. CLI status, server service status, MCP search responses, index status responses, and health payloads expose the same backend contract.

Using the MCP server

The Model Context Protocol (MCP) server gives AI assistants direct access to vault and codebase search. It runs in two transport modes with different project-resolution rules.

stdio mode -- one process per project. The MCP client launches vaultspec-search-mcp as a subprocess, scoped to a single workspace via VAULTSPEC_RAG_ROOT. Use this for Claude Desktop, Claude Code, and similar single-project AI tools.

Local storage is process-exclusive, so avoid launching multiple stdio MCP processes against the same project root. For concurrent clients on one project, route requests through a single HTTP service.

{
  "mcpServers": {
    "vaultspec-rag": {
      "command": "vaultspec-search-mcp",
      "env": {
        "VAULTSPEC_RAG_ROOT": "/path/to/your/project"
      }
    }
  }
}

HTTP mode -- one daemon, many projects. Start vaultspec-rag server service start as a background daemon, then connect any MCP client to http://127.0.0.1:8766/mcp. The daemon has no default project; every tool call must include project_root. Use this to share one GPU-loaded service across workspaces.

Project slots are isolated by root and share one loaded model plus the GPU lock. Different roots can initialize and proceed concurrently; same-root local backend access still serializes around Qdrant.

See the MCP integration reference for the full tool list, both modes' contracts, and choosing between them.

Guide	What it covers
Usage modes	Ad-hoc vs. service operation
CLI commands	Command tree, flags, `--port` fast path
Configuration	Precedence, environment variables, `.vaultragignore`
Service management	Background daemon, health endpoint, model warmup
Python API	Facade functions for programmatic use
Architecture overview	Access layers, GPU lifecycle, multi-project support
Models	Embedding stack and model cards

Getting help

Open an issue on GitHub.

Contributing and license

Contributions welcome -- bug reports, feature ideas, or pull requests. vaultspec-rag uses the MIT License.

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

gergely.wootsch

These details have not been verified by PyPI

Release history Release notifications | RSS feed

0.2.8

May 3, 2026

This version

0.2.7

May 3, 2026

0.2.6

Apr 28, 2026

0.2.5

Apr 27, 2026

0.2.4

Apr 27, 2026

0.2.3

Apr 24, 2026

0.2.1

Apr 12, 2026

0.2.0a0 pre-release

Apr 12, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

vaultspec_rag-0.2.7.tar.gz (910.0 kB view details)

Uploaded May 3, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

vaultspec_rag-0.2.7-py3-none-any.whl (263.3 kB view details)

Uploaded May 3, 2026 Python 3

File details

Details for the file vaultspec_rag-0.2.7.tar.gz.

File metadata

Download URL: vaultspec_rag-0.2.7.tar.gz
Upload date: May 3, 2026
Size: 910.0 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: uv/0.11.8 {"installer":{"name":"uv","version":"0.11.8","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for vaultspec_rag-0.2.7.tar.gz
Algorithm	Hash digest
SHA256	`06cb9f10b8bbf83cea2cf592eb2aa5cb458e9d372c8c754c9c4e72203958d289`
MD5	`59b2c73beefd1b3cee56eef3792a03ee`
BLAKE2b-256	`7311ce5fc965cdb4139cb62e58d81efadc5ffad9cbf6e9260293c88a7ae1f13d`

See more details on using hashes here.

File details

Details for the file vaultspec_rag-0.2.7-py3-none-any.whl.

File metadata

Download URL: vaultspec_rag-0.2.7-py3-none-any.whl
Upload date: May 3, 2026
Size: 263.3 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: uv/0.11.8 {"installer":{"name":"uv","version":"0.11.8","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for vaultspec_rag-0.2.7-py3-none-any.whl
Algorithm	Hash digest
SHA256	`482e93a6ccb6b5b19c39aa069da163a70e0dddeed901b8b9f7aede95ff695465`
MD5	`af2f8671bc414a113785962525c927c5`
BLAKE2b-256	`af522c8798adfd7369b741cc00659719aa935e903115ab41f26b04676ae0186b`

See more details on using hashes here.

vaultspec-rag 0.2.7

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Meta

Unverified details

Meta

Classifiers

Project description

vaultspec-rag

Semantic search for your vaultspec vault and project codebase

Getting started

Prerequisites

Install

Manual cu130 configuration

Troubleshooting: "PyTorch was installed without CUDA support"

Verify

Index and search

Search concurrency contract

Using the MCP server

Further reading

Getting help

Contributing and license

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Meta

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes