Vectora - Advanced AI Assistant with RAG and MCP capabilities

Project description

Vectora

Vectora is an open-source AI assistant (Apache 2.0) built for developers — local-first, self-hosted, and designed to run as a powerful sub-agent inside any MCP-compatible orchestrator (Claude Code, Claude Desktop, Paperclip, VS Code extensions).

At its core, Vectora solves the knowledge gap problem: LLMs don't know your codebase, your docs, or the latest versions of your stack. Vectora bridges that gap with RAG (Retrieval-Augmented Generation) — ingest your docs once, and every AI interaction becomes contextually aware.

Why Vectora?

Supervisor + Specialized Agents: A router classifies every message and delegates to the right specialist — search agent for web/RAG, coder agent for files and terminal, direct agent for conversation and synthesis.
RAG-native subgraph: Every query goes through a full retrieve → score → rerank → inject pipeline before hitting the LLM.
14 tools across 4 categories: Web search, vector search, file system, memory — each agent sees only the tools it needs.
Cascading embeddings: Web search results are automatically queued for embedding into LanceDB (fire-and-forget), building your knowledge base as you chat.
Sub-agent architecture: Runs as an MCP server. Claude Code delegates complex tasks to Vectora; Vectora reasons, routes, and responds.
Persistent memory: Cross-session memory in SQLite. Vectora remembers your preferences, project context, and decisions.
Zero infra: SQLite + LanceDB. No Docker required for local use.
Multi-LLM: Google Gemini (free tier), Cohere (free tier), OpenAI, Anthropic, or Ollama (fully local).

Architecture

Supervisor + Workers

Every message enters through a single entry point and is routed by the Supervisor to the right specialized agent:

START
  └─► supervisor (classify intent)
        ├─► direct    ──► direct_tools (memory) ──► direct ──► END
        ├─► search    ──► search_tools ──► process_retrieval ──► search ──► END
        ├─► coder     ──► coder_tools (fs + memory) ──► coder ──► END
        └─► rag_subgraph ──────────────────────────────────────► direct ──► END

Agent	Responsibility	Tools
supervisor	Classifies intent via regex + LLM fallback, routes via `Command(goto=...)`	—
direct	General conversation, synthesis after RAG, memory management	`save_memory`, `get_memory`, `delete_memory`
search	Web research, real-time info, builds knowledge base via cascading embeddings	`web_search`, `fetch_url`, `vector_search`
coder	File operations, terminal commands, code generation	`file_read`, `file_edit`, `file_write`, `grep`, `list_dir`, `terminal`

RAG Subgraph

When the supervisor routes to rag, a dedicated subgraph runs the full retrieval pipeline before synthesis:

rag_retrieve (vector_search)
  └─► rag_decide (score threshold)
        ├─► rag_inject     (score ≥ 0.7 — high confidence, inject directly)
        ├─► rag_rerank     (score 0.4–0.7 — rerank with Cohere before inject)
        └─► rag_websearch  (score < 0.4 — fall back to web + auto-embed results)

Results are injected as a SystemMessage into context before the direct agent synthesizes the final answer.

Cascading Embeddings

After any web_search or fetch_url call, process_retrieval automatically queues the results for embedding into LanceDB — fire-and-forget, no blocking. Your vector store grows passively as you use web search.

Prerequisites

Cohere — Required

Vectora uses Cohere for embeddings (embed-multilingual-v3.0) and reranking (rerank-multilingual-v3.0). It offers a generous free tier with first-class LangChain integration.

Get your key: https://dashboard.cohere.com/api-keys

Tavily — Required

Vectora uses Tavily for real-time web search and URL content extraction. It offers a generous free tier optimized for AI agents.

Get your key: https://app.tavily.com/

LLM Provider — Choose One

Provider	Free Tier	Get Key
Google Gemini ✅ Recommended	Yes	aistudio.google.com
Cohere	Yes	dashboard.cohere.com
Ollama (local)	No cost	ollama.ai
OpenAI	Paid	platform.openai.com
Anthropic	Paid	console.anthropic.com

Installation

Option 1: UV (Recommended)

# Install globally
uv tool install vectora-agent

# First-time setup (interactive wizard)
vectora setup

# Start chatting
vectora chat

Option 2: From Source

git clone https://github.com/brunosrz/vectora.git
cd vectora

# Install with all dependencies
uv sync

# Configure your keys
cp .env.example .env
# Edit .env with your GOOGLE_API_KEY and COHERE_API_KEY

# Run
uv run vectora chat

Option 3: Docker

# Copy and configure environment
cp .env.example .env
# Edit .env with your API keys

# Run the chat interface
docker compose run --rm vectora

# Or run as MCP server (multi-agent mode)
MCP_TRANSPORT=sse docker compose up -d

Running Modes

Chat Mode (Interactive TUI)

The primary interface — a terminal dashboard built with Rich.

vectora chat

Features: multi-turn conversation, session history, live tool feedback (colored panels), debug mode toggle, model switching.

MCP Server — Local (stdio)

Run Vectora as an MCP sub-agent for Claude Code or Claude Desktop.

vectora mcp-server

MCP Server — Remote (SSE, Multi-Agent)

Run Vectora as a shared hub for multiple Paperclip agents or orchestrators connecting simultaneously.

MCP_TRANSPORT=sse MCP_PORT=8000 vectora mcp-server

Each client passes its own thread_id — sessions are fully isolated.

Setup Wizard

Interactive configuration to set up API keys, choose LLM provider, and test connectivity.

vectora setup

Connecting to Claude Code / Claude Desktop

Add Vectora to your .mcp.json (in your project root):

{
  "mcpServers": {
    "Vectora-MCP": {
      "command": "uv",
      "args": ["run", "--project", "/absolute/path/to/vectora", "vectora-mcp"]
    }
  }
}

For a globally installed Vectora:

{
  "mcpServers": {
    "Vectora-MCP": {
      "command": "vectora-mcp"
    }
  }
}

For Docker (SSE mode, multiple agents):

{
  "mcpServers": {
    "Vectora-MCP": {
      "url": "http://localhost:8000/sse"
    }
  }
}

Chat Commands

Command	Description
`/help`	Show quick help
`/list`	Show all commands
`/tools`	List available tools
`/model`	List or switch models
`/debug`	Toggle debug mode (shows tool calls and routing decisions)
`/new`	Start a new session
`/sessions`	List all sessions
`/session <id>`	Switch to a specific session
`/quit`	Exit

Input shortcuts: Enter sends, Alt+Enter or Shift+Enter adds a line break.

Tools Reference

14 tools across 4 categories, distributed to the agent that needs them:

Category	Tools	Agent
Web	`web_search`, `fetch_url`	search
RAG	`vector_search`, `embedding`, `ingest_docs`	search / RAG subgraph
Files	`file_read`, `file_edit`, `file_write`, `grep`, `list_dir`, `terminal`	coder
Memory	`save_memory`, `get_memory`, `delete_memory`	direct / coder
MCP	`call_mcp_tool`	all

Data & Persistence

All data is stored locally in ~/.vectora/:

~/.vectora/
├── .env                    # Your API keys
├── chat_config.json        # Persistent chat settings
├── data/
│   ├── vectora.db          # Sessions, memories, checkpoints (SQLite)
│   ├── embedding_queue.db  # Async embedding queue (SQLite)
│   └── lancedb/            # Vector store for RAG
├── logs/
│   ├── vectora.jsonl       # Structured JSON logs (rotating, 10 MB)
│   └── mcp.log             # MCP server logs
└── exports/                # Session audit trails + debug dumps

Tech Stack

Layer	Technology
Language	Python 3.14+ managed by uv
Agent Framework	LangChain + LangGraph
Agent Pattern	Supervisor + Specialized Workers (direct / search / coder) + RAG Subgraph
Vector Store	LanceDB — file-based, zero-config
Embeddings	Cohere — `embed-multilingual-v3.0` + `rerank-multilingual-v3.0`
Persistence	SQLite via `aiosqlite` + LangGraph Checkpointer
Context Protocol	MCP via FastMCP
Terminal UI	Rich + prompt-toolkit
Observability	LangSmith (optional)

Configuration

All configuration goes in ~/.vectora/.env or a project-local .env:

# LLM Provider
LLM_PROVIDER=google-genai
GOOGLE_API_KEY=your_key_here

# Required: RAG embeddings + reranking
COHERE_API_KEY=your_key_here

# Required: Web search + URL extraction
TAVILY_API_KEY=your_key_here

# Optional: Tracing
LANGSMITH_TRACING=false
LANGSMITH_API_KEY=your_key_here
LANGSMITH_PROJECT=vectora

# Optional: Logging
LOG_LEVEL=INFO

# Feature flags
ENABLE_RAG=true
ENABLE_FILE_OPERATIONS=true

License

Apache 2.0. See LICENSE.

Project details

Release history Release notifications | RSS feed

This version

0.1.0rc1 pre-release

May 21, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

vectora_agent-0.1.0rc1.tar.gz (356.2 kB view details)

Uploaded May 21, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

vectora_agent-0.1.0rc1-py3-none-any.whl (161.9 kB view details)

Uploaded May 21, 2026 Python 3

File details

Details for the file vectora_agent-0.1.0rc1.tar.gz.

File metadata

Download URL: vectora_agent-0.1.0rc1.tar.gz
Upload date: May 21, 2026
Size: 356.2 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for vectora_agent-0.1.0rc1.tar.gz
Algorithm	Hash digest
SHA256	`b9290b8941c3d3643c5e5e75256af744584bd27f654dfaa556f5146be31d5800`
MD5	`252442f24b1ce0643da6b60d68401683`
BLAKE2b-256	`5367892dadeb7df97e3dd4616ce7d9425504dc81d660cb9b03971d6e97221d41`

See more details on using hashes here.

Provenance

The following attestation bundles were made for vectora_agent-0.1.0rc1.tar.gz:

Publisher: runner.yml on brunosrz/vectora

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: vectora_agent-0.1.0rc1.tar.gz
- Subject digest: b9290b8941c3d3643c5e5e75256af744584bd27f654dfaa556f5146be31d5800
- Sigstore transparency entry: 1588994027
- Sigstore integration time: May 21, 2026
Source repository:
- Permalink: brunosrz/vectora@5f35a3b826da9edd55c43b820148ecd385cf4462
- Branch / Tag: refs/tags/v0.1.0rc1
- Owner: https://github.com/brunosrz
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: runner.yml@5f35a3b826da9edd55c43b820148ecd385cf4462
- Trigger Event: push

File details

Details for the file vectora_agent-0.1.0rc1-py3-none-any.whl.

File metadata

Download URL: vectora_agent-0.1.0rc1-py3-none-any.whl
Upload date: May 21, 2026
Size: 161.9 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for vectora_agent-0.1.0rc1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`ecbf92e6b07a54b40ba9d2a6848c894f81ae1c20d18f7c1a4c75739a31509dc2`
MD5	`6d0667ee7e11995f154ad858677e538a`
BLAKE2b-256	`a29a01288fc48c7f8652d29109d086916706bad62518abaf3a4021efa53983d2`

See more details on using hashes here.

Provenance

The following attestation bundles were made for vectora_agent-0.1.0rc1-py3-none-any.whl:

Publisher: runner.yml on brunosrz/vectora

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: vectora_agent-0.1.0rc1-py3-none-any.whl
- Subject digest: ecbf92e6b07a54b40ba9d2a6848c894f81ae1c20d18f7c1a4c75739a31509dc2
- Sigstore transparency entry: 1588994058
- Sigstore integration time: May 21, 2026
Source repository:
- Permalink: brunosrz/vectora@5f35a3b826da9edd55c43b820148ecd385cf4462
- Branch / Tag: refs/tags/v0.1.0rc1
- Owner: https://github.com/brunosrz
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: runner.yml@5f35a3b826da9edd55c43b820148ecd385cf4462
- Trigger Event: push

vectora-agent 0.1.0rc1

Navigation

Verified details

Maintainers

Meta

Unverified details

Meta

Project description

Vectora

Why Vectora?

Architecture

Supervisor + Workers

RAG Subgraph

Cascading Embeddings

Prerequisites

Cohere — Required

Tavily — Required

LLM Provider — Choose One

Installation

Option 1: UV (Recommended)

Option 2: From Source

Option 3: Docker

Running Modes

Chat Mode (Interactive TUI)

MCP Server — Local (stdio)

MCP Server — Remote (SSE, Multi-Agent)

Setup Wizard

Connecting to Claude Code / Claude Desktop

Chat Commands

Tools Reference

Data & Persistence

Tech Stack

Configuration

License

Project details

Verified details

Maintainers

Meta

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance