Skip to main content

PostgreSQL pgvector-based RAG memory system with MCP server

Project description

RAG Memory

PyPI package Python License

A production-ready PostgreSQL + pgvector + Neo4j knowledge management system with dual storage for semantic search (RAG) and knowledge graphs.

Three Ways to Use RAG Memory

Interface Description Best For
Web Interface React + FastAPI conversational UI Interactive exploration, visual knowledge management
MCP Server 20 tools for AI agents via Model Context Protocol Claude Desktop, Claude Code, Cursor, AI integrations
CLI Tool Command-line interface Scripting, automation, bulk operations

⚡ Quick Start

Recommended: Interactive Setup with Claude Code

  1. Clone and open in Claude Code (or Cursor):

    git clone https://github.com/codingthefuturewithai/rag-memory.git
    cd rag-memory
    claude  # or cursor .
    
  2. Run the getting started guide:

    /getting-started
    

This interactive guide will walk you through:

  • Understanding RAG Memory concepts (semantic search, knowledge graphs)
  • Installing and configuring all components
  • Setting up your first collections
  • Testing search and graph queries

Alternative: Manual Setup

git clone https://github.com/codingthefuturewithai/rag-memory.git
cd rag-memory
uv sync                          # Install dependencies
source .venv/bin/activate        # Activate virtual environment (Linux/macOS)
# .venv\Scripts\activate         # Windows alternative
python scripts/setup.py

The setup script will:

  • ✅ Check you have Docker installed
  • ✅ Start PostgreSQL and Neo4j containers
  • ✅ Ask for your OpenAI API key
  • ✅ Initialize your local knowledge base
  • ✅ Install the rag CLI tool

Web Interface

RAG Memory includes a full React + FastAPI web application for conversational knowledge management.

📖 Full Web Interface Documentation

Quick Start (Web)

# From rag-memory root directory
python manage.py setup    # First time only: installs dependencies, runs migrations
python manage.py start    # Start all services

Access: http://localhost:5173

What It Provides

  • Conversational Interface - Chat-based interaction with a ReAct agent that dynamically chooses from 20 tools
  • 3-Column Layout - Collections sidebar | Chat interface | Document viewer
  • Web Search Integration - Discover and evaluate content before ingestion
  • Knowledge Graph Visualization - See entity relationships visually
  • Streaming Responses - Token-by-token SSE streaming
  • Conversation Persistence - PostgreSQL-backed conversation history

Architecture

frontend/          → React 19 + Mantine UI + Vite (port 5173)
backend/           → FastAPI + LangGraph + LangChain (port 8000)
manage.py          → Service orchestration script
                      ↓
               RAG Memory MCP Server (port 3001)
                      ↓
               PostgreSQL + Neo4j (knowledge storage)

Service Management

python manage.py status   # Check all services
python manage.py logs     # Tail all logs
python manage.py stop     # Stop all services
python manage.py restart  # Restart services

What Is This?

RAG Memory combines two powerful databases for knowledge management:

  • PostgreSQL + pgvector - Semantic search across document content (RAG layer)
  • Neo4j - Entity relationships and knowledge graphs (KG layer)

Both databases work together automatically - when you ingest a document, it's indexed in both systems simultaneously.

Three ways to use it:

  1. Web Interface - React + FastAPI conversational UI for interactive exploration
  2. MCP Server - Connect AI agents (Claude Desktop, Claude Code, Cursor) with 20 MCP tools
  3. CLI Tool - Direct command-line access for testing, automation, and bulk operations

Key capabilities:

  • Semantic search with vector embeddings (pgvector + HNSW indexing)
  • Knowledge graph queries for relationships and entities
  • Web crawling and documentation ingestion
  • Document chunking for large files
  • Collection management for organizing knowledge
  • Full document lifecycle (create, read, update, delete)
  • Cross-platform configuration system

📚 Complete Documentation: See .reference/ directory for comprehensive guides (setup, MCP tools, pricing, search optimization, knowledge graphs)

For Developers (Code Modifications)

If you want to modify the code:

# Clone repository
git clone https://github.com/yourusername/rag-memory.git
cd rag-memory

# Install dependencies
uv sync

# Copy environment template
cp .env.example .env
# Edit .env with your OPENAI_API_KEY

# Run tests or development commands
uv run pytest
uv run rag status

CLI Commands

Database & Status

rag status                    # Check database connection and stats

Instance Management (Multi-Instance Support)

RAG Memory supports running multiple isolated instances, each with its own PostgreSQL, Neo4j, and MCP server. Instances are automatically assigned unique ports.

# List all instances
rag instance list

# Start/create an instance
rag instance start primary                  # Start or create "primary" instance
rag instance start research                 # Create additional "research" instance

# Check instance status
rag instance status primary                 # Detailed health and port info

# View instance logs
rag instance logs primary                   # View logs for an instance
rag instance logs primary --service mcp     # View specific service logs

# Stop an instance (preserves data)
rag instance stop primary

# Delete an instance (removes containers and volumes)
rag instance delete research --force

Port Allocation:

  • Instance 1: PostgreSQL=54320, Neo4j=7687/7474, MCP=8000
  • Instance 2: PostgreSQL=54330, Neo4j=7688/7475, MCP=8001
  • Instance 3: PostgreSQL=54340, Neo4j=7689/7476, MCP=8002

Shortcuts: rag start, rag stop, rag status operate on the default (first) instance.

Collection Management

rag collection create <name> --description TEXT  # Description now required
rag collection list
rag collection info <name>    # View stats and crawl history
rag collection update <name> --description TEXT  # Update collection description
rag collection delete <name>

Document Ingestion

Text:

rag ingest text "content" --collection <name> [--metadata JSON]

Files:

rag ingest file <path> --collection <name>
rag ingest directory <path> --collection <name> --extensions .txt,.md [--recursive]

Web Pages:

# Analyze website structure first
rag analyze https://docs.example.com

# Crawl single page
rag ingest url https://docs.example.com --collection docs

# Crawl with link following
rag ingest url https://docs.example.com --collection docs --follow-links --max-depth 2

# Re-crawl to update content
rag recrawl https://docs.example.com --collection docs --follow-links --max-depth 2

Semantic Search (RAG Layer)

⚠️ IMPORTANT: Use Natural Language, Not Keywords This system uses semantic similarity search, not keyword matching. Always use complete questions or sentences:

  • ✅ Good: "How do I configure authentication in the system?"
  • ❌ Bad: "authentication configuration"
# Basic search
rag search "How do I configure authentication?" --collection <name>

# Advanced options
rag search "What are the best practices for error handling?" --collection <name> --limit 10 --threshold 0.7 --verbose --show-source

# Search with metadata filter
rag search "How do I use decorators in Python?" --metadata '{"topic":"python"}'

Knowledge Graph Search

Query Entity Relationships:

# Find connections between concepts
rag graph query-relationships "How does PostgreSQL relate to semantic search?" --limit 5

# With threshold tuning
rag graph query-relationships "What connects Docker to Kubernetes?" --threshold 0.5

# Scoped to collection
rag graph query-relationships "How do transformers relate to attention mechanisms?" --collection ai-docs

# Verbose output (shows node IDs, timestamps)
rag graph query-relationships "How does Python relate to machine learning?" --verbose

Query Temporal Evolution:

# See how knowledge changed over time
rag graph query-temporal "How has my understanding of quantum computing evolved?" --limit 10

# Filter by time window
rag graph query-temporal "What decisions did I make in December?" \
  --valid-from "2025-12-01T00:00:00" \
  --valid-until "2025-12-31T23:59:59"

# With confidence threshold
rag graph query-temporal "How has my focus changed?" --threshold 0.5 --collection business-docs

Document Management

# List documents
rag document list [--collection <name>]

# View document details
rag document view <ID> [--show-chunks] [--show-content]

# Update document (re-chunks and re-embeds)
rag document update <ID> --content "new content" [--title "title"] [--metadata JSON]

# Delete document
rag document delete <ID> [--confirm]

MCP Server for AI Agents

RAG Memory exposes 20 tools via Model Context Protocol (MCP) for AI agent integration.

Quick Setup

1. Run the setup script:

git clone https://github.com/yourusername/rag-memory.git
cd rag-memory
python scripts/setup.py

After setup, RAG Memory's MCP server is automatically running in Docker on port 8000.

2. Connect to Claude Code:

claude mcp add rag-memory --type sse --url http://localhost:8000/sse

3. Connect to Claude Desktop (optional):

Edit ~/Library/Application Support/Claude/claude_desktop_config.json:

{
  "mcpServers": {
    "rag-memory": {
      "command": "rag-mcp-stdio",
      "args": []
    }
  }
}

Then restart Claude Desktop.

4. Test: Ask your agent "List RAG Memory collections"

Available MCP Tools (20 Total)

Core RAG (3 tools):

  • search_documents - Semantic search across knowledge base
  • list_collections - Discover available collections
  • ingest_text - Add text content with auto-chunking

Knowledge Graph (2 tools):

  • query_relationships - Search entity relationships using natural language
  • query_temporal - Query how knowledge evolved over time

Collection Management (6 tools):

  • create_collection - Create new collections (description required)
  • get_collection_info - Collection stats and crawl history
  • get_collection_metadata_schema - View metadata schema for a collection
  • update_collection_metadata - Update collection metadata schema (additive only)
  • delete_collection - Delete collection and all its documents (admin function)
  • manage_collection_link - Link or unlink documents to/from collections

Document Management (4 tools):

  • list_documents - Browse documents with pagination
  • get_document_by_id - Retrieve full source document
  • update_document - Edit existing documents (triggers re-chunking/re-embedding)
  • delete_document - Remove outdated documents

Advanced Ingestion (5 tools):

  • analyze_website - Sitemap analysis for planning crawls
  • ingest_url - Crawl web pages with duplicate prevention (crawl/recrawl modes)
  • ingest_file - Ingest from file system
  • ingest_directory - Batch ingest from directories
  • list_directory - Browse directory contents before ingestion

See .reference/MCP_GUIDE.md for complete tool reference and examples.

Configuration System

RAG Memory uses a three-tier priority system for configuration:

  1. Environment variables (highest priority) - Set in your shell
  2. Project .env file (current directory only) - For developers
  3. Global ~/.rag-memory-env (lowest priority) - For end users

For CLI usage: First run triggers interactive setup wizard

For MCP server: Configuration comes from MCP client config (not files)

See docs/ENVIRONMENT_VARIABLES.md for complete details.

Key Features

Vector Search with pgvector

  • PostgreSQL 17 + pgvector extension
  • HNSW indexing for fast approximate nearest neighbor search
  • Vector normalization for accurate cosine similarity
  • Optimized for 95%+ recall

Document Chunking

  • Hierarchical text splitting (headers → paragraphs → sentences)
  • ~1000 chars per chunk with 200 char overlap
  • Preserves context across boundaries
  • Each chunk independently embedded and searchable
  • Source documents preserved for full context retrieval

Web Crawling

  • Built on Crawl4AI for robust web scraping
  • Sitemap.xml parsing for comprehensive crawls
  • Follow internal links with configurable depth
  • Duplicate prevention (crawl mode vs recrawl mode)
  • Crawl metadata tracking (root URL, session ID, timestamp)

Collection Management

  • Organize documents by topic/domain
  • Many-to-many relationships (documents can belong to multiple collections)
  • Search can be scoped to specific collection
  • Collection statistics and crawl history
  • Required descriptions for better organization (enforced by database constraint)

Full Document Lifecycle

  • Create: Ingest from text, files, directories, URLs
  • Read: Search chunks, retrieve full documents
  • Update: Edit content with automatic re-chunking/re-embedding
  • Delete: Remove outdated documents and their chunks

Architecture

Database Schema

Source documents and chunks:

  • source_documents - Full original documents
  • document_chunks - Searchable chunks with embeddings (vector[1536])
  • collections - Named groupings (description required with NOT NULL constraint)
  • chunk_collections - Junction table (N:M relationship)

Indexes:

  • HNSW on document_chunks.embedding for fast vector search
  • GIN on metadata columns for efficient JSONB queries

Migrations:

  • Managed by Alembic (see docs/DATABASE_MIGRATION_GUIDE.md)
  • Version tracking in alembic_version table
  • Run migrations: uv run rag migrate

Project Structure

rag-memory/
├── .claude/                   # Claude Code integration
│   ├── commands/              # Slash commands (7 commands)
│   │   ├── getting-started.md
│   │   ├── setup-collections.md
│   │   ├── capture.md
│   │   ├── dev-onboarding.md
│   │   ├── cloud-setup.md
│   │   ├── reference-audit.md
│   │   └── report-bug.md
│   └── hooks/                 # Automation hooks
│       └── rag-approval.py    # Ingest approval hook
│
├── .reference/                # Documentation (single source of truth)
│   ├── README.md              # Documentation index
│   ├── INSTALLATION.md        # Setup guide
│   ├── CLI_GUIDE.md           # CLI command reference
│   ├── MCP_GUIDE.md           # MCP tools reference
│   ├── WEB_INTERFACE.md       # Web UI documentation
│   ├── KNOWLEDGE_GRAPH.md     # Graph features
│   ├── VECTOR_SEARCH.md       # Semantic search
│   └── ...                    # Additional guides
│
├── frontend/                  # React web application
│   ├── src/
│   │   ├── components/        # UI components (chat, sidebar, viewer)
│   │   ├── rag/               # RAG-specific components and views
│   │   └── App.tsx            # Main application
│   └── package.json           # Node.js dependencies
│
├── backend/                   # FastAPI server
│   ├── app/
│   │   ├── main.py            # FastAPI application
│   │   ├── agent.py           # LangGraph ReAct agent
│   │   ├── tools/             # Python tools (web search, etc.)
│   │   └── shared/            # Conversation persistence
│   └── alembic/               # Backend database migrations
│
├── mcp-server/                # MCP server and CLI
│   ├── src/
│   │   ├── cli.py             # Command-line interface entry point
│   │   ├── cli_commands/      # CLI command implementations
│   │   ├── mcp/               # MCP server + 20 tool implementations
│   │   ├── core/              # Database, embeddings, config
│   │   ├── ingestion/         # Document store, web crawler
│   │   ├── retrieval/         # Vector similarity search
│   │   └── unified/           # RAG + Graph orchestration
│   └── tests/                 # Test suite (unit + integration)
│
├── graphiti-patched/          # Knowledge graph library (custom fork)
│   ├── graphiti_core/         # Core graph functionality
│   └── examples/              # Usage examples
│
├── deploy/                    # Docker and deployment
│   ├── docker/                # Dockerfiles and compose configs
│   └── alembic/               # Database migrations
│
├── scripts/                   # Setup and utility scripts
│   ├── setup.py               # Interactive setup wizard
│   └── db_migrate.py          # Multi-instance migration tool
│
├── docs/                      # Additional documentation
│   ├── ARCHITECTURE.md
│   ├── ENVIRONMENT_VARIABLES.md
│   └── DATABASE_MIGRATION_GUIDE.md
│
├── manage.py                  # Web interface service orchestration
├── pyproject.toml             # Python project config (uv)
├── CLAUDE.md                  # Claude Code development instructions
└── docker-compose*.yml        # Docker service definitions

Documentation

Prerequisites

  • Docker & Docker Compose - For PostgreSQL database
  • uv - Fast Python package manager (curl -LsSf https://astral.sh/uv/install.sh | sh)
  • Python 3.12+ - Managed by uv
  • OpenAI API Key - For embedding generation (https://platform.openai.com/api-keys)

Technology Stack

Core:

  • Database: PostgreSQL 17 + pgvector extension
  • Graph: Neo4j + Graphiti
  • Language: Python 3.12
  • Package Manager: uv (Astral)
  • Embedding Model: OpenAI text-embedding-3-small (1536 dims)

Interfaces:

  • Web Frontend: React 19 + Mantine UI + Vite
  • Web Backend: FastAPI + LangGraph + LangChain
  • MCP Server: FastMCP (Anthropic)
  • CLI: Click + Rich

Other:

  • Web Crawling: Crawl4AI (Playwright-based)
  • Testing: pytest

Cost Analysis

OpenAI text-embedding-3-small: $0.02 per 1M tokens

Example usage:

  • 10,000 documents × 750 tokens avg = 7.5M tokens
  • One-time embedding cost: $0.15
  • Per-query cost: ~$0.00003 (negligible)

Extremely cost-effective for most use cases.

Development

Running Tests

uv run pytest                                         # All tests
uv run pytest mcp-server/tests/test_embeddings.py    # Specific file

Code Quality

uv run black mcp-server/src/ mcp-server/tests/       # Format
uv run ruff check mcp-server/src/ mcp-server/tests/  # Lint

Troubleshooting

Database connection errors

docker-compose ps                      # Check if running
docker-compose logs postgres           # View logs
docker-compose restart                 # Restart
docker-compose down -v && docker-compose up -d  # Reset

Configuration issues

# Check global config
cat ~/.rag-memory-env

# Re-run first-run wizard
rm ~/.rag-memory-env
rag status

# Check environment variables
env | grep -E '(DATABASE_URL|OPENAI_API_KEY)'

MCP server not showing in agent

  • Check JSON syntax in MCP config (no trailing commas!)
  • Verify both DATABASE_URL and OPENAI_API_KEY in env section
  • Check MCP logs: ~/Library/Logs/Claude/mcp*.log (macOS)
  • Restart AI agent completely (quit and reopen)

See docs/ENVIRONMENT_VARIABLES.md troubleshooting section for more.

License

MIT License - See LICENSE file for details.

Support

For help getting started:

For MCP server setup:

For issues:

  • Check troubleshooting sections above
  • Review documentation in docs/ directory
  • Check database logs: docker-compose logs -f

Built with PostgreSQL + pgvector for production-grade semantic search.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

rag_memory-0.25.1.tar.gz (4.4 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

rag_memory-0.25.1-py3-none-any.whl (190.3 kB view details)

Uploaded Python 3

File details

Details for the file rag_memory-0.25.1.tar.gz.

File metadata

  • Download URL: rag_memory-0.25.1.tar.gz
  • Upload date:
  • Size: 4.4 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.12

File hashes

Hashes for rag_memory-0.25.1.tar.gz
Algorithm Hash digest
SHA256 972792d3df25062776ae571a70d55df6fe0da135aeeeadcf305128257b6b0dd2
MD5 eaff8171659e100b4265d38b7985a60c
BLAKE2b-256 4d0d236eb304150ce8795eb2671d40d9763706a11d40d999ed84d19698b6dd06

See more details on using hashes here.

File details

Details for the file rag_memory-0.25.1-py3-none-any.whl.

File metadata

  • Download URL: rag_memory-0.25.1-py3-none-any.whl
  • Upload date:
  • Size: 190.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.12

File hashes

Hashes for rag_memory-0.25.1-py3-none-any.whl
Algorithm Hash digest
SHA256 164525ab073572a15224785ac997876442623dfb1d61be2331989d254f2d02d0
MD5 cc29208a4afd2db7c942838a6d8b7669
BLAKE2b-256 378fd991a7b6e828e31d71b47a6003c2d4361778594d6232f5bdea25a0154415

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page