PostgreSQL pgvector-based RAG memory system with MCP server
Project description
RAG Memory
A production-ready PostgreSQL + pgvector + Neo4j knowledge management system with dual storage for semantic search (RAG) and knowledge graphs. Works as an MCP server for AI agents and a standalone CLI tool.
⚡ Quick Start (30 minutes)
Open a new terminal and run:
git clone https://github.com/yourusername/rag-memory.git
cd rag-memory
python scripts/setup.py
This setup script will:
- ✅ Check you have Docker installed
- ✅ Start PostgreSQL and Neo4j containers
- ✅ Ask for your OpenAI API key
- ✅ Initialize your local knowledge base
- ✅ Install the
ragCLI tool
That's it! After setup completes, you'll have a working RAG Memory system ready to use.
What Is This?
RAG Memory combines two powerful databases for knowledge management:
- PostgreSQL + pgvector - Semantic search across document content (RAG layer)
- Neo4j - Entity relationships and knowledge graphs (KG layer)
Both databases work together automatically - when you ingest a document, it's indexed in both systems simultaneously.
Two ways to use it:
- MCP Server - Connect AI agents (Claude Desktop, Claude Code, Cursor) with 17 MCP tools
- CLI Tool - Direct command-line access for testing, automation, and bulk operations
Key capabilities:
- Semantic search with vector embeddings (pgvector + HNSW indexing)
- Knowledge graph queries for relationships and entities
- Web crawling and documentation ingestion
- Document chunking for large files
- Collection management for organizing knowledge
- Full document lifecycle (create, read, update, delete)
- Cross-platform configuration system
For Developers (Code Modifications)
If you want to modify the code:
# Clone repository
git clone https://github.com/yourusername/rag-memory.git
cd rag-memory
# Install dependencies
uv sync
# Copy environment template
cp .env.example .env
# Edit .env with your OPENAI_API_KEY
# Run tests or development commands
uv run pytest
uv run rag status
CLI Commands
Database & Status
rag init # Initialize database schema
rag status # Check database connection and stats
rag migrate # Run database migrations (Alembic)
Collection Management
rag collection create <name> --description TEXT # Description now required
rag collection list
rag collection info <name> # View stats and crawl history
rag collection update <name> --description TEXT # Update collection description
rag collection delete <name>
Document Ingestion
Text:
rag ingest text "content" --collection <name> [--metadata JSON]
Files:
rag ingest file <path> --collection <name>
rag ingest directory <path> --collection <name> --extensions .txt,.md [--recursive]
Web Pages:
# Analyze website structure first
rag analyze https://docs.example.com
# Crawl single page
rag ingest url https://docs.example.com --collection docs
# Crawl with link following
rag ingest url https://docs.example.com --collection docs --follow-links --max-depth 2
# Re-crawl to update content
rag recrawl https://docs.example.com --collection docs --follow-links --max-depth 2
Search
# Basic search
rag search "query" --collection <name>
# Advanced options
rag search "query" --collection <name> --limit 10 --threshold 0.7 --verbose --show-source
# Search with metadata filter
rag search "query" --metadata '{"topic":"python"}'
Document Management
# List documents
rag document list [--collection <name>]
# View document details
rag document view <ID> [--show-chunks] [--show-content]
# Update document (re-chunks and re-embeds)
rag document update <ID> --content "new content" [--title "title"] [--metadata JSON]
# Delete document
rag document delete <ID> [--confirm]
MCP Server for AI Agents
RAG Memory exposes 17 tools via Model Context Protocol (MCP) for AI agent integration.
Quick Setup
1. Run the setup script:
git clone https://github.com/yourusername/rag-memory.git
cd rag-memory
python scripts/setup.py
After setup, RAG Memory's MCP server is automatically running in Docker on port 8000.
2. Connect to Claude Code:
claude mcp add rag-memory --type sse --url http://localhost:8000/sse
3. Connect to Claude Desktop (optional):
Edit ~/Library/Application Support/Claude/claude_desktop_config.json:
{
"mcpServers": {
"rag-memory": {
"command": "rag-mcp-stdio",
"args": []
}
}
}
Then restart Claude Desktop.
4. Test: Ask your agent "List RAG Memory collections"
Available MCP Tools (17 Total)
Core RAG (3 tools):
search_documents- Semantic search across knowledge baselist_collections- Discover available collectionsingest_text- Add text content with auto-chunking
Collection Management (2 tools):
create_collection- Create new collections (description required)update_collection_description- Update existing collection descriptions
Document Management (4 tools):
list_documents- Browse documents with paginationget_document_by_id- Retrieve full source documentupdate_document- Edit existing documents (triggers re-chunking/re-embedding)delete_document- Remove outdated documents
Advanced Ingestion (5 tools):
get_collection_info- Collection stats and crawl historyanalyze_website- Sitemap analysis for planning crawlsingest_url- Crawl web pages with duplicate prevention (crawl/recrawl modes)ingest_file- Ingest from file systemingest_directory- Batch ingest from directories
See docs/MCP_SERVER_GUIDE.md for complete tool reference and examples.
Configuration System
RAG Memory uses a three-tier priority system for configuration:
- Environment variables (highest priority) - Set in your shell
- Project
.envfile (current directory only) - For developers - Global
~/.rag-memory-env(lowest priority) - For end users
For CLI usage: First run triggers interactive setup wizard
For MCP server: Configuration comes from MCP client config (not files)
See docs/ENVIRONMENT_VARIABLES.md for complete details.
Key Features
Vector Search with pgvector
- PostgreSQL 17 + pgvector extension
- HNSW indexing for fast approximate nearest neighbor search
- Vector normalization for accurate cosine similarity
- Optimized for 95%+ recall
Document Chunking
- Hierarchical text splitting (headers → paragraphs → sentences)
- ~1000 chars per chunk with 200 char overlap
- Preserves context across boundaries
- Each chunk independently embedded and searchable
- Source documents preserved for full context retrieval
Web Crawling
- Built on Crawl4AI for robust web scraping
- Sitemap.xml parsing for comprehensive crawls
- Follow internal links with configurable depth
- Duplicate prevention (crawl mode vs recrawl mode)
- Crawl metadata tracking (root URL, session ID, timestamp)
Collection Management
- Organize documents by topic/domain
- Many-to-many relationships (documents can belong to multiple collections)
- Search can be scoped to specific collection
- Collection statistics and crawl history
- Required descriptions for better organization (enforced by database constraint)
Full Document Lifecycle
- Create: Ingest from text, files, directories, URLs
- Read: Search chunks, retrieve full documents
- Update: Edit content with automatic re-chunking/re-embedding
- Delete: Remove outdated documents and their chunks
Architecture
Database Schema
Source documents and chunks:
source_documents- Full original documentsdocument_chunks- Searchable chunks with embeddings (vector[1536])collections- Named groupings (description required with NOT NULL constraint)chunk_collections- Junction table (N:M relationship)
Indexes:
- HNSW on
document_chunks.embeddingfor fast vector search - GIN on metadata columns for efficient JSONB queries
Migrations:
- Managed by Alembic (see
docs/DATABASE_MIGRATION_GUIDE.md) - Version tracking in
alembic_versiontable - Run migrations:
uv run rag migrate
Python Application
src/
├── cli.py # Command-line interface
├── core/
│ ├── config_loader.py # Three-tier environment configuration
│ ├── first_run.py # Interactive setup wizard
│ ├── database.py # PostgreSQL connection management
│ ├── embeddings.py # OpenAI embeddings with normalization
│ ├── collections.py # Collection CRUD operations
│ └── chunking.py # Document text splitting
├── ingestion/
│ ├── document_store.py # High-level document management
│ ├── web_crawler.py # Web page crawling (Crawl4AI)
│ └── website_analyzer.py # Sitemap analysis
├── retrieval/
│ └── search.py # Semantic search with pgvector
└── mcp/
├── server.py # MCP server (FastMCP)
└── tools.py # 14 MCP tool implementations
Documentation
- .reference/OVERVIEW.md - Quick overview for slash command
- .reference/MCP_QUICK_START.md - MCP setup guide
- docs/ENVIRONMENT_VARIABLES.md - Configuration system explained
- docs/MCP_SERVER_GUIDE.md - Complete MCP tool reference (14 tools)
- docs/DATABASE_MIGRATION_GUIDE.md - Database schema migration guide (Alembic)
- CLAUDE.md - Development guide and CLI reference
Prerequisites
- Docker & Docker Compose - For PostgreSQL database
- uv - Fast Python package manager (
curl -LsSf https://astral.sh/uv/install.sh | sh) - Python 3.12+ - Managed by uv
- OpenAI API Key - For embedding generation (https://platform.openai.com/api-keys)
Technology Stack
- Database: PostgreSQL 17 + pgvector extension
- Language: Python 3.12
- Package Manager: uv (Astral)
- Embedding Model: OpenAI text-embedding-3-small (1536 dims)
- Web Crawling: Crawl4AI (Playwright-based)
- MCP Server: FastMCP (Anthropic)
- CLI Framework: Click + Rich
- Testing: pytest
Cost Analysis
OpenAI text-embedding-3-small: $0.02 per 1M tokens
Example usage:
- 10,000 documents × 750 tokens avg = 7.5M tokens
- One-time embedding cost: $0.15
- Per-query cost: ~$0.00003 (negligible)
Extremely cost-effective for most use cases.
Development
Running Tests
uv run pytest # All tests
uv run pytest tests/test_embeddings.py # Specific file
Code Quality
uv run black src/ tests/ # Format
uv run ruff check src/ tests/ # Lint
Troubleshooting
Database connection errors
docker-compose ps # Check if running
docker-compose logs postgres # View logs
docker-compose restart # Restart
docker-compose down -v && docker-compose up -d # Reset
Configuration issues
# Check global config
cat ~/.rag-memory-env
# Re-run first-run wizard
rm ~/.rag-memory-env
rag status
# Check environment variables
env | grep -E '(DATABASE_URL|OPENAI_API_KEY)'
MCP server not showing in agent
- Check JSON syntax in MCP config (no trailing commas!)
- Verify both DATABASE_URL and OPENAI_API_KEY in
envsection - Check MCP logs:
~/Library/Logs/Claude/mcp*.log(macOS) - Restart AI agent completely (quit and reopen)
See docs/ENVIRONMENT_VARIABLES.md troubleshooting section for more.
License
MIT License - See LICENSE file for details.
Support
For help getting started:
- Run
/getting-startedslash command in Claude Code - Read .reference/OVERVIEW.md
- Check docs/ENVIRONMENT_VARIABLES.md
For MCP server setup:
For issues:
- Check troubleshooting sections above
- Review documentation in docs/ directory
- Check database logs:
docker-compose logs -f
Built with PostgreSQL + pgvector for production-grade semantic search.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file rag_memory-0.12.1.tar.gz.
File metadata
- Download URL: rag_memory-0.12.1.tar.gz
- Upload date:
- Size: 1.1 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ceb55833aa53a007dc58b7e7ecaf1ded550de906edbb1a6a8c1aa4ef12d1030d
|
|
| MD5 |
56da7e23f5bf9d2fb0e028b0a6ef3446
|
|
| BLAKE2b-256 |
6d8d7a5c1ab87218573cb0a8604bfcd3d2e0013dac76f9470c0498c80fa0a503
|
File details
Details for the file rag_memory-0.12.1-py3-none-any.whl.
File metadata
- Download URL: rag_memory-0.12.1-py3-none-any.whl
- Upload date:
- Size: 119.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
995a48bbf02ab6f40564d2362b451496a9d018e519cd2871d86411b3663af568
|
|
| MD5 |
4cf0e9924b45ab3ecf5d0cb9455ee17e
|
|
| BLAKE2b-256 |
71cd899e9fed38b230b53fa23f60e0d10ea1eeffd490514ea3e70ed04d2548fd
|