PostgreSQL pgvector-based RAG memory system with MCP server

These details have not been verified by PyPI

Project description

RAG Memory

A production-ready PostgreSQL + pgvector + Neo4j knowledge management system with dual storage for semantic search (RAG) and knowledge graphs. Works as an MCP server for AI agents and a standalone CLI tool.

⚡ Quick Start (30 minutes)

Open a new terminal and run:

git clone https://github.com/yourusername/rag-memory.git
cd rag-memory
python scripts/setup.py

This setup script will:

✅ Check you have Docker installed
✅ Start PostgreSQL and Neo4j containers
✅ Ask for your OpenAI API key
✅ Initialize your local knowledge base
✅ Install the rag CLI tool

That's it! After setup completes, you'll have a working RAG Memory system ready to use.

What Is This?

RAG Memory combines two powerful databases for knowledge management:

PostgreSQL + pgvector - Semantic search across document content (RAG layer)
Neo4j - Entity relationships and knowledge graphs (KG layer)

Both databases work together automatically - when you ingest a document, it's indexed in both systems simultaneously.

Two ways to use it:

MCP Server - Connect AI agents (Claude Desktop, Claude Code, Cursor) with 17 MCP tools
CLI Tool - Direct command-line access for testing, automation, and bulk operations

Key capabilities:

Semantic search with vector embeddings (pgvector + HNSW indexing)
Knowledge graph queries for relationships and entities
Web crawling and documentation ingestion
Document chunking for large files
Collection management for organizing knowledge
Full document lifecycle (create, read, update, delete)
Cross-platform configuration system

For Developers (Code Modifications)

If you want to modify the code:

# Clone repository
git clone https://github.com/yourusername/rag-memory.git
cd rag-memory

# Install dependencies
uv sync

# Copy environment template
cp .env.example .env
# Edit .env with your OPENAI_API_KEY

# Run tests or development commands
uv run pytest
uv run rag status

CLI Commands

Database & Status

rag init                      # Initialize database schema
rag status                    # Check database connection and stats
rag migrate                   # Run database migrations (Alembic)

Collection Management

rag collection create <name> --description TEXT  # Description now required
rag collection list
rag collection info <name>    # View stats and crawl history
rag collection update <name> --description TEXT  # Update collection description
rag collection delete <name>

Document Ingestion

Text:

rag ingest text "content" --collection <name> [--metadata JSON]

Files:

rag ingest file <path> --collection <name>
rag ingest directory <path> --collection <name> --extensions .txt,.md [--recursive]

Web Pages:

# Analyze website structure first
rag analyze https://docs.example.com

# Crawl single page
rag ingest url https://docs.example.com --collection docs

# Crawl with link following
rag ingest url https://docs.example.com --collection docs --follow-links --max-depth 2

# Re-crawl to update content
rag recrawl https://docs.example.com --collection docs --follow-links --max-depth 2

Search

# Basic search
rag search "query" --collection <name>

# Advanced options
rag search "query" --collection <name> --limit 10 --threshold 0.7 --verbose --show-source

# Search with metadata filter
rag search "query" --metadata '{"topic":"python"}'

Document Management

# List documents
rag document list [--collection <name>]

# View document details
rag document view <ID> [--show-chunks] [--show-content]

# Update document (re-chunks and re-embeds)
rag document update <ID> --content "new content" [--title "title"] [--metadata JSON]

# Delete document
rag document delete <ID> [--confirm]

MCP Server for AI Agents

RAG Memory exposes 17 tools via Model Context Protocol (MCP) for AI agent integration.

Quick Setup

1. Run the setup script:

git clone https://github.com/yourusername/rag-memory.git
cd rag-memory
python scripts/setup.py

After setup, RAG Memory's MCP server is automatically running in Docker on port 8000.

2. Connect to Claude Code:

claude mcp add rag-memory --type sse --url http://localhost:8000/sse

3. Connect to Claude Desktop (optional):

Edit ~/Library/Application Support/Claude/claude_desktop_config.json:

{
  "mcpServers": {
    "rag-memory": {
      "command": "rag-mcp-stdio",
      "args": []
    }
  }
}

Then restart Claude Desktop.

4. Test: Ask your agent "List RAG Memory collections"

Available MCP Tools (17 Total)

Core RAG (3 tools):

search_documents - Semantic search across knowledge base
list_collections - Discover available collections
ingest_text - Add text content with auto-chunking

Collection Management (2 tools):

create_collection - Create new collections (description required)
update_collection_description - Update existing collection descriptions

Document Management (4 tools):

list_documents - Browse documents with pagination
get_document_by_id - Retrieve full source document
update_document - Edit existing documents (triggers re-chunking/re-embedding)
delete_document - Remove outdated documents

Advanced Ingestion (5 tools):

get_collection_info - Collection stats and crawl history
analyze_website - Sitemap analysis for planning crawls
ingest_url - Crawl web pages with duplicate prevention (crawl/recrawl modes)
ingest_file - Ingest from file system
ingest_directory - Batch ingest from directories

See docs/MCP_SERVER_GUIDE.md for complete tool reference and examples.

Configuration System

RAG Memory uses a three-tier priority system for configuration:

Environment variables (highest priority) - Set in your shell
Project .env file (current directory only) - For developers
Global ~/.rag-memory-env (lowest priority) - For end users

For CLI usage: First run triggers interactive setup wizard

For MCP server: Configuration comes from MCP client config (not files)

See docs/ENVIRONMENT_VARIABLES.md for complete details.

Key Features

Vector Search with pgvector

PostgreSQL 17 + pgvector extension
HNSW indexing for fast approximate nearest neighbor search
Vector normalization for accurate cosine similarity
Optimized for 95%+ recall

Document Chunking

Hierarchical text splitting (headers → paragraphs → sentences)
~1000 chars per chunk with 200 char overlap
Preserves context across boundaries
Each chunk independently embedded and searchable
Source documents preserved for full context retrieval

Web Crawling

Built on Crawl4AI for robust web scraping
Sitemap.xml parsing for comprehensive crawls
Follow internal links with configurable depth
Duplicate prevention (crawl mode vs recrawl mode)
Crawl metadata tracking (root URL, session ID, timestamp)

Collection Management

Organize documents by topic/domain
Many-to-many relationships (documents can belong to multiple collections)
Search can be scoped to specific collection
Collection statistics and crawl history
Required descriptions for better organization (enforced by database constraint)

Full Document Lifecycle

Create: Ingest from text, files, directories, URLs
Read: Search chunks, retrieve full documents
Update: Edit content with automatic re-chunking/re-embedding
Delete: Remove outdated documents and their chunks

Architecture

Database Schema

Source documents and chunks:

source_documents - Full original documents
document_chunks - Searchable chunks with embeddings (vector[1536])
collections - Named groupings (description required with NOT NULL constraint)
chunk_collections - Junction table (N:M relationship)

Indexes:

HNSW on document_chunks.embedding for fast vector search
GIN on metadata columns for efficient JSONB queries

Migrations:

Managed by Alembic (see docs/DATABASE_MIGRATION_GUIDE.md)
Version tracking in alembic_version table
Run migrations: uv run rag migrate

Python Application

src/
├── cli.py                 # Command-line interface
├── core/
│   ├── config_loader.py   # Three-tier environment configuration
│   ├── first_run.py       # Interactive setup wizard
│   ├── database.py        # PostgreSQL connection management
│   ├── embeddings.py      # OpenAI embeddings with normalization
│   ├── collections.py     # Collection CRUD operations
│   └── chunking.py        # Document text splitting
├── ingestion/
│   ├── document_store.py  # High-level document management
│   ├── web_crawler.py     # Web page crawling (Crawl4AI)
│   └── website_analyzer.py # Sitemap analysis
├── retrieval/
│   └── search.py          # Semantic search with pgvector
└── mcp/
    ├── server.py          # MCP server (FastMCP)
    └── tools.py           # 14 MCP tool implementations

Documentation

.reference/OVERVIEW.md - Quick overview for slash command
.reference/MCP_QUICK_START.md - MCP setup guide
docs/ENVIRONMENT_VARIABLES.md - Configuration system explained
docs/MCP_SERVER_GUIDE.md - Complete MCP tool reference (14 tools)
docs/DATABASE_MIGRATION_GUIDE.md - Database schema migration guide (Alembic)
CLAUDE.md - Development guide and CLI reference

Prerequisites

Docker & Docker Compose - For PostgreSQL database
uv - Fast Python package manager (curl -LsSf https://astral.sh/uv/install.sh | sh)
Python 3.12+ - Managed by uv
OpenAI API Key - For embedding generation (https://platform.openai.com/api-keys)

Technology Stack

Database: PostgreSQL 17 + pgvector extension
Language: Python 3.12
Package Manager: uv (Astral)
Embedding Model: OpenAI text-embedding-3-small (1536 dims)
Web Crawling: Crawl4AI (Playwright-based)
MCP Server: FastMCP (Anthropic)
CLI Framework: Click + Rich
Testing: pytest

Cost Analysis

OpenAI text-embedding-3-small: $0.02 per 1M tokens

Example usage:

10,000 documents × 750 tokens avg = 7.5M tokens
One-time embedding cost: $0.15
Per-query cost: ~$0.00003 (negligible)

Extremely cost-effective for most use cases.

Development

Running Tests

uv run pytest                          # All tests
uv run pytest tests/test_embeddings.py # Specific file

Code Quality

uv run black src/ tests/               # Format
uv run ruff check src/ tests/          # Lint

Troubleshooting

Database connection errors

docker-compose ps                      # Check if running
docker-compose logs postgres           # View logs
docker-compose restart                 # Restart
docker-compose down -v && docker-compose up -d  # Reset

Configuration issues

# Check global config
cat ~/.rag-memory-env

# Re-run first-run wizard
rm ~/.rag-memory-env
rag status

# Check environment variables
env | grep -E '(DATABASE_URL|OPENAI_API_KEY)'

MCP server not showing in agent

Check JSON syntax in MCP config (no trailing commas!)
Verify both DATABASE_URL and OPENAI_API_KEY in env section
Check MCP logs: ~/Library/Logs/Claude/mcp*.log (macOS)
Restart AI agent completely (quit and reopen)

See docs/ENVIRONMENT_VARIABLES.md troubleshooting section for more.

License

MIT License - See LICENSE file for details.

Support

For help getting started:

Run /getting-started slash command in Claude Code
Read .reference/OVERVIEW.md
Check docs/ENVIRONMENT_VARIABLES.md

For MCP server setup:

See .reference/MCP_QUICK_START.md
Read docs/MCP_SERVER_GUIDE.md

For issues:

Check troubleshooting sections above
Review documentation in docs/ directory
Check database logs: docker-compose logs -f

Built with PostgreSQL + pgvector for production-grade semantic search.

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

0.25.1

Jan 19, 2026

0.25.0

Jan 15, 2026

0.24.0

Jan 14, 2026

0.23.0

Jan 14, 2026

0.22.0

Jan 6, 2026

0.21.0

Dec 23, 2025

0.20.0

Dec 4, 2025

0.19.4

Nov 10, 2025

0.19.3

Nov 10, 2025

0.19.2

Nov 10, 2025

0.19.1

Nov 10, 2025

0.19.0

Nov 10, 2025

0.18.0

Nov 10, 2025

0.17.0

Oct 31, 2025

0.16.0

Oct 30, 2025

0.15.0

Oct 30, 2025

0.14.0

Oct 30, 2025

0.13.0

Oct 28, 2025

This version

0.12.1

Oct 28, 2025

0.12.0

Oct 28, 2025

0.11.0

Oct 28, 2025

0.10.0

Oct 28, 2025

0.9.0

Oct 22, 2025

0.8.0

Oct 22, 2025

0.7.0

Oct 13, 2025

0.6.0

Oct 13, 2025

0.5.3

Oct 12, 2025

0.5.2

Oct 12, 2025

0.5.1

Oct 12, 2025

0.5.0

Oct 12, 2025

0.4.0

Oct 12, 2025

0.3.0

Oct 12, 2025

0.2.0

Oct 12, 2025

0.1.0

Oct 12, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

rag_memory-0.12.1.tar.gz (1.1 MB view details)

Uploaded Oct 28, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

rag_memory-0.12.1-py3-none-any.whl (119.3 kB view details)

Uploaded Oct 28, 2025 Python 3

File details

Details for the file rag_memory-0.12.1.tar.gz.

File metadata

Download URL: rag_memory-0.12.1.tar.gz
Upload date: Oct 28, 2025
Size: 1.1 MB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.12

File hashes

Hashes for rag_memory-0.12.1.tar.gz
Algorithm	Hash digest
SHA256	`ceb55833aa53a007dc58b7e7ecaf1ded550de906edbb1a6a8c1aa4ef12d1030d`
MD5	`56da7e23f5bf9d2fb0e028b0a6ef3446`
BLAKE2b-256	`6d8d7a5c1ab87218573cb0a8604bfcd3d2e0013dac76f9470c0498c80fa0a503`

See more details on using hashes here.

File details

Details for the file rag_memory-0.12.1-py3-none-any.whl.

File metadata

Download URL: rag_memory-0.12.1-py3-none-any.whl
Upload date: Oct 28, 2025
Size: 119.3 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.12

File hashes

Hashes for rag_memory-0.12.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`995a48bbf02ab6f40564d2362b451496a9d018e519cd2871d86411b3663af568`
MD5	`4cf0e9924b45ab3ecf5d0cb9455ee17e`
BLAKE2b-256	`71cd899e9fed38b230b53fa23f60e0d10ea1eeffd490514ea3e70ed04d2548fd`

See more details on using hashes here.

rag-memory 0.12.1

Navigation

Verified details

Maintainers

Unverified details

Meta

Classifiers

Project description

RAG Memory

⚡ Quick Start (30 minutes)

What Is This?

For Developers (Code Modifications)

CLI Commands

Database & Status

Collection Management

Document Ingestion

Search

Document Management

MCP Server for AI Agents

Quick Setup

Available MCP Tools (17 Total)

Configuration System

Key Features

Vector Search with pgvector

Document Chunking

Web Crawling

Collection Management

Full Document Lifecycle

Architecture

Database Schema

Python Application

Documentation

Prerequisites

Technology Stack

Cost Analysis

Development

Running Tests

Code Quality

Troubleshooting

Database connection errors

Configuration issues

MCP server not showing in agent

License

Support

Project details

Verified details

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes