The fastest, simplest way to add knowledge-graph-powered RAG to any app — backed by PostgreSQL.
Project description
pg-raggraph
PostgreSQL-native GraphRAG. Vector search, full-text search, and knowledge-graph traversal — all in a single SQL query. No Neo4j. No Pinecone. No Apache AGE. Just the Postgres you already run.
What this is
pg-raggraph is a Python library for GraphRAG on plain PostgreSQL. You point it at a directory of documents, it ingests them — chunks, embeddings, entities, relationships, full-text index — and you get back a query API that combines vector similarity, BM25, and graph traversal. All retrieval happens in one round-trip to Postgres.
It is also a full toolkit around that library: a CLI (pgrg), an optional FastAPI server with a web UI, and an MCP server for Claude Desktop / Cursor / Zed.
Two retrieval workloads are first-class:
- Classic GraphRAG — static corpora, code Q&A, technical docs, multi-hop entity reasoning. Validated at +18.9% accuracy lift over plain vector search on a real 909-doc dev codebase.
- Evolving knowledge — corpora where the right answer depends on time, version, or retraction status. Validated on Python 3.10/3.11/3.12 docs (13/13 perfect version-filter purity) and PubMed HRT retractions (15/15 perfect on retraction-aware + time-travel queries).
Why it exists
Most GraphRAG today means stitching together two or three databases:
- A vector DB (Pinecone, Weaviate, Qdrant) for semantic search.
- A graph DB (Neo4j) for relationship traversal.
- An orchestrator on top — LangChain, LlamaIndex, or hand-rolled.
That's three deploy targets, three connection pools, three sets of credentials, three failure modes, three vendors to negotiate with. And the killer GraphRAG operation — "find chunks similar to X, then expand via the entity graph" — needs at least two round-trips, often more, because vector and graph live in different worlds.
pg-raggraph proves you don't need any of that. PostgreSQL already has:
- pgvector — vector similarity search with HNSW or IVFFlat indexes.
- pg_trgm — trigram fuzzy matching, perfect for entity resolution.
- Recursive CTEs — fast, well-indexed graph traversal that the planner understands.
tsvector+to_tsquery— production-grade full-text search with BM25-equivalent ranking.
Combine them in one SQL query and you have a complete GraphRAG stack. One ACID-compliant database. One backup story. One thing to monitor. Works on every managed Postgres — AWS RDS, Supabase, Neon, GCP Cloud SQL, Azure, self-hosted — anywhere modern PostgreSQL runs.
The thesis is decided by benchmark, not opinion. See Tests and benchmarks below.
Wait — isn't it called graphrag, not raggraph?
The name flip is deliberate. Most "GraphRAG" systems lead with the graph: docs get converted to entities and relationships up front, the graph is the corpus, and retrieval is graph-walks looking for relevant subgraphs. That's the Microsoft GraphRAG / LightRAG / Neo4j-GraphRAG model.
That model misreads what most corpora actually are. Documentation, technical articles, code, support tickets, papers, chat logs — none of these start out as graphs. They're prose. They answer most questions through plain semantic similarity. Forcing them through an entity-extraction pipeline first, then querying the resulting graph, adds latency, LLM cost, and information loss without buying you much for the bulk of queries.
pg-raggraph inverts the order. The graph is an enhancer, not the main attraction. A query starts as RAG — vector similarity + BM25 — and the graph layer kicks in only when retrieval needs help: re-ranking the top-K via 1-hop entity connectivity (naive_boost), or expanding to chunks reachable through entity relationships when the seed retrieval is weak (local / hybrid). Graph helps finish the story, not start it.
This isn't aesthetic preference. The bake-off confirms it: on clean technical corpora, graph-only retrieval modes don't beat plain vector + BM25. They earn their cost when the chunker is weak, when the corpus has cross-document entity reasoning, or when you need explainability and provenance trails. Calling it "raggraph" rather than "graphrag" reflects that ordering: RAG first, graph second, and only when it pays.
Quickstart — 5 minutes, works cold
This is verified to reproduce on a fresh clone. Every command is copy-pasteable.
# 1. Clone (pg-raggraph is alpha and not yet on PyPI)
git clone https://github.com/yonk-labs/pg_raggraph
cd pg_raggraph
# 2. Install Python deps (uv recommended; falls back to pip if you must)
uv sync
# 3. Start a local Postgres with pgvector + pg_trgm pre-installed
docker compose up -d postgres
# 4. Pick an LLM endpoint (skip if you only want pure vector RAG)
# Option A — OpenAI:
export PGRG_LLM_BASE_URL=https://api.openai.com/v1
export PGRG_LLM_API_KEY=sk-... # your key
export PGRG_LLM_MODEL=gpt-4o-mini
# Option B — local Ollama (free):
# ollama pull llama3.2 && ollama serve # leave running in another shell
# (PGRG defaults to Ollama at http://localhost:11434/v1, so no env needed)
# 5. Ingest a directory and ask questions
uv run pgrg devmem ingest ./my-repo/
uv run pgrg devmem ask "who owns the authentication service?"
If your LLM endpoint is up and your repo has docs/code, you'll see something like:
Found 12 files to process.
[1/12] README.md: 8 entities, 14 rels
[2/12] auth/service.py: 5 entities, 11 rels
...
Done: 12 ingested, 0 skipped. 87 entities, 156 relationships.
Answer: The authentication service is owned by the platform team.
Sarah Chen leads platform; auth.py was last touched by alex@acme.com
in commit 4f2c8a1 ("rotate JWT signing key").
Sources:
[0.79] auth/README.md
[0.71] team/platform.md
[0.68] commits/4f2c8a1.md
That's the whole loop. From git clone to a grounded answer in five minutes.
One thing to know about
pgrg serve— the bundled FastAPI web UI is for local development and demos only. It ships without authentication, rate limiting, or upload size caps. Do not expose it directly to the public internet. For production, put it behind a reverse proxy that adds auth, TLS, and rate limits — or embedcreate_app()in your own FastAPI application. Seedocs/user-guide.md#production-deploymentfor the recommended setup.
Tests and benchmarks
Real numbers from real corpora. No cherry-picking.
Classic GraphRAG — pg-agents real dev codebase (909 docs, 17K entities, 38K relationships):
| Mode | Avg top score | Latency p50 | vs naive |
|---|---|---|---|
| naive (vector + BM25) | 0.602 | 109 ms | baseline |
naive_boost ⭐ |
0.716 | 107 ms | +18.9% |
smart (default) |
0.716 | 127 ms | +18.9% at routing |
| local (graph traversal) | 0.614 | 423 ms | +1.9% |
| hybrid (local + global) | 0.614 | 482 ms | +1.9% |
Evolving knowledge — versioned docs (benchmarks/python-versioned-docs/):
12 docs (Python 3.10 / 3.11 / 3.12), 1364 chunks, 15 hand-written gold questions.
| Threshold | Result | Pass? |
|---|---|---|
≥ 80% of version_filter-tagged Qs return top-5 chunks ONLY from matching version |
100% (13/13) | ✅ |
| ≥ 1 unfiltered_target Q has expected version in top-3 | 1/2 | ✅ |
Evolving knowledge — medical retractions (benchmarks/medical-hrt/):
48 PubMed abstracts on HRT + cardiovascular outcomes (1998–2025), 7 epistemically-retracted (WHI 2002 superseded the prior consensus), 15 hand-written gold questions.
| Threshold | Result | Pass? |
|---|---|---|
≥ 4/5 retraction_aware Qs return top-5 with zero retracted in retracted_behavior="hide" mode |
5/5 | ✅ |
≥ 1/5 time-travel Qs (as_of=2001-12-31) return ≥1 pre-2002 paper in top-5 |
5/5 | ✅ |
Versus Apache AGE — SCOTUS bake-off (772 docs, 30 questions × 3 runs × 6 modes per engine):
| Axis | pg-raggraph | Apache AGE |
|---|---|---|
| Accuracy (fully_correct/30) | 17–18 | 17–18 (tie) |
| Retrieval p50 latency | 32–73 ms | 3,079–3,906 ms (42–111× slower) |
| Cloud compatibility | RDS, Supabase, Neon, Cloud SQL, Azure, self-host | Azure only |
Full bake-off report: benchmarks/age-bakeoff/results/REPORT-VERDICT.md.
Test suite: 195 passing tests across tests/unit/ and tests/integration/, including a 15-test error-path suite that asserts specific exception types on bad DSNs, naive as_of, oversize /ingest, path traversal, etc. CI runs the full suite against pgvector containers on Python 3.12 and 3.13.
Where to go next
┌──────────────────────────────────────────────────┐
│ I want to … │
├──────────────────────────────────────────────────┤
│ Pick the right workload → USE-CASES.md │
│ Walk a worked example → blog series │
│ Get the full API surface → user-guide.md │
│ Tier-1 evolving-knowledge → cookbook │
│ Avoid common API gotchas → API-QUICKREF │
│ Read the architecture decisions → research/ │
│ See the unvarnished critique → ASSESSMENT.md │
└──────────────────────────────────────────────────┘
| Document | What's inside |
|---|---|
docs/USE-CASES.md |
Decision matrix: classic GraphRAG vs evolving knowledge. Corpus shape → recommended config. |
docs/blogs/01-intro-classic-vs-evolving.md |
Series intro: two workloads, one Postgres database, when each one applies. |
docs/blogs/02-path-a-versioned-python-docs.md |
Walkthrough: ingest Python 3.10/3.11/3.12 docs, query with version_filter. |
docs/blogs/03-path-b-medical-retractions.md |
Walkthrough: ingest PubMed HRT abstracts, demonstrate retracted_behavior and as_of. |
docs/cookbook/evolution-tracking.md |
Tier 1 quickstart — effective_from, retracted, version_label ingest + query patterns. |
docs/EVOLUTION-API-QUICKREF.md |
Common assumptions vs reality for the Tier 1 API (which kwargs are per-query vs config-only, schema column locations, semantics of as_of × retracted_at). |
docs/user-guide.md |
Full user guide. Installation, all 6 modes, configuration, REST API, production deployment, troubleshooting. |
docs/devmem-guide.md |
pgrg devmem — the developer-knowledge-base flavor with code-aware chunking + dev-tuned extraction. |
research/ |
Architecture rationale, vs-AGE evaluation, competitor analyses (LightRAG, Neo4j, Zep). |
ASSESSMENT.md |
No-BS project evaluation. Strengths, gaps, where you should and shouldn't use it. |
benchmarks/ |
Every benchmark corpus + runner + results document. Re-runnable from clone. |
The weeds
Below this line is the reference material — architecture, the retrieval-mode menu, every environment variable, the schema, and the prior-art rebuttals. Read on if you want to go deep; skip if you just want to get something working.
Architecture
graph TB
subgraph PGRG["pg-raggraph (Python, ~4K LOC core)"]
CLI[pgrg CLI]
API[FastAPI server]
MCP[MCP server]
SDK[GraphRAG SDK]
CLI --> SDK
API --> SDK
MCP --> SDK
SDK --> ING[Ingestion Pipeline]
SDK --> RET[Retrieval Engine]
ING --> CHK[Chunker<br/>markdown / code / text]
ING --> EMB[fastembed<br/>local 384-dim]
ING --> EXT[LLM extractor<br/>OpenAI-compatible]
ING --> RES[Entity resolver<br/>pg_trgm + vector]
RET --> SM[Smart Router]
SM --> NV[naive: vector + BM25]
SM --> GB[graph boost: 1-hop re-rank]
SM --> LC[local / global / hybrid:<br/>recursive CTEs]
end
subgraph PG["PostgreSQL 16+"]
PGV[pgvector HNSW]
PGT[pg_trgm GIN]
FTS[tsvector full-text]
TBL[(documents · chunks ·<br/>entities · relationships ·<br/>document_versions ·<br/>facts · fact_edges)]
end
NV --> PGV
NV --> FTS
GB --> TBL
LC --> TBL
RES --> PGT
RES --> PGV
Two extensions — pgvector (vector search) and pg_trgm (built into Postgres in most builds). Auto-bootstrapped schema. Migrations applied on first connect under a per-project advisory lock. Everything else is plain SQL.
Retrieval modes
smart (the default) routes between three strategies based on confidence: ship-as-is when the naive top score is high, apply a cheap graph boost when medium, escalate to graph expansion when low. Manually pin to a specific mode with mode="..." if you know your access pattern.
| Mode | What it does | Typical latency |
|---|---|---|
smart ⭐ |
Routes between naive / boost / expand based on confidence | 85–220 ms |
naive |
Vector similarity + BM25 | ~85 ms |
naive_boost |
Naive + 1-hop graph re-rank | ~90 ms |
local |
Seed → recursive CTE traversal → rank | ~220 ms |
global |
Relationship-centric retrieval | ~150 ms |
hybrid |
local + global merged | ~450 ms |
Full deep-dive with selection guidance and per-mode SQL: docs/modes.md. Schema diagram + ER relationships: docs/user-guide.md#schema-overview.
Configuration (essentials)
All settings via env vars prefixed PGRG_ (also work as kwargs to GraphRAG(...)). The most-used ones:
| Variable | Default | What it does |
|---|---|---|
PGRG_DSN |
postgresql://postgres:postgres@localhost:5434/pg_raggraph |
Database connection. Refuses to start if PGRG_ENV=production and DSN unchanged. |
PGRG_NAMESPACE |
default |
Data isolation key. |
PGRG_LLM_BASE_URL |
http://localhost:11434/v1 |
OpenAI-compatible LLM endpoint. |
PGRG_LLM_API_KEY |
"" |
Bearer token (empty for Ollama). |
PGRG_EVOLUTION_TIER |
off |
off / structural (Tier 1 evolution-aware). |
PGRG_INGEST_PROFILE |
balanced |
conservative / balanced / aggressive / max. |
PGRG_LOG_FORMAT |
(unset) | Set to json for structured logging (Datadog / ELK / Loki). |
PGRG_SERVER_API_KEY |
(unset) | Enables Bearer auth on the FastAPI server. |
Full reference (~25 vars including evolution scoring weights, entity-resolution thresholds, server upload caps, Origin allowlists): docs/user-guide.md#configuration.
CLI reference
# Core
pgrg init # Bootstrap schema, verify connection
pgrg ingest PATH... [-n NS] [-p PROFILE] # Ingest files / directories
pgrg query "question" [-m MODE] [-n NS] # Query (default: smart mode)
pgrg ask "question" [-m MODE] [-n NS] # Query + grounded LLM answer
pgrg status [-n NS] # Graph statistics
pgrg delete -n NS # Delete a namespace's data
# Servers
pgrg serve --port 8080 # FastAPI + web UI (local/demo only)
pgrg demo # Auto-ingest sample data + launch UI
pgrg mcp-serve # MCP stdio server for Claude Desktop / Cursor / Zed
# Developer-knowledge-base flavor (code-aware chunking + dev extraction prompt)
pgrg devmem ingest ./repo/ -p aggressive
pgrg devmem ask "who owns the auth service?"
Throttle profiles tune CPU-yield + parallel ingest knobs:
| Profile | doc_concurrency | extract_concurrency | embed_batch_size | Use case |
|---|---|---|---|---|
conservative |
1 | 4 | 8 | Shared servers, laptops on battery |
balanced |
2 | 8 | 16 | Default — most dev machines |
aggressive |
4 | 16 | 32 | Dedicated dev box |
max |
8 | 32 | 64 | One-off batch jobs on a beefy machine |
Why not Apache AGE?
We evaluated AGE (PostgreSQL's graph extension) before writing a line of code. We rejected it for four reasons:
- Cloud killed. AGE requires
shared_preload_libraries— only Azure supports it among managed providers. No RDS, Supabase, Neon, or Cloud SQL. - Can't combine with pgvector in a single query. AGE Cypher and pgvector live in different worlds. The killer GraphRAG operation needs two round-trips with AGE; one query with recursive CTEs.
- Slower for GraphRAG patterns. Bake-off measurements: AGE is 42–111× slower on retrieval than recursive CTEs for the typical 1-3 hop pattern.
- Production disaster. LightRAG Issue #2255: 17-hour migration with AGE caused by a query plan estimating 49 billion intermediate rows for a 681K-row join. Closed
NOT_PLANNED.
Full analysis: research/apache-age-evaluation.md. Bake-off verdict: benchmarks/age-bakeoff/results/REPORT-VERDICT.md.
Comparison
| pg-raggraph | LightRAG | Neo4j GraphRAG | Zep | |
|---|---|---|---|---|
| PostgreSQL-native | ✅ | AGE adapter (Azure only) | ❌ | ❌ |
| Single-query hybrid retrieval | ✅ | ❌ | ❌ | ❌ |
| Works on RDS / Supabase / Neon | ✅ | ❌ | n/a | n/a |
| License | MIT | MIT | Apache 2.0 | Apache 2.0 |
| Pricing | free | free | $65+/mo Aura | $1.25/1K msgs |
| Local embeddings by default | ✅ | ✅ | ❌ | ❌ |
| Directed relationships | ✅ | ❌ (undirected) | ✅ | ✅ |
| Time-aware / retraction-aware | ✅ Tier 1 | ❌ | ❌ | partial |
| Stars | new | 33K+ | 2K+ | 24.8K |
Full feature matrix: research/competition-comparison.md.
Requirements
- Python 3.12+
- PostgreSQL 16+ with
pgvectorandpg_trgmextensions - (Recommended) An OpenAI-compatible LLM endpoint for entity extraction. Without one, ingest still works as pure-vector RAG and graph features stay empty.
License
MIT. See LICENSE.
Built with honest benchmarks and real corpora. Real numbers throughout this README come from benchmarks/ runs that ship with the repo — re-runnable from clone. The unvarnished evaluation is in ASSESSMENT.md.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file pg_raggraph-0.3.0a2.tar.gz.
File metadata
- Download URL: pg_raggraph-0.3.0a2.tar.gz
- Upload date:
- Size: 91.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: uv/0.11.8 {"installer":{"name":"uv","version":"0.11.8","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
bc3878b0194cd4ca08cdc6b75b5b6330cf6d82117c29fb6a9f6a439c8013da45
|
|
| MD5 |
929738bc65b03b0985ec7bcbfd4a0e48
|
|
| BLAKE2b-256 |
444ff0777dcba59cc0ef4ab20661bd2e334573988c40b9ae89b0e2abd7810c38
|
File details
Details for the file pg_raggraph-0.3.0a2-py3-none-any.whl.
File metadata
- Download URL: pg_raggraph-0.3.0a2-py3-none-any.whl
- Upload date:
- Size: 85.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: uv/0.11.8 {"installer":{"name":"uv","version":"0.11.8","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b67508472d5e534acd439fbac2950c21f2e2a071cc3e1248da8395a88d7bdf60
|
|
| MD5 |
f6dd3ca83924b9e87043efb29b80fbaf
|
|
| BLAKE2b-256 |
635b705b66de7f07687a49d7e1769a90f3c26f4c7028dc10b8e657dc3cab0673
|