The fastest, simplest way to add knowledge-graph-powered RAG to any app — backed by PostgreSQL.

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

These details have not been verified by PyPI

Project description

pg-raggraph

PostgreSQL-native GraphRAG. Vector search, full-text search, and knowledge-graph traversal — all in a single SQL query. No Neo4j. No Pinecone. No Apache AGE. Just the Postgres you already run.

What this is

pg-raggraph is a Python library for GraphRAG on plain PostgreSQL. You point it at a directory of documents, it ingests them — chunks, embeddings, entities, relationships, full-text index — and you get back a query API that combines vector similarity, BM25, and graph traversal. All retrieval happens in one round-trip to Postgres.

It is also a full toolkit around that library: a CLI (pgrg), an optional FastAPI server with a web UI, and an MCP server for Claude Desktop / Cursor / Zed.

Two retrieval workloads are first-class:

Classic GraphRAG — static corpora, code Q&A, technical docs, multi-hop entity reasoning. Validated at +18.9% accuracy lift over plain vector search on a real 909-doc dev codebase.
Evolving knowledge — corpora where the right answer depends on time, version, or retraction status. Validated on Python 3.10/3.11/3.12 docs (13/13 perfect version-filter purity) and PubMed HRT retractions (15/15 perfect on retraction-aware + time-travel queries).

Why it exists

Most GraphRAG today means stitching together two or three databases:

A vector DB (Pinecone, Weaviate, Qdrant) for semantic search.
A graph DB (Neo4j) for relationship traversal.
An orchestrator on top — LangChain, LlamaIndex, or hand-rolled.

That's three deploy targets, three connection pools, three sets of credentials, three failure modes, three vendors to negotiate with. And the killer GraphRAG operation — "find chunks similar to X, then expand via the entity graph" — needs at least two round-trips, often more, because vector and graph live in different worlds.

pg-raggraph proves you don't need any of that. PostgreSQL already has:

pgvector — vector similarity search with HNSW or IVFFlat indexes.
pg_trgm — trigram fuzzy matching, perfect for entity resolution.
Recursive CTEs — fast, well-indexed graph traversal that the planner understands.
tsvector + to_tsquery — production-grade full-text search with BM25-equivalent ranking.

Combine them in one SQL query and you have a complete GraphRAG stack. One ACID-compliant database. One backup story. One thing to monitor. Works on every managed Postgres — AWS RDS, Supabase, Neon, GCP Cloud SQL, Azure, self-hosted — anywhere modern PostgreSQL runs.

The thesis is decided by benchmark, not opinion. See Tests and benchmarks below.

Wait — isn't it called graphrag, not raggraph?

The name flip is deliberate. Most "GraphRAG" systems lead with the graph: docs get converted to entities and relationships up front, the graph is the corpus, and retrieval is graph-walks looking for relevant subgraphs. That's the Microsoft GraphRAG / LightRAG / Neo4j-GraphRAG model.

That model misreads what most corpora actually are. Documentation, technical articles, code, support tickets, papers, chat logs — none of these start out as graphs. They're prose. They answer most questions through plain semantic similarity. Forcing them through an entity-extraction pipeline first, then querying the resulting graph, adds latency, LLM cost, and information loss without buying you much for the bulk of queries.

pg-raggraph inverts the order. The graph is an enhancer, not the main attraction. A query starts as RAG — vector similarity + BM25 — and the graph layer kicks in only when retrieval needs help: re-ranking the top-K via 1-hop entity connectivity (naive_boost), or expanding to chunks reachable through entity relationships when the seed retrieval is weak (local / hybrid). Graph helps finish the story, not start it.

This isn't aesthetic preference. The bake-off confirms it: on clean technical corpora, graph-only retrieval modes don't beat plain vector + BM25. They earn their cost when the chunker is weak, when the corpus has cross-document entity reasoning, or when you need explainability and provenance trails. Calling it "raggraph" rather than "graphrag" reflects that ordering: RAG first, graph second, and only when it pays.

Quickstart — 5 minutes, works cold

This is verified to reproduce on a fresh clone. Every command is copy-pasteable.

# 1. Clone (pg-raggraph is alpha and not yet on PyPI)
git clone https://github.com/yonk-labs/pg_raggraph
cd pg_raggraph

# 2. Install Python deps (uv recommended; falls back to pip if you must)
uv sync

# 3. Start a local Postgres with pgvector + pg_trgm pre-installed
docker compose up -d postgres

# 4. Pick an LLM endpoint (skip if you only want pure vector RAG)
#    Option A — OpenAI:
export PGRG_LLM_BASE_URL=https://api.openai.com/v1
export PGRG_LLM_API_KEY=sk-...   # your key
export PGRG_LLM_MODEL=gpt-4o-mini

#    Option B — local Ollama (free):
# ollama pull llama3.2 && ollama serve   # leave running in another shell
# (PGRG defaults to Ollama at http://localhost:11434/v1, so no env needed)

# 5. Ingest a directory and ask questions
uv run pgrg devmem ingest ./my-repo/
uv run pgrg devmem ask "who owns the authentication service?"

If your LLM endpoint is up and your repo has docs/code, you'll see something like:

Found 12 files to process.
[1/12] README.md: 8 entities, 14 rels
[2/12] auth/service.py: 5 entities, 11 rels
...
Done: 12 ingested, 0 skipped. 87 entities, 156 relationships.

Answer: The authentication service is owned by the platform team.
Sarah Chen leads platform; auth.py was last touched by alex@acme.com
in commit 4f2c8a1 ("rotate JWT signing key").

Sources:
  [0.79] auth/README.md
  [0.71] team/platform.md
  [0.68] commits/4f2c8a1.md

That's the whole loop. From git clone to a grounded answer in five minutes.

One thing to know about pgrg serve — the bundled FastAPI web UI is for local development and demos only. It ships without authentication, rate limiting, or upload size caps. Do not expose it directly to the public internet. For production, put it behind a reverse proxy that adds auth, TLS, and rate limits — or embed create_app() in your own FastAPI application. See docs/user-guide.md#production-deployment for the recommended setup.

Tests and benchmarks

Real numbers from real corpora. No cherry-picking.

Classic GraphRAG — pg-agents real dev codebase (909 docs, 17K entities, 38K relationships):

Mode	Avg top score	Latency p50	vs naive
naive (vector + BM25)	0.602	109 ms	baseline
`naive_boost` ⭐	0.716	107 ms	+18.9%
`smart` (default)	0.716	127 ms	+18.9% at routing
local (graph traversal)	0.614	423 ms	+1.9%
hybrid (local + global)	0.614	482 ms	+1.9%

Evolving knowledge — versioned docs (benchmarks/python-versioned-docs/):

12 docs (Python 3.10 / 3.11 / 3.12), 1364 chunks, 15 hand-written gold questions.

Threshold	Result	Pass?
≥ 80% of `version_filter`-tagged Qs return top-5 chunks ONLY from matching version	100% (13/13)	✅
≥ 1 unfiltered_target Q has expected version in top-3	1/2	✅

Evolving knowledge — medical retractions (benchmarks/medical-hrt/):

48 PubMed abstracts on HRT + cardiovascular outcomes (1998–2025), 7 epistemically-retracted (WHI 2002 superseded the prior consensus), 15 hand-written gold questions.

Threshold	Result	Pass?
≥ 4/5 retraction_aware Qs return top-5 with zero retracted in `retracted_behavior="hide"` mode	5/5	✅
≥ 1/5 time-travel Qs (`as_of=2001-12-31`) return ≥1 pre-2002 paper in top-5	5/5	✅

Versus Apache AGE — SCOTUS bake-off (772 docs, 30 questions × 3 runs × 6 modes per engine):

Axis	pg-raggraph	Apache AGE
Accuracy (fully_correct/30)	17–18	17–18 (tie)
Retrieval p50 latency	32–73 ms	3,079–3,906 ms (42–111× slower)
Cloud compatibility	RDS, Supabase, Neon, Cloud SQL, Azure, self-host	Azure only

Full bake-off report: benchmarks/age-bakeoff/results/REPORT-VERDICT.md.

Test suite: 195 passing tests across tests/unit/ and tests/integration/, including a 15-test error-path suite that asserts specific exception types on bad DSNs, naive as_of, oversize /ingest, path traversal, etc. CI runs the full suite against pgvector containers on Python 3.12 and 3.13.

Where to go next

       ┌──────────────────────────────────────────────────┐
       │  I want to …                                     │
       ├──────────────────────────────────────────────────┤
       │  Pick the right workload         → USE-CASES.md  │
       │  Walk a worked example           → blog series   │
       │  Get the full API surface        → user-guide.md │
       │  Tier-1 evolving-knowledge       → cookbook      │
       │  Avoid common API gotchas        → API-QUICKREF  │
       │  Read the architecture decisions → research/     │
       │  See the unvarnished critique    → ASSESSMENT.md │
       └──────────────────────────────────────────────────┘

Document	What's inside
`docs/USE-CASES.md`	Decision matrix: classic GraphRAG vs evolving knowledge. Corpus shape → recommended config.
`docs/blogs/01-intro-classic-vs-evolving.md`	Series intro: two workloads, one Postgres database, when each one applies.
`docs/blogs/02-path-a-versioned-python-docs.md`	Walkthrough: ingest Python 3.10/3.11/3.12 docs, query with `version_filter`.
`docs/blogs/03-path-b-medical-retractions.md`	Walkthrough: ingest PubMed HRT abstracts, demonstrate `retracted_behavior` and `as_of`.
`docs/cookbook/evolution-tracking.md`	Tier 1 quickstart — `effective_from`, `retracted`, `version_label` ingest + query patterns.
`docs/EVOLUTION-API-QUICKREF.md`	Common assumptions vs reality for the Tier 1 API (which kwargs are per-query vs config-only, schema column locations, semantics of `as_of` × `retracted_at`).
`docs/user-guide.md`	Full user guide. Installation, all 6 modes, configuration, REST API, production deployment, troubleshooting.
`docs/devmem-guide.md`	`pgrg devmem` — the developer-knowledge-base flavor with code-aware chunking + dev-tuned extraction.
`research/`	Architecture rationale, vs-AGE evaluation, competitor analyses (LightRAG, Neo4j, Zep).
`ASSESSMENT.md`	No-BS project evaluation. Strengths, gaps, where you should and shouldn't use it.
`benchmarks/`	Every benchmark corpus + runner + results document. Re-runnable from clone.

The weeds

Below this line is the reference material — architecture, the retrieval-mode menu, every environment variable, the schema, and the prior-art rebuttals. Read on if you want to go deep; skip if you just want to get something working.

Architecture

graph TB
    subgraph PGRG["pg-raggraph (Python, ~4K LOC core)"]
        CLI[pgrg CLI]
        API[FastAPI server]
        MCP[MCP server]
        SDK[GraphRAG SDK]
        CLI --> SDK
        API --> SDK
        MCP --> SDK
        SDK --> ING[Ingestion Pipeline]
        SDK --> RET[Retrieval Engine]
        ING --> CHK[Chunker<br/>markdown / code / text]
        ING --> EMB[fastembed<br/>local 384-dim]
        ING --> EXT[LLM extractor<br/>OpenAI-compatible]
        ING --> RES[Entity resolver<br/>pg_trgm + vector]
        RET --> SM[Smart Router]
        SM --> NV[naive: vector + BM25]
        SM --> GB[graph boost: 1-hop re-rank]
        SM --> LC[local / global / hybrid:<br/>recursive CTEs]
    end
    subgraph PG["PostgreSQL 16+"]
        PGV[pgvector HNSW]
        PGT[pg_trgm GIN]
        FTS[tsvector full-text]
        TBL[(documents · chunks ·<br/>entities · relationships ·<br/>document_versions ·<br/>facts · fact_edges)]
    end
    NV --> PGV
    NV --> FTS
    GB --> TBL
    LC --> TBL
    RES --> PGT
    RES --> PGV

Two extensions — pgvector (vector search) and pg_trgm (built into Postgres in most builds). Auto-bootstrapped schema. Migrations applied on first connect under a per-project advisory lock. Everything else is plain SQL.

Retrieval modes

smart (the default) routes between three strategies based on confidence: ship-as-is when the naive top score is high, apply a cheap graph boost when medium, escalate to graph expansion when low. Manually pin to a specific mode with mode="..." if you know your access pattern.

Mode	What it does	Typical latency
`smart` ⭐	Routes between naive / boost / expand based on confidence	85–220 ms
`naive`	Vector similarity + BM25	~85 ms
`naive_boost`	Naive + 1-hop graph re-rank	~90 ms
`local`	Seed → recursive CTE traversal → rank	~220 ms
`global`	Relationship-centric retrieval	~150 ms
`hybrid`	local + global merged	~450 ms

Full deep-dive with selection guidance and per-mode SQL: docs/modes.md. Schema diagram + ER relationships: docs/user-guide.md#schema-overview.

Configuration (essentials)

All settings via env vars prefixed PGRG_ (also work as kwargs to GraphRAG(...)). The most-used ones:

Variable	Default	What it does
`PGRG_DSN`	`postgresql://postgres:postgres@localhost:5434/pg_raggraph`	Database connection. Refuses to start if `PGRG_ENV=production` and DSN unchanged.
`PGRG_NAMESPACE`	`default`	Data isolation key.
`PGRG_LLM_BASE_URL`	`http://localhost:11434/v1`	OpenAI-compatible LLM endpoint.
`PGRG_LLM_API_KEY`	`""`	Bearer token (empty for Ollama).
`PGRG_EVOLUTION_TIER`	`off`	`off` / `structural` (Tier 1 evolution-aware).
`PGRG_INGEST_PROFILE`	`balanced`	`conservative` / `balanced` / `aggressive` / `max`.
`PGRG_LOG_FORMAT`	(unset)	Set to `json` for structured logging (Datadog / ELK / Loki).
`PGRG_SERVER_API_KEY`	(unset)	Enables Bearer auth on the FastAPI server.

Full reference (~25 vars including evolution scoring weights, entity-resolution thresholds, server upload caps, Origin allowlists): docs/user-guide.md#configuration.

CLI reference

# Core
pgrg init                                # Bootstrap schema, verify connection
pgrg ingest PATH... [-n NS] [-p PROFILE] # Ingest files / directories
pgrg query "question" [-m MODE] [-n NS]  # Query (default: smart mode)
pgrg ask "question" [-m MODE] [-n NS]    # Query + grounded LLM answer
pgrg status [-n NS]                      # Graph statistics
pgrg delete -n NS                        # Delete a namespace's data

# Servers
pgrg serve --port 8080                   # FastAPI + web UI (local/demo only)
pgrg demo                                # Auto-ingest sample data + launch UI
pgrg mcp-serve                           # MCP stdio server for Claude Desktop / Cursor / Zed

# Developer-knowledge-base flavor (code-aware chunking + dev extraction prompt)
pgrg devmem ingest ./repo/ -p aggressive
pgrg devmem ask "who owns the auth service?"

Throttle profiles tune CPU-yield + parallel ingest knobs:

Profile	doc_concurrency	extract_concurrency	embed_batch_size	Use case
`conservative`	1	4	8	Shared servers, laptops on battery
`balanced`	2	8	16	Default — most dev machines
`aggressive`	4	16	32	Dedicated dev box
`max`	8	32	64	One-off batch jobs on a beefy machine

Why not Apache AGE?

We evaluated AGE (PostgreSQL's graph extension) before writing a line of code. We rejected it for four reasons:

Cloud killed. AGE requires shared_preload_libraries — only Azure supports it among managed providers. No RDS, Supabase, Neon, or Cloud SQL.
Can't combine with pgvector in a single query. AGE Cypher and pgvector live in different worlds. The killer GraphRAG operation needs two round-trips with AGE; one query with recursive CTEs.
Slower for GraphRAG patterns. Bake-off measurements: AGE is 42–111× slower on retrieval than recursive CTEs for the typical 1-3 hop pattern.
Production disaster. LightRAG Issue #2255: 17-hour migration with AGE caused by a query plan estimating 49 billion intermediate rows for a 681K-row join. Closed NOT_PLANNED.

Full analysis: research/apache-age-evaluation.md. Bake-off verdict: benchmarks/age-bakeoff/results/REPORT-VERDICT.md.

Comparison

	pg-raggraph	LightRAG	Neo4j GraphRAG	Zep
PostgreSQL-native	✅	AGE adapter (Azure only)	❌	❌
Single-query hybrid retrieval	✅	❌	❌	❌
Works on RDS / Supabase / Neon	✅	❌	n/a	n/a
License	MIT	MIT	Apache 2.0	Apache 2.0
Pricing	free	free	$65+/mo Aura	$1.25/1K msgs
Local embeddings by default	✅	✅	❌	❌
Directed relationships	✅	❌ (undirected)	✅	✅
Time-aware / retraction-aware	✅ Tier 1	❌	❌	partial
Stars	new	33K+	2K+	24.8K

Full feature matrix: research/competition-comparison.md.

Requirements

Python 3.12+
PostgreSQL 16+ with pgvector and pg_trgm extensions
(Recommended) An OpenAI-compatible LLM endpoint for entity extraction. Without one, ingest still works as pure-vector RAG and graph features stay empty.

License

MIT. See LICENSE.

Built with honest benchmarks and real corpora. Real numbers throughout this README come from benchmarks/ runs that ship with the repo — re-runnable from clone. The unvarnished evaluation is in ASSESSMENT.md.

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

TheYonk

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

0.3.0a2 pre-release

May 2, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pg_raggraph-0.3.0a2.tar.gz (91.3 kB view details)

Uploaded May 2, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

pg_raggraph-0.3.0a2-py3-none-any.whl (85.1 kB view details)

Uploaded May 2, 2026 Python 3

File details

Details for the file pg_raggraph-0.3.0a2.tar.gz.

File metadata

Download URL: pg_raggraph-0.3.0a2.tar.gz
Upload date: May 2, 2026
Size: 91.3 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: uv/0.11.8 {"installer":{"name":"uv","version":"0.11.8","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for pg_raggraph-0.3.0a2.tar.gz
Algorithm	Hash digest
SHA256	`bc3878b0194cd4ca08cdc6b75b5b6330cf6d82117c29fb6a9f6a439c8013da45`
MD5	`929738bc65b03b0985ec7bcbfd4a0e48`
BLAKE2b-256	`444ff0777dcba59cc0ef4ab20661bd2e334573988c40b9ae89b0e2abd7810c38`

See more details on using hashes here.

File details

Details for the file pg_raggraph-0.3.0a2-py3-none-any.whl.

File metadata

Download URL: pg_raggraph-0.3.0a2-py3-none-any.whl
Upload date: May 2, 2026
Size: 85.1 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: uv/0.11.8 {"installer":{"name":"uv","version":"0.11.8","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for pg_raggraph-0.3.0a2-py3-none-any.whl
Algorithm	Hash digest
SHA256	`b67508472d5e534acd439fbac2950c21f2e2a071cc3e1248da8395a88d7bdf60`
MD5	`f6dd3ca83924b9e87043efb29b80fbaf`
BLAKE2b-256	`635b705b66de7f07687a49d7e1769a90f3c26f4c7028dc10b8e657dc3cab0673`

See more details on using hashes here.

pg-raggraph 0.3.0a2

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

pg-raggraph

What this is

Why it exists

Wait — isn't it called graphrag, not raggraph?

Quickstart — 5 minutes, works cold

Tests and benchmarks

Where to go next

The weeds

Architecture

Retrieval modes

Configuration (essentials)

CLI reference

Why not Apache AGE?

Comparison

Requirements

License

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes