Skip to main content

A lean GraphRAG library using Postgres/pgvector as the sole database

Project description

Postgres Graph RAG

A high-performance, Postgres-native GraphRAG library using a migration-safe "Forever Schema". No complex orchestration frameworks—just pure Python and SQL.

Core Philosophy

  • Infrastructure: Postgres is the only database (via pgvector).
  • Intelligence: Hosted SLMs (GPT-5.2 or Gemini 2.5) for extraction.
  • Simplicity: Native Async Python + SQL.
  • Scalability: High-performance connection pooling and namespace-aware design (Multi-tenancy).

Installation

Using uv (recommended):

uv sync --extra test

Or using pip:

pip install "postgres-graph-rag[pool]"

Getting Started (Interactive & Frictionless)

The library is designed to be interactive-friendly. You can instantiate it normally and use await at the top level in Notebooks or REPLs. The database connection pool is initialized lazily upon the first request.

from postgres_graph_rag import PostgresGraphRAG

# 1. Simple Instantiation
rag = PostgresGraphRAG(
    postgres_url="postgresql://user:password@localhost:5432/dbname",
    openai_api_key="sk-..." # Or use google_api_key
)

async def quick_start():
    # 2. Setup (Creates tables and pgvector extension if missing)
    await rag.setup()

    # 3. Add Knowledge (Atomic upserts with automatic entity resolution)
    await rag.add_texts(
        "Johny Srouji leads the hardware team at Apple.", 
        namespace="apple_research"
    )

    # 4. Hybrid Query (Vector Search + Recursive Graph Traversal)
    context = await rag.query(
        "Who is leading the hardware efforts?", 
        namespace="apple_research",
        hops=2
    )
    print(context)
    
    # 5. Cleanup (Closes the connection pool)
    await rag.close()

Advanced Usage & Modes

1. The Production Way: Async Context Manager

For applications (like FastAPI or background workers), use the async with pattern to ensure the connection pool is always closed correctly, even if errors occur.

async with PostgresGraphRAG(postgres_url=DSN, openai_api_key=KEY) as rag:
    await rag.add_texts("The M4 chip uses ARM architecture.")
    # No need to call rag.close(), it happens automatically!

2. Custom Chunking (Inversion of Control)

Don't like the default character splitter? Inject your own. You can pass any callable that takes a string and returns a list of strings.

from langchain_text_splitters import RecursiveCharacterTextSplitter

# Create your favorite chunker
splitter = RecursiveCharacterTextSplitter(chunk_size=500, chunk_overlap=50)

# Inject it into the library
rag = PostgresGraphRAG(
    postgres_url=DSN,
    openai_api_key=KEY,
    chunker=splitter.split_text # Just pass the method
)

3. Custom Provider Configuration

You can control exactly which models are used for extraction and embeddings.

from postgres_graph_rag.models import ProviderConfig

custom_config: ProviderConfig = {
    "extraction_model": "gpt-5-nano-2025-08-07",
    "embedding_model": "text-embedding-3-large",
    "dimension": 3072 # Must match the model's output
}

rag = PostgresGraphRAG(..., config=custom_config)

4. Multi-Tenancy (Namespacing)

Isolate data for different users or projects within the same database tables.

# User A's private graph
await rag.add_texts("My secret key is 123.", namespace="user_a")

# User B's private graph
await rag.add_texts("My secret key is 999.", namespace="user_b")

# Queries are strictly isolated
res = await rag.query("What is my key?", namespace="user_a") # Returns 123

Features under the Hood

  • Forever Schema: Uses graph_nodes and graph_edges with JSONB metadata. No ALTER TABLE Akrobatik needed for future metadata fields.
  • Connection Pooling: Uses psycopg_pool.AsyncConnectionPool for high-concurrency performance.
  • Cycle Detection: The recursive CTE uses a visited array to prevent infinite loops in complex graphs.
  • Atomic JSONB Merges: Metadata is merged using the Postgres || operator during ingestion, preserving historical data.

Dependencies

  • psycopg[pool]
  • pgvector
  • openai
  • google-genai
  • pandas
  • pydantic

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

postgres_graph_rag-0.1.0.tar.gz (82.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

postgres_graph_rag-0.1.0-py3-none-any.whl (9.4 kB view details)

Uploaded Python 3

File details

Details for the file postgres_graph_rag-0.1.0.tar.gz.

File metadata

  • Download URL: postgres_graph_rag-0.1.0.tar.gz
  • Upload date:
  • Size: 82.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.8.4

File hashes

Hashes for postgres_graph_rag-0.1.0.tar.gz
Algorithm Hash digest
SHA256 894086f20ee2d497cf0d6de5d35840201d6e7587fed128dc1711c7fb248000c9
MD5 8a5854cf7b80d1544d0abf7c095e2046
BLAKE2b-256 b867918b97e20ce7a00a1a86014529a9efee8b27af93b2b777c4d77403d3ed3f

See more details on using hashes here.

File details

Details for the file postgres_graph_rag-0.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for postgres_graph_rag-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 208c70670d9e36d8300bfc6a48b67bcfe814e09fd074d90aea30d8ae09450840
MD5 f9444aef68f70071f3a47ad9ff466038
BLAKE2b-256 69d9a63dc41be93c26edfed3d25519532647f964df2aea547fdf5dce3097f5bf

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page