Skip to main content

Graph Pydantic DataBase - stores a graph of Pydantic objects in a Postgres database

Project description

gpdb

A Python library for storing graphs of typed, schema-validated data in PostgreSQL. Nodes and edges carry arbitrary JSON payloads backed by Pydantic models, JSONB columns, and async SQLAlchemy.

When to use it

Use gpdb when you need a lightweight graph-like data layer on top of Postgres without deploying a dedicated graph database. It's a good fit for applications that need:

  • Typed nodes and edges with flexible JSON data
  • Schema validation and versioned schema evolution
  • Binary payload storage on nodes (files, images, embeddings)
  • Filtered search with pagination over graph records
  • Multi-tenant table isolation via prefixes
  • An async-first Python API

Install

Choose the package that matches how you want to use GPDB:

  • pip install gpdb — core graph database library only
  • pip install gpdb-admin — installs the core library plus the gpdb admin command, web UI, REST API, and MCP server
  • pip install gpdb[dev] — core development dependencies
  • pip install gpdb-admin[dev] — full admin + development/test dependencies

Requires Python 3.9+.

The core gpdb package uses your PostgreSQL database. The optional gpdb-admin package adds an admin runtime on top of the core library.

Quick start

from gpdb import GPGraph, NodeUpsert, EdgeUpsert, SearchQuery, Filter, Op

db = GPGraph("postgresql://user:pass@localhost/mydb")
await db.create_tables()

# Create nodes
alice = (await db.set_nodes([NodeUpsert(type="user", name="alice", data={"role": "admin"})]))[0]
bob = (await db.set_nodes([NodeUpsert(type="user", name="bob", data={"role": "member"})]))[0]

# Connect them with an edge
await db.set_edges([EdgeUpsert(type="follows", source_id=alice.id, target_id=bob.id)])

# Search
result = await db.search_nodes(
    SearchQuery(filter=Filter(field="type", op=Op.EQ, value="user"), limit=10)
)
for node in result.items:
    print(node.name, node.data)

Admin add-on

gpdb-admin is an optional package that layers an admin runtime on top of the core graph library. When installed, it provides:

  • the gpdb console command
  • a browser-based admin app
  • a REST API under /api
  • an MCP server exposed over Streamable HTTP
  • embeddable runtime for mounting into host applications

Admin install

pip install gpdb-admin

Import the admin module as gpdb.admin when gpdb-admin is installed.

Start the admin service

gpdb start

By default the admin service listens on 127.0.0.1:8747. You can override the bind address at startup:

gpdb start --host 0.0.0.0 --port 9000

Docker deployment

Mount a volume on the data dir so config and database persist. Use --data-dir /data (or set GPDB_DATA_DIR=/data):

docker run -e GPDB_PUBLIC_URL=https://gpdb.example.com \
  -p 8747:8747 \
  -v gpdb-data:/data \
  gpdb-admin \
  gpdb start --data-dir /data --host 0.0.0.0

Or in docker-compose.yml:

services:
  gpdb-admin:
    image: gpdb-admin
    ports:
      - "8747"
    environment:
      - GPDB_PUBLIC_URL=https://gpdb.example.com
    volumes:
      - gpdb-data:/data
    command: gpdb start --data-dir /data --host 0.0.0.0

The public_url is used when generating absolute URLs for:

  • Email notifications (if implemented)
  • API responses with links
  • Shared resource URLs
  • Webhook callbacks

If not set, the application uses relative URLs and auto-detects the base URL from request headers when behind a proxy.

Once the service is running, the current runtime exposes:

  • the admin web app at /
  • a health endpoint at /health
  • the REST status endpoint at POST /api/status
  • the MCP endpoint at /mcp/gpdb/mcp

The CLI also exposes the current status command directly:

gpdb status

First-run setup

On a fresh install, opening the admin web app takes you through initial owner setup. After the first owner account is created, the app requires login and uses an authenticated browser session for access to the admin pages.

Configuration

gpdb-admin uses a single data directory. All runtime state (config file and database) lives under that directory. You specify only the data dir; the config file is always admin.toml inside it.

Data directory is resolved in this order:

  1. --data-dir or -d (e.g. gpdb start --data-dir /data)
  2. GPDB_DATA_DIR environment variable
  3. The default user data dir for gpdb (platform-dependent, e.g. ~/.local/share/gpdb/admin)

The config file at {data_dir}/admin.toml can contain:

  • server.host
  • server.port
  • server.public_url — Optional public base URL (e.g., https://gpdb.example.com). Can also be set via GPDB_PUBLIC_URL environment variable. Used for generating absolute URLs in emails, API responses, and shared links.
  • auth.session_secret

At startup, gpdb-admin will generate and persist auth.session_secret automatically if it is missing and the data dir is writable.

Admin storage model

The admin runtime manages its own local data directory and starts a captive PostgreSQL instance for admin state. Admin identity data is stored using GPDB tables with the admin table prefix, separate from the application graph data you manage with the core library.

Embedding admin in host applications

The admin runtime can be embedded into existing ToolAccess-based applications using the AdminRuntime container and attach_admin_to_manager() function. This allows you to mount the admin UI and APIs under custom prefixes within your own application's ServerManager.

from gpdb.admin.entry import attach_admin_to_manager
from toolaccess import ServerManager

def build_main_manager() -> ServerManager:
    # Host creates its own manager
    manager = ServerManager(name="my-main-app")

    # Attach admin under /gpdb
    admin = attach_admin_to_manager(
        manager,
        http_root="/gpdb",
        api_path_prefix="/api",
        mcp_name="gpdb",
        cli_root_name=None,  # Host controls CLI
    )

    # Host can mount admin ToolServices into its own CLI
    # my_cli.mount(admin.graph_service)

    return manager

The AdminRuntime exposes the following ToolServices for host integration:

  • admin_service — Admin tools (status, etc.)
  • graph_service — Graph content tools
  • cli_api_key_service — CLI API key management
  • mcp_api_key_service — MCP API key management

Mount point parameters:

  • http_root — Web UI mount prefix (e.g., /gpdb)
  • api_path_prefix — REST API mount prefix (e.g., /api)
  • mcp_name — MCP server name (e.g., "gpdb" or "gpdb-admin")
  • cli_root_name — CLI root command (set to None to skip CLI creation)

When embedded, the admin runtime shares the host's ServerManager lifecycle, and upgrading gpdb-admin automatically updates all embedded surfaces.

Instance and Graph CRUD APIs

gpdb-admin provides full CRUD operations for managing PostgreSQL instances and graphs (table prefixes). All endpoints are available on REST, CLI, and MCP surfaces.

Instance APIs:

  • instance_list — List all managed instances
  • instance_get — Get a single instance by ID
  • instance_create — Create a new external PostgreSQL instance
  • instance_update — Update instance metadata and connection fields
  • instance_delete — Delete an instance and its graph metadata

Graph APIs:

  • graph_list — List all graphs (optionally filtered by instance)
  • graph_get — Get a single graph by ID
  • graph_create — Create a new graph (table prefix) within an instance
  • graph_update — Update graph display name
  • graph_delete — Delete a graph and drop its tables

For detailed API documentation including parameters, response models, error conditions, and examples, see Instance and Graph CRUD APIs.

Current scope

gpdb-admin provides a full admin surface over the core graph library. It supports:

  • Service — startup via gpdb start; health and status across CLI, REST, and MCP
  • Auth — first-run owner bootstrap; login/logout for the web app; API keys for CLI and MCP
  • Instances and graphs — create and manage Postgres instances and graphs (table prefixes)
  • Graph content — schemas (list/get/create/update/delete); nodes and edges (list/get/create/update/delete); node payload get/set; node and edge list endpoints accept an optional filter query parameter (DSL string, same syntax as the core Query DSL)
  • Surfaces — all graph and admin tools are exposed on REST, CLI, and MCP
  • Web UI — dashboard, graph browser, and forms for nodes, edges, and schemas

It does not yet provide a full multi-user administration console (e.g. roles and multiple admin users beyond the owner and API-key callers).

Core concepts

Nodes

Nodes are the primary records. Each has an id, type, optional name, and a data dict for arbitrary JSON. Nodes also support:

  • Parent-child hierarchyparent_id with a unique constraint on (parent_id, name)
  • Ownership — optional owner_id for access control patterns
  • Binary payloads — store bytes with auto-computed payload_size, payload_hash, plus optional payload_mime and payload_filename
  • Tags — a JSONB list for lightweight categorization
node = (await db.set_nodes([
    NodeUpsert(
    type="document",
    name="notes.md",
    parent_id=folder.id,
    data={"word_count": 350},
    tags=["draft", "personal"],
    payload=b"# My notes\n...",
    payload_mime="text/markdown",
    payload_filename="notes.md",
    )
]))[0]

Payloads are deferred by default — get_nodes() skips the blob, get_node_payloads() includes it.

Edges

Edges connect two nodes with a source_id and target_id, a type, and their own data and tags.

edge = (await db.set_edges([
    EdgeUpsert(
        type="authored",
        source_id=user.id,
        target_id=document.id,
        data={"timestamp": "2025-01-15"},
    )
]))[0]

Search

Search nodes or edges with filters, sorting, and pagination. Filters work on top-level columns and on nested JSONB paths using dot notation.

# Programmatic filters
result = await db.search_nodes(SearchQuery(
    filter=FilterGroup(logic=Logic.AND, filters=[
        Filter(field="type", op=Op.EQ, value="user"),
        Filter(field="data.role", op=Op.EQ, value="admin"),
    ]),
    sort=[Sort(field="created_at", desc=True)],
    limit=25,
))

# Or use the DSL string syntax
result = await db.search_nodes(SearchQuery(filter="type = user and data.role = admin"))

Query DSL

The DSL string syntax supports natural comparison operators for filtering nodes and edges:

# Equality (= or ==)
result = await db.search_nodes(SearchQuery(filter='name == "alice"'))
result = await db.search_nodes(SearchQuery(filter="type = user"))

# Not equal (!=)
result = await db.search_nodes(SearchQuery(filter='status != "deleted"'))

# Greater than (>) and greater than or equal (>=)
result = await db.search_nodes(SearchQuery(filter="age >= 18"))
result = await db.search_nodes(SearchQuery(filter="score > 100"))

# Less than (<) and less than or equal (<=)
result = await db.search_nodes(SearchQuery(filter="created_at < 2024-01-01"))
result = await db.search_nodes(SearchQuery(filter="price <= 50.00"))

# Contains (~) - case-insensitive substring match
result = await db.search_nodes(SearchQuery(filter='name ~ "john"'))

# In - match any value in a list
result = await db.search_nodes(SearchQuery(filter="type in (user, admin, guest)"))

# Combining conditions with and/or
result = await db.search_nodes(SearchQuery(filter='type = user and age >= 18'))
result = await db.search_nodes(SearchQuery(filter='status = active or role = admin'))

# Parentheses for grouping
result = await db.search_nodes(SearchQuery(filter='(type = user and active = true) or role = superuser'))

Supported operators:

Operator Aliases Meaning
= ==, :, eq Equal
!= ne Not equal
> gt, after Greater than
>= gte Greater than or equal
< lt, before Less than
<= lte Less than or equal
~ contains Contains (case-insensitive)
in Match any value in list

JSON path filtering:

Use dot notation to filter on nested JSONB data:

result = await db.search_nodes(SearchQuery(filter="data.role = admin"))
result = await db.search_nodes(SearchQuery(filter="data.metadata.version >= 2"))

Field projections are available via search_nodes_projection() for returning only selected columns.

Schemas

Register JSON schemas (or Pydantic models) to validate node or edge data on every write. Each schema is scoped to either nodes or edges and is versioned with automatic semver bumps.

from pydantic import BaseModel
from gpdb import NodeUpsert, SchemaUpsert

class UserData(BaseModel):
    role: str
    email: str | None = None

await db.set_schemas([SchemaUpsert(name="user_data", json_schema=UserData, kind="node")])

# This node's data will be validated against the schema
await db.set_nodes([NodeUpsert(
    type="user",
    schema_name="user_data",
    data={"role": "admin", "email": "a@b.com"},
)])

Schema updates are classified automatically:

  • Patch — description/title changes only
  • Minor — new optional fields (backward compatible)
  • Major — removed fields, type changes, or new required fields (rejected unless migrated)

Use migrate_schema() to atomically transform existing data and update the schema in one transaction.

Schema Visualization

Schemas support optional visualization options to improve graph readability:

Alias

Set a short display name or emoji for your schema that will appear in the graph viewer instead of the full schema name:

from gpdb import SchemaUpsert

schema = SchemaUpsert(
    name="user_profile",
    json_schema={"type": "object", "properties": {"name": {"type": "string"}}},
    kind="node",
    alias="👤 User",  # Short display name or emoji
)
await graph.set_schemas([schema])

SVG Icon

Add a custom SVG icon that will be displayed on nodes/edges in the graph viewer:

from gpdb import SchemaUpsert

schema = SchemaUpsert(
    name="user_profile",
    json_schema={"type": "object", "properties": {"name": {"type": "string"}}},
    kind="node",
    svg_icon='<svg viewBox="0 0 24 24" fill="currentColor"><circle cx="12" cy="12" r="10"/></svg>',
)
await graph.set_schemas([schema])

Display Order

The graph viewer displays schema information in this priority order:

  1. SVG icon (if set) — on the node body for node schemas; for edge schemas, a small icon sits at the edge midpoint (Cytoscape cannot paint images on the edge line itself)
  2. Alias (if set) - displayed as label text
  3. Schema name - fallback if neither alias nor icon is set

The /viewer/data JSON schemas map is keyed as node:<schema_name> or edge:<schema_name> so a node schema and an edge schema with the same name do not collide.

SVG Security & Requirements

  • Size limit: SVG icons must be 20KB or smaller
  • Sanitization: All SVGs are automatically sanitized to remove:
    • JavaScript code (<script> tags)
    • Event handlers (onclick, onload, etc.)
    • Dangerous attributes
  • Allowed elements: svg, path, circle, rect, ellipse, line, polyline, polygon, text, g, defs, use, symbol, marker, clipPath, mask, pattern, gradient, stop, linearGradient, radialGradient
  • Allowed attributes: Presentation attributes only (fill, stroke, stroke-width, transform, etc.)

Best Practices

  • Keep SVGs simple and performant
  • Use vector graphics that scale well
  • Consider both light and dark themes
  • Test icons at different sizes (32x32px recommended)
  • Use meaningful aliases or emojis for quick identification

For detailed SVG guidelines, see SVG Icon Guidelines.

Domain models (ODM)

Subclass NodeModel or EdgeModel for strongly-typed domain objects that serialize to/from the graph.

from gpdb import NodeModel

class User(NodeModel):
    node_type: str = "user"
    role: str = "member"
    email: str | None = None

user = User(role="admin", email="a@b.com")
created = (await db.set_nodes([user.to_upsert()]))[0]
loaded = User.from_read((await db.get_nodes([created.id]))[0])
print(loaded.role)  # "admin"

Transactions

Wrap multiple operations in an atomic transaction.

async with db.transaction():
    node = (await db.set_nodes([NodeUpsert(type="account")]))[0]
    await db.set_edges([EdgeUpsert(type="owns", source_id=owner.id, target_id=node.id)])

Table prefixes

Isolate data into separate tables by passing a table_prefix. Each prefix gets its own nodes, edges, and schemas tables.

main_db = GPGraph(url)
scratch = GPGraph(url, table_prefix="scratch")

await main_db.create_tables()   # creates: nodes, edges, schemas
await scratch.create_tables()   # creates: scratch_nodes, scratch_edges, scratch_schemas

API reference

Method Description
create_tables() Create tables (idempotent)
drop_tables() Drop this instance's tables
set_nodes([NodeUpsert]) Create or update nodes
get_nodes([ids]) Get nodes without payload
get_node_payloads([ids]) Get nodes with payload
get_node_payload(id) Get only the payload bytes
set_node_payload(id, bytes) Set payload on existing node
get_node_child(parent_id, name) Get child node by name
delete_nodes([ids]) Delete nodes
search_nodes(SearchQuery) Search nodes with filters/sort/pagination
search_nodes_projection(SearchQuery) Search with field projection
set_edges([EdgeUpsert]) Create or update edges
get_edges([ids]) Get edges
delete_edges([ids]) Delete edges
search_edges(SearchQuery) Search edges
search_edges_projection(SearchQuery) Search edges with field projection
set_schemas([SchemaUpsert]) Register or update node/edge JSON schemas
get_schemas([names]) Get schemas
delete_schemas([names]) Delete schemas (fails if in use)
list_schemas(kind=None) List all schema names, optionally filtered by kind
migrate_schema(name, func, schema, kind=None) Atomically migrate data + schema
transaction() Context manager for atomic operations

Dependencies

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

gpdb-0.4.0.tar.gz (80.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

gpdb-0.4.0-py3-none-any.whl (50.6 kB view details)

Uploaded Python 3

File details

Details for the file gpdb-0.4.0.tar.gz.

File metadata

  • Download URL: gpdb-0.4.0.tar.gz
  • Upload date:
  • Size: 80.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.7

File hashes

Hashes for gpdb-0.4.0.tar.gz
Algorithm Hash digest
SHA256 514ba2c52c141901f09fb2e9427104b2541ede26624943d351a9e7fdf2d4539e
MD5 98c0baf8c0d5e4ea0cf0fc951d268c6a
BLAKE2b-256 4bef264e46c376ec1cd951ba3aef4c966b37c58a8272e9ef4da716028528dfb0

See more details on using hashes here.

File details

Details for the file gpdb-0.4.0-py3-none-any.whl.

File metadata

  • Download URL: gpdb-0.4.0-py3-none-any.whl
  • Upload date:
  • Size: 50.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.7

File hashes

Hashes for gpdb-0.4.0-py3-none-any.whl
Algorithm Hash digest
SHA256 317addd161e929ec43815a4b653117310cd8c7a6124cf10dd668013bebfe849a
MD5 988c244785c73caec6dec60f4ce6158f
BLAKE2b-256 8f0509b48722806043c59bf13869d7f6e60b2acac922cfe45c0807d58c666c81

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page