Skip to main content

pgraf turns Postgres into a lightning fast property graph engine for use in AI Agents

This project has been archived.

The maintainers of this project have marked this project as archived. No new releases are expected.

Project description

pgraf

PyPI version Documentation Python Version License

pgraf turns PostgreSQL into a lightning fast property graph engine with vector search capabilities, designed for use in AI agents and applications.

Features

  • Typed Models: Strong typing with Pydantic models for nodes, edges, and content
  • Vector Search: Built-in support for embeddings and semantic search
  • Property Graph: Full property graph capabilities with typed nodes and labeled edges
  • Asynchronous API: Modern async/await API for high-performance applications
  • PostgreSQL Backend: Uses PostgreSQL's power for reliability and scalability

📚 Documentation | 🚀 Quick Start | 📖 API Reference

Installation

Prerequisites

  • Python 3.12+
  • PostgreSQL 14+ with pgvector extension installed

Installing pgraf

# From PyPI
pip install pgraf

# From source
git clone https://github.com/gmr/pgraf.git
cd pgraf
pip install -e .

Database Setup

  1. Create a database:

    createdb pgraf
    
  2. Apply the schema (includes pgvector extension creation):

    psql -d pgraf -f schema/pgraf.sql
    

The schema file creates the pgvector extension, necessary tables, indexes, and stored procedures for the graph functionality.

Usage

Basic Example

import asyncio
from pgraf import graph

async def main():
    # Initialize the graph with PostgreSQL connection
    pgraf = graph.PGraf(url="postgresql://postgres:postgres@localhost:5432/pgraf")

    try:
        # Add a simple node
        person = await pgraf.add_node(
            labels=["person"],
            properties={"name": "Alice", "age": 30}
        )

        # Add a node with content and vector embeddings
        document = await pgraf.add_node(
            labels=["document"],
            properties={
                "tags": ["example"],
                "title": "Sample Document",
                "url": "https://example.com"
            },
            mimetype="text/plain",
            content="This is a sample document that will be embedded in vector space."
        )

        # Create a relationship between nodes
        await pgraf.add_edge(
            source=person.id,
            target=document.id,
            labels=["CREATED"],
            properties={"timestamp": "2023-01-01"}
        )

        # Retrieve nodes
        all_people = []
        async for node in pgraf.get_nodes(
            labels=["person"],
            properties={"name": "Alice"}
        ):
            all_people.append(node)

        # Traverse the graph
        traversal_results = await pgraf.traverse(
            start_node=person.id,
            edge_labels=["CREATED"],
            direction="outgoing",
            max_depth=2
        )

        # Print traversal results
        for node, edge in traversal_results:
            print(f"Node: {node.labels[0] if node.labels else 'Unknown'} {node.id}")
            if edge:
                print(f"  via edge: {edge.labels[0] if edge.labels else 'Unknown'}")

    finally:
        await pgraf.aclose()


if __name__ == "__main__":
    asyncio.run(main())

Semantic Search Example

import asyncio
from pgraf import graph, models
from sentence_transformers import SentenceTransformer

async def main():
    # Initialize the graph
    pgraf = graph.PGraf(url="postgresql://postgres:postgres@localhost:5432/pgraf")

    try:
        # Add some documents with content for vector embedding
        await pgraf.add_node(
            labels=["document"],
            properties={"title": "Climate Change Overview"},
            content="Climate change is the long-term alteration of temperature and weather patterns.",
            mimetype="text/plain"
        )

        await pgraf.add_node(
            labels=["document"],
            properties={"title": "Machine Learning Basics"},
            content="Machine learning is a branch of AI focused on building models that learn from data.",
            mimetype="text/plain"
        )

        await pgraf.add_node(
            labels=["document"],
            properties={"title": "Graph Databases"},
            content="Graph databases store data in nodes and edges, representing entities and relationships.",
            mimetype="text/plain"
        )

        # No need to explicitly generate embeddings - they're created
        # automatically when nodes with content are added

        # Perform semantic search
        # This automatically generates an embedding for the query text
        results = await pgraf.search(
            query="How do databases represent connections between data points?",
            labels=["document"],
            limit=2
        )

        # Print results sorted by relevance
        for result in results:
            print(f"Match: {result.properties.get('title')} (Score: {result.similarity:.4f})")
            print(f"Content: {result.content[:100]}...")
            print()

        # For custom queries, the search method automatically converts query to embeddings
        # You just need to provide the query text, and it will use the internal embedding model
        query_text = "AI techniques for data analysis"

        # The search method handles embedding generation internally
        custom_results = await pgraf.search(
            query=query_text,
            labels=["document"],
            similarity_threshold=0.3,  # Adjust similarity threshold as needed
            limit=2
        )

        for result in custom_results:
            print(f"Custom search match: {result.properties.get('title')} (Score: {result.similarity:.4f})")

    finally:
        await pgraf.aclose()

if __name__ == "__main__":
    asyncio.run(main())

Requirements

  • Python 3.12+
  • PostgreSQL 14+

License

See LICENSE for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pgraf-1.0.0a0.tar.gz (19.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pgraf-1.0.0a0-py3-none-any.whl (26.3 kB view details)

Uploaded Python 3

File details

Details for the file pgraf-1.0.0a0.tar.gz.

File metadata

  • Download URL: pgraf-1.0.0a0.tar.gz
  • Upload date:
  • Size: 19.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for pgraf-1.0.0a0.tar.gz
Algorithm Hash digest
SHA256 c86a091c6b1b2e695c09f3a0f0f26ea4f3a82c1286f0d69054e290ea1fba9da4
MD5 6600e83f7e24b2045810f3b8e30690ef
BLAKE2b-256 1bbd9f11a8a165f860007ea0f160a1918ea0409d65b92436edd1b4d2b1cea158

See more details on using hashes here.

File details

Details for the file pgraf-1.0.0a0-py3-none-any.whl.

File metadata

  • Download URL: pgraf-1.0.0a0-py3-none-any.whl
  • Upload date:
  • Size: 26.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for pgraf-1.0.0a0-py3-none-any.whl
Algorithm Hash digest
SHA256 9760aafad23d7b5a8643e946b14218d2ce669c7742d6c7aba58ee62153f43410
MD5 e625fbef2bf9b3d8eaa709c8c1740ab1
BLAKE2b-256 a0a23be87479b321d58f6d5821a1aba701d462d3153ffbefd1fcdc1c2860c732

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page