Skip to main content

pgraf turns Postgres into a lightning fast property graph engine for use in AI Agents

This project has been archived.

The maintainers of this project have marked this project as archived. No new releases are expected.

Project description

pgraf

PyPI version Documentation Python Version License

pgraf turns PostgreSQL into a lightning fast property graph engine with vector search capabilities, designed for use in AI agents and applications.

Features

  • Typed Models: Strong typing with Pydantic models for nodes, edges, and content
  • Vector Search: Built-in support for embeddings and semantic search
  • Property Graph: Full property graph capabilities with typed nodes and labeled edges
  • Asynchronous API: Modern async/await API for high-performance applications
  • PostgreSQL Backend: Uses PostgreSQL's power for reliability and scalability

📚 Documentation | 🚀 Quick Start | 📖 API Reference

Installation

Prerequisites

  • Python 3.12+
  • PostgreSQL 14+ with pgvector extension installed

Installing pgraf

# From PyPI
pip install pgraf

# From source
git clone https://github.com/gmr/pgraf.git
cd pgraf
pip install -e .

Database Setup

  1. Create a database:

    createdb pgraf
    
  2. Apply the schema (includes pgvector extension creation):

    psql -d pgraf -f schema/pgraf.sql
    

The schema file creates the pgvector extension, necessary tables, indexes, and stored procedures for the graph functionality.

Usage

Basic Example

import asyncio
from pgraf import graph

async def main():
    # Initialize the graph with PostgreSQL connection
    pgraf = graph.PGraf(url="postgresql://postgres:postgres@localhost:5432/pgraf")

    try:
        # Add a simple node
        person = await pgraf.add_node(
            labels=["person"],
            properties={"name": "Alice", "age": 30}
        )

        # Add a node with content and vector embeddings
        document = await pgraf.add_node(
            labels=["document"],
            properties={
                "tags": ["example"],
                "title": "Sample Document",
                "url": "https://example.com"
            },
            mimetype="text/plain",
            content="This is a sample document that will be embedded in vector space."
        )

        # Create a relationship between nodes
        await pgraf.add_edge(
            source=person.id,
            target=document.id,
            labels=["CREATED"],
            properties={"timestamp": "2023-01-01"}
        )

        # Retrieve nodes
        all_people = []
        async for node in pgraf.get_nodes(
            labels=["person"],
            properties={"name": "Alice"}
        ):
            all_people.append(node)

        # Traverse the graph
        traversal_results = await pgraf.traverse(
            start_node=person.id,
            edge_labels=["CREATED"],
            direction="outgoing",
            max_depth=2
        )

        # Print traversal results
        for node, edge in traversal_results:
            print(f"Node: {node.labels[0] if node.labels else 'Unknown'} {node.id}")
            if edge:
                print(f"  via edge: {edge.labels[0] if edge.labels else 'Unknown'}")

    finally:
        await pgraf.aclose()


if __name__ == "__main__":
    asyncio.run(main())

Semantic Search Example

import asyncio
from pgraf import graph, models
from sentence_transformers import SentenceTransformer

async def main():
    # Initialize the graph
    pgraf = graph.PGraf(url="postgresql://postgres:postgres@localhost:5432/pgraf")

    try:
        # Add some documents with content for vector embedding
        await pgraf.add_node(
            labels=["document"],
            properties={"title": "Climate Change Overview"},
            content="Climate change is the long-term alteration of temperature and weather patterns.",
            mimetype="text/plain"
        )

        await pgraf.add_node(
            labels=["document"],
            properties={"title": "Machine Learning Basics"},
            content="Machine learning is a branch of AI focused on building models that learn from data.",
            mimetype="text/plain"
        )

        await pgraf.add_node(
            labels=["document"],
            properties={"title": "Graph Databases"},
            content="Graph databases store data in nodes and edges, representing entities and relationships.",
            mimetype="text/plain"
        )

        # No need to explicitly generate embeddings - they're created
        # automatically when nodes with content are added

        # Perform semantic search
        # This automatically generates an embedding for the query text
        results = await pgraf.search(
            query="How do databases represent connections between data points?",
            labels=["document"],
            limit=2
        )

        # Print results sorted by relevance
        for result in results:
            print(f"Match: {result.properties.get('title')} (Score: {result.similarity:.4f})")
            print(f"Content: {result.content[:100]}...")
            print()

        # For custom queries, the search method automatically converts query to embeddings
        # You just need to provide the query text, and it will use the internal embedding model
        query_text = "AI techniques for data analysis"

        # The search method handles embedding generation internally
        custom_results = await pgraf.search(
            query=query_text,
            labels=["document"],
            similarity_threshold=0.3,  # Adjust similarity threshold as needed
            limit=2
        )

        for result in custom_results:
            print(f"Custom search match: {result.properties.get('title')} (Score: {result.similarity:.4f})")

    finally:
        await pgraf.aclose()

if __name__ == "__main__":
    asyncio.run(main())

Requirements

  • Python 3.12+
  • PostgreSQL 14+

License

See LICENSE for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pgraf-1.0.0a2.tar.gz (21.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pgraf-1.0.0a2-py3-none-any.whl (27.7 kB view details)

Uploaded Python 3

File details

Details for the file pgraf-1.0.0a2.tar.gz.

File metadata

  • Download URL: pgraf-1.0.0a2.tar.gz
  • Upload date:
  • Size: 21.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for pgraf-1.0.0a2.tar.gz
Algorithm Hash digest
SHA256 a8450e49b90ab2ddb90e97cbdc7ea4a7f32d7ea4caaa817446aa286e20690a99
MD5 09e6a8555aa9675eb0cc4bb9add2c3cd
BLAKE2b-256 b68eef20ac9e6f43ef16470d3e80865793f4aaff6854fa083a2e7f01406b1005

See more details on using hashes here.

File details

Details for the file pgraf-1.0.0a2-py3-none-any.whl.

File metadata

  • Download URL: pgraf-1.0.0a2-py3-none-any.whl
  • Upload date:
  • Size: 27.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for pgraf-1.0.0a2-py3-none-any.whl
Algorithm Hash digest
SHA256 7cbf9fac48b6828b54d9ad35ad4f0bf40dc77cfe2d2612c68f87e35f99fad0fe
MD5 8afa475f602004961212c111edfc1d9d
BLAKE2b-256 10201ff60847ac75edc2916ccaac210daa03cc910d3c2f77c1b1d441e7b02153

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page