pgraf turns Postgres into a lightning fast property graph engine for use in AI Agents
This project has been archived.
The maintainers of this project have marked this project as archived. No new releases are expected.
Project description
pgraf
pgraf turns PostgreSQL into a lightning fast property graph engine with vector search capabilities, designed for use in AI agents and applications.
Features
- Typed Models: Strong typing with Pydantic models for nodes, edges, and content
- Vector Search: Built-in support for embeddings and semantic search
- Property Graph: Full property graph capabilities with typed nodes and labeled edges
- Asynchronous API: Modern async/await API for high-performance applications
- PostgreSQL Backend: Uses PostgreSQL's power for reliability and scalability
📚 Documentation | 🚀 Quick Start | 📖 API Reference
Installation
Prerequisites
- Python 3.12+
- PostgreSQL 14+ with pgvector extension installed
Installing pgraf
# From PyPI
pip install pgraf
# From source
git clone https://github.com/gmr/pgraf.git
cd pgraf
pip install -e .
Database Setup
-
Create a database:
createdb pgraf -
Apply the schema (includes pgvector extension creation):
psql -d pgraf -f schema/pgraf.sql
The schema file creates the pgvector extension, necessary tables, indexes, and stored procedures for the graph functionality.
Usage
Basic Example
import asyncio
from pgraf import graph
async def main():
# Initialize the graph with PostgreSQL connection
pgraf = graph.PGraf(url="postgresql://postgres:postgres@localhost:5432/pgraf")
try:
# Add a simple node
person = await pgraf.add_node(
labels=["person"],
properties={"name": "Alice", "age": 30}
)
# Add a node with content and vector embeddings
document = await pgraf.add_node(
labels=["document"],
properties={
"tags": ["example"],
"title": "Sample Document",
"url": "https://example.com"
},
mimetype="text/plain",
content="This is a sample document that will be embedded in vector space."
)
# Create a relationship between nodes
await pgraf.add_edge(
source=person.id,
target=document.id,
labels=["CREATED"],
properties={"timestamp": "2023-01-01"}
)
# Retrieve nodes
all_people = []
async for node in pgraf.get_nodes(
labels=["person"],
properties={"name": "Alice"}
):
all_people.append(node)
# Traverse the graph
traversal_results = await pgraf.traverse(
start_node=person.id,
edge_labels=["CREATED"],
direction="outgoing",
max_depth=2
)
# Print traversal results
for node, edge in traversal_results:
print(f"Node: {node.labels[0] if node.labels else 'Unknown'} {node.id}")
if edge:
print(f" via edge: {edge.labels[0] if edge.labels else 'Unknown'}")
finally:
await pgraf.aclose()
if __name__ == "__main__":
asyncio.run(main())
Semantic Search Example
import asyncio
from pgraf import graph, models
from sentence_transformers import SentenceTransformer
async def main():
# Initialize the graph
pgraf = graph.PGraf(url="postgresql://postgres:postgres@localhost:5432/pgraf")
try:
# Add some documents with content for vector embedding
await pgraf.add_node(
labels=["document"],
properties={"title": "Climate Change Overview"},
content="Climate change is the long-term alteration of temperature and weather patterns.",
mimetype="text/plain"
)
await pgraf.add_node(
labels=["document"],
properties={"title": "Machine Learning Basics"},
content="Machine learning is a branch of AI focused on building models that learn from data.",
mimetype="text/plain"
)
await pgraf.add_node(
labels=["document"],
properties={"title": "Graph Databases"},
content="Graph databases store data in nodes and edges, representing entities and relationships.",
mimetype="text/plain"
)
# No need to explicitly generate embeddings - they're created
# automatically when nodes with content are added
# Perform semantic search
# This automatically generates an embedding for the query text
results = await pgraf.search(
query="How do databases represent connections between data points?",
labels=["document"],
limit=2
)
# Print results sorted by relevance
for result in results:
print(f"Match: {result.properties.get('title')} (Score: {result.similarity:.4f})")
print(f"Content: {result.content[:100]}...")
print()
# For custom queries, the search method automatically converts query to embeddings
# You just need to provide the query text, and it will use the internal embedding model
query_text = "AI techniques for data analysis"
# The search method handles embedding generation internally
custom_results = await pgraf.search(
query=query_text,
labels=["document"],
similarity_threshold=0.3, # Adjust similarity threshold as needed
limit=2
)
for result in custom_results:
print(f"Custom search match: {result.properties.get('title')} (Score: {result.similarity:.4f})")
finally:
await pgraf.aclose()
if __name__ == "__main__":
asyncio.run(main())
Requirements
- Python 3.12+
- PostgreSQL 14+
License
See LICENSE for details.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file pgraf-1.0.0a0.tar.gz.
File metadata
- Download URL: pgraf-1.0.0a0.tar.gz
- Upload date:
- Size: 19.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c86a091c6b1b2e695c09f3a0f0f26ea4f3a82c1286f0d69054e290ea1fba9da4
|
|
| MD5 |
6600e83f7e24b2045810f3b8e30690ef
|
|
| BLAKE2b-256 |
1bbd9f11a8a165f860007ea0f160a1918ea0409d65b92436edd1b4d2b1cea158
|
File details
Details for the file pgraf-1.0.0a0-py3-none-any.whl.
File metadata
- Download URL: pgraf-1.0.0a0-py3-none-any.whl
- Upload date:
- Size: 26.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
9760aafad23d7b5a8643e946b14218d2ce669c7742d6c7aba58ee62153f43410
|
|
| MD5 |
e625fbef2bf9b3d8eaa709c8c1740ab1
|
|
| BLAKE2b-256 |
a0a23be87479b321d58f6d5821a1aba701d462d3153ffbefd1fcdc1c2860c732
|