Skip to main content

OpenCypher PetGraph - 100% openCypher-compliant graph database engine

Project description

ocpg - OpenCypher PetGraph

ocpg is the Python package for nxcypher, a 100% openCypher-compliant graph database engine written in Rust.

OpenCypher + PetGraph = Fast, memory-safe graph queries in Python.

Features

  • 100% openCypher TCK compliant (3,893 scenarios passing)
  • 🚀 High performance - Pure Rust implementation with Python bindings
  • 🔒 Memory safe - No unsafe code, guaranteed by Rust
  • 🧪 Well tested - Comprehensive test coverage
  • 📦 Easy to install - Single pip install command
  • 🐍 Pythonic API - Natural Python interface

Installation

pip install ocpg

Quick Start

import ocpg

# Create a graph
g = ocpg.Graph()

# Create nodes and relationships
g.execute("""
    CREATE (alice:Person {name: 'Alice', age: 30})
    CREATE (bob:Person {name: 'Bob', age: 25})
    CREATE (alice)-[:KNOWS]->(bob)
""")

# Query the graph
result = g.execute("MATCH (p:Person) RETURN p.name, p.age ORDER BY p.age")

# Iterate over results
for row in result:
    print(f"Name: {row['p.name']}, Age: {row['p.age']}")

# Or convert to list
data = result.to_list()
print(data)

API Reference

Graph

The main graph database class.

Methods

  • execute(query: str, params: dict = None) -> QueryResult

    Execute a Cypher query with optional parameters.

    result = g.execute(
        "MATCH (n:Person) WHERE n.age > $min_age RETURN n",
        {"min_age": 18}
    )
    
  • node_count() -> int

    Get the number of nodes in the graph.

  • edge_count() -> int

    Get the number of relationships in the graph.

  • clear()

    Remove all nodes and relationships from the graph.

  • create_node(labels: list[str], properties: dict = None) -> int 🚀 NEW

    Create a node directly without parsing Cypher (bulk loading). ~99x faster than CREATE queries.

    # Create a node and get its ID
    alice_id = g.create_node(["Person"], {"name": "Alice", "age": 30})
    bob_id = g.create_node(["Person"], {"name": "Bob", "age": 25})
    
    # No properties
    empty_id = g.create_node(["Label"])
    
  • create_relationship(from_id: int, to_id: int, rel_type: str, properties: dict = None) -> int 🚀 NEW

    Create a relationship directly without parsing Cypher (bulk loading). ~99x faster than CREATE queries.

    # Create nodes first
    alice_id = g.create_node(["Person"], {"name": "Alice"})
    bob_id = g.create_node(["Person"], {"name": "Bob"})
    
    # Create relationship between them
    edge_id = g.create_relationship(
        alice_id,
        bob_id,
        "KNOWS",
        {"since": 2020}
    )
    

QueryResult

Query result object containing rows and columns.

Properties

  • columns: list[str] - List of column names

Methods

  • __len__() -> int - Number of rows
  • __getitem__(index: int) -> dict - Get row by index
  • __iter__() - Iterate over rows
  • to_list() -> list[dict] - Convert all rows to a list of dictionaries

Example

result = g.execute("MATCH (n) RETURN n, n.name LIMIT 5")

print(f"Columns: {result.columns}")  # ['n', 'n.name']
print(f"Rows: {len(result)}")        # 5

# Access by index
first_row = result[0]
print(first_row['n.name'])

# Iterate
for row in result:
    print(row['n.name'])

# Convert to list
all_data = result.to_list()

Supported Data Types

Python types are automatically converted to/from Cypher types:

Python Type Cypher Type Notes
None NULL -
bool BOOLEAN -
int INTEGER -
float FLOAT -
str STRING -
list LIST Nested lists supported
dict MAP -

Graph types (Node, Relationship, Path) are returned as dictionaries with a _type field:

result = g.execute("MATCH (n:Person {name: 'Alice'}) RETURN n")
node = result[0]['n']

print(node['_type'])        # 'Node'
print(node['id'])           # Node ID
print(node['labels'])       # ['Person']
print(node['properties'])   # {'name': 'Alice', 'age': 30}

Advanced Examples

Parameterized Queries

# Create with parameters
g.execute(
    "CREATE (p:Person {name: $name, age: $age})",
    {"name": "Charlie", "age": 35}
)

# Query with parameters
result = g.execute(
    "MATCH (p:Person) WHERE p.age BETWEEN $min AND $max RETURN p",
    {"min": 25, "max": 35}
)

Working with Relationships

# Create relationship with properties
g.execute("""
    MATCH (a:Person {name: 'Alice'})
    MATCH (b:Person {name: 'Bob'})
    CREATE (a)-[:KNOWS {since: 2020, strength: 0.8}]->(b)
""")

# Query relationships
result = g.execute("""
    MATCH (a:Person)-[r:KNOWS]->(b:Person)
    RETURN a.name, r.since, r.strength, b.name
""")

for row in result:
    print(f"{row['a.name']} knows {row['b.name']} since {row['r.since']}")

Aggregations

result = g.execute("""
    MATCH (p:Person)
    RETURN
        count(p) as total,
        avg(p.age) as avg_age,
        min(p.age) as min_age,
        max(p.age) as max_age
""")

stats = result[0]
print(f"Total: {stats['total']}, Avg Age: {stats['avg_age']:.1f}")

Path Queries

# Find shortest path
result = g.execute("""
    MATCH path = shortestPath(
        (a:Person {name: 'Alice'})-[:KNOWS*]-(b:Person {name: 'Charlie'})
    )
    RETURN path
""")

path = result[0]['path']
print(f"Path length: {len(path['nodes']) - 1}")

Bulk Loading for High Performance 🚀

For loading large amounts of data, use the native bulk loader API instead of Cypher queries.

Performance Comparison

Benchmark: Creating 1,000 nodes

Method Time Throughput Speedup
Bulk Loader 0.0018s 555K nodes/sec 99x faster
Cypher Queries 0.178s 5.6K nodes/sec baseline

Bulk Loading Example

import ocpg
import time

g = ocpg.Graph()

# Traditional approach (slow)
start = time.time()
for i in range(1000):
    g.execute(f"CREATE (p:Person {{name: 'Person{i}', age: {20 + i}}})")
print(f"Cypher: {time.time() - start:.2f}s")

# Bulk loader approach (99x faster!)
g2 = ocpg.Graph()
start = time.time()
node_ids = []
for i in range(1000):
    node_id = g2.create_node(["Person"], {"name": f"Person{i}", "age": 20 + i})
    node_ids.append(node_id)
print(f"Bulk: {time.time() - start:.2f}s")

# Create relationships between nodes
for i in range(len(node_ids) - 1):
    g2.create_relationship(node_ids[i], node_ids[i + 1], "KNOWS", {})

When to Use Bulk Loading

Use bulk loaders for:

  • Initial data import
  • Migrating from other databases
  • Loading CSV/JSON files
  • Creating large test datasets
  • Any time you create >100 nodes/relationships

Use Cypher for:

  • Complex pattern matching and creation
  • Conditional logic (MERGE, conditional SET)
  • When you need to match existing nodes first
  • One-off operations
  • When query readability matters more than performance

Performance Tips

  1. Use bulk loaders (create_node, create_relationship) for loading data - 99x faster
  2. Use parameters instead of string formatting for better performance and security
  3. Batch operations when creating many nodes/relationships
  4. Use LIMIT for large result sets
  5. Create indexes for frequently queried properties (when index support is added)

Development

Building from source:

# Install maturin
pip install maturin

# Build and install in development mode
cd python-bindings
maturin develop

# Build release wheel
maturin build --release

# Run tests
python -m pytest

License

Apache 2.0 - See LICENSE for details.

Links

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ocpg-0.3.0.tar.gz (13.4 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ocpg-0.3.0-cp38-abi3-macosx_11_0_arm64.whl (1.7 MB view details)

Uploaded CPython 3.8+macOS 11.0+ ARM64

File details

Details for the file ocpg-0.3.0.tar.gz.

File metadata

  • Download URL: ocpg-0.3.0.tar.gz
  • Upload date:
  • Size: 13.4 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.12

File hashes

Hashes for ocpg-0.3.0.tar.gz
Algorithm Hash digest
SHA256 363f6d862d97358eed0c61187f3a54451a4486e6c411121d22a2ec4a619b5c7d
MD5 e5d45eed6adccc733399f25e5f637ea0
BLAKE2b-256 8c1788c33f08d8d216514629227e5c9a2faf42f1ea6a1a8cf2dd920909975dbf

See more details on using hashes here.

File details

Details for the file ocpg-0.3.0-cp38-abi3-macosx_11_0_arm64.whl.

File metadata

  • Download URL: ocpg-0.3.0-cp38-abi3-macosx_11_0_arm64.whl
  • Upload date:
  • Size: 1.7 MB
  • Tags: CPython 3.8+, macOS 11.0+ ARM64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.12

File hashes

Hashes for ocpg-0.3.0-cp38-abi3-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 b92dd87f0dc7719af1d48d6a2f46c418e86aacc59fcc790209c3e1ef3be4dbe7
MD5 4f0b83020beea909acbb649a481d4bfb
BLAKE2b-256 e5a4ae4347f2b01d9860e745dbf9ae0fa2f1960c867f8206763f585818b80c40

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page