Skip to main content

Python bindings for the lance-graph Cypher engine

Project description

Lance Graph

A high-performance Cypher-capable graph query engine with Python bindings for building scalable, serverless knowledge graphs.

Lance Graph combines a Rust-powered Cypher query engine with Python APIs for:

  • Fast graph queries using Cypher query language
  • AI-powered knowledge extraction from text (via LLM)
  • Lance-backed storage for efficient graph data management
  • Natural language Q&A over your knowledge graphs
  • FastAPI web service for graph queries

Installation

pip install lance-graph

Quick Start

1. Simple Cypher Query

import pyarrow as pa
from lance_graph import CypherQuery, GraphConfig

cfg = (
    GraphConfig.builder()
    .with_node_label("Person", "id")
    .with_node_label("City", "id")
    .with_relationship("lives_in", "src", "dst")
    .build()
)

datasets = {
    "Person": pa.table({"id": [1, 2], "name": ["Alice", "Bob"]}),
    "City": pa.table({"id": [10, 20], "name": ["London", "Sydney"]}),
    "lives_in": pa.table({"src": [1, 2], "dst": [10, 20]}),
}

query = """
    MATCH (p:Person)-[:lives_in]->(c:City)
    RETURN p.name, c.name
"""

result = CypherQuery(query).with_config(cfg).execute(datasets)
print(result.to_pylist())

[{'p.name': 'Alice', 'c.name': 'London'}, {'p.name': 'Bob', 'c.name': 'Sydney'}]

2. Multi-Query Execution with CypherEngine

For executing multiple queries against the same datasets, use CypherEngine to cache the catalog and achieve better performance:

import pyarrow as pa
from lance_graph import CypherEngine, GraphConfig

cfg = (
    GraphConfig.builder()
    .with_node_label("Person", "id")
    .with_node_label("City", "id")
    .with_relationship("lives_in", "src", "dst")
    .build()
)

datasets = {
    "Person": pa.table({"id": [1, 2], "name": ["Alice", "Bob"], "age": [30, 25]}),
    "City": pa.table({"id": [10, 20], "name": ["London", "Sydney"]}),
    "lives_in": pa.table({"src": [1, 2], "dst": [10, 20]}),
}

# Create engine once - builds catalog
engine = CypherEngine(cfg, datasets)

# Execute multiple queries efficiently - catalog is reused
result1 = engine.execute("MATCH (p:Person) WHERE p.age > 25 RETURN p.name")
result2 = engine.execute("MATCH (p:Person)-[:lives_in]->(c:City) RETURN p.name, c.name")
result3 = engine.execute("MATCH (p:Person) RETURN count(*) as total")

print(result1.to_pylist())
# [{'p.name': 'Alice'}]

3. Direct SQL Queries

For data analytics workflows where you prefer standard SQL over Cypher, use SqlQuery or SqlEngine. No GraphConfig is needed — write explicit JOINs against your tables directly:

import pyarrow as pa
from lance_graph import SqlQuery, SqlEngine

person = pa.table({
    "id": [1, 2, 3],
    "name": ["Alice", "Bob", "Carol"],
    "age": [28, 34, 29],
})
knows = pa.table({"src_id": [1, 1, 2], "dst_id": [2, 3, 3]})
datasets = {"person": person, "knows": knows}

# One-off query
result = SqlQuery(
    "SELECT p.name, p.age FROM person p WHERE p.age > 30"
).execute(datasets)
print(result.to_pylist())
# [{'name': 'Bob', 'age': 34}]

# Multi-query with cached context
engine = SqlEngine(datasets)
r1 = engine.execute("SELECT COUNT(*) AS cnt FROM person")
r2 = engine.execute(
    "SELECT p1.name AS person, p2.name AS friend "
    "FROM person p1 "
    "JOIN knows k ON p1.id = k.src_id "
    "JOIN person p2 ON p2.id = k.dst_id"
)

4. Unity Catalog Integration

Connect to Unity Catalog (OSS) to discover and query Delta Lake or Parquet tables without manually loading data:

from lance_graph import UnityCatalog

# Connect to Unity Catalog
uc = UnityCatalog("http://localhost:8080/api/2.1/unity-catalog")

# Browse catalog hierarchy
catalogs = uc.list_catalogs()
schemas = uc.list_schemas("unity")
tables = uc.list_tables("unity", "default")

# Inspect table metadata
table = uc.get_table("unity", "default", "marksheet")
print(table.data_source_format)  # "Delta"
print(table.columns())           # [{"name": "id", "type_name": "INT", ...}, ...]

# Auto-register all tables and query via SQL
engine = uc.create_sql_engine("unity", "default")
result = engine.execute("SELECT * FROM marksheet WHERE mark > 80")
print(result.to_pandas())

For tables on cloud storage (S3, Azure, GCS):

uc = UnityCatalog(
    "http://localhost:8080/api/2.1/unity-catalog",
    storage_options={
        "azure_storage_account_name": "myaccount",
        "azure_storage_account_key": "...",
    }
)
engine = uc.create_sql_engine("unity", "default")

5. Build a Knowledge Graph from Text

from pathlib import Path
from knowledge_graph import (
    KnowledgeGraphConfig,
    LanceKnowledgeGraph,
    LanceGraphStore,
    get_extractor,
)
from knowledge_graph.cli.ingest import extract_and_add

# Initialize knowledge graph
config = KnowledgeGraphConfig.from_root(Path("./my_graph"))
config.ensure_directories()

# Create schema
schema_path = config.resolved_schema_path()
if not schema_path.exists():
    schema_content = """
nodes:
  Entity:
    id_field: entity_id

relationships:
  RELATIONSHIP:
    source: source_entity_id
    target: target_entity_id
"""
    schema_path.write_text(schema_content, encoding="utf-8")

store = LanceGraphStore(config)
store.ensure_layout()

graph_config = config.load_graph_config()
kg = LanceKnowledgeGraph(graph_config, storage=store)
kg.ensure_initialized()

# Extract and add entities/relationships from text
# Using heuristic extractor for testing without API key
extractor = get_extractor("heuristic")
# or using LLM extractor (requires API key)
# extractor = get_extractor("llm", llm_model="gpt-4o-mini")
text = """
Albert Einstein developed the theory of relativity at Princeton.
Marie Curie discovered radioactivity in Paris.
"""

extract_and_add(text, kg, extractor, embedding_generator=None)

# Query the graph
result = kg.query("""
    MATCH (e:Entity)
    RETURN e.name, e.entity_type
    LIMIT 10
""")
print(result.to_pylist())

6. Natural Language Q&A

from knowledge_graph.llm.qa import ask_question

# Ask questions in natural language
answer = ask_question(
    "Who discovered radioactivity?",
    kg,
    llm_model="gpt-4o-mini"
)
print(answer)
# Output: Marie Curie discovered radioactivity.

Command-Line Interface

Lance Graph includes a CLI for building and querying knowledge graphs:

# Initialize and extract
knowledge_graph --root ./my_graph --init
knowledge_graph --root ./my_graph --extract-and-add notes.txt

# Query with Cypher
knowledge_graph --root ./my_graph "MATCH (e:Entity) RETURN e.name LIMIT 10"

# Natural language Q&A
knowledge_graph --root ./my_graph --ask "Who discovered DNA?"

For complete CLI documentation and examples, see the main README.

Requirements

  • Python 3.11+
  • Optional: OpenAI API key for LLM extraction

Contributing

Lance Graph is open source! Contributions are welcome.

Quick start

cd python
uv venv --python 3.11 .venv
source .venv/bin/activate
uv pip install maturin[patchelf]
uv pip install -e '.[tests]'
maturin develop
pytest python/tests/ -v

Development workflow

For linting and type checks:

# Install dev dependencies
uv pip install -e '.[dev]'

# Run linters and type checker
ruff format python/              # format code
ruff check python/               # lint code
pyright                          # type check

# Run specific tests
pytest python/tests/test_graph.py::test_basic_node_selection -v

# Rebuild extension after Rust changes
maturin develop

If another virtual environment is already active, run deactivate (or unset VIRTUAL_ENV) before invoking uv run so uv binds to .venv.

Repository layout

  • crates/lance-graph-python/src/ – PyO3 bridge that exposes graph APIs to Python
  • python/python/lance_graph/ – pure-Python wrapper and __init__
  • python/python/knowledge_graph/ – CLI, FastAPI, and extractor utilities built on Lance
  • python/python/tests/ – graph-centric functional tests

For more information on development setup, building from source, running tests, and code quality guidelines, see DEVELOPMENT.md.

License

Apache 2.0

Links

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

lance_graph-0.5.4.tar.gz (249.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

lance_graph-0.5.4-cp39-abi3-manylinux_2_39_x86_64.whl (52.4 MB view details)

Uploaded CPython 3.9+manylinux: glibc 2.39+ x86-64

File details

Details for the file lance_graph-0.5.4.tar.gz.

File metadata

  • Download URL: lance_graph-0.5.4.tar.gz
  • Upload date:
  • Size: 249.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.10.12 {"installer":{"name":"uv","version":"0.10.12","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for lance_graph-0.5.4.tar.gz
Algorithm Hash digest
SHA256 114f19819ac7ee77003b59b90e6421287d136d64287fd0954cc8449c67493746
MD5 2694a68dbc3f6b2acca4b061cec02f3a
BLAKE2b-256 d0f394581fa76b160853b38a2abd1b310271376e91870086f5e021fc27eba99c

See more details on using hashes here.

File details

Details for the file lance_graph-0.5.4-cp39-abi3-manylinux_2_39_x86_64.whl.

File metadata

  • Download URL: lance_graph-0.5.4-cp39-abi3-manylinux_2_39_x86_64.whl
  • Upload date:
  • Size: 52.4 MB
  • Tags: CPython 3.9+, manylinux: glibc 2.39+ x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.10.12 {"installer":{"name":"uv","version":"0.10.12","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for lance_graph-0.5.4-cp39-abi3-manylinux_2_39_x86_64.whl
Algorithm Hash digest
SHA256 08503a5151ff6321f5e696620c39ae3d5370b5d49ff8562966e763bd93e760ff
MD5 5069d05e8c5ac8469006991336856c4e
BLAKE2b-256 77b4c039b72f86237c5e6546fe133426c75f4f4b7c602e841dc191a3f770d1ce

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page