Skip to main content

Parse, validate, and transpile graph query languages (GQL and Neo4j Cypher).

Project description

GraphGlot

CI License Python 3.11+

Parse, validate, transpile, and analyze graph query languages.

GraphGlot is a pure-Python toolkit for GQL (ISO/IEC 39075:2024) and Neo4j Cypher. It lets you parse queries into ASTs, transpile between dialects, validate syntax and feature compatibility, and analyze data lineage without requiring a database.

GraphGlot is pre-v1 and evolving quickly. APIs and behavior may still change as coverage and semantics expand.

Why GraphGlot?

  • GQL parser — parses the GQL language defined by ISO/IEC 39075:2024, including the core language and many optional language features
  • Standards-aligned feature flagging — reports which optional GQL features a query uses, following the GQL Flagger model in the standard
  • Validate queries — check syntax, feature compatibility, and semantic rules before hitting the database
  • Transpile between dialects — parse Neo4j Cypher and generate standard GQL, or vice versa
  • 100% openCypher TCK parse rate — parses all 3,897 tracked conformance scenarios
  • Track data lineage — understand how data flows from MATCH patterns to RETURN outputs
  • Build tooling — power linters, formatters, migration tools, and IDE integrations with a complete AST

Playground

Try it online here.

Validation

Check if a query is valid for a specific dialect and see which GQL features it requires:

from graphglot.dialect import Dialect

neo4j = Dialect.get_or_raise("neo4j")
result = neo4j.validate("MATCH (n:Person) RETURN n.name")

print(result.success)      # True
print(result.features)     # Set of required GQL features
print(result.diagnostics)  # Semantic warnings/errors

Transpilation

Parse a query with one dialect, generate it in another:

from graphglot.dialect import Dialect

neo4j = Dialect.get_or_raise("neo4j")
gql = Dialect.get_or_raise("fullgql")  # standard GQL

# Parse Cypher, generate GQL
ast = neo4j.parse("UNWIND [1, 2, 3] AS x RETURN x")
print(gql.generate(ast[0]))
# FOR x IN [1, 2, 3] RETURN x

ast = neo4j.parse("MATCH (n)-[r:KNOWS*1..3]->(m) RETURN n, m")
print(gql.generate(ast[0]))
# MATCH (n) -[r :KNOWS]-> {1,3} (m) RETURN n, m

ast = neo4j.parse("MATCH (n) WHERE n.score ^ 2 > 100 RETURN n")
print(gql.generate(ast[0]))
# MATCH (n) WHERE POWER(n.score, 2) > 100 RETURN n

Cypher-specific syntax is automatically converted to GQL equivalents: UNWIND becomes FOR...IN, variable-length paths [*1..3] become quantifiers {1,3}, ^ becomes POWER(), and more.

Data Lineage

Track how data flows through a query — which patterns introduce which variables, what each output depends on:

from graphglot.dialect import Dialect
from graphglot.lineage import LineageAnalyzer

query = "MATCH (n:Person)-[r:KNOWS]->(m:Person) WHERE n.age > 21 RETURN n.name AS person, m.name AS friend"

neo4j = Dialect.get_or_raise("neo4j")
ast = neo4j.parse(query)

analyzer = LineageAnalyzer()
result = analyzer.analyze(ast[0], query_text=query)

for b in result.bindings.values():
    print(f"{b.name}: {b.kind.value} label_expression={b.label_expression}")
# n: node label_expression=Person
# r: edge label_expression=KNOWS
# m: node label_expression=Person

for o in result.outputs.values():
    print(f"{o.alias}: {o.id}")
# person: o_0
# friend: o_1

Export lineage as JSON or upstream summary:

gg lineage "MATCH (n:Person)-[r:KNOWS]->(m) RETURN n.name" -o json
gg lineage "MATCH (n:Person)-[r:KNOWS]->(m) RETURN n.name" -o upstream

CLI

GraphGlot ships with the gg command-line tool:

# Parse and visualize the AST
gg tree "MATCH (n:Person)-[r:KNOWS]->(m) RETURN n.name, m.name"

# Validate against a dialect
gg validate --dialect neo4j "MATCH (n:Person) RETURN n.name"

# Tokenize
gg tokenize "MATCH (n:Person) RETURN n"

# Lineage analysis
gg lineage "MATCH (n:Person)-[r:KNOWS]->(m) RETURN n.name"

# Transpile between dialects
gg transpile -r neo4j -w fullgql "MATCH (n) WITH n.age AS age RETURN age"

# Parse and display the raw AST (JSON)
gg parse "MATCH (n) RETURN n"

# Infer types
gg type "MATCH (n:Person) RETURN n.name"

# List available dialects and features
gg dialects
gg features --dialect neo4j

Supported Dialects

Dialect Description
fullgql Full GQL — all extension and optional features enabled
coregql Core GQL — mandatory extension features only, no optional features
neo4j Neo4j Cypher 2025+ — GQL subset plus Cypher extensions

The dialect system is extensible. To add a new dialect (e.g., Memgraph, Amazon Neptune), subclass Dialect or CypherDialect and declare your supported features:

from graphglot.dialect.base import Dialect
from graphglot.features import ALL_FEATURES, G002

class MyDialect(Dialect):
    SUPPORTED_FEATURES = ALL_FEATURES - {G002}
    KEYWORD_OVERRIDES = {"OFFSET": "SKIP"}

Installation

pip install graphglot

For development:

pip install -e ".[dev]"

Documentation

Contributing

See CONTRIBUTING.md for guidelines. We use:

make test      # unit tests (pytest)
make pre       # linter/formatter (ruff + pre-commit)
make type      # type checking (mypy)
make neo4j     # integration tests (requires Neo4j)
make tck       # openCypher TCK conformance

Acknowledgments

GraphGlot is inspired by SQLGlot, the excellent SQL parser and transpiler. SQLGlot demonstrated that a pure-Python, dialect-aware parser with AST-based transpilation is a powerful and practical approach. GraphGlot applies the same philosophy to graph query languages.

License

Apache 2.0, see LICENSE.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

graphglot-0.8.2.tar.gz (685.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

graphglot-0.8.2-py3-none-any.whl (309.7 kB view details)

Uploaded Python 3

File details

Details for the file graphglot-0.8.2.tar.gz.

File metadata

  • Download URL: graphglot-0.8.2.tar.gz
  • Upload date:
  • Size: 685.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for graphglot-0.8.2.tar.gz
Algorithm Hash digest
SHA256 1a006566e2d71a6363c99b3d91916232c90c90e9d3fa6b9db69cf80219cc5ef1
MD5 0ae256bdbbb7c7aae93210c41889abb8
BLAKE2b-256 e238bf0c2e58c2210a2eedfd4b285cb29c12b95373c23d69074b848f15cad183

See more details on using hashes here.

Provenance

The following attestation bundles were made for graphglot-0.8.2.tar.gz:

Publisher: release.yml on averdeny/graphglot

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file graphglot-0.8.2-py3-none-any.whl.

File metadata

  • Download URL: graphglot-0.8.2-py3-none-any.whl
  • Upload date:
  • Size: 309.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for graphglot-0.8.2-py3-none-any.whl
Algorithm Hash digest
SHA256 870d9202bfaa52a3399a4c819b9f7ee603cf7c8a99b86e9234bf07b16a967ae7
MD5 fa7460944bd7df240cac88342cbe503c
BLAKE2b-256 9c7f8ea633a5174b92c5807968b91aadf77b3684036ba6b076d4ba35c9a07c2e

See more details on using hashes here.

Provenance

The following attestation bundles were made for graphglot-0.8.2-py3-none-any.whl:

Publisher: release.yml on averdeny/graphglot

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page