Skip to main content

Python port of Google's ZetaSQL - SQL analysis and parsing library

Project description

ZetaSQL Python

Python 3.10+ License: Apache 2.0

Python port of Google's ZetaSQL - a powerful SQL analysis and parsing library.

Overview

ZetaSQL Python brings Google's SQL analyzer to the Python ecosystem, providing:

  • SQL Analysis: Parse and analyze SQL statements with full type checking
  • Name Resolution: Resolve table and column references against catalogs
  • Query Building: Construct and manipulate SQL AST programmatically
  • Expression Evaluation: Execute queries and expressions with parameter binding
  • Java-like API: Familiar builder patterns and fluent interfaces

This project is built on top of zetasql-wasi, which provides the WebAssembly build of ZetaSQL, enabling ZetaSQL functionality in Python environments.

Installation

pip install zetasql

Requirements: Python 3.10+

Quick Start

from zetasql.api import Analyzer, CatalogBuilder, TableBuilder
from zetasql.types import AnalyzerOptions, LanguageOptions, TypeKind, ZetaSQLBuiltinFunctionOptions

# Create a catalog with a table
table = (
    TableBuilder("users")
    .add_column("id", TypeKind.TYPE_INT64)
    .add_column("name", TypeKind.TYPE_STRING)
    .add_column("email", TypeKind.TYPE_STRING)
    .build()
)

# Build catalog with builtin functions
lang_opts = LanguageOptions.maximum_features()
builtin_opts = ZetaSQLBuiltinFunctionOptions(language_options=lang_opts)

catalog = (
    CatalogBuilder("mydb")
    .add_table(table)
    .with_builtin_functions(builtin_opts)
    .build()
)

# Analyze SQL
options = AnalyzerOptions(language_options=lang_opts)
analyzer = Analyzer(options, catalog)
stmt = analyzer.analyze_statement("SELECT name, email FROM users WHERE id > 100")

print(f"Analyzed {len(stmt.output_column_list)} output columns")
# Output: Analyzed 2 output columns

Key Features

🔍 SQL Parsing & Analysis

Parse SQL syntax or perform full semantic analysis:

from zetasql.api import Parser, Analyzer

# Parser: Syntax-only parsing (fast, no catalog needed)
ast_stmt = Parser.parse_statement_static("SELECT * FROM users")

# Analyzer: Full semantic analysis with type checking
resolved_stmt = analyzer.analyze_statement("SELECT * FROM users")

🌲 AST Traversal

Visit and inspect SQL parse trees with the Visitor pattern:

from zetasql.api import Parser, ASTNodeVisitor

class TableNameCollector(ASTNodeVisitor):
    def __init__(self):
        super().__init__()
        self.tables = []
    
    def visit_ASTPathExpression(self, node):
        if node.names:
            self.tables.append(".".join([n.identifier for n in node.names]))

stmt = Parser.parse_statement_static("SELECT * FROM users JOIN orders")
visitor = TableNameCollector()
visitor.visit(stmt)
print(visitor.tables)  # ['users', 'orders']

🏗️ Builder Pattern APIs

Fluent interfaces for constructing catalogs, tables, and functions:

from zetasql.api import CatalogBuilder
from zetasql.types import LanguageOptions, ZetaSQLBuiltinFunctionOptions

lang_opts = LanguageOptions.maximum_features()
builtin_opts = ZetaSQLBuiltinFunctionOptions(language_options=lang_opts)

catalog = (
    CatalogBuilder("shop")
    .add_table(orders_table)
    .add_table(products_table)
    .with_builtin_functions(builtin_opts)
    .build()
)

Query Execution

Execute queries with parameter binding and table data:

from zetasql.api import PreparedQuery, create_table_content

data = create_table_content([[1, "Alice"], [2, "Bob"]])
query = PreparedQuery("SELECT * FROM users WHERE id = @user_id", options, catalog)
result = query.execute(parameters={"user_id": 1}, table_content={"users": data})

🎯 Type-Safe Values

Create and manipulate typed SQL values:

from zetasql.api import Value

int_val = Value.int64(42)
str_val = Value.string("hello")
array_val = Value.array([Value.int64(1), Value.int64(2)])

📦 ProtoModel System

Pythonic wrappers around protobuf messages with real inheritance:

from zetasql.types import ResolvedLiteral

literal = ResolvedLiteral(...)
type_kind = literal.type.type_kind  # Direct access, no parent chain
isinstance(literal, ResolvedExpr)   # True - real inheritance!

Documentation

Development

# Clone the repository
git clone https://github.com/heoh/zetasql-py.git
cd zetasql-py

# Create virtual environment
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

# Install in development mode
pip install -e ".[dev]"

# Run tests
pytest

# Run linter
ruff check src/ tests/

See CONTRIBUTING.md for detailed contribution guidelines.

Project Status

This project is in alpha stage. The API is functional but may change as we refine the design. Feedback and contributions are welcome!

License

Apache License 2.0 - see LICENSE file for details.

This is an unofficial Python port of Google's ZetaSQL (also Apache 2.0 licensed) and is not affiliated with Google.

Acknowledgments

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

zetasql-0.1.5.tar.gz (7.6 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

zetasql-0.1.5-py3-none-any.whl (7.8 MB view details)

Uploaded Python 3

File details

Details for the file zetasql-0.1.5.tar.gz.

File metadata

  • Download URL: zetasql-0.1.5.tar.gz
  • Upload date:
  • Size: 7.6 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for zetasql-0.1.5.tar.gz
Algorithm Hash digest
SHA256 c61ebe1f42744047305fdebd6293aa4f842e2c1cc26e4d1b9b022cf67c9f3003
MD5 946183ec87eddf4fe022e65952c2ef4b
BLAKE2b-256 73d561547dbfa3eb5373396142e59837ecfc0e2d80f0dc64c6c33b058299daf8

See more details on using hashes here.

File details

Details for the file zetasql-0.1.5-py3-none-any.whl.

File metadata

  • Download URL: zetasql-0.1.5-py3-none-any.whl
  • Upload date:
  • Size: 7.8 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for zetasql-0.1.5-py3-none-any.whl
Algorithm Hash digest
SHA256 17fa5e5aa927fbb647c1812b08ee8ef8c5d36c8e2e83eaa4651d9bc4a8a462f8
MD5 259fd8c186cefc252e41c5a4b9583078
BLAKE2b-256 21ed79975a3ebd7142266e9ac94bb77380e81b0e5e8db1e53d55a1380e57d1bb

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page