Python port of Google's ZetaSQL - SQL analysis and parsing library
Project description
ZetaSQL Python
Python port of Google's ZetaSQL - a powerful SQL analysis and parsing library.
Overview
ZetaSQL Python brings Google's SQL analyzer to the Python ecosystem, providing:
- SQL Analysis: Parse and analyze SQL statements with full type checking
- Name Resolution: Resolve table and column references against catalogs
- Query Building: Construct and manipulate SQL AST programmatically
- Expression Evaluation: Execute queries and expressions with parameter binding
- Java-like API: Familiar builder patterns and fluent interfaces
This project is built on top of zetasql-wasi, which provides the WebAssembly build of ZetaSQL, enabling ZetaSQL functionality in Python environments.
Installation
pip install zetasql
Requirements: Python 3.10+
Quick Start
from zetasql.api import Analyzer, CatalogBuilder, TableBuilder
from zetasql.types import AnalyzerOptions, LanguageOptions, TypeKind, ZetaSQLBuiltinFunctionOptions
# Create a catalog with a table
table = (
TableBuilder("users")
.add_column("id", TypeKind.TYPE_INT64)
.add_column("name", TypeKind.TYPE_STRING)
.add_column("email", TypeKind.TYPE_STRING)
.build()
)
# Build catalog with builtin functions
lang_opts = LanguageOptions.maximum_features()
builtin_opts = ZetaSQLBuiltinFunctionOptions(language_options=lang_opts)
catalog = (
CatalogBuilder("mydb")
.add_table(table)
.with_builtin_functions(builtin_opts)
.build()
)
# Analyze SQL
options = AnalyzerOptions(language_options=lang_opts)
analyzer = Analyzer(options, catalog)
stmt = analyzer.analyze_statement("SELECT name, email FROM users WHERE id > 100")
print(f"Analyzed {len(stmt.output_column_list)} output columns")
# Output: Analyzed 2 output columns
Key Features
🔍 SQL Parsing & Analysis
Parse SQL syntax or perform full semantic analysis:
from zetasql.api import Parser, Analyzer
# Parser: Syntax-only parsing (fast, no catalog needed)
ast_stmt = Parser.parse_statement_static("SELECT * FROM users")
# Analyzer: Full semantic analysis with type checking
resolved_stmt = analyzer.analyze_statement("SELECT * FROM users")
🌲 AST Traversal
Visit and inspect SQL parse trees with the Visitor pattern:
from zetasql.api import Parser, ASTNodeVisitor
class TableNameCollector(ASTNodeVisitor):
def __init__(self):
super().__init__()
self.tables = []
def visit_ASTPathExpression(self, node):
if node.names:
self.tables.append(".".join([n.identifier for n in node.names]))
stmt = Parser.parse_statement_static("SELECT * FROM users JOIN orders")
visitor = TableNameCollector()
visitor.visit(stmt)
print(visitor.tables) # ['users', 'orders']
🏗️ Builder Pattern APIs
Fluent interfaces for constructing catalogs, tables, and functions:
from zetasql.api import CatalogBuilder
from zetasql.types import LanguageOptions, ZetaSQLBuiltinFunctionOptions
lang_opts = LanguageOptions.maximum_features()
builtin_opts = ZetaSQLBuiltinFunctionOptions(language_options=lang_opts)
catalog = (
CatalogBuilder("shop")
.add_table(orders_table)
.add_table(products_table)
.with_builtin_functions(builtin_opts)
.build()
)
⚡ Query Execution
Execute queries with parameter binding and table data:
from zetasql.api import PreparedQuery, create_table_content
data = create_table_content([[1, "Alice"], [2, "Bob"]])
query = PreparedQuery("SELECT * FROM users WHERE id = @user_id", options, catalog)
result = query.execute(parameters={"user_id": 1}, table_content={"users": data})
🎯 Type-Safe Values
Create and manipulate typed SQL values:
from zetasql.api import Value
int_val = Value.int64(42)
str_val = Value.string("hello")
array_val = Value.array([Value.int64(1), Value.int64(2)])
📦 ProtoModel System
Pythonic wrappers around protobuf messages with real inheritance:
from zetasql.types import ResolvedLiteral
literal = ResolvedLiteral(...)
type_kind = literal.type.type_kind # Direct access, no parent chain
isinstance(literal, ResolvedExpr) # True - real inheritance!
Documentation
- Getting Started Guide - Detailed tutorials and examples
- API Reference - Complete API documentation
- ProtoModel Inheritance Hierarchy - The inheritance structure of the ProtoModel classes
- Architecture - Project structure and design
Development
# Clone the repository
git clone https://github.com/heoh/zetasql-py.git
cd zetasql-py
# Create virtual environment
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
# Install in development mode
pip install -e ".[dev]"
# Run tests
pytest
# Run linter
ruff check src/ tests/
See CONTRIBUTING.md for detailed contribution guidelines.
Project Status
This project is in alpha stage. The API is functional but may change as we refine the design. Feedback and contributions are welcome!
License
Apache License 2.0 - see LICENSE file for details.
This is an unofficial Python port of Google's ZetaSQL (also Apache 2.0 licensed) and is not affiliated with Google.
Acknowledgments
- Original ZetaSQL project by Google
- Built on zetasql-wasi - WebAssembly build of ZetaSQL
- Inspired by the Java ZetaSQL API design
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file zetasql-0.1.5.tar.gz.
File metadata
- Download URL: zetasql-0.1.5.tar.gz
- Upload date:
- Size: 7.6 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c61ebe1f42744047305fdebd6293aa4f842e2c1cc26e4d1b9b022cf67c9f3003
|
|
| MD5 |
946183ec87eddf4fe022e65952c2ef4b
|
|
| BLAKE2b-256 |
73d561547dbfa3eb5373396142e59837ecfc0e2d80f0dc64c6c33b058299daf8
|
File details
Details for the file zetasql-0.1.5-py3-none-any.whl.
File metadata
- Download URL: zetasql-0.1.5-py3-none-any.whl
- Upload date:
- Size: 7.8 MB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
17fa5e5aa927fbb647c1812b08ee8ef8c5d36c8e2e83eaa4651d9bc4a8a462f8
|
|
| MD5 |
259fd8c186cefc252e41c5a4b9583078
|
|
| BLAKE2b-256 |
21ed79975a3ebd7142266e9ac94bb77380e81b0e5e8db1e53d55a1380e57d1bb
|