SQL extension for declarative data visualization
Project description
ggsql
Python bindings for ggsql, a SQL extension for declarative data visualization.
This package provides Python bindings to the Rust ggsql crate, enabling Python users to create visualizations using ggsql's VISUALISE syntax with native Altair chart output.
Installation
From PyPI (when published)
pip install ggsql
From source
Building from source requires:
# Clone the monorepo
git clone https://github.com/georgestagg/ggsql.git
cd ggsql/ggsql-python
# Create a virtual environment
python -m venv .venv
source .venv/bin/activate # or `.venv\Scripts\activate` on Windows
# Install build dependencies
pip install maturin
# Build and install in development mode
maturin develop
# Or build a wheel
maturin build --release
pip install target/wheels/ggsql-*.whl
Quick Start
Simple Usage with render_altair
For quick visualizations, use the render_altair convenience function:
import ggsql
import polars as pl
# Create a DataFrame
df = pl.DataFrame({
"x": [1, 2, 3, 4, 5],
"y": [10, 20, 15, 30, 25],
"category": ["A", "B", "A", "B", "A"]
})
# Render to Altair chart
chart = ggsql.render_altair(df, "VISUALISE x, y DRAW point")
# Display or save
chart.display() # In Jupyter
chart.save("chart.html") # Save to file
Two-Stage API
For more control, use the two-stage API with explicit reader and writer:
import ggsql
import polars as pl
# 1. Create a DuckDB reader
reader = ggsql.DuckDBReader("duckdb://memory")
# 2. Register your DataFrame as a table
df = pl.DataFrame({
"date": ["2024-01-01", "2024-01-02", "2024-01-03"],
"revenue": [100, 150, 120],
"region": ["North", "South", "North"]
})
reader.register("sales", df)
# 3. Execute the ggsql query
spec = reader.execute(
"""
SELECT * FROM sales
VISUALISE date AS x, revenue AS y, region AS color
DRAW line
LABEL title => 'Sales by Region'
"""
)
# 4. Inspect metadata
print(f"Rows: {spec.metadata()['rows']}")
print(f"Columns: {spec.metadata()['columns']}")
print(f"Layers: {spec.layer_count()}")
# 5. Inspect SQL/VISUALISE portions and data
print(f"SQL: {spec.sql()}")
print(f"Visual: {spec.visual()}")
print(spec.layer_data(0)) # Returns polars DataFrame
# 6. Render to Vega-Lite JSON
writer = ggsql.VegaLiteWriter()
vegalite_json = writer.render(spec)
print(vegalite_json)
API Reference
Classes
DuckDBReader(connection: str)
Database reader that executes SQL and manages DataFrames.
reader = ggsql.DuckDBReader("duckdb://memory") # In-memory database
reader = ggsql.DuckDBReader("duckdb:///path/to/file.db") # File database
Methods:
register(name: str, df: polars.DataFrame, replace: bool = False)- Register a DataFrame as a queryable tableunregister(name: str)- Unregister a previously registered tableexecute_sql(sql: str) -> polars.DataFrame- Execute SQL and return results
VegaLiteWriter()
Writer that generates Vega-Lite v6 JSON specifications.
writer = ggsql.VegaLiteWriter()
json_output = writer.render(spec)
Validated
Result of validate() containing query analysis without SQL execution.
Methods:
valid() -> bool- Whether the query is syntactically and semantically validhas_visual() -> bool- Whether the query contains a VISUALISE clausesql() -> str- The SQL portion (before VISUALISE)visual() -> str- The VISUALISE portionerrors() -> list[dict]- Validation errors with messages and locationswarnings() -> list[dict]- Validation warnings
Spec
Result of reader.execute(), containing resolved visualization ready for rendering.
Methods:
metadata() -> dict- Get{"rows": int, "columns": list[str], "layer_count": int}sql() -> str- The executed SQL queryvisual() -> str- The VISUALISE clauselayer_count() -> int- Number of DRAW layersdata() -> polars.DataFrame | None- Main query result DataFramelayer_data(index: int) -> polars.DataFrame | None- Layer-specific data (if filtered)stat_data(index: int) -> polars.DataFrame | None- Statistical transform datalayer_sql(index: int) -> str | None- Layer filter SQLstat_sql(index: int) -> str | None- Stat transform SQLwarnings() -> list[dict]- Validation warnings from execution
Functions
validate(query: str) -> Validated
Validate query syntax and semantics without executing SQL.
validated = ggsql.validate("SELECT x, y FROM data VISUALISE x, y DRAW point")
if validated.valid():
print("Query is valid!")
else:
for error in validated.errors():
print(f"Error: {error['message']}")
reader.execute(query: str) -> Spec
Execute a ggsql query and return the visualization specification.
reader = ggsql.DuckDBReader("duckdb://memory")
spec = reader.execute("SELECT 1 AS x, 2 AS y VISUALISE x, y DRAW point")
render_altair(df, viz: str, **kwargs) -> altair.Chart
Convenience function to render a DataFrame with a VISUALISE spec to an Altair chart.
Parameters:
df- Any narwhals-compatible DataFrame (polars, pandas, etc.). LazyFrames are collected automatically.viz- The VISUALISE specification string**kwargs- Additional arguments passed toaltair.Chart.from_json()(e.g.,validate=False)
Returns: An Altair chart object (Chart, LayerChart, FacetChart, etc.)
import polars as pl
import ggsql
df = pl.DataFrame({"x": [1, 2, 3], "y": [10, 20, 30]})
chart = ggsql.render_altair(df, "VISUALISE x, y DRAW point")
Examples
Mapping Styles
df = pl.DataFrame({"x": [1, 2, 3], "y": [10, 20, 30], "category": ["A", "B", "A"]})
# Explicit mapping
ggsql.render_altair(df, "VISUALISE x AS x, y AS y DRAW point")
# Implicit mapping (column name = aesthetic name)
ggsql.render_altair(df, "VISUALISE x, y DRAW point")
# Wildcard mapping (map all matching columns)
ggsql.render_altair(df, "VISUALISE * DRAW point")
# With color encoding
ggsql.render_altair(df, "VISUALISE x, y, category AS color DRAW point")
Custom Readers
You can use any Python object with an execute_sql(sql: str) -> polars.DataFrame method as a reader. This enables integration with any data source.
import ggsql
import polars as pl
class CSVReader:
"""Custom reader that loads data from CSV files."""
def __init__(self, data_dir: str):
self.data_dir = data_dir
def execute_sql(self, sql: str) -> pl.DataFrame:
# Simple implementation: ignore SQL and return fixed data
# A real implementation would parse SQL to determine which file to load
return pl.read_csv(f"{self.data_dir}/data.csv")
# Use custom reader with ggsql.execute()
reader = CSVReader("/path/to/data")
spec = ggsql.execute(
"SELECT * FROM data VISUALISE x, y DRAW point",
reader
)
writer = ggsql.VegaLiteWriter()
json_output = writer.render(spec)
Additional methods for custom readers:
register(name: str, df: polars.DataFrame, replace: bool = False) -> None- Register a DataFrame as a queryable table (required)unregister(name: str) -> None- Unregister a previously registered table (optional)
class AdvancedReader:
"""Custom reader with registration support."""
def __init__(self):
self.tables = {}
def execute_sql(self, sql: str) -> pl.DataFrame:
# Your SQL execution logic here
...
def register(self, name: str, df: pl.DataFrame, replace: bool = False) -> None:
self.tables[name] = df
def unregister(self, name: str) -> None:
del self.tables[name]
Native readers like DuckDBReader use an optimized fast path, while custom Python readers are automatically bridged via IPC serialization.
Ibis Reader Example
Ibis provides a unified Python API for SQL operations across multiple backends. Here's how to create an ibis-based custom reader:
import ggsql
import polars as pl
import ibis
class IbisReader:
"""Custom reader using ibis as the SQL backend."""
def __init__(self, backend="duckdb"):
if backend == "duckdb":
self.con = ibis.duckdb.connect()
elif backend == "sqlite":
self.con = ibis.sqlite.connect()
# Add other backends as needed
def execute_sql(self, sql: str) -> pl.DataFrame:
return self.con.con.execute(sql).pl()
def register(self, name: str, df: pl.DataFrame, replace: bool = False) -> None:
self.con.create_table(name, df.to_arrow(), overwrite=replace)
def unregister(self, name: str) -> None:
self.con.drop_table(name)
# Usage
reader = IbisReader()
df = pl.DataFrame({
"date": ["2024-01-01", "2024-01-02", "2024-01-03"],
"revenue": [100, 150, 120],
})
reader.register("sales", df)
spec = ggsql.execute(
"SELECT * FROM sales VISUALISE date AS x, revenue AS y DRAW line",
reader
)
writer = ggsql.VegaLiteWriter()
print(writer.render(spec))
Development
Keeping in sync with the monorepo
The ggsql-python package is part of the ggsql monorepo and depends on the Rust ggsql crate via a path dependency. When the Rust crate is updated, you may need to rebuild:
cd ggsql-python
# Rebuild after Rust changes
maturin develop
# If tree-sitter grammar changed, clean and rebuild
cd .. && cargo clean -p tree-sitter-ggsql && cd ggsql-python
maturin develop
Running tests
# Install test dependencies
pip install pytest
# Run all tests
pytest tests/ -v
Requirements
- Python >= 3.10
- altair >= 5.0
- narwhals >= 2.15
- polars >= 1.0
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distributions
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file ggsql-0.2.3.tar.gz.
File metadata
- Download URL: ggsql-0.2.3.tar.gz
- Upload date:
- Size: 881.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f78e874e9908b7f224382e796da68c10a5ea09d519ed2d323256fb600b5f4587
|
|
| MD5 |
d9415524aaeb73b270f42efd7f22024d
|
|
| BLAKE2b-256 |
6056815b88957e49a7a971bd274225a98cecf95e5aa3199e56d2b50c26ae2960
|
Provenance
The following attestation bundles were made for ggsql-0.2.3.tar.gz:
Publisher:
release-python.yml on posit-dev/ggsql
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
ggsql-0.2.3.tar.gz -
Subject digest:
f78e874e9908b7f224382e796da68c10a5ea09d519ed2d323256fb600b5f4587 - Sigstore transparency entry: 1329757750
- Sigstore integration time:
-
Permalink:
posit-dev/ggsql@5015d5217f40525f8a5c8877728f99788baff4ca -
Branch / Tag:
refs/tags/py/v0.2.3 - Owner: https://github.com/posit-dev
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release-python.yml@5015d5217f40525f8a5c8877728f99788baff4ca -
Trigger Event:
push
-
Statement type:
File details
Details for the file ggsql-0.2.3-cp310-abi3-win_amd64.whl.
File metadata
- Download URL: ggsql-0.2.3-cp310-abi3-win_amd64.whl
- Upload date:
- Size: 31.0 MB
- Tags: CPython 3.10+, Windows x86-64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b1b911ef81ef79358f9f8e248aedf176bc14a60d324befec4ffd6775f4404b97
|
|
| MD5 |
da6ea5b5ab3332a87fda066f7dbb8d4f
|
|
| BLAKE2b-256 |
5ad95c3ff40869a27ff747d9df2b3f5fe10ffc4b896c151b54105e903c892a57
|
Provenance
The following attestation bundles were made for ggsql-0.2.3-cp310-abi3-win_amd64.whl:
Publisher:
release-python.yml on posit-dev/ggsql
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
ggsql-0.2.3-cp310-abi3-win_amd64.whl -
Subject digest:
b1b911ef81ef79358f9f8e248aedf176bc14a60d324befec4ffd6775f4404b97 - Sigstore transparency entry: 1329759395
- Sigstore integration time:
-
Permalink:
posit-dev/ggsql@5015d5217f40525f8a5c8877728f99788baff4ca -
Branch / Tag:
refs/tags/py/v0.2.3 - Owner: https://github.com/posit-dev
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release-python.yml@5015d5217f40525f8a5c8877728f99788baff4ca -
Trigger Event:
push
-
Statement type:
File details
Details for the file ggsql-0.2.3-cp310-abi3-manylinux_2_28_x86_64.whl.
File metadata
- Download URL: ggsql-0.2.3-cp310-abi3-manylinux_2_28_x86_64.whl
- Upload date:
- Size: 38.0 MB
- Tags: CPython 3.10+, manylinux: glibc 2.28+ x86-64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
cd70195b1014472397d5f3b52b07856d7ef4a25a6a37ef51d6224d41ea8e6056
|
|
| MD5 |
7451ead19bccf43f58a321eaa4ced2f5
|
|
| BLAKE2b-256 |
8319a5c90ee56d1bc868bac9082990ae8177b783f07cab7cfc68400f7334144e
|
Provenance
The following attestation bundles were made for ggsql-0.2.3-cp310-abi3-manylinux_2_28_x86_64.whl:
Publisher:
release-python.yml on posit-dev/ggsql
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
ggsql-0.2.3-cp310-abi3-manylinux_2_28_x86_64.whl -
Subject digest:
cd70195b1014472397d5f3b52b07856d7ef4a25a6a37ef51d6224d41ea8e6056 - Sigstore transparency entry: 1329758360
- Sigstore integration time:
-
Permalink:
posit-dev/ggsql@5015d5217f40525f8a5c8877728f99788baff4ca -
Branch / Tag:
refs/tags/py/v0.2.3 - Owner: https://github.com/posit-dev
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release-python.yml@5015d5217f40525f8a5c8877728f99788baff4ca -
Trigger Event:
push
-
Statement type:
File details
Details for the file ggsql-0.2.3-cp310-abi3-manylinux_2_28_aarch64.whl.
File metadata
- Download URL: ggsql-0.2.3-cp310-abi3-manylinux_2_28_aarch64.whl
- Upload date:
- Size: 37.0 MB
- Tags: CPython 3.10+, manylinux: glibc 2.28+ ARM64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
018a967e02e145f46a117ef974e86478af77c86bc7f47333435f650ebdc1640e
|
|
| MD5 |
a50f569628ed5dacc35c5ab86a1d03a1
|
|
| BLAKE2b-256 |
46fab495981317a4746d40db83add841ffc248f329d949a865fcac057be78a78
|
Provenance
The following attestation bundles were made for ggsql-0.2.3-cp310-abi3-manylinux_2_28_aarch64.whl:
Publisher:
release-python.yml on posit-dev/ggsql
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
ggsql-0.2.3-cp310-abi3-manylinux_2_28_aarch64.whl -
Subject digest:
018a967e02e145f46a117ef974e86478af77c86bc7f47333435f650ebdc1640e - Sigstore transparency entry: 1329758964
- Sigstore integration time:
-
Permalink:
posit-dev/ggsql@5015d5217f40525f8a5c8877728f99788baff4ca -
Branch / Tag:
refs/tags/py/v0.2.3 - Owner: https://github.com/posit-dev
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release-python.yml@5015d5217f40525f8a5c8877728f99788baff4ca -
Trigger Event:
push
-
Statement type:
File details
Details for the file ggsql-0.2.3-cp310-abi3-macosx_11_0_arm64.whl.
File metadata
- Download URL: ggsql-0.2.3-cp310-abi3-macosx_11_0_arm64.whl
- Upload date:
- Size: 32.6 MB
- Tags: CPython 3.10+, macOS 11.0+ ARM64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
736f9fad3489f0b4803863114e8628d47c49afb3fe3348f070d89ea89b182e13
|
|
| MD5 |
1b7033801f51524a0585ce41e79b79bf
|
|
| BLAKE2b-256 |
8b71bc85ee28baa9f62e7da4dba115ab0f96b75ba8c6ccb5c509ef90caae8d6b
|
Provenance
The following attestation bundles were made for ggsql-0.2.3-cp310-abi3-macosx_11_0_arm64.whl:
Publisher:
release-python.yml on posit-dev/ggsql
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
ggsql-0.2.3-cp310-abi3-macosx_11_0_arm64.whl -
Subject digest:
736f9fad3489f0b4803863114e8628d47c49afb3fe3348f070d89ea89b182e13 - Sigstore transparency entry: 1329760099
- Sigstore integration time:
-
Permalink:
posit-dev/ggsql@5015d5217f40525f8a5c8877728f99788baff4ca -
Branch / Tag:
refs/tags/py/v0.2.3 - Owner: https://github.com/posit-dev
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release-python.yml@5015d5217f40525f8a5c8877728f99788baff4ca -
Trigger Event:
push
-
Statement type:
File details
Details for the file ggsql-0.2.3-cp310-abi3-macosx_10_12_x86_64.whl.
File metadata
- Download URL: ggsql-0.2.3-cp310-abi3-macosx_10_12_x86_64.whl
- Upload date:
- Size: 35.1 MB
- Tags: CPython 3.10+, macOS 10.12+ x86-64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8c656032647488283bb1017715a0a36f19530442d29891f86b3c32f0ed3df9f9
|
|
| MD5 |
87d14b6f9d4bea7abf8143812e7fc590
|
|
| BLAKE2b-256 |
6b450b66b2de4912c237d9342e36db5f7a336bd0a6418fce8364ad3d433b95bd
|
Provenance
The following attestation bundles were made for ggsql-0.2.3-cp310-abi3-macosx_10_12_x86_64.whl:
Publisher:
release-python.yml on posit-dev/ggsql
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
ggsql-0.2.3-cp310-abi3-macosx_10_12_x86_64.whl -
Subject digest:
8c656032647488283bb1017715a0a36f19530442d29891f86b3c32f0ed3df9f9 - Sigstore transparency entry: 1329759705
- Sigstore integration time:
-
Permalink:
posit-dev/ggsql@5015d5217f40525f8a5c8877728f99788baff4ca -
Branch / Tag:
refs/tags/py/v0.2.3 - Owner: https://github.com/posit-dev
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release-python.yml@5015d5217f40525f8a5c8877728f99788baff4ca -
Trigger Event:
push
-
Statement type: