Skip to main content

SQL extension for declarative data visualization

Project description

ggsql

Python bindings for ggsql, a SQL extension for declarative data visualization.

This package provides Python bindings to the Rust ggsql crate, enabling Python users to create visualizations using ggsql's VISUALISE syntax with native Altair chart output.

Installation

From PyPI (when published)

pip install ggsql

From source

Building from source requires:

# Clone the repository
git clone https://github.com/posit-dev/ggsql-python.git
cd ggsql-python

# Create a virtual environment
python -m venv .venv
source .venv/bin/activate  # or `.venv\Scripts\activate` on Windows

# Install build dependencies
pip install maturin

# Build and install in development mode
maturin develop

# Or build a wheel
maturin build --release
pip install target/wheels/ggsql-*.whl

Quick Start

Simple Usage with render_altair

For quick visualizations, use the render_altair convenience function. It accepts any narwhals-compatible DataFrame (polars, pandas, pyarrow, etc.):

import ggsql
import pyarrow as pa

# Create a table
table = pa.table({
    "x": [1, 2, 3, 4, 5],
    "y": [10, 20, 15, 30, 25],
    "category": ["A", "B", "A", "B", "A"]
})

# Render to Altair chart
chart = ggsql.render_altair(table, "VISUALISE x, y DRAW point")

# Display or save
chart.display()  # In Jupyter
chart.save("chart.html")  # Save to file

Two-Stage API

For more control, use the two-stage API with explicit reader and writer:

import ggsql
import pyarrow as pa

# 1. Create a DuckDB reader
reader = ggsql.DuckDBReader("duckdb://memory")

# 2. Register your data as a table (accepts pyarrow, polars, pandas, etc.)
table = pa.table({
    "date": ["2024-01-01", "2024-01-02", "2024-01-03"],
    "revenue": [100, 150, 120],
    "region": ["North", "South", "North"]
})
reader.register("sales", table)

# 3. Execute the ggsql query
spec = reader.execute(
    """
    SELECT * FROM sales
    VISUALISE date AS x, revenue AS y, region AS color
    DRAW line
    LABEL title => 'Sales by Region'
    """
)

# 4. Inspect metadata
print(f"Rows: {spec.metadata()['rows']}")
print(f"Columns: {spec.metadata()['columns']}")
print(f"Layers: {spec.layer_count()}")

# 5. Inspect SQL/VISUALISE portions and data
print(f"SQL: {spec.sql()}")
print(f"Visual: {spec.visual()}")
print(spec.layer_data(0))  # Returns pyarrow.Table

# 6. Render to Vega-Lite JSON
writer = ggsql.VegaLiteWriter()
vegalite_json = writer.render(spec)
print(vegalite_json)

API Reference

Classes

DuckDBReader(connection: str)

Database reader that executes SQL and manages DataFrames.

reader = ggsql.DuckDBReader("duckdb://memory")  # In-memory database
reader = ggsql.DuckDBReader("duckdb:///path/to/file.db")  # File database

Methods:

  • register(name: str, table, replace: bool = False) - Register data as a queryable table (accepts pyarrow.Table, polars.DataFrame, pandas.DataFrame, etc.)
  • unregister(name: str) - Unregister a previously registered table
  • execute_sql(sql: str) -> pyarrow.Table - Execute SQL and return results

VegaLiteWriter()

Writer that generates Vega-Lite v6 JSON specifications.

writer = ggsql.VegaLiteWriter()
json_output = writer.render(spec)

Validated

Result of validate() containing query analysis without SQL execution.

Methods:

  • valid() -> bool - Whether the query is syntactically and semantically valid
  • has_visual() -> bool - Whether the query contains a VISUALISE clause
  • sql() -> str - The SQL portion (before VISUALISE)
  • visual() -> str - The VISUALISE portion
  • errors() -> list[dict] - Validation errors with messages and locations
  • warnings() -> list[dict] - Validation warnings

Spec

Result of reader.execute(), containing resolved visualization ready for rendering.

Methods:

  • metadata() -> dict - Get {"rows": int, "columns": list[str], "layer_count": int}
  • sql() -> str - The executed SQL query
  • visual() -> str - The VISUALISE clause
  • layer_count() -> int - Number of DRAW layers
  • data() -> pyarrow.Table | None - Main query result data
  • layer_data(index: int) -> pyarrow.Table | None - Layer-specific data (if filtered)
  • stat_data(index: int) -> pyarrow.Table | None - Statistical transform data
  • layer_sql(index: int) -> str | None - Layer filter SQL
  • stat_sql(index: int) -> str | None - Stat transform SQL
  • warnings() -> list[dict] - Validation warnings from execution

Functions

validate(query: str) -> Validated

Validate query syntax and semantics without executing SQL.

validated = ggsql.validate("SELECT x, y FROM data VISUALISE x, y DRAW point")
if validated.valid():
    print("Query is valid!")
else:
    for error in validated.errors():
        print(f"Error: {error['message']}")

reader.execute(query: str) -> Spec

Execute a ggsql query and return the visualization specification.

reader = ggsql.DuckDBReader("duckdb://memory")
spec = reader.execute("SELECT 1 AS x, 2 AS y VISUALISE x, y DRAW point")

render_altair(df, viz: str, **kwargs) -> altair.Chart

Convenience function to render a DataFrame with a VISUALISE spec to an Altair chart.

Parameters:

  • df - Any narwhals-compatible DataFrame (polars, pandas, etc.). LazyFrames are collected automatically.
  • viz - The VISUALISE specification string
  • **kwargs - Additional arguments passed to altair.Chart.from_json() (e.g., validate=False)

Returns: An Altair chart object (Chart, LayerChart, FacetChart, etc.)

import pyarrow as pa
import ggsql

table = pa.table({"x": [1, 2, 3], "y": [10, 20, 30]})
chart = ggsql.render_altair(table, "VISUALISE x, y DRAW point")

Examples

Mapping Styles

import pyarrow as pa

table = pa.table({"x": [1, 2, 3], "y": [10, 20, 30], "category": ["A", "B", "A"]})

# Explicit mapping
ggsql.render_altair(table, "VISUALISE x AS x, y AS y DRAW point")

# Implicit mapping (column name = aesthetic name)
ggsql.render_altair(table, "VISUALISE x, y DRAW point")

# Wildcard mapping (map all matching columns)
ggsql.render_altair(table, "VISUALISE * DRAW point")

# With color encoding
ggsql.render_altair(table, "VISUALISE x, y, category AS color DRAW point")

Custom Readers

You can use any Python object with an execute_sql(sql: str) method as a reader. The method should return a pyarrow.Table (or any type that pyarrow.table() can convert, such as a polars.DataFrame).

import ggsql
import pyarrow as pa
import pyarrow.csv

class CSVReader:
    """Custom reader that loads data from CSV files."""

    def __init__(self, data_dir: str):
        self.data_dir = data_dir

    def execute_sql(self, sql: str) -> pa.Table:
        # Simple implementation: ignore SQL and return fixed data
        # A real implementation would parse SQL to determine which file to load
        return pyarrow.csv.read_csv(f"{self.data_dir}/data.csv")

# Use custom reader with ggsql.execute()
reader = CSVReader("/path/to/data")
spec = ggsql.execute(
    "SELECT * FROM data VISUALISE x, y DRAW point",
    reader
)
writer = ggsql.VegaLiteWriter()
json_output = writer.render(spec)

Additional methods for custom readers:

  • register(name: str, table, replace: bool = False) -> None - Register data as a queryable table (required). Receives a pyarrow.Table.
  • unregister(name: str) -> None - Unregister a previously registered table (optional)
class AdvancedReader:
    """Custom reader with registration support."""

    def __init__(self):
        self.tables = {}

    def execute_sql(self, sql: str) -> pa.Table:
        # Your SQL execution logic here
        ...

    def register(self, name: str, table: pa.Table, replace: bool = False) -> None:
        self.tables[name] = table

    def unregister(self, name: str) -> None:
        del self.tables[name]

Native readers like DuckDBReader use an optimized fast path, while custom Python readers are automatically bridged via IPC serialization.

Ibis Reader Example

Ibis provides a unified Python API for SQL operations across multiple backends. Here's how to create an ibis-based custom reader:

import ggsql
import pyarrow as pa
import ibis

class IbisReader:
    """Custom reader using ibis as the SQL backend."""

    def __init__(self, backend="duckdb"):
        if backend == "duckdb":
            self.con = ibis.duckdb.connect()
        elif backend == "sqlite":
            self.con = ibis.sqlite.connect()
        # Add other backends as needed

    def execute_sql(self, sql: str) -> pa.Table:
        return self.con.con.execute(sql).arrow()

    def register(self, name: str, table: pa.Table, replace: bool = False) -> None:
        self.con.create_table(name, table, overwrite=replace)

    def unregister(self, name: str) -> None:
        self.con.drop_table(name)

# Usage
reader = IbisReader()
table = pa.table({
    "date": ["2024-01-01", "2024-01-02", "2024-01-03"],
    "revenue": [100, 150, 120],
})
reader.register("sales", table)

spec = ggsql.execute(
    "SELECT * FROM sales VISUALISE date AS x, revenue AS y DRAW line",
    reader
)
writer = ggsql.VegaLiteWriter()
print(writer.render(spec))

Development

Building

# Rebuild after Rust changes
maturin develop

To pick up a new version of the upstream ggsql Rust crate, bump its version in Cargo.toml and re-run maturin develop.

Running tests

# Install test dependencies
pip install pytest

# Run all tests
pytest tests/ -v

Requirements

  • Python >= 3.10
  • altair >= 5.0
  • narwhals >= 2.15
  • pyarrow >= 14.0

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ggsql-0.3.1.tar.gz (195.9 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

ggsql-0.3.1-cp310-abi3-win_amd64.whl (13.2 MB view details)

Uploaded CPython 3.10+Windows x86-64

ggsql-0.3.1-cp310-abi3-manylinux_2_28_x86_64.whl (18.8 MB view details)

Uploaded CPython 3.10+manylinux: glibc 2.28+ x86-64

ggsql-0.3.1-cp310-abi3-manylinux_2_28_aarch64.whl (17.1 MB view details)

Uploaded CPython 3.10+manylinux: glibc 2.28+ ARM64

ggsql-0.3.1-cp310-abi3-macosx_11_0_arm64.whl (14.8 MB view details)

Uploaded CPython 3.10+macOS 11.0+ ARM64

ggsql-0.3.1-cp310-abi3-macosx_10_12_x86_64.whl (16.3 MB view details)

Uploaded CPython 3.10+macOS 10.12+ x86-64

File details

Details for the file ggsql-0.3.1.tar.gz.

File metadata

  • Download URL: ggsql-0.3.1.tar.gz
  • Upload date:
  • Size: 195.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for ggsql-0.3.1.tar.gz
Algorithm Hash digest
SHA256 951a76505c32d64f819b8e85b8bfb806a1c4aafd1d7c80a3289e7622c2c837c3
MD5 84ae95072a18269d97b3a3c7ab8de088
BLAKE2b-256 b59f8a9862957dae8dcdc75700f1899f08a86e522d4e0f0b97c657fbced9080f

See more details on using hashes here.

Provenance

The following attestation bundles were made for ggsql-0.3.1.tar.gz:

Publisher: release-python.yml on posit-dev/ggsql-python

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file ggsql-0.3.1-cp310-abi3-win_amd64.whl.

File metadata

  • Download URL: ggsql-0.3.1-cp310-abi3-win_amd64.whl
  • Upload date:
  • Size: 13.2 MB
  • Tags: CPython 3.10+, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for ggsql-0.3.1-cp310-abi3-win_amd64.whl
Algorithm Hash digest
SHA256 d3bad31593516ee55cf38c797f4f9871117313a931b7c514f780ae38ec1ed6fe
MD5 9547fc4b3e4d2c45707f4992897f01d2
BLAKE2b-256 9224c876b5d5a17b73122625f709969c152cab15c6071c38e63bcf0c85e70050

See more details on using hashes here.

Provenance

The following attestation bundles were made for ggsql-0.3.1-cp310-abi3-win_amd64.whl:

Publisher: release-python.yml on posit-dev/ggsql-python

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file ggsql-0.3.1-cp310-abi3-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for ggsql-0.3.1-cp310-abi3-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 5e5bb33674f9959800419520cf382b8d430eb05fd4da8ecdff50193cf8cde0f7
MD5 85005fbb40a41d7f25e7b30291880313
BLAKE2b-256 974cb19def855227175f68cf7e6e59c182959a04a4df5302e949e9746d27e93f

See more details on using hashes here.

Provenance

The following attestation bundles were made for ggsql-0.3.1-cp310-abi3-manylinux_2_28_x86_64.whl:

Publisher: release-python.yml on posit-dev/ggsql-python

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file ggsql-0.3.1-cp310-abi3-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for ggsql-0.3.1-cp310-abi3-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 0bf165657b314817e97c7c4b6941f02144709c6530eda66333767add8e150e25
MD5 42fbeed118866b89dcfdf79a6b4b2c49
BLAKE2b-256 d58073434c880affba2849cdef1cb222c24b2d8ca650ebcfded55b6c9e35e2f1

See more details on using hashes here.

Provenance

The following attestation bundles were made for ggsql-0.3.1-cp310-abi3-manylinux_2_28_aarch64.whl:

Publisher: release-python.yml on posit-dev/ggsql-python

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file ggsql-0.3.1-cp310-abi3-macosx_11_0_arm64.whl.

File metadata

  • Download URL: ggsql-0.3.1-cp310-abi3-macosx_11_0_arm64.whl
  • Upload date:
  • Size: 14.8 MB
  • Tags: CPython 3.10+, macOS 11.0+ ARM64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for ggsql-0.3.1-cp310-abi3-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 6bcb9ace4b668f14c6d572f37ca92401c2c7d49dc3d208fdc9a61d82733373cf
MD5 ae12bbf6cd4db7c6e6879b786d828b0e
BLAKE2b-256 1346ac10ae14bc38237ef80193f68b6285536e02945a2c1882422c248ed03006

See more details on using hashes here.

Provenance

The following attestation bundles were made for ggsql-0.3.1-cp310-abi3-macosx_11_0_arm64.whl:

Publisher: release-python.yml on posit-dev/ggsql-python

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file ggsql-0.3.1-cp310-abi3-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for ggsql-0.3.1-cp310-abi3-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 de1bafe2b612c6b4545784c5e5b9dc626062ddd7d27d1a28d25f35dc14eab458
MD5 5b1805342f02c1a3e3662f2e7d6470b7
BLAKE2b-256 5c9ca76aef3118c9775d0fd8da043a5112803cc011fcd73a27f4285fc91b993b

See more details on using hashes here.

Provenance

The following attestation bundles were made for ggsql-0.3.1-cp310-abi3-macosx_10_12_x86_64.whl:

Publisher: release-python.yml on posit-dev/ggsql-python

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page