Skip to main content

SQL extension for declarative data visualization

Project description

ggsql

Python bindings for ggsql, a SQL extension for declarative data visualization.

This package provides Python bindings to the Rust ggsql crate, enabling Python users to create visualizations using ggsql's VISUALISE syntax with native Altair chart output.

Installation

From PyPI (when published)

pip install ggsql

From source

Building from source requires:

# Clone the repository
git clone https://github.com/posit-dev/ggsql-python.git
cd ggsql-python

# Create a virtual environment
python -m venv .venv
source .venv/bin/activate  # or `.venv\Scripts\activate` on Windows

# Install build dependencies
pip install maturin

# Build and install in development mode
maturin develop

# Or build a wheel
maturin build --release
pip install target/wheels/ggsql-*.whl

Quick Start

Simple Usage with render_altair

For quick visualizations, use the render_altair convenience function. It accepts any narwhals-compatible DataFrame (polars, pandas, pyarrow, etc.):

import ggsql
import pyarrow as pa

# Create a table
table = pa.table({
    "x": [1, 2, 3, 4, 5],
    "y": [10, 20, 15, 30, 25],
    "category": ["A", "B", "A", "B", "A"]
})

# Render to Altair chart
chart = ggsql.render_altair(table, "VISUALISE x, y DRAW point")

# Display or save
chart.display()  # In Jupyter
chart.save("chart.html")  # Save to file

Two-Stage API

For more control, use the two-stage API with explicit reader and writer:

import ggsql
import pyarrow as pa

# 1. Create a DuckDB reader
reader = ggsql.DuckDBReader("duckdb://memory")

# 2. Register your data as a table (accepts pyarrow, polars, pandas, etc.)
table = pa.table({
    "date": ["2024-01-01", "2024-01-02", "2024-01-03"],
    "revenue": [100, 150, 120],
    "region": ["North", "South", "North"]
})
reader.register("sales", table)

# 3. Execute the ggsql query
spec = reader.execute(
    """
    SELECT * FROM sales
    VISUALISE date AS x, revenue AS y, region AS color
    DRAW line
    LABEL title => 'Sales by Region'
    """
)

# 4. Inspect metadata
print(f"Rows: {spec.metadata()['rows']}")
print(f"Columns: {spec.metadata()['columns']}")
print(f"Layers: {spec.layer_count()}")

# 5. Inspect SQL/VISUALISE portions and data
print(f"SQL: {spec.sql()}")
print(f"Visual: {spec.visual()}")
print(spec.layer_data(0))  # Returns pyarrow.Table

# 6. Render to Vega-Lite JSON
writer = ggsql.VegaLiteWriter()
vegalite_json = writer.render(spec)
print(vegalite_json)

API Reference

Classes

DuckDBReader(connection: str)

Database reader that executes SQL and manages DataFrames.

reader = ggsql.DuckDBReader("duckdb://memory")  # In-memory database
reader = ggsql.DuckDBReader("duckdb:///path/to/file.db")  # File database

Methods:

  • register(name: str, table, replace: bool = False) - Register data as a queryable table (accepts pyarrow.Table, polars.DataFrame, pandas.DataFrame, etc.)
  • unregister(name: str) - Unregister a previously registered table
  • execute_sql(sql: str) -> pyarrow.Table - Execute SQL and return results

VegaLiteWriter()

Writer that generates Vega-Lite v6 JSON specifications.

writer = ggsql.VegaLiteWriter()
json_output = writer.render(spec)

Validated

Result of validate() containing query analysis without SQL execution.

Methods:

  • valid() -> bool - Whether the query is syntactically and semantically valid
  • has_visual() -> bool - Whether the query contains a VISUALISE clause
  • sql() -> str - The SQL portion (before VISUALISE)
  • visual() -> str - The VISUALISE portion
  • errors() -> list[dict] - Validation errors with messages and locations
  • warnings() -> list[dict] - Validation warnings

Spec

Result of reader.execute(), containing resolved visualization ready for rendering.

Methods:

  • metadata() -> dict - Get {"rows": int, "columns": list[str], "layer_count": int}
  • sql() -> str - The executed SQL query
  • visual() -> str - The VISUALISE clause
  • layer_count() -> int - Number of DRAW layers
  • data() -> pyarrow.Table | None - Main query result data
  • layer_data(index: int) -> pyarrow.Table | None - Layer-specific data (if filtered)
  • stat_data(index: int) -> pyarrow.Table | None - Statistical transform data
  • layer_sql(index: int) -> str | None - Layer filter SQL
  • stat_sql(index: int) -> str | None - Stat transform SQL
  • warnings() -> list[dict] - Validation warnings from execution

Functions

validate(query: str) -> Validated

Validate query syntax and semantics without executing SQL.

validated = ggsql.validate("SELECT x, y FROM data VISUALISE x, y DRAW point")
if validated.valid():
    print("Query is valid!")
else:
    for error in validated.errors():
        print(f"Error: {error['message']}")

reader.execute(query: str) -> Spec

Execute a ggsql query and return the visualization specification.

reader = ggsql.DuckDBReader("duckdb://memory")
spec = reader.execute("SELECT 1 AS x, 2 AS y VISUALISE x, y DRAW point")

render_altair(df, viz: str, **kwargs) -> altair.Chart

Convenience function to render a DataFrame with a VISUALISE spec to an Altair chart.

Parameters:

  • df - Any narwhals-compatible DataFrame (polars, pandas, etc.). LazyFrames are collected automatically.
  • viz - The VISUALISE specification string
  • **kwargs - Additional arguments passed to altair.Chart.from_json() (e.g., validate=False)

Returns: An Altair chart object (Chart, LayerChart, FacetChart, etc.)

import pyarrow as pa
import ggsql

table = pa.table({"x": [1, 2, 3], "y": [10, 20, 30]})
chart = ggsql.render_altair(table, "VISUALISE x, y DRAW point")

Examples

Mapping Styles

import pyarrow as pa

table = pa.table({"x": [1, 2, 3], "y": [10, 20, 30], "category": ["A", "B", "A"]})

# Explicit mapping
ggsql.render_altair(table, "VISUALISE x AS x, y AS y DRAW point")

# Implicit mapping (column name = aesthetic name)
ggsql.render_altair(table, "VISUALISE x, y DRAW point")

# Wildcard mapping (map all matching columns)
ggsql.render_altair(table, "VISUALISE * DRAW point")

# With color encoding
ggsql.render_altair(table, "VISUALISE x, y, category AS color DRAW point")

Custom Readers

You can use any Python object with an execute_sql(sql: str) method as a reader. The method should return a pyarrow.Table (or any type that pyarrow.table() can convert, such as a polars.DataFrame).

import ggsql
import pyarrow as pa
import pyarrow.csv

class CSVReader:
    """Custom reader that loads data from CSV files."""

    def __init__(self, data_dir: str):
        self.data_dir = data_dir

    def execute_sql(self, sql: str) -> pa.Table:
        # Simple implementation: ignore SQL and return fixed data
        # A real implementation would parse SQL to determine which file to load
        return pyarrow.csv.read_csv(f"{self.data_dir}/data.csv")

# Use custom reader with ggsql.execute()
reader = CSVReader("/path/to/data")
spec = ggsql.execute(
    "SELECT * FROM data VISUALISE x, y DRAW point",
    reader
)
writer = ggsql.VegaLiteWriter()
json_output = writer.render(spec)

Additional methods for custom readers:

  • register(name: str, table, replace: bool = False) -> None - Register data as a queryable table (required). Receives a pyarrow.Table.
  • unregister(name: str) -> None - Unregister a previously registered table (optional)
class AdvancedReader:
    """Custom reader with registration support."""

    def __init__(self):
        self.tables = {}

    def execute_sql(self, sql: str) -> pa.Table:
        # Your SQL execution logic here
        ...

    def register(self, name: str, table: pa.Table, replace: bool = False) -> None:
        self.tables[name] = table

    def unregister(self, name: str) -> None:
        del self.tables[name]

Native readers like DuckDBReader use an optimized fast path, while custom Python readers are automatically bridged via IPC serialization.

Ibis Reader Example

Ibis provides a unified Python API for SQL operations across multiple backends. Here's how to create an ibis-based custom reader:

import ggsql
import pyarrow as pa
import ibis

class IbisReader:
    """Custom reader using ibis as the SQL backend."""

    def __init__(self, backend="duckdb"):
        if backend == "duckdb":
            self.con = ibis.duckdb.connect()
        elif backend == "sqlite":
            self.con = ibis.sqlite.connect()
        # Add other backends as needed

    def execute_sql(self, sql: str) -> pa.Table:
        return self.con.con.execute(sql).arrow()

    def register(self, name: str, table: pa.Table, replace: bool = False) -> None:
        self.con.create_table(name, table, overwrite=replace)

    def unregister(self, name: str) -> None:
        self.con.drop_table(name)

# Usage
reader = IbisReader()
table = pa.table({
    "date": ["2024-01-01", "2024-01-02", "2024-01-03"],
    "revenue": [100, 150, 120],
})
reader.register("sales", table)

spec = ggsql.execute(
    "SELECT * FROM sales VISUALISE date AS x, revenue AS y DRAW line",
    reader
)
writer = ggsql.VegaLiteWriter()
print(writer.render(spec))

Development

Building

# Rebuild after Rust changes
maturin develop

To pick up a new version of the upstream ggsql Rust crate, bump its version in Cargo.toml and re-run maturin develop.

Running tests

# Install test dependencies
pip install pytest

# Run all tests
pytest tests/ -v

Requirements

  • Python >= 3.10
  • altair >= 5.0
  • narwhals >= 2.15
  • pyarrow >= 14.0

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ggsql-0.3.2.tar.gz (195.8 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

ggsql-0.3.2-cp310-abi3-win_amd64.whl (13.2 MB view details)

Uploaded CPython 3.10+Windows x86-64

ggsql-0.3.2-cp310-abi3-manylinux_2_28_x86_64.whl (18.9 MB view details)

Uploaded CPython 3.10+manylinux: glibc 2.28+ x86-64

ggsql-0.3.2-cp310-abi3-manylinux_2_28_aarch64.whl (17.2 MB view details)

Uploaded CPython 3.10+manylinux: glibc 2.28+ ARM64

ggsql-0.3.2-cp310-abi3-macosx_11_0_arm64.whl (14.8 MB view details)

Uploaded CPython 3.10+macOS 11.0+ ARM64

ggsql-0.3.2-cp310-abi3-macosx_10_12_x86_64.whl (16.3 MB view details)

Uploaded CPython 3.10+macOS 10.12+ x86-64

File details

Details for the file ggsql-0.3.2.tar.gz.

File metadata

  • Download URL: ggsql-0.3.2.tar.gz
  • Upload date:
  • Size: 195.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for ggsql-0.3.2.tar.gz
Algorithm Hash digest
SHA256 15cde053cbfdea255a4049beda7b5533f7d7e347ea4d87d465115df26ad4c257
MD5 a5a0767e42d5d100a9abeb436b0ab76e
BLAKE2b-256 d0d1b5ac0f09a3b11d451f89f8aa363f9f262cddb4ef3c1c9ea2e506e196d91c

See more details on using hashes here.

Provenance

The following attestation bundles were made for ggsql-0.3.2.tar.gz:

Publisher: release-python.yml on posit-dev/ggsql-python

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file ggsql-0.3.2-cp310-abi3-win_amd64.whl.

File metadata

  • Download URL: ggsql-0.3.2-cp310-abi3-win_amd64.whl
  • Upload date:
  • Size: 13.2 MB
  • Tags: CPython 3.10+, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for ggsql-0.3.2-cp310-abi3-win_amd64.whl
Algorithm Hash digest
SHA256 ae8b9a571016a1a40c6a25e00f099f7cb1894198a189598e29602a2c85f239ad
MD5 f9936392b0d8eb2ad584d0479466724a
BLAKE2b-256 464e5c8745147f942a16457dbbd3bfa8e730137f6f777ad496ccedb413af3cca

See more details on using hashes here.

Provenance

The following attestation bundles were made for ggsql-0.3.2-cp310-abi3-win_amd64.whl:

Publisher: release-python.yml on posit-dev/ggsql-python

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file ggsql-0.3.2-cp310-abi3-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for ggsql-0.3.2-cp310-abi3-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 5aad697663de008cc37b738f06f6cca1804a647b254b15e7b2fe1065095c458f
MD5 1cdb5bbd49b2246ac3ba5ed270dac63a
BLAKE2b-256 575da9f1a07990ece5ae6d955821e59c600dce5bccdf2e70d047a1011ddeb03a

See more details on using hashes here.

Provenance

The following attestation bundles were made for ggsql-0.3.2-cp310-abi3-manylinux_2_28_x86_64.whl:

Publisher: release-python.yml on posit-dev/ggsql-python

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file ggsql-0.3.2-cp310-abi3-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for ggsql-0.3.2-cp310-abi3-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 f15a9ed55d0496c86f10095d204aac6df92cb62a0262554dc724fd29b1008b86
MD5 bc3e7d99aa921a02b3468a3e129cee84
BLAKE2b-256 8076185f36671fcebcb9faf7fef0dcb7812e2ad9b4086197ae2723c5a4097ad1

See more details on using hashes here.

Provenance

The following attestation bundles were made for ggsql-0.3.2-cp310-abi3-manylinux_2_28_aarch64.whl:

Publisher: release-python.yml on posit-dev/ggsql-python

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file ggsql-0.3.2-cp310-abi3-macosx_11_0_arm64.whl.

File metadata

  • Download URL: ggsql-0.3.2-cp310-abi3-macosx_11_0_arm64.whl
  • Upload date:
  • Size: 14.8 MB
  • Tags: CPython 3.10+, macOS 11.0+ ARM64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for ggsql-0.3.2-cp310-abi3-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 4a77e7bb552144a716349677469b4a6c518ca20bd39425fb5d2b2f7147ffe4e6
MD5 229d94169dfe941795b4f0ba4e02e237
BLAKE2b-256 3d8d2666e2e766dbb62b272700cd973a2872e36a78f7b8ddf47b2a5e5af07ae4

See more details on using hashes here.

Provenance

The following attestation bundles were made for ggsql-0.3.2-cp310-abi3-macosx_11_0_arm64.whl:

Publisher: release-python.yml on posit-dev/ggsql-python

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file ggsql-0.3.2-cp310-abi3-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for ggsql-0.3.2-cp310-abi3-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 8a536b07238c6c47e13995f557ab49fd3653e1dfe1bdc8a355ca9c2f76886b92
MD5 fbef9c04451f8e269c515a0a5710658c
BLAKE2b-256 9743c241464a4df2a5126b8f39715cb11a6e7ba58c524413cf85ea5b405da1eb

See more details on using hashes here.

Provenance

The following attestation bundles were made for ggsql-0.3.2-cp310-abi3-macosx_10_12_x86_64.whl:

Publisher: release-python.yml on posit-dev/ggsql-python

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page