Skip to main content

SQL extension for declarative data visualization

Project description

ggsql

Python bindings for ggsql, a SQL extension for declarative data visualization.

This package provides Python bindings to the Rust ggsql crate, enabling Python users to create visualizations using ggsql's VISUALISE syntax with native Altair chart output.

Installation

From PyPI (when published)

pip install ggsql

From source

Building from source requires:

# Clone the monorepo
git clone https://github.com/georgestagg/ggsql.git
cd ggsql/ggsql-python

# Create a virtual environment
python -m venv .venv
source .venv/bin/activate  # or `.venv\Scripts\activate` on Windows

# Install build dependencies
pip install maturin

# Build and install in development mode
maturin develop

# Or build a wheel
maturin build --release
pip install target/wheels/ggsql-*.whl

Quick Start

Simple Usage with render_altair

For quick visualizations, use the render_altair convenience function:

import ggsql
import polars as pl

# Create a DataFrame
df = pl.DataFrame({
    "x": [1, 2, 3, 4, 5],
    "y": [10, 20, 15, 30, 25],
    "category": ["A", "B", "A", "B", "A"]
})

# Render to Altair chart
chart = ggsql.render_altair(df, "VISUALISE x, y DRAW point")

# Display or save
chart.display()  # In Jupyter
chart.save("chart.html")  # Save to file

Two-Stage API

For more control, use the two-stage API with explicit reader and writer:

import ggsql
import polars as pl

# 1. Create a DuckDB reader
reader = ggsql.DuckDBReader("duckdb://memory")

# 2. Register your DataFrame as a table
df = pl.DataFrame({
    "date": ["2024-01-01", "2024-01-02", "2024-01-03"],
    "revenue": [100, 150, 120],
    "region": ["North", "South", "North"]
})
reader.register("sales", df)

# 3. Execute the ggsql query
spec = reader.execute(
    """
    SELECT * FROM sales
    VISUALISE date AS x, revenue AS y, region AS color
    DRAW line
    LABEL title => 'Sales by Region'
    """
)

# 4. Inspect metadata
print(f"Rows: {spec.metadata()['rows']}")
print(f"Columns: {spec.metadata()['columns']}")
print(f"Layers: {spec.layer_count()}")

# 5. Inspect SQL/VISUALISE portions and data
print(f"SQL: {spec.sql()}")
print(f"Visual: {spec.visual()}")
print(spec.layer_data(0))  # Returns polars DataFrame

# 6. Render to Vega-Lite JSON
writer = ggsql.VegaLiteWriter()
vegalite_json = writer.render(spec)
print(vegalite_json)

API Reference

Classes

DuckDBReader(connection: str)

Database reader that executes SQL and manages DataFrames.

reader = ggsql.DuckDBReader("duckdb://memory")  # In-memory database
reader = ggsql.DuckDBReader("duckdb:///path/to/file.db")  # File database

Methods:

  • register(name: str, df: polars.DataFrame, replace: bool = False) - Register a DataFrame as a queryable table
  • unregister(name: str) - Unregister a previously registered table
  • execute_sql(sql: str) -> polars.DataFrame - Execute SQL and return results

VegaLiteWriter()

Writer that generates Vega-Lite v6 JSON specifications.

writer = ggsql.VegaLiteWriter()
json_output = writer.render(spec)

Validated

Result of validate() containing query analysis without SQL execution.

Methods:

  • valid() -> bool - Whether the query is syntactically and semantically valid
  • has_visual() -> bool - Whether the query contains a VISUALISE clause
  • sql() -> str - The SQL portion (before VISUALISE)
  • visual() -> str - The VISUALISE portion
  • errors() -> list[dict] - Validation errors with messages and locations
  • warnings() -> list[dict] - Validation warnings

Spec

Result of reader.execute(), containing resolved visualization ready for rendering.

Methods:

  • metadata() -> dict - Get {"rows": int, "columns": list[str], "layer_count": int}
  • sql() -> str - The executed SQL query
  • visual() -> str - The VISUALISE clause
  • layer_count() -> int - Number of DRAW layers
  • data() -> polars.DataFrame | None - Main query result DataFrame
  • layer_data(index: int) -> polars.DataFrame | None - Layer-specific data (if filtered)
  • stat_data(index: int) -> polars.DataFrame | None - Statistical transform data
  • layer_sql(index: int) -> str | None - Layer filter SQL
  • stat_sql(index: int) -> str | None - Stat transform SQL
  • warnings() -> list[dict] - Validation warnings from execution

Functions

validate(query: str) -> Validated

Validate query syntax and semantics without executing SQL.

validated = ggsql.validate("SELECT x, y FROM data VISUALISE x, y DRAW point")
if validated.valid():
    print("Query is valid!")
else:
    for error in validated.errors():
        print(f"Error: {error['message']}")

reader.execute(query: str) -> Spec

Execute a ggsql query and return the visualization specification.

reader = ggsql.DuckDBReader("duckdb://memory")
spec = reader.execute("SELECT 1 AS x, 2 AS y VISUALISE x, y DRAW point")

render_altair(df, viz: str, **kwargs) -> altair.Chart

Convenience function to render a DataFrame with a VISUALISE spec to an Altair chart.

Parameters:

  • df - Any narwhals-compatible DataFrame (polars, pandas, etc.). LazyFrames are collected automatically.
  • viz - The VISUALISE specification string
  • **kwargs - Additional arguments passed to altair.Chart.from_json() (e.g., validate=False)

Returns: An Altair chart object (Chart, LayerChart, FacetChart, etc.)

import polars as pl
import ggsql

df = pl.DataFrame({"x": [1, 2, 3], "y": [10, 20, 30]})
chart = ggsql.render_altair(df, "VISUALISE x, y DRAW point")

Examples

Mapping Styles

df = pl.DataFrame({"x": [1, 2, 3], "y": [10, 20, 30], "category": ["A", "B", "A"]})

# Explicit mapping
ggsql.render_altair(df, "VISUALISE x AS x, y AS y DRAW point")

# Implicit mapping (column name = aesthetic name)
ggsql.render_altair(df, "VISUALISE x, y DRAW point")

# Wildcard mapping (map all matching columns)
ggsql.render_altair(df, "VISUALISE * DRAW point")

# With color encoding
ggsql.render_altair(df, "VISUALISE x, y, category AS color DRAW point")

Custom Readers

You can use any Python object with an execute_sql(sql: str) -> polars.DataFrame method as a reader. This enables integration with any data source.

import ggsql
import polars as pl

class CSVReader:
    """Custom reader that loads data from CSV files."""

    def __init__(self, data_dir: str):
        self.data_dir = data_dir

    def execute_sql(self, sql: str) -> pl.DataFrame:
        # Simple implementation: ignore SQL and return fixed data
        # A real implementation would parse SQL to determine which file to load
        return pl.read_csv(f"{self.data_dir}/data.csv")

# Use custom reader with ggsql.execute()
reader = CSVReader("/path/to/data")
spec = ggsql.execute(
    "SELECT * FROM data VISUALISE x, y DRAW point",
    reader
)
writer = ggsql.VegaLiteWriter()
json_output = writer.render(spec)

Additional methods for custom readers:

  • register(name: str, df: polars.DataFrame, replace: bool = False) -> None - Register a DataFrame as a queryable table (required)
  • unregister(name: str) -> None - Unregister a previously registered table (optional)
class AdvancedReader:
    """Custom reader with registration support."""

    def __init__(self):
        self.tables = {}

    def execute_sql(self, sql: str) -> pl.DataFrame:
        # Your SQL execution logic here
        ...

    def register(self, name: str, df: pl.DataFrame, replace: bool = False) -> None:
        self.tables[name] = df

    def unregister(self, name: str) -> None:
        del self.tables[name]

Native readers like DuckDBReader use an optimized fast path, while custom Python readers are automatically bridged via IPC serialization.

Ibis Reader Example

Ibis provides a unified Python API for SQL operations across multiple backends. Here's how to create an ibis-based custom reader:

import ggsql
import polars as pl
import ibis

class IbisReader:
    """Custom reader using ibis as the SQL backend."""

    def __init__(self, backend="duckdb"):
        if backend == "duckdb":
            self.con = ibis.duckdb.connect()
        elif backend == "sqlite":
            self.con = ibis.sqlite.connect()
        # Add other backends as needed

    def execute_sql(self, sql: str) -> pl.DataFrame:
        return self.con.con.execute(sql).pl()

    def register(self, name: str, df: pl.DataFrame, replace: bool = False) -> None:
        self.con.create_table(name, df.to_arrow(), overwrite=replace)

    def unregister(self, name: str) -> None:
        self.con.drop_table(name)

# Usage
reader = IbisReader()
df = pl.DataFrame({
    "date": ["2024-01-01", "2024-01-02", "2024-01-03"],
    "revenue": [100, 150, 120],
})
reader.register("sales", df)

spec = ggsql.execute(
    "SELECT * FROM sales VISUALISE date AS x, revenue AS y DRAW line",
    reader
)
writer = ggsql.VegaLiteWriter()
print(writer.render(spec))

Development

Keeping in sync with the monorepo

The ggsql-python package is part of the ggsql monorepo and depends on the Rust ggsql crate via a path dependency. When the Rust crate is updated, you may need to rebuild:

cd ggsql-python

# Rebuild after Rust changes
maturin develop

# If tree-sitter grammar changed, clean and rebuild
cd .. && cargo clean -p tree-sitter-ggsql && cd ggsql-python
maturin develop

Running tests

# Install test dependencies
pip install pytest

# Run all tests
pytest tests/ -v

Requirements

  • Python >= 3.10
  • altair >= 5.0
  • narwhals >= 2.15
  • polars >= 1.0

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ggsql-0.2.4.tar.gz (881.5 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

ggsql-0.2.4-cp310-abi3-win_amd64.whl (31.0 MB view details)

Uploaded CPython 3.10+Windows x86-64

ggsql-0.2.4-cp310-abi3-manylinux_2_28_x86_64.whl (38.0 MB view details)

Uploaded CPython 3.10+manylinux: glibc 2.28+ x86-64

ggsql-0.2.4-cp310-abi3-manylinux_2_28_aarch64.whl (37.0 MB view details)

Uploaded CPython 3.10+manylinux: glibc 2.28+ ARM64

ggsql-0.2.4-cp310-abi3-macosx_11_0_arm64.whl (32.6 MB view details)

Uploaded CPython 3.10+macOS 11.0+ ARM64

ggsql-0.2.4-cp310-abi3-macosx_10_12_x86_64.whl (35.1 MB view details)

Uploaded CPython 3.10+macOS 10.12+ x86-64

File details

Details for the file ggsql-0.2.4.tar.gz.

File metadata

  • Download URL: ggsql-0.2.4.tar.gz
  • Upload date:
  • Size: 881.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for ggsql-0.2.4.tar.gz
Algorithm Hash digest
SHA256 c596af9e9baa3cefb7a9d5414327eadda1391ede7ac74463860498b4149e65c5
MD5 3e962ffa4b7608ec535de1db759a7418
BLAKE2b-256 d7c591e46571f3cec1a25f51fe2b9f40768f1789a1947ec030c2e9c43f7b5480

See more details on using hashes here.

Provenance

The following attestation bundles were made for ggsql-0.2.4.tar.gz:

Publisher: release-python.yml on posit-dev/ggsql

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file ggsql-0.2.4-cp310-abi3-win_amd64.whl.

File metadata

  • Download URL: ggsql-0.2.4-cp310-abi3-win_amd64.whl
  • Upload date:
  • Size: 31.0 MB
  • Tags: CPython 3.10+, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for ggsql-0.2.4-cp310-abi3-win_amd64.whl
Algorithm Hash digest
SHA256 74b35b083d2d1dc941ac5387a03a4438a584e44f88e968a8ea9a138df9251333
MD5 3369a4e6f7c06b7e0a0850667d49890f
BLAKE2b-256 1cf6550f84b4a08fbaa63e1e27c99865cc92e5fa1bd91103f2aee5dff1ea6071

See more details on using hashes here.

Provenance

The following attestation bundles were made for ggsql-0.2.4-cp310-abi3-win_amd64.whl:

Publisher: release-python.yml on posit-dev/ggsql

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file ggsql-0.2.4-cp310-abi3-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for ggsql-0.2.4-cp310-abi3-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 321e7bbc85498b23a432cb3edc78fae379e13370a1c6855442ba9af3602fd2cb
MD5 44c6f8ae55a54d0c99a620864f09284c
BLAKE2b-256 19145919877ed1b628a53374b83d7a1d2a25ff92c63fe524928326558e29dc23

See more details on using hashes here.

Provenance

The following attestation bundles were made for ggsql-0.2.4-cp310-abi3-manylinux_2_28_x86_64.whl:

Publisher: release-python.yml on posit-dev/ggsql

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file ggsql-0.2.4-cp310-abi3-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for ggsql-0.2.4-cp310-abi3-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 159867d05e8ef5d45d624b8aa27a6688cd226f8b7a039c64abee448ef40220e7
MD5 43eeb4da24d3d3fe2be7422c6f5bb277
BLAKE2b-256 20cad7ffd245c35733cf3502c70c04887620bc50f4e5434f923d6a1dc9815f7c

See more details on using hashes here.

Provenance

The following attestation bundles were made for ggsql-0.2.4-cp310-abi3-manylinux_2_28_aarch64.whl:

Publisher: release-python.yml on posit-dev/ggsql

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file ggsql-0.2.4-cp310-abi3-macosx_11_0_arm64.whl.

File metadata

  • Download URL: ggsql-0.2.4-cp310-abi3-macosx_11_0_arm64.whl
  • Upload date:
  • Size: 32.6 MB
  • Tags: CPython 3.10+, macOS 11.0+ ARM64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for ggsql-0.2.4-cp310-abi3-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 2b3071cd19c1dab2473f324ef0260dd2e5709d4a1bdf2ea682105efa39bc96e9
MD5 734b8d799fd0450f41e66fe0992051ed
BLAKE2b-256 93b705a3ad9319241ad5a12af80f5a7c8a4ead630543b5a152064bcfd8056cf8

See more details on using hashes here.

Provenance

The following attestation bundles were made for ggsql-0.2.4-cp310-abi3-macosx_11_0_arm64.whl:

Publisher: release-python.yml on posit-dev/ggsql

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file ggsql-0.2.4-cp310-abi3-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for ggsql-0.2.4-cp310-abi3-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 1ec99748658718bedd7f1def3bcb18461ef33ed09682eea92ceb8f045eb83dfd
MD5 c5ad03e4d5da98b9e931bece1c0c6010
BLAKE2b-256 98df921996186604fd937346799222846f79d81f6dc272b5547b1873e8c6e035

See more details on using hashes here.

Provenance

The following attestation bundles were made for ggsql-0.2.4-cp310-abi3-macosx_10_12_x86_64.whl:

Publisher: release-python.yml on posit-dev/ggsql

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page