Ibis backend for Hotdata federated SQL API (depends on the hotdata SDK only; not hotdata-runtime)
Project description
hotdata-ibis
Experimental Ibis backend for Hotdata: compile expressions with Ibis, run federated SQL over the Hotdata API. REST calls use the official hotdata Python SDK. Repo examples use httpx (listed under the dev dependency group).
Requirements: Python 3.10+, ibis-framework 10.x, hotdata ≥0.2.
Install
uv pip install hotdata-ibis
# or: python -m pip install hotdata-ibis
Features
- Ibis connection API — connect with
ibis.hotdata.connect(...)oribis.connect("hotdata://..."). - Hotdata catalog mapping — expose Hotdata connections, schemas, and tables through Ibis catalogs, databases, and tables.
- SQL-backed expression execution — compile Ibis expressions with the Postgres SQLGlot compiler and execute them through Hotdata query APIs.
- Typed table discovery — load schema metadata from Hotdata information schema and map SQL types into Ibis types.
- Arrow and pandas results — materialize expressions as pandas DataFrames, PyArrow tables, or local Arrow record batches.
- Raw SQL escape hatch — use
con.sql(..., dialect="postgres")when Hotdata-specific federated SQL is clearer than modeled Ibis expressions. - Managed database writes — create managed connections with
create_database, load local pandas or PyArrow data throughcreate_table, and clean up withdrop_table/drop_database.
Connect
Programmatic API:
import ibis
con = ibis.hotdata.connect(
api_url="https://api.hotdata.dev",
token="YOUR_API_TOKEN",
workspace_id="ws_…",
session_id=None, # optional: X-Session-Id (sandbox)
verify_ssl=True,
timeout=120.0,
default_connection=None, # Hotdata connection id → Ibis catalog
default_schema=None, # remote schema → Ibis database
poll_interval_s=0.25,
poll_timeout_s=600.0,
)
URL style (token may live in the query string or the URL “password” segment):
con = ibis.connect(
"hotdata://api.hotdata.dev/?token=…&workspace_id=ws_…&verify_ssl=true"
)
Mapping: Ibis catalog = Hotdata connection id; database = remote schema; table = table name. SQL references look like connection.schema.table. With a single connection and schema, defaults are inferred; otherwise set default_connection / default_schema or qualify con.table(..., database=(conn_id, schema)).
Execution: SQL is compiled with Ibis’s Postgres SQLGlot compiler. The client submits queries asynchronously with POST /v1/query, polls GET /v1/query-runs/{id}, then downloads ready results as Arrow IPC from GET /v1/results/{id}. Tuning: poll_interval_s, poll_timeout_s on connect().
Types: Typed tables come from Hotdata’s information schema. con.sql(...) types are inferred from a small preview query and Arrow schema; see Hotdata SQL for server behavior.
Ibis Support Overview
hotdata-ibis is a read-oriented SQL backend. It is useful for exploring Hotdata workspaces with Ibis expressions, running federated SQL, and materializing results locally, but it is not a full mutable database backend.
Supported today:
- Connection setup:
ibis.hotdata.connect(...)andibis.connect("hotdata://...")with token, workspace, optional sandbox session, TLS, timeout, and polling settings. - Catalog discovery:
list_catalogs,list_databases,list_tables,current_catalog, andcurrent_databasemap Hotdata connections and remote schemas into Ibis' catalog/database/table hierarchy. - Table schemas:
con.table(...)uses Hotdata information schema column metadata and maps SQL types through Ibis' Postgres type parser. - SQL-backed expressions: Ibis expressions compile with the Postgres SQLGlot compiler and execute through Hotdata. Common
SELECTworkloads such as projection, filtering, joins, grouping, aggregation, ordering, limits, scalar expressions, andcon.sql(...)work when the generated SQL is accepted by Hotdata. - Result materialization:
.execute()returns pandas objects..to_pyarrow()and.to_pyarrow_batches()use the Arrow IPC result data exposed by Hotdata without converting through JSON rows; batches are split locally after the result is downloaded. - Raw SQL escape hatch:
con.sql("SELECT ...", dialect="postgres")is the most reliable way to use Hotdata-specific federated table names or SQL that Ibis does not model directly. - Managed database lifecycle:
create_database("sales", schema="public", tables=["orders"])registers a managed connection (Ibis catalog).create_table("orders", pandas_df, database=("sales", "public"))uploads Parquet and loads it with replace mode. Query assales.public.ordersin SQL.drop_tableclears a managed table;drop_databasedeletes the connection. - Parquet uploads:
create_tableaccepts pandas DataFrames, PyArrow tables, or schema-only empty tables. Tables must live in a managed connection — declare them withcreate_database(..., tables=[...])first. Loads always use replace mode; passoverwrite=Trueto replace an existing synced table (the defaultoverwrite=Falseraises if the table already exists).
Not supported as full Ibis backend features:
- General DDL and mutations: Arbitrary remote DDL, inserts, updates, deletes, and schema-altering operations on external connections are not implemented. Managed-database writes are limited to
create_database,create_table,drop_table, anddrop_databaseas described above. - Temporary tables and in-memory registration:
supports_temporary_tablesis false, and in-memory tables are not uploaded automatically for joins. - Python UDFs:
supports_python_udfsis false. - Transactions and sessions as database state: Hotdata sandbox sessions can be passed as
session_id, but the backend does not expose transaction APIs. - Backend-native SQL dialect: Compilation uses Ibis' Postgres dialect as the closest fit. Hotdata SQL and federation rules are authoritative, so not every Ibis expression that compiles is guaranteed to execute remotely.
- Complete Ibis compliance: The backend is experimental and has focused test coverage for connection, discovery, schema mapping, execution, uploads, and Arrow results. It has not yet been validated against the full Ibis backend test suite.
- Hotdata platform APIs beyond SQL and managed databases: embeddings, indexes, query history management, sandbox lifecycle management, and other Hotdata-specific APIs are outside the Ibis backend surface.
Development
uv sync # installs dev group by default (pytest, ruff, httpx for examples)
uv run pytest
uv run ruff check src tests examples
Lockfile CI: uv sync --locked && uv run pytest.
TPC-H for the examples
Examples assume something like tpch.tpch_sf1.customer. Provision TPC-H in your workspace (commonly a DuckDB connection, then DuckDB’s tpch extension and CALL dbgen(sf = 1) — see DuckDB TPC-H and Hotdata Quick Start). If your data lives under main instead, pass --default-schema / --default-connection or set HOTDATA_DEFAULT_* (see examples/_helpers.py).
Examples
Needs HOTDATA_API_KEY and HOTDATA_WORKSPACE.
uv sync
export HOTDATA_API_KEY=…
export HOTDATA_WORKSPACE=…
uv run python examples/01_catalog_introspection.py
uv run python examples/02_execute_sql.py 'SELECT COUNT(*) AS n FROM tpch.tpch_sf1.customer'
uv run python examples/03_connect_via_url.py
uv run python examples/04_ibis_table_workflows.py
Ibis tables → pandas DataFrames
Calling .execute() on a table expression runs the compiled SQL on Hotdata and returns a pandas DataFrame (Ibis’s default for this backend).
Hotdata’s SQL often uses a federated prefix (for example tpch.tpch_sf1) that may not match the Ibis catalog string (the connection id). A reliable pattern is to start from con.sql("SELECT * FROM tpch.tpch_sf1.mytable", dialect="postgres"), then chain filters and aggregates—see examples/04_ibis_table_workflows.py.
When con.table("mytable") is enough (single connection/schema and names align with compiled SQL), the same operations apply:
t = con.table("customer") # or con.table("customer", database=(conn_id, "tpch_sf1"))
df = (
t.filter(t.c_mktsegment == "AUTOMOBILE")
.select("c_custkey", "c_name")
.limit(100)
.execute()
)
by_seg = t.group_by(t.c_mktsegment).agg(n=t.count()).execute()
o = con.table("orders")
orders_with_names = (
t.join(o, t.c_custkey == o.o_custkey)
.select(t.c_name, o.o_totalprice)
.limit(50)
.execute()
)
total = t.c_acctbal.sum().execute()
Other useful paths: .to_pyarrow() / .to_pyarrow_batches() for Arrow; con.sql("SELECT …", dialect="postgres") then chain the returned table expression.
References
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file hotdata_ibis-0.1.1.tar.gz.
File metadata
- Download URL: hotdata_ibis-0.1.1.tar.gz
- Upload date:
- Size: 20.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
baa2c5c9bfd2016a36870c133e63f200b5b48faa20c1fb452739ea4a389dd15d
|
|
| MD5 |
0ad880b0ff2932f5a5318b8550662b1c
|
|
| BLAKE2b-256 |
d77858ebd6a18b36a04f54bf437c98b365bf94ef421e1879005f98984011bd2d
|
Provenance
The following attestation bundles were made for hotdata_ibis-0.1.1.tar.gz:
Publisher:
publish.yml on hotdata-dev/hotdata-ibis
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
hotdata_ibis-0.1.1.tar.gz -
Subject digest:
baa2c5c9bfd2016a36870c133e63f200b5b48faa20c1fb452739ea4a389dd15d - Sigstore transparency entry: 1576272699
- Sigstore integration time:
-
Permalink:
hotdata-dev/hotdata-ibis@b3d9d80e47bcb047cb3fa38ced5a5d01ceed0196 -
Branch / Tag:
refs/tags/v0.1.1 - Owner: https://github.com/hotdata-dev
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@b3d9d80e47bcb047cb3fa38ced5a5d01ceed0196 -
Trigger Event:
push
-
Statement type:
File details
Details for the file hotdata_ibis-0.1.1-py3-none-any.whl.
File metadata
- Download URL: hotdata_ibis-0.1.1-py3-none-any.whl
- Upload date:
- Size: 16.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
63fccbda14340456a3ca2db5f97c199aaf8edfc5dd758392ab619c51c3f84fa1
|
|
| MD5 |
ed3f851e166688139bfaa7e8732aa8a7
|
|
| BLAKE2b-256 |
b36720f1a64010d7071f5fc29e76b02e86fe66e82a4ccb356df1d02b016351c6
|
Provenance
The following attestation bundles were made for hotdata_ibis-0.1.1-py3-none-any.whl:
Publisher:
publish.yml on hotdata-dev/hotdata-ibis
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
hotdata_ibis-0.1.1-py3-none-any.whl -
Subject digest:
63fccbda14340456a3ca2db5f97c199aaf8edfc5dd758392ab619c51c3f84fa1 - Sigstore transparency entry: 1576272718
- Sigstore integration time:
-
Permalink:
hotdata-dev/hotdata-ibis@b3d9d80e47bcb047cb3fa38ced5a5d01ceed0196 -
Branch / Tag:
refs/tags/v0.1.1 - Owner: https://github.com/hotdata-dev
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@b3d9d80e47bcb047cb3fa38ced5a5d01ceed0196 -
Trigger Event:
push
-
Statement type: