Skip to main content

Onehouse Python SDK — modular data-plane clients for LakeBase and beyond

Project description

onehouse-python-sdk

Python SDK for connecting to Onehouse data-plane and control-plane services.

The base package has zero required dependencies. Drivers ship as optional extras so you only install what you need.

pip install onehouse-python-sdk[lakebase]    # PostgreSQL/LakeBase SQL client
pip install onehouse-python-sdk[resources]   # Control-plane REST client

LakeBase

LakeBase is Onehouse's PostgreSQL-compatible managed lakehouse engine. This SDK provides a psycopg2-backed client that handles the browser-based and federated authentication flows LakeBase clusters require, on top of standard username/password auth.

Installation

pip install onehouse-python-sdk[lakebase]

Requires Python 3.9+.

Quickstart

from onehouse_python_sdk import LakebaseClient

# Username / password
with LakebaseClient().connect(
    host="<cluster-host>",
    port=5432,
    dbname="mydb",
    user="admin",
    password="secret",
) as client:
    rows = client.fetchall("SELECT * FROM mytable WHERE id = %s", (42,))

Authentication flows

Flow Parameters
Username / password user, password
OIDC Device Flow (Okta / Auth0) browser_auth=true, oidc_client_id, oidc_issuer_url, oidc_iam_role
Azure AD OAuth2 browser_auth=true, azure_oauth_tenant_id, azure_oauth_client_id, azure_oauth_client_secret
Azure Entra ID SAML browser_auth=true, azure_tenant_id, azure_entity_id
Built-in login form browser_auth=true (default when no IdP params are set)
External redirect browser_auth=true, auth_redirect_url
# OIDC Device Flow
client = LakebaseClient().connect(
    host="<cluster-host>",
    port=5432,
    dbname="mydb",
    browser_auth="true",
    oidc_client_id="0oaXXXXX",
    oidc_issuer_url="https://myorg.okta.com",
    oidc_iam_role="arn:aws:iam::123456789012:role/LakeBaseRole",
)

# Azure AD OAuth2
client = LakebaseClient().connect(
    host="<cluster-host>",
    port=5432,
    dbname="mydb",
    browser_auth="true",
    azure_oauth_tenant_id="your-tenant-id",
    azure_oauth_client_id="your-client-id",
    azure_oauth_client_secret="your-client-secret",
)

DSN string form is also supported:

client = LakebaseClient().connect(
    "postgresql://<cluster-host>:5432/mydb"
    "?browser_auth=true"
    "&oidc_client_id=0oaXXXX"
    "&oidc_issuer_url=https://myorg.okta.com"
    "&oidc_iam_role=arn:aws:iam::123456789012:role/LakeBaseRole"
)

Client API

All clients implement the SqlClient interface:

Method Description
connect(dsn=None, **kwargs) → self Establish connection, returns self for chaining
execute(sql, params) Run a statement, return rowcount
fetchall(sql, params) Run a query, return list[tuple]
fetchone(sql, params) Run a query, return first row or None
cursor() Raw cursor for advanced use
raw_connection Underlying psycopg2 connection
close() Close the connection
__enter__ / __exit__ Context manager — closes on exit

Notes

  • Credential caching — auth tokens are cached for 4 minutes per connection parameters to avoid repeated browser prompts within the same process.
  • Callback port — the local auth callback server defaults to port 8888 at path /lakebase. Override with auth_callback_port and auth_callback_path.

Onehouse Resources (Control-Plane API)

OnehouseResources wraps the Onehouse REST API for managing platform resources — clusters, lakes, flows, jobs, table services, and more. It posts SQL statements to https://api.onehouse.ai/v1/resource/ and polls /v1/status/{requestId} for the result.

Installation

pip install onehouse-python-sdk[resources]

Quickstart

from onehouse_python_sdk import OnehouseResources

client = OnehouseResources(
    account_uid="...", project_uid="...",
    api_key="...", api_secret="...",
    link_uid="...", region="us-west-2", user_uid="...",
)

# Typed helper — blocks until the operation reaches a terminal status.
result = client.create_cluster(
    "prod",
    type="Managed",
    max_ocu=10,
    min_ocu=1,
    options={"worker.type": "oh-general-4"},
)
print(result.api_status)        # ApiStatus.SUCCESS
print(result.api_response)      # raw API payload

Credentials

Credentials resolve from three sources, in order (highest wins):

  1. Explicit constructor arguments (shown above).
  2. Environment variablesONEHOUSE_ACCOUNT_UID, ONEHOUSE_PROJECT_UID, ONEHOUSE_API_KEY, ONEHOUSE_API_SECRET, ONEHOUSE_LINK_UID, ONEHOUSE_REGION, ONEHOUSE_USER_UID. Optional: ONEHOUSE_BASE_URL, ONEHOUSE_PROFILE, ONEHOUSE_CREDENTIALS_FILE.
  3. INI credentials file at ~/.onehouse/credentials (override with ONEHOUSE_CREDENTIALS_FILE).
# ~/.onehouse/credentials
[default]
account_uid = 92e5f1ab-...
project_uid = 3afe72cd-...
api_key     = j+m8wRhgpKYFTLxCHNDzQA==
api_secret  = tXpzrqfUBNK9yhS5+FmLM37xwfhVeZygJntCzHG4Dpq=
link_uid    = da56fe8b-...
region      = us-west-2
user_uid    = ...

[staging]
account_uid = ...
# Read from environment / default profile.
client = OnehouseResources()

# Pick a named profile.
client = OnehouseResources(profile="staging")

Missing fields produce an AuthError listing which fields are unset and how to supply them. A world-readable credentials file triggers a warning — chmod 600 ~/.onehouse/credentials.

Three ways to run a command

# (1) Blocking — submit, poll, return the terminal status. Most common.
result = client.execute("SHOW CLUSTERS")

# (2) Non-blocking — submit now, poll later. Good for long-running ops or
# parallel orchestration where you don't want to hold a thread.
submitted = client.submit("CREATE CLUSTER `prod` TYPE = 'Managed' MAX_OCU = 10 MIN_OCU = 1")
# ... do other work, persist submitted.request_id ...
status = client.get_status(submitted.request_id)
while status.api_status == ApiStatus.PENDING:
    time.sleep(5)
    status = client.get_status(submitted.request_id)

# (3) Typed helpers — same blocking semantics as execute(), but build the SQL for you.
client.create_cluster("prod", type="Managed", max_ocu=10, min_ocu=1)
client.show_clusters()
client.delete_cluster("prod")

Typed helpers

OnehouseResources exposes one method per Phase 1 SQL command (~50 methods across 11 resource families): Clusters, Lakes, Databases, Tables, Catalogs, Sources, Flows, Transformations, Validations, Table Services, Jobs, Service Principals, API Tokens. Every typed helper accepts the same trailing kwargs: unsafe_raw, timeout, poll_interval.

client.create_lake(
    "analytics",
    lake_type="MANAGED",
    bucket_path="s3://my-bucket/lake",
    default_services_cluster="services",
)

from onehouse_python_sdk.resources.sql.commands import PartitionKeyField

client.create_flow(
    "events_pipeline",
    source="my_kafka_source",
    lake="analytics",
    database="events",
    table_name="page_views",
    write_mode="MUTABLE",
    cluster="ingest_cluster",
    catalogs=["my_glue_catalog"],
    record_key_fields=["id"],
    partition_key_fields=[
        PartitionKeyField("date", partition_type="DATE_STRING",
                          input_format="yyyy-mm-dd", output_format="yyyy-mm-dd"),
    ],
    min_sync_frequency_mins=5,
    options={"kafka.topic.name": "page_views"},
)

ACL / privilege / role / group commands aren't exposed as typed helpers yet — use client.execute("GRANT ...") until they land in a future release.

unsafe_raw escape hatch

The builder validates resource names against ^[A-Za-z][A-Za-z0-9_-]*$ and rejects unknown enum values to catch obvious typos. If a name doesn't match (e.g. legacy resources with dots in the name) or you're using a SQL feature the SDK hasn't been updated for, pass unsafe_raw=True on the call to skip validation:

client.create_cluster("legacy.name", type="CustomType", unsafe_raw=True)

The argument is deliberately named so its uses are easy to find in grep.

Error handling

OnehouseSdkError
└── ResourcesError                     # raised by the resources/ subpackage
    ├── AuthError                      # missing/invalid credentials
    ├── SqlParseError                  # HTTP 400 + grpc-message — server rejected the SQL
    ├── OperationFailedError           # terminal status FAILED or INVALID
    └── OperationTimeoutError          # polling exceeded the configured timeout

OperationTimeoutError carries the request_id of the in-flight operation — you can resume polling with get_status(request_id) rather than re-submitting (which would create a duplicate resource).

Client API

Method Description
submit(statement) → SubmitResponse POST /v1/resource/, return the requestId.
get_status(request_id) → StatusResponse GET /v1/status/{id}, return the parsed status.
execute(statement, timeout=, poll_interval=) → StatusResponse Submit + poll until terminal. Blocks.
<verb>_<resource>(...) (~50 methods) Typed wrappers around execute() that build SQL for you.

Notes

  • Lazy importfrom onehouse_python_sdk import OnehouseResources works even when the [resources] extra isn't installed; the first network call raises a clear "install [resources] extra" error.
  • Rate limit — projects are capped at 10 QPS. The transport retries 429 responses with bounded exponential backoff.
  • Not a SqlClientOnehouseResources is a control-plane HTTP client and intentionally does not implement the SqlClient interface (no cursor, fetchall, etc.).

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

onehouse_python_sdk-0.2.0.tar.gz (35.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

onehouse_python_sdk-0.2.0-py3-none-any.whl (44.6 kB view details)

Uploaded Python 3

File details

Details for the file onehouse_python_sdk-0.2.0.tar.gz.

File metadata

  • Download URL: onehouse_python_sdk-0.2.0.tar.gz
  • Upload date:
  • Size: 35.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for onehouse_python_sdk-0.2.0.tar.gz
Algorithm Hash digest
SHA256 20ac6a9e324aaa6d9c6ea3b41725c1d97bcc9f069d7ca03770652e61588666c2
MD5 88febfbf59ac9e7a8abbbce2a9c69aa2
BLAKE2b-256 2489c680c30e217fc1e9eccd72127b47df15f287276d27f96986d12bff1fe3d6

See more details on using hashes here.

Provenance

The following attestation bundles were made for onehouse_python_sdk-0.2.0.tar.gz:

Publisher: publish.yml on onehouseinc/onehouse-python-sdk

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file onehouse_python_sdk-0.2.0-py3-none-any.whl.

File metadata

File hashes

Hashes for onehouse_python_sdk-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 2e38cd2a9d5b597eeae566374895ad53f46d0c7b78eb8849846fd9c1813e7875
MD5 5b1e2102a0b4cb4bb6d46575b0d30c75
BLAKE2b-256 45b52cb763fae3e9bdc66f824158fa77d45c6fac3a3645d6f057c1d6b37327c5

See more details on using hashes here.

Provenance

The following attestation bundles were made for onehouse_python_sdk-0.2.0-py3-none-any.whl:

Publisher: publish.yml on onehouseinc/onehouse-python-sdk

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page