Lightweight stateless SQL execution for Databricks with minimal dependencies

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

databricks-labs nfx

These details have not been verified by PyPI

Project description

Databricks Labs LSQL

Lightweight execution of SQL queries through Databricks SDK for Python.

Databricks Labs LSQL
Installation
Executing SQL
SQL backend abstraction
Project Support

Installation

pip install databricks-labs-lsql

[back to top]

Executing SQL

Primary use-case of :py:meth:fetch_all and :py:meth:execute methods is oriented at executing SQL queries in a stateless manner straight away from Databricks SDK for Python, without requiring any external dependencies. Results are fetched in JSON format through presigned external links. This is perfect for serverless applications like AWS Lambda, Azure Functions, or any other containerised short-lived applications, where container startup time is faster with the smaller dependency set.

Applications, that need a more traditional SQL Python APIs with cursors, efficient data transfer of hundreds of megabytes or gigabytes of data serialized in Apache Arrow format, and low result fetching latency, should use the stateful Databricks SQL Connector for Python.

Constructor and the most of the methods do accept common parameters.

from databricks.sdk import WorkspaceClient
from databricks.labs.lsql.core import StatementExecutionExt
w = WorkspaceClient()
see = StatementExecutionExt(w)
for (pickup_zip, dropoff_zip) in see('SELECT pickup_zip, dropoff_zip FROM samples.nyctaxi.trips LIMIT 10'):
    print(f'pickup_zip={pickup_zip}, dropoff_zip={dropoff_zip}')

[back to top]

Iterating over results

Method fetch_all returns an iterator of objects, that resemble pyspark.sql.Row APIs, but full compatibility is not the goal of this implementation. Method accepts common parameters.

import os
from databricks.sdk import WorkspaceClient
from databricks.labs.lsql.core import StatementExecutionExt

results = []
w = WorkspaceClient()
see = StatementExecutionExt(w, warehouse_id=os.environ.get("TEST_DEFAULT_WAREHOUSE_ID"))
for pickup_zip, dropoff_zip in see.fetch_all("SELECT pickup_zip, dropoff_zip FROM samples.nyctaxi.trips LIMIT 10"):
    results.append((pickup_zip, dropoff_zip))

[back to top]

Executing without iterating

When you only need to execute the query and have no need to iterate over results, use the execute method, which accepts common parameters.

from databricks.sdk import WorkspaceClient
from databricks.labs.lsql.core import StatementExecutionExt

w = WorkspaceClient()
see = StatementExecutionExt(w)
see.execute("CREATE TABLE foo AS SELECT * FROM range(10)")

[back to top]

Fetching one record

Method fetch_one returns a single record from the result set. If the result set is empty, it returns None. If the result set contains more than one record, it raises ValueError.

from databricks.sdk import WorkspaceClient
from databricks.labs.lsql.core import StatementExecutionExt

w = WorkspaceClient()
see = StatementExecutionExt(w)
pickup_zip, dropoff_zip = see.fetch_one("SELECT pickup_zip, dropoff_zip FROM samples.nyctaxi.trips LIMIT 1")
print(f'pickup_zip={pickup_zip}, dropoff_zip={dropoff_zip}')

[back to top]

Fetching one value

Method fetch_value returns a single value from the result set. If the result set is empty, it returns None.

from databricks.sdk import WorkspaceClient
from databricks.labs.lsql.core import StatementExecutionExt

w = WorkspaceClient()
see = StatementExecutionExt(w)
count = see.fetch_value("SELECT COUNT(*) FROM samples.nyctaxi.trips")
print(f'count={count}')

[back to top]

Parameters

warehouse_id (str, optional) - Warehouse upon which to execute a statement. If not given, it will use the warehouse specified in the constructor or the first available warehouse that is not in the DELETED or DELETING state.
byte_limit (int, optional) - Applies the given byte limit to the statement's result size. Byte counts are based on internal representations and may not match measurable sizes in the JSON format.
catalog (str, optional) - Sets default catalog for statement execution, similar to USE CATALOG in SQL. If not given, it will use the default catalog or the catalog specified in the constructor.
schema (str, optional) - Sets default schema for statement execution, similar to USE SCHEMA in SQL. If not given, it will use the default schema or the schema specified in the constructor.
timeout (timedelta, optional) - Timeout after which the query is cancelled. If timeout is less than 50 seconds, it is handled on the server side. If the timeout is greater than 50 seconds, Databricks SDK for Python cancels the statement execution and throws TimeoutError. If not given, it will use the timeout specified in the constructor.

[back to top]

SQL backend abstraction

This framework allows for mapping with strongly-typed dataclasses between SQL and Python runtime. It handles the schema creation logic purely from Python datastructure.

SqlBackend is used to define the methods that are required to be implemented by any SQL backend that is used by the library. The methods defined in this class are used to execute SQL statements, fetch results from SQL statements, and save data to tables. Available backends are:

StatementExecutionBackend used for reading/writing records purely through REST API
DatabricksConnectBackend used for reading/writing records through Databricks Connect
RuntimeBackend used for execution within Databricks Runtime
MockBackend used for unit testing

Common methods are:

execute(str) - Execute a SQL statement and wait till it finishes
fetch(str) - Execute a SQL statement and iterate over all results
save_table(full_name: str, rows: Sequence[DataclassInstance], klass: Dataclass) - Save a sequence of dataclass instances to a table

[back to top]

Project Support

Please note that all projects in the /databrickslabs github account are provided for your exploration only, and are not formally supported by Databricks with Service Level Agreements (SLAs). They are provided AS-IS and we do not make any guarantees of any kind. Please do not submit a support ticket relating to any issues arising from the use of these projects.

Any issues discovered through the use of this project should be filed as GitHub Issues on the Repo. They will be reviewed as time permits, but there are no formal SLAs for support.

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

databricks-labs nfx

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

0.16.0

Feb 27, 2025

0.15.1

Feb 27, 2025

0.14.2

Feb 27, 2025

0.14.1

Nov 19, 2024

0.14.0

Nov 15, 2024

0.13.0

Nov 8, 2024

0.12.1

Sep 26, 2024

0.12.0

Sep 19, 2024

0.11.0

Sep 18, 2024

0.10.0

Sep 11, 2024

0.9.3

Sep 4, 2024

0.9.2

Sep 3, 2024

0.9.1

Aug 30, 2024

0.9.0

Aug 26, 2024

0.8.0

Aug 13, 2024

0.7.5

Jul 30, 2024

0.7.4

Jul 30, 2024

0.7.3

Jul 25, 2024

0.7.2

Jul 22, 2024

0.7.1

Jul 16, 2024

0.7.0

Jul 15, 2024

0.6.0

Jul 11, 2024

0.5.0

Jul 3, 2024

0.4.3

May 8, 2024

0.4.2

Apr 19, 2024

0.4.1

Apr 12, 2024

0.4.0

Apr 11, 2024

0.3.1

Apr 2, 2024

0.3.0

Mar 27, 2024

0.2.5

Mar 26, 2024

0.2.4

Mar 25, 2024

0.2.3

Mar 18, 2024

0.2.2

Mar 15, 2024

0.2.1

Mar 13, 2024

0.2.0

Mar 12, 2024

0.1.1

Mar 11, 2024

0.1.0

Mar 11, 2024

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

databricks_labs_lsql-0.16.0.tar.gz (52.9 kB view details)

Uploaded Feb 27, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

databricks_labs_lsql-0.16.0-py3-none-any.whl (48.1 kB view details)

Uploaded Feb 27, 2025 Python 3

File details

Details for the file databricks_labs_lsql-0.16.0.tar.gz.

File metadata

Download URL: databricks_labs_lsql-0.16.0.tar.gz
Upload date: Feb 27, 2025
Size: 52.9 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for databricks_labs_lsql-0.16.0.tar.gz
Algorithm	Hash digest
SHA256	`b6680471b198bd5c01fe2bc690a5bf4d0c7504519019fd45241435464dcb47a3`
MD5	`546d5ee8710ab8d70053d82320c5a826`
BLAKE2b-256	`8061b2e2fad8d3680b5db7ce90dea2fec84f5504747f6a648467e083c1aa3c91`

See more details on using hashes here.

Provenance

The following attestation bundles were made for databricks_labs_lsql-0.16.0.tar.gz:

Publisher: release.yml on databrickslabs/lsql

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: databricks_labs_lsql-0.16.0.tar.gz
- Subject digest: b6680471b198bd5c01fe2bc690a5bf4d0c7504519019fd45241435464dcb47a3
- Sigstore transparency entry: 175146501
- Sigstore integration time: Feb 27, 2025
Source repository:
- Permalink: databrickslabs/lsql@eababdb046f9b524170d0209a5becb116914044e
- Branch / Tag: refs/tags/v0.16.0
- Owner: https://github.com/databrickslabs
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release.yml@eababdb046f9b524170d0209a5becb116914044e
- Trigger Event: push

File details

Details for the file databricks_labs_lsql-0.16.0-py3-none-any.whl.

File metadata

Download URL: databricks_labs_lsql-0.16.0-py3-none-any.whl
Upload date: Feb 27, 2025
Size: 48.1 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for databricks_labs_lsql-0.16.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`d3004edd7ac2089156ee24db4bb2f4fce7f49bf45e2a96287d1243ac4e4a9cc0`
MD5	`452524aac7fdd980db5e000bb366bba0`
BLAKE2b-256	`ac08fc6eaa5d1b7c7cae0029170edd1916f97c430589f5c0f9c3cdec98c2c6f5`

See more details on using hashes here.

Provenance

The following attestation bundles were made for databricks_labs_lsql-0.16.0-py3-none-any.whl:

Publisher: release.yml on databrickslabs/lsql

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: databricks_labs_lsql-0.16.0-py3-none-any.whl
- Subject digest: d3004edd7ac2089156ee24db4bb2f4fce7f49bf45e2a96287d1243ac4e4a9cc0
- Sigstore transparency entry: 175146506
- Sigstore integration time: Feb 27, 2025
Source repository:
- Permalink: databrickslabs/lsql@eababdb046f9b524170d0209a5becb116914044e
- Branch / Tag: refs/tags/v0.16.0
- Owner: https://github.com/databrickslabs
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release.yml@eababdb046f9b524170d0209a5becb116914044e
- Trigger Event: push

databricks-labs-lsql 0.16.0

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

Databricks Labs LSQL

Installation

Executing SQL

Iterating over results

Executing without iterating

Fetching one record

Fetching one value

Parameters

SQL backend abstraction

Project Support

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance