Databricks DBAPI.
Project description
A thin wrapper around pyhive for connecting to an interactive Databricks cluster.
Installation
Install using pip install databricks-dbapi
Usage
The connect() function returns a pyhive Hive connection object, which internally wraps a thrift connection.
Using a Databricks API token (recommended):
import os
from databricks_dbapi import databricks
token = os.environ["DATABRICKS_TOKEN"]
host = os.environ["DATABRICKS_HOST"]
# host = <account_name>.cloud.databricks.com
cluster = os.environ["DATABRICKS_CLUSTER"]
connection = databricks.connect(
host=host,
cluster=cluster,
token=token,
)
cursor = connection.cursor()
cursor.execute("SELECT * FROM some_table LIMIT 100")
print(cursor.fetchone())
print(cursor.fetchall())
Using your username and password (not recommended):
import os
from databricks_dbapi import databricks
user = os.environ["DATABRICKS_USER"]
password = os.environ["DATABRICKS_PASSWORD"]
host = os.environ["DATABRICKS_HOST"]
# host = <account_name>.cloud.databricks.com
cluster = os.environ["DATABRICKS_CLUSTER"]
connection = databricks.connect(
host=host,
cluster=cluster,
user=user,
password=password
)
cursor = connection.cursor()
cursor.execute("SELECT * FROM some_table LIMIT 100")
print(cursor.fetchone())
print(cursor.fetchall())
The pyhive connection also provides async functionality:
import os
from databricks_dbapi import databricks
from TCLIService.ttypes import TOperationState
token = os.environ["DATABRICKS_TOKEN"]
host = os.environ["DATABRICKS_HOST"]
# host = <account_name>.cloud.databricks.com
cluster = os.environ["DATABRICKS_CLUSTER"]
connection = databricks.connect(
host=host,
cluster=cluster,
token=token,
)
cursor = connection.cursor()
cursor.execute("SELECT * FROM some_table LIMIT 100", async_=True)
status = cursor.poll().operationState
while status in (TOperationState.INITIALIZED_STATE, TOperationState.RUNNING_STATE):
logs = cursor.fetch_logs()
for message in logs:
print(message)
# If needed, an asynchronous query can be cancelled at any time with:
# cursor.cancel()
status = cursor.poll().operationState
print(cursor.fetchall())
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
databricks_dbapi-0.1.0.tar.gz
(3.7 kB
view hashes)
Built Distribution
Close
Hashes for databricks_dbapi-0.1.0-py2.py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 0539bd1613b4e988d80a1107fe41f2ef82092ea9588cf7b0337e7a9c293eaea9 |
|
MD5 | 70a8b72f48e4de74359e3408784fb1d2 |
|
BLAKE2b-256 | 595b9184078aa1cc3eeb59178667493399acb0b5cefcbae256ccbd85b81dd6bb |