Skip to main content

Production-focused Databricks API toolkit by Rehla Digital Inc for workspace and account automation on AWS.

Project description

Unified Databricks API

Single Python package to call Databricks Workspace and Account APIs and convert JSON responses into Pandas or PySpark DataFrames.

About Rehla Digital Inc

Rehla Digital Inc builds cloud and data engineering solutions that help teams standardize platform operations, accelerate delivery, and reduce integration risk. This package is maintained as part of that effort to provide a practical, production-oriented Databricks API toolkit.

Install

pip install rehla-dbx-tools

Import in Python with underscores:

from rehla_dbx_tools import DatabricksApiClient

Install Spark extras if needed:

pip install "rehla-dbx-tools[spark]"

Quick Start

from rehla_dbx_tools import DatabricksApiClient

client = DatabricksApiClient.from_env()
if client.workspace is not None:
    jobs = client.workspace.list_jobs()
    df = jobs.to_pandas()
    print(df.head())

# Force both workspace/account config to a target cloud
client = DatabricksApiClient.from_env_for_cloud("azure")

Simple host/token setup:

from rehla_dbx_tools import DatabricksApiClient

client = DatabricksApiClient.simple(
    host="https://dbc-xxxx.cloud.databricks.com",
    token="dapi...token...",
)

for run in client.list_active_job_runs(limit=25):
    print(run.get("run_id"))

Token can be omitted if you want guided auth:

client = DatabricksApiClient.simple(
    host="https://dbc-xxxx.cloud.databricks.com",
    open_browser_for_token=True,  # opens Access Tokens page
    prompt_for_token=True,         # prompts to paste token
)

Windows SSO flow (Databricks CLI login):

client = DatabricksApiClient.from_windows_sso(
    host="https://dbc-xxxx.cloud.databricks.com",
)

Notebook Context Bootstrap

Inside Databricks notebooks:

from rehla_dbx_tools import DatabricksApiClient

client = DatabricksApiClient.from_notebook_context()
if client.workspace is not None:
    clusters = client.workspace.list_clusters()
    spark_df = clusters.to_spark()
    display(spark_df)

Account API

account client is enabled when DATABRICKS_ACCOUNT_HOST and DATABRICKS_ACCOUNT_ID are set.

if client.account is not None:
    workspaces = client.account.list_workspaces()
    print(workspaces.to_pandas().head())

Version-Aware Generic Request

response = client.workspace.request_versioned(
    "GET",
    service="unity-catalog",
    endpoint="metastores",
    api_version="2.1",
)

Expanded Convenience Wrappers

import getpass

if client.workspace is not None:
    run = client.workspace.run_job_now(job_id=123)
    runs = client.workspace.list_job_runs(job_id=123, active_only=True, limit=10)
    run_export = client.workspace.export_job_run(run_id=987, views_to_export="CODE")
    run_output = client.workspace.get_job_run_output(run_id=987)
    run_submit = client.workspace.submit_job_run({"run_name": "ad-hoc-check"})
    run_delete = client.workspace.delete_job_run(run_id=987)
    job_permissions = client.workspace.get_job_permissions(job_id=123)
    permission_levels = client.workspace.get_job_permission_levels(job_id=123)
    permission_update = client.workspace.update_job_permissions(
        job_id=123,
        access_control_list=[{"group_name": "admins", "permission_level": "CAN_MANAGE"}],
    )
    cluster_permissions = client.workspace.get_cluster_permissions(cluster_id="0123-abc")
    cluster_permission_levels = client.workspace.get_cluster_permission_levels(cluster_id="0123-abc")
    repo_permissions = client.workspace.get_repo_permissions(repo_id=12345)
    repo_permission_levels = client.workspace.get_repo_permission_levels(repo_id=12345)
    repair = client.workspace.repair_job_run(run_id=987, rerun_all_failed_tasks=True)
    cancel_all = client.workspace.cancel_all_job_runs(job_id=123, all_queued_runs=True)
    cluster = client.workspace.get_cluster(cluster_id="0123-abc")
    catalogs = client.workspace.list_catalogs(max_results=25)
    warehouses = client.workspace.list_sql_warehouses()
    dbfs_files = client.workspace.list_dbfs("dbfs:/tmp")
    token = client.workspace.create_token(lifetime_seconds=3600, comment="ci-short-lived")
    rotated_token = client.workspace.rotate_token(
        token_id_to_revoke="old-token-id",
        lifetime_seconds=3600,
        comment="ci-rotation",
    )
    repos = client.workspace.list_repos(path_prefix="/Repos/team")
    repo = client.workspace.get_repo(repo_id=12345)
    client.workspace.put_secret(
        scope="app-prod",
        key="api-token",
        string_value=getpass.getpass("Secret value: "),
    )

if client.account is not None:
    ws = client.account.get_workspace(workspace_id=101)
    creds = client.account.list_credentials()
    storage_cfgs = client.account.list_storage_configurations()
    networks = client.account.list_networks()
    private_access = client.account.list_private_access_settings()
    vpc_endpoints = client.account.list_vpc_endpoints()
    cmks = client.account.list_customer_managed_keys()
    users = client.account.list_users()
    user = client.account.get_user("user-101")
    groups = client.account.list_groups()
    group = client.account.get_group("group-101")
    budgets = client.account.list_budget_policies()
    log_delivery_configs = client.account.list_log_delivery_configurations()

For detailed setup and examples, see docs/USAGE.md.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

rehla_dbx_tools-1.1.1.tar.gz (32.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

rehla_dbx_tools-1.1.1-py3-none-any.whl (35.8 kB view details)

Uploaded Python 3

File details

Details for the file rehla_dbx_tools-1.1.1.tar.gz.

File metadata

  • Download URL: rehla_dbx_tools-1.1.1.tar.gz
  • Upload date:
  • Size: 32.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for rehla_dbx_tools-1.1.1.tar.gz
Algorithm Hash digest
SHA256 29bd69060ee7960d256b431fedd748c8f9c5da4b092c054248f727a89d1ad0ee
MD5 ae54e67b17643fc1c32d49dd7ef7c4be
BLAKE2b-256 70191bf0adaa53b282f27ba9076a7911535aa9b3abd5405b66ce379683a1e67d

See more details on using hashes here.

Provenance

The following attestation bundles were made for rehla_dbx_tools-1.1.1.tar.gz:

Publisher: workflow.yml on rehladigital/rehla_dbx_tools

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file rehla_dbx_tools-1.1.1-py3-none-any.whl.

File metadata

File hashes

Hashes for rehla_dbx_tools-1.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 ee321cd5fb36336608f2c0f1c0b118c31016220bb0d9677e78c8612d76fe95a7
MD5 89d80a8cf64751e6eaf59e54eac2c8b4
BLAKE2b-256 f88fbc716c177e67eb675fec942e04a0606e3ad60daeea7060c06256e8bd03f6

See more details on using hashes here.

Provenance

The following attestation bundles were made for rehla_dbx_tools-1.1.1-py3-none-any.whl:

Publisher: workflow.yml on rehladigital/rehla_dbx_tools

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page