Build and fingerprint databricks.sdk Config instances from server-supplied env or kwargs.
Project description
dbx-tools-config
Tiny wrapper around databricks.sdk.config.Config for services that
build a Config per request from caller-supplied inputs - MCP
servers, brokers, multi-tenant backends, sidecars, agent frameworks,
etc.
The Databricks SDK auto-discovers config from os.environ of the host
process, which is the wrong scope for a service serving many callers.
This module lets each request bring its own config-shaped inputs and
materialise a Config (or just a fingerprint) from them. A typical
mapping for an HTTP/MCP-style request:
| Source | dbx-tools-config layer |
|---|---|
| Request headers (env-shaped) | env= |
POST body / RPC payload Config fields |
**kwargs |
Pre-resolved Config baseline |
config= |
Precedence is kwargs > env > config (last write wins). Every
layer is optional.
API
Three public helpers, all with the same signature:
def config_params(
config: Config | None = None,
env: Mapping[str, Iterable[str] | None] | None = None,
**kwargs,
) -> dict[str, Any]: ...
def create_config(
config: Config | None = None,
env: Mapping[str, Iterable[str] | None] | None = None,
**kwargs,
) -> Config: ...
def config_params_hash(
config: Config | None = None,
env: Mapping[str, Iterable[str] | None] | None = None,
**kwargs,
) -> str: ...
config_params(...)mergesconfig.as_dict()+ recognisedenvkeyskwargsinto a single dict suitable forConfig(**...).
create_config(...)is a one-liner forConfig(**config_params(...)). Expensive: triggersConfig.__init__'s host-metadata HTTP probe,~/.databrickscfgread and credential strategy bootstrap.config_params_hash(...)returns a SHA-256 hex digest of the merged kwargs after dropping fields in_HASH_IGNORE_FIELDS. Cheap: pure in-memory compute, noConfigconstructed. See Hashing.
Install
Not published to PyPI - install directly from GitHub via a PEP 508
direct URL. Works with pip, uv, poetry, etc. - they all read
[project].dependencies from pyproject.toml.
pyproject.toml:
[project]
dependencies = [
"dbx-tools-config @ git+https://github.com/reggie-db/dbx-tools-config",
]
Pin to a tag, branch or commit with the standard @<ref> suffix:
[project]
dependencies = [
# tag
"dbx-tools-config @ git+https://github.com/reggie-db/dbx-tools-config@v0.1.4",
# branch
"dbx-tools-config @ git+https://github.com/reggie-db/dbx-tools-config@main",
# commit SHA
"dbx-tools-config @ git+https://github.com/reggie-db/dbx-tools-config@<sha>",
]
Then install with whichever tool you use:
pip install . # or: pip install -e .
uv sync # or: uv add 'dbx-tools-config @ git+https://github.com/reggie-db/dbx-tools-config'
Usage
Server-style: per-request Config from headers + body
import dbx_tools_config
from databricks.sdk import WorkspaceClient
# An MCP-style handler. Headers carry env-shaped names, the body
# carries Config field overrides.
def handle_request(request):
config = dbx_tools_config.create_config(
env=request.headers, # e.g. {"DATABRICKS_HOST": "...",
# "DATABRICKS_TOKEN": "..."}
**request.json(), # e.g. {"warehouse_id": "abc",
# "cluster_id": "xyz"}
)
return WorkspaceClient(config=config).do_work(...)
Other shapes
# From an arbitrary env-shaped mapping
config = dbx_tools_config.create_config(env={
"DATABRICKS_HOST": "https://myworkspace.cloud.databricks.com",
"DATABRICKS_TOKEN": "dapi...",
})
# From the process environment (single-tenant CLIs, scripts, tests)
import os
config = dbx_tools_config.create_config(env=os.environ)
# Kwargs always win over env
config = dbx_tools_config.create_config(
host="https://override.cloud.databricks.com",
env=client_env,
)
# Round-trip an existing Config (e.g. as a baseline)
config = dbx_tools_config.create_config(config=other_config, host="https://override...")
# Just the merged kwargs, without constructing a Config
kwargs = dbx_tools_config.config_params(config=other_config, env=client_env)
Env value semantics
Each value in the env mapping may be:
| Value | Behavior |
|---|---|
str |
Used directly. |
None |
Sets the field to None (clears any baseline from config=). |
Iterable[str] |
First element wins; matches multi-value HTTP / multidict frames. |
| empty iterable | Field is left untouched. |
Env key resolution
Each key in the env mapping is matched against the SDK's declared
ConfigAttribute.env (and any env_aliases) on Config. Examples that
the SDK declares today:
DATABRICKS_HOST->hostDATABRICKS_TOKEN->tokenDATABRICKS_CLUSTER_ID->cluster_idDATABRICKS_OIDC_TOKEN_FILE->oidc_token_filepath(alias)DATABRICKS_AZURE_RESOURCE_ID->azure_workspace_resource_idARM_TENANT_ID->azure_tenant_idGOOGLE_CREDENTIALS->google_credentials
Keys that don't match a declared env name (or alias) are silently ignored.
Note: this module does not perform string-to-bool/int/float coercion. Values are forwarded to
Config(**kwargs)as-is and the SDK's descriptortransform(typically just the annotated type) does any conversion. Be aware that the SDK usesbool(value)for boolean fields, so the string"false"will resolve toTrue. Pass real Python booleans viakwargsif you care.
Out of scope: ambient env vars
A handful of databricks-sdk features read env vars directly from
os.environ instead of going through Config:
DATABRICKS_RUNTIME_VERSION(DBR detection / user-agent)IS_IN_DB_MODEL_SERVING_ENV,IS_IN_DATABRICKS_MODEL_SERVING_ENV,DATABRICKS_MODEL_SERVING_HOST_URL,DB_MODEL_SERVING_HOST_URL(model serving auto-auth)ACTIONS_ID_TOKEN_REQUEST_TOKEN,ACTIONS_ID_TOKEN_REQUEST_URL(GitHub Actions OIDC)SYSTEM_ACCESSTOKEN,SYSTEM_*(Azure DevOps OIDC)AGENT(user-agent)
Forwarding these through dbx_tools_config.create_config(env=...) has
no effect because they bypass Config entirely. If you need them in a
service context, set them on os.environ of the worker process before
constructing the SDK client.
Hashing
dbx_tools_config.config_params_hash(...) returns a stable SHA-256 hex
digest of the resolved kwargs without constructing a Config. This
matters because Config.__init__ is not free - it does (in order):
_resolve_host_metadata- HTTPGET host/.well-known/databricks-configto discoveraccount_id,workspace_id,cloud,discovery_url._known_file_config_loader- reads~/.databrickscfgfrom disk if no auth is configured directly._validate- checks for conflicting auth methods.init_auth- bootstraps the credential strategy (which itself may shell out to the Databricks CLI, fetch a token from disk, etc).
For a service that fans many requests over a small set of logical identities, hashing first lets you cache (or rate-limit) clients without paying any of the above per request:
import dbx_tools_config
from databricks.sdk import WorkspaceClient
_clients: dict[str, WorkspaceClient] = {}
def client_for(request):
key = dbx_tools_config.config_params_hash(env=request.headers, **request.json())
client = _clients.get(key)
if client is None:
config = dbx_tools_config.create_config(env=request.headers, **request.json())
client = _clients[key] = WorkspaceClient(config=config)
return client
The digest deliberately ignores fields that don't change which workspace / account is being addressed or how it's being authenticated:
| Group | Fields ignored |
|---|---|
| Source / lookup | profile, config_file, databricks_cli_path |
| Derived during init | auth_type, databricks_environment |
So two configs that resolve to the same identity but were loaded from
different DATABRICKS_CONFIG_PROFILE / DATABRICKS_CONFIG_FILE paths,
via a different CLI binary, or that happened to be tagged with a
different derived auth_type, fingerprint the same way.
Normalisation:
- Mapping keys are sorted (after JSON-encoding) so dict ordering does not affect the digest.
Nonecollapses with the empty string so an explicitNonevalue hashes the same as an explicit"".- Iterables (other than strings) preserve their order.
- Scalar values are stringified via
str()and JSON-quoted before being streamed into the digest.
Development
uv sync
uv build
uv run pytest
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file dbx_tools_config-0.1.7.tar.gz.
File metadata
- Download URL: dbx_tools_config-0.1.7.tar.gz
- Upload date:
- Size: 8.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
40506989bcc4e9576eede71fc15fd65d703f1fb69ac0d6faf1179f2f3075ebf5
|
|
| MD5 |
a069187dd5a92983ef84a35e0f015f2a
|
|
| BLAKE2b-256 |
7e3613a1af4161a20b6627dd06e45eba0bbf6fd5c842c8d76e9fa25c0f1191b5
|
Provenance
The following attestation bundles were made for dbx_tools_config-0.1.7.tar.gz:
Publisher:
publish.yml on reggie-db/dbx-tools-config
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
dbx_tools_config-0.1.7.tar.gz -
Subject digest:
40506989bcc4e9576eede71fc15fd65d703f1fb69ac0d6faf1179f2f3075ebf5 - Sigstore transparency entry: 1440168556
- Sigstore integration time:
-
Permalink:
reggie-db/dbx-tools-config@8357ab1868f60034212ae7efbbe0b966829ea1b0 -
Branch / Tag:
refs/tags/v0.1.7 - Owner: https://github.com/reggie-db
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@8357ab1868f60034212ae7efbbe0b966829ea1b0 -
Trigger Event:
push
-
Statement type:
File details
Details for the file dbx_tools_config-0.1.7-py3-none-any.whl.
File metadata
- Download URL: dbx_tools_config-0.1.7-py3-none-any.whl
- Upload date:
- Size: 9.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
6d1fc5835f7744abb52fdec3848833cb86a32ee336c68792d6b435ed203ce002
|
|
| MD5 |
ffe4353115bf3cf3536e401841bd82df
|
|
| BLAKE2b-256 |
b7aa5088cf1ed82dd29b7417dc7e97509990b638c0f4e149a02aba68cabbcda9
|
Provenance
The following attestation bundles were made for dbx_tools_config-0.1.7-py3-none-any.whl:
Publisher:
publish.yml on reggie-db/dbx-tools-config
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
dbx_tools_config-0.1.7-py3-none-any.whl -
Subject digest:
6d1fc5835f7744abb52fdec3848833cb86a32ee336c68792d6b435ed203ce002 - Sigstore transparency entry: 1440168567
- Sigstore integration time:
-
Permalink:
reggie-db/dbx-tools-config@8357ab1868f60034212ae7efbbe0b966829ea1b0 -
Branch / Tag:
refs/tags/v0.1.7 - Owner: https://github.com/reggie-db
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@8357ab1868f60034212ae7efbbe0b966829ea1b0 -
Trigger Event:
push
-
Statement type: