DataBricks CLI eXtensions aka dbx
Project description
DataBricks CLI eXtensions - aka dbx
is a CLI tool for advanced Databricks jobs management.
Concept
dbx
simplifies jobs launch and deployment process across multiple environments.
It also helps to package your project and deliver it to your Databricks environment in a versioned fashion.
Designed in a CLI-first manner, it is built to be actively used both inside CI/CD pipelines and as a part of local tooling for fast prototyping.
Requirements
Python Version > 3.6
pip
orconda
Installation
with
pip
:
pip install dbx
Quickstart
Please refer to the Quickstart section.
Documentation
Please refer to the docs page.
Differences from other tools
Tool |
Comment |
---|---|
dbx is NOT a replacement for databricks-cli. Quite the opposite - dbx is heavily dependent on databricks-cli and uses most of the APIs exactly from databricks-cli SDK. |
|
dbx is NOT a replacement for mlflow cli. dbx uses some of the MLflow APIs under the hood to store serialized job objects, but doesn’t use mlflow CLI directly. |
|
While dbx is primarily oriented on versioned job management, Databricks Terraform Provider provides much wider set of infrastructure settings. In comparison, dbx doesn’t provide infrastructure management capabilities, but brings more flexible deployment and launch options. |
|
Databricks Stack CLI is a great component for managing a stack of objects. dbx concentrates on the versioning and packaging jobs together, not treating files and notebooks as a separate component. |
Limitations
Development:
dbx
currently doesn’t provide interactive debugging capabilities.If you want to use interactive debugging, you can use Databricks Connect +dbx
for deployment operations.dbx execute
only supports Python-based projects which use local files (Notebooks or Repos are not supported indbx execute
).dbx execute
can only be used on clusters with Databricks ML Runtime 7.X or higher.
General:
dbx
doesn’t support Delta Live Tables at the moment.host
in your profile configuration in~/.databrickscfg
shall only consist of two parts:{scheme}://netlog
, e.g.https://some-host.cloud.databricks.com
.Strings likehttps://some-host.cloud.databricks.com/?o=XXXX#
are not supported. As a symptom if this you might the the error below:
raise MlflowException("%s. Response body: '%s'" % (base_msg, response.text))
mlflow.exceptions.MlflowException: API request to endpoint was successful but the response body was not in a valid JSON format.
Versioning
For CLI interfaces, we support SemVer approach. However, for API components we don’t use SemVer as of now.
This may lead to instability when using dbx
API methods directly.
Legal Information
This software is provided as-is and is not officially supported by Databricks through customer technical support channels. Support, questions, and feature requests can be communicated through the Issues page of this repo. Please see the legal agreement and understand that issues with the use of this code will not be answered or investigated by Databricks Support.
Feedback
Issues with dbx
? Found a bug? Have a great idea for an addition? Feel free to file an issue.
Contributing
Please find more details about contributing to dbx
in the contributing doc.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distribution
File details
Details for the file dbx-0.6.2-py3-none-any.whl
.
File metadata
- Download URL: dbx-0.6.2-py3-none-any.whl
- Upload date:
- Size: 79.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.1 CPython/3.9.13
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 22351445b09dcc5073affc1a5a218b90782270b911678f76c2377111fb230f5e |
|
MD5 | 04441a50a8109e21f87da32e98daa6d6 |
|
BLAKE2b-256 | 32b95c8daf1083187f6af510becb084de714654c79594128f025c2e2585d158e |