Skip to main content

Synchronize Databricks workspace content with a local directory.

Project description

dbx-sync

         __ __
    ,___/ // /____   __      _____ __  ______  _____
   / __  // __ \\ \ / /_____/ ___// / / / __ \/ ___/
  / /_/ // /_/ //  X  \____(__  )/ /_/ / / / / /__
 /_____//_____//__/ \__\  /____/ \__, /_/ /_/\___/
                                 /____/

Are you tired of bouncing between the Databricks workspace UI and your local editor, copying changes by hand, and pretending that counts as a workflow? Well now there's dbx-sync.

dbx-sync keeps a Databricks workspace folder or file and a local directory or file in sync so you can work with your favorite tools and still stay aligned with what is running in Databricks.

Build locally, run in Databricks, tweak it there, then jump back to local coding. Skip the usual copy-paste ritual or one-way imports to weird folders.

Great for AI coding-agent workflows, including GitHub Copilot and Claude-based setups that work best against a real local folder.

Worried about losing files? dbx-sync does not delete files locally or remotely, but it can overwrite content if both sides changed while you were not syncing. Use version control locally and Databricks revision history remotely when you need rollback.

Current scope notes:

  • Sync is limited to a single local folder/workspace folder pair or one local/workspace file pair.
  • File and folder discovery is not recursive.
  • Local tracking currently covers notebook files with Databricks notebook extensions: .py, .sql, .scala, .r, and .ipynb.
  • When syncing a single local file, notebook extensions are imported as notebooks and other files are imported as workspace files.

Prerequisites

Install

Recommended: install as a uv tool

Install dbx-sync as a tool so you can run it directly from your shell:

uv tool install dbx-sync

Update tool

uv tool upgrade dbx-sync

Alternative: install with pip

If you prefer a standard virtual environment workflow, install the package with pip:

python -m pip install dbx-sync

Alternative: run from a local checkout

If you are developing on the project itself, install the local environment and run it with uv run:

uv sync --dev
uv run dbx-sync ./local-project /Workspace/Users/me/project

Usage

The command takes two positional arguments: the first is always the local path (file or directory) and the second is always the remote Databricks workspace path (file or folder):

dbx-sync <local-path> <remote-workspace-path>

Sync a single workspace folder with a single local folder (one-time):

dbx-sync ./local-project /Workspace/Users/me/project

Sync a single local file to a workspace folder, using the source filename for the target object:

dbx-sync ./local-project/notebook.py /Workspace/Users/me/project

Sync a single workspace file or notebook to a local folder, using the source filename locally:

dbx-sync ./local-project /Workspace/Users/me/project/notebook

Sync explicit local and workspace file paths:

dbx-sync ./local-project/notebook.py /Workspace/Users/me/project/notebook

Preview actions without applying them:

dbx-sync ./local-project /Workspace/Users/me/project --dry-run

Continuously watch and resync (default polling happens every second):

dbx-sync ./local-project /Workspace/Users/me/project --watch

Use --force to clear saved sync state before a fresh pass. This can be useful to handle conflicts.

Pro-tip: add --dry-run to check force behavior before running it for real.

Force options are mutually exclusive and only apply to a single sync pass:

  • --force clears saved sync state before comparing local and remote files.
  • --force-upload uploads matching local files even when saved sync state would otherwise skip them.
  • --force-download downloads matching remote files even when saved sync state would otherwise skip them.
dbx-sync ./local-project /Workspace/Users/me/project --force

Override optional settings when needed:

dbx-sync ./local-project /Workspace/Users/me/project \
	--profile WORKSPACE \
	--poll-interval 5 \
	--log-level DEBUG \

Watch mode cannot be combined with force options or dry-run mode. Use --watch for continuous syncing, or use --dry-run, --force, --force-upload, and --force-download for one-time sync passes.

If your local directory does not exist, the tool will attempt to create it for you (when not in dry-run mode).

Notes on Jupyter Notebooks

Jupyter notebooks are represented the same as other notebooks when using Databricks CLI databricks workspace list. For cases where there is not a matching local .ipynb file, we export those files as .py.

You can manually export them as .ipynb first if you wish to avoid this, using databricks workspace export <FILE> --format JUPYTER --file <FILE>.ipynb.

Alternatives

Yes, I recognize there are a variety of official ways to do something close to this, but none of them fit my desired workflow well. So here are some references for alternatives.

Development

See CONTRIBUTING.md for local development, testing, release, and repository workflow details.

License

MIT. See LICENSE.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dbx_sync-0.4.1.tar.gz (42.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

dbx_sync-0.4.1-py3-none-any.whl (13.7 kB view details)

Uploaded Python 3

File details

Details for the file dbx_sync-0.4.1.tar.gz.

File metadata

  • Download URL: dbx_sync-0.4.1.tar.gz
  • Upload date:
  • Size: 42.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.11.21 {"installer":{"name":"uv","version":"0.11.21","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for dbx_sync-0.4.1.tar.gz
Algorithm Hash digest
SHA256 84a4fb1660d14df35cf78879758b90e5409d54fa59293169a0696afbc107f5eb
MD5 3c3203215d01c2e55d2ed928090d359d
BLAKE2b-256 8c2908df9c5e4079282aad50d3b450741dec2cdecfa030e54214b3651687022a

See more details on using hashes here.

File details

Details for the file dbx_sync-0.4.1-py3-none-any.whl.

File metadata

  • Download URL: dbx_sync-0.4.1-py3-none-any.whl
  • Upload date:
  • Size: 13.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.11.21 {"installer":{"name":"uv","version":"0.11.21","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for dbx_sync-0.4.1-py3-none-any.whl
Algorithm Hash digest
SHA256 d3f4b74b2382cc90ea24a223ed4ef2e40aa213fe5dd8f41d0fbeaf503b398f27
MD5 e4baf1457109f24712d2d863375feaf3
BLAKE2b-256 4985a31e29ad4b5b0fd8eb3b45e2c15b1fc897b9c7e7b3cab3ad5818e2dbdf68

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page