A Jupyter kernel for complete remote execution on Databricks clusters
Project description
jupyter-databricks-kernel
A Jupyter kernel for complete remote execution on Databricks clusters.
1. Features
- Execute Python code entirely on Databricks clusters
- Works with VS Code, JupyterLab, and other Jupyter frontends
2. Requirements
- Python 3.11 or later
- Databricks workspace with Personal Access Token
- Classic all-purpose cluster
3. Quick Start
-
Install the kernel:
# With uv uv pip install jupyter-databricks-kernel uv run python -m jupyter_databricks_kernel.install # With pip pip install jupyter-databricks-kernel python -m jupyter_databricks_kernel.install
Install options:
Option Description (default) Install to current venv ( sys.prefix)--userInstall to user directory ( ~/.local/share/jupyter/kernels/)--prefix PATHInstall to custom path -
Configure authentication and cluster:
# Recommended: Use Databricks CLI to set up everything databricks auth login --configure-cluster
This creates
~/.databrickscfgwith authentication credentials and cluster ID.Alternatively, use environment variables:
# Override cluster ID (optional, takes priority over ~/.databrickscfg) export DATABRICKS_CLUSTER_ID=your-cluster-id # Authentication (if not using ~/.databrickscfg) export DATABRICKS_HOST=https://your-workspace.cloud.databricks.com export DATABRICKS_TOKEN=your-personal-access-token # Use specific profile from ~/.databrickscfg (optional) export DATABRICKS_CONFIG_PROFILE=your-profile-name
For authentication options, see Databricks SDK Authentication.
-
Open a notebook and select "Databricks" kernel:
VS Code:
- Install the Jupyter extension
- Open a
.ipynbfile - Click "Select Kernel" and choose "Databricks"
JupyterLab:
jupyter-lab
Select "Databricks" from the kernel list.
-
Run a simple test:
spark.version
If the cluster is stopped, the first execution may take 5-6 minutes while the cluster starts.
4. Configuration
4.1. Cluster ID
Cluster ID is read from (in order of priority):
DATABRICKS_CLUSTER_IDenvironment variable~/.databrickscfg(from active profile)
Active profile is determined by DATABRICKS_CONFIG_PROFILE environment
variable, or DEFAULT if not set.
Example ~/.databrickscfg:
[DEFAULT]
host = https://your-workspace.cloud.databricks.com
token = dapi...
cluster_id = 0123-456789-abcdef12
4.2. Sync Settings
You can configure file synchronization in pyproject.toml:
[tool.jupyter-databricks-kernel.sync]
enabled = true
source = "."
exclude = ["*.log", "data/"]
max_size_mb = 100.0
max_file_size_mb = 10.0
use_gitignore = true
| Option | Description | Default |
|---|---|---|
sync.enabled |
Enable file synchronization | true |
sync.source |
Source directory to sync | "." |
sync.exclude |
Additional exclude patterns | [] |
sync.max_size_mb |
Maximum total project size in MB | No limit |
sync.max_file_size_mb |
Maximum individual file size in MB | No limit |
sync.use_gitignore |
Respect .gitignore patterns | true |
5. Known Limitations
- Serverless compute is not supported (Command Execution API limitation)
input()and interactive prompts do not work- Interactive widgets (ipywidgets) are not supported
6. Troubleshooting
6.1. Kernel feels slow
File sync may be uploading unnecessary files. Check your sync settings:
-
Ensure
.gitignoreincludes large/unnecessary files:.venv/ __pycache__/ *.pyc data/ *.parquet node_modules/
-
Add exclude patterns in
pyproject.toml:[tool.jupyter-databricks-kernel.sync] exclude = ["data/", "models/", "*.csv"]
-
Set size limits to catch unexpected large files:
[tool.jupyter-databricks-kernel.sync] max_size_mb = 50.0 max_file_size_mb = 10.0
-
Disable sync entirely if not needed:
[tool.jupyter-databricks-kernel.sync] enabled = false
7. Development
See CONTRIBUTING.md for development setup and guidelines.
8. License
Apache License 2.0
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file jupyter_databricks_kernel-1.1.1.tar.gz.
File metadata
- Download URL: jupyter_databricks_kernel-1.1.1.tar.gz
- Upload date:
- Size: 138.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d1e5eaf138b44795581ca5d3cb83d6a41a964f70e7352ecb12a45c865684abb2
|
|
| MD5 |
1ec8a2038cef5905b7a679a50fa79ffc
|
|
| BLAKE2b-256 |
49da8ecf760e5cffefe7a96843e0cb130be2acdd03b06e64b8d03238c3ed2e37
|
Provenance
The following attestation bundles were made for jupyter_databricks_kernel-1.1.1.tar.gz:
Publisher:
publish.yaml on i9wa4/jupyter-databricks-kernel
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
jupyter_databricks_kernel-1.1.1.tar.gz -
Subject digest:
d1e5eaf138b44795581ca5d3cb83d6a41a964f70e7352ecb12a45c865684abb2 - Sigstore transparency entry: 756215672
- Sigstore integration time:
-
Permalink:
i9wa4/jupyter-databricks-kernel@fb95544b8f33a10b1de11a6184298a5e6bf874fb -
Branch / Tag:
refs/tags/v1.1.1 - Owner: https://github.com/i9wa4
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yaml@fb95544b8f33a10b1de11a6184298a5e6bf874fb -
Trigger Event:
push
-
Statement type:
File details
Details for the file jupyter_databricks_kernel-1.1.1-py3-none-any.whl.
File metadata
- Download URL: jupyter_databricks_kernel-1.1.1-py3-none-any.whl
- Upload date:
- Size: 26.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
495c8d14449a721ac11e7a4541cf5a8df3b692cc5907e99c1d9f6c99020d34f6
|
|
| MD5 |
4438a8cc2f3414be2109122ce7a77e72
|
|
| BLAKE2b-256 |
6ebe3ab2ae571d63dd934bf731841ce99ad3afe6008211070f0a2acef3e0c822
|
Provenance
The following attestation bundles were made for jupyter_databricks_kernel-1.1.1-py3-none-any.whl:
Publisher:
publish.yaml on i9wa4/jupyter-databricks-kernel
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
jupyter_databricks_kernel-1.1.1-py3-none-any.whl -
Subject digest:
495c8d14449a721ac11e7a4541cf5a8df3b692cc5907e99c1d9f6c99020d34f6 - Sigstore transparency entry: 756215678
- Sigstore integration time:
-
Permalink:
i9wa4/jupyter-databricks-kernel@fb95544b8f33a10b1de11a6184298a5e6bf874fb -
Branch / Tag:
refs/tags/v1.1.1 - Owner: https://github.com/i9wa4
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yaml@fb95544b8f33a10b1de11a6184298a5e6bf874fb -
Trigger Event:
push
-
Statement type: