A Jupyter kernel for complete remote execution on Databricks clusters

These details have not been verified by PyPI

Development Status
- 4 - Beta
Framework
- Jupyter
Intended Audience
- Developers
License
- OSI Approved :: Apache Software License
Operating System
- OS Independent
Programming Language

Project description

jupyter-databricks-kernel

A Jupyter kernel for complete remote execution on Databricks clusters.

1. Features

Execute Python code entirely on Databricks clusters
- Works with VS Code, JupyterLab, and other Jupyter frontends
- CLI execution support with jupyter execute command
Automatic file synchronization to the Databricks cluster driver node
- Syncs your local project files to the cluster driver node before each execution
- Respects .gitignore patterns and configurable exclude rules
- Configurable size limits to prevent syncing large files

2. Requirements

Python 3.11 or later
Databricks workspace with authentication configured (supports Personal Access Token, OAuth M2M with Service Principal, etc.)
Classic all-purpose cluster

3. Quick Start

Install the kernel:
```
pip install jupyter-databricks-kernel
python -m jupyter_databricks_kernel.install
```
Install options:

Option Description

(default) Install to current venv (sys.prefix)

--user Install to user site (~/.local/share/jupyter/kernels/)

--prefix PATH Install to custom path

Option	Description
(default)	Install to current venv (`sys.prefix`)
`--user`	Install to user site (`~/.local/share/jupyter/kernels/`)
`--prefix PATH`	Install to custom path

Configure authentication and cluster:

# Recommended: Use Databricks CLI to set up everything
databricks auth login --configure-cluster

This creates ~/.databrickscfg with authentication credentials and cluster ID.

Alternatively, use environment variables:

# Override cluster ID (optional, takes priority over ~/.databrickscfg)
export DATABRICKS_CLUSTER_ID=your-cluster-id

# Authentication (if not using ~/.databrickscfg)
export DATABRICKS_HOST=https://your-workspace.cloud.databricks.com
export DATABRICKS_TOKEN=your-personal-access-token

# Service Principal authentication (alternative to PAT)
export DATABRICKS_CLIENT_ID=your-client-id
export DATABRICKS_CLIENT_SECRET=your-client-secret

# Use specific profile from ~/.databrickscfg (optional)
export DATABRICKS_CONFIG_PROFILE=your-profile-name

For authentication options, see Databricks SDK Authentication.

Open a notebook and select "Databricks" kernel:

VS Code:
1. Install the Jupyter extension
2. Open a .ipynb file
3. Click "Select Kernel" and choose "Databricks"
JupyterLab:
```
jupyter-lab
```
Select "Databricks" from the kernel list.
Run a simple test:
```
spark.version
```

If the cluster is stopped, the first execution may take 5-6 minutes while the cluster starts.

4. Configuration

4.1. Cluster ID

Cluster ID is read from (in order of priority):

DATABRICKS_CLUSTER_ID environment variable
~/.databrickscfg (from active profile)

Active profile is determined by DATABRICKS_CONFIG_PROFILE environment variable, or DEFAULT if not set.

Example ~/.databrickscfg:

[DEFAULT]
host = https://your-workspace.cloud.databricks.com
token = dapi...
cluster_id = 0123-456789-abcdef12

4.2. Sync Settings

You can configure file synchronization in pyproject.toml:

[tool.jupyter-databricks-kernel.sync]
enabled = true
source = "."
exclude = ["*.log", "data/"]
max_size_mb = 100.0
max_file_size_mb = 10.0
use_gitignore = true

Option	Description	Default
`sync.enabled`	Enable file synchronization	`true`
`sync.source`	Source directory to sync	`"."`
`sync.exclude`	Additional exclude patterns	`[]`
`sync.max_size_mb`	Maximum total project size in MB	No limit
`sync.max_file_size_mb`	Maximum individual file size in MB	No limit
`sync.use_gitignore`	Respect .gitignore patterns	`true`
`sync.workspace_extract_dir`	Custom extraction directory on cluster	`null` (auto)

The extraction directory can also be set via the JUPYTER_DATABRICKS_KERNEL_EXTRACT_DIR environment variable, which takes priority over pyproject.toml.

By default, files are extracted to /Workspace/Users/<your-user>/jupyter_databricks_kernel/<session>/ on the cluster driver node. This is a cluster-local path, not the Databricks UI Workspace file browser. A fallback path under /tmp/ is used for service principals or when workspace permissions are denied.

5. CLI Execution

You can execute notebooks from the command line using jupyter execute:

jupyter execute notebook.ipynb --kernel_name=databricks --inplace

To save the output to a different file:

jupyter execute notebook.ipynb --kernel_name=databricks --output=output.ipynb

5.1. Options

Option	Description
`--kernel_name`	Kernel name (use `databricks`)
`--output`	Output file name
`--inplace`	Overwrite input file with results
`--timeout`	Cell execution timeout in seconds
`--startup_timeout`	Kernel startup timeout in seconds (default: 60)
`--allow-errors`	Continue execution even if a cell raises an error

5.2. Notes

If the cluster is stopped, kernel startup may take 5-6 minutes. Increase --startup_timeout to avoid timeout errors:

jupyter execute notebook.ipynb --kernel_name=databricks --startup_timeout=600

6. Papermill Integration

papermill supports parameter injection for notebook pipelines. Use it with this kernel for parameterized remote execution on Databricks clusters.

Install papermill:

pip install papermill

Run a notebook with parameter injection:

papermill input.ipynb output.ipynb --kernel databricks -p param1 value1 -p param2 value2

Do NOT use the --inplace flag with papermill. Papermill is designed to produce a new output notebook with injected parameters and captured cell outputs; --inplace overwrites the source notebook and defeats this purpose.

If the cluster is stopped, increase the startup timeout:

papermill input.ipynb output.ipynb --kernel databricks --start_timeout 600 -p param1 value1

7. Known Limitations

Serverless compute is not supported (Command Execution API limitation)
input() and interactive prompts do not work
Interactive widgets (ipywidgets) are not supported

8. Troubleshooting

8.1. Kernel feels slow

File sync may be uploading unnecessary files. Check your sync settings:

Ensure .gitignore includes large/unnecessary files:

.venv/
__pycache__/
*.pyc
data/
*.parquet
node_modules/

Add exclude patterns in pyproject.toml:

[tool.jupyter-databricks-kernel.sync]
exclude = ["data/", "models/", "*.csv"]

Set size limits to catch unexpected large files:

[tool.jupyter-databricks-kernel.sync]
max_size_mb = 50.0
max_file_size_mb = 10.0

Disable sync entirely if not needed:

[tool.jupyter-databricks-kernel.sync]
enabled = false

9. Development

See CONTRIBUTING.md for development setup and guidelines.

10. License

Apache License 2.0

Project details

These details have not been verified by PyPI

Development Status
- 4 - Beta
Framework
- Jupyter
Intended Audience
- Developers
License
- OSI Approved :: Apache Software License
Operating System
- OS Independent
Programming Language

Release history Release notifications | RSS feed

1.3.1

May 1, 2026

This version

1.3.0

Apr 25, 2026

1.2.7

Apr 8, 2026

1.2.6

Apr 8, 2026

1.2.5

Feb 12, 2026

1.2.4

Feb 12, 2026

1.2.1

Feb 7, 2026

1.2.0

Feb 7, 2026

1.1.5

Dec 12, 2025

1.1.4

Dec 12, 2025

1.1.2

Dec 10, 2025

1.1.1

Dec 10, 2025

1.1.0

Dec 9, 2025

1.0.0

Dec 8, 2025

0.2.0

Dec 8, 2025

0.1.0

Dec 8, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

jupyter_databricks_kernel-1.3.0.tar.gz (215.3 kB view details)

Uploaded Apr 25, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

jupyter_databricks_kernel-1.3.0-py3-none-any.whl (33.2 kB view details)

Uploaded Apr 25, 2026 Python 3

File details

Details for the file jupyter_databricks_kernel-1.3.0.tar.gz.

File metadata

Download URL: jupyter_databricks_kernel-1.3.0.tar.gz
Upload date: Apr 25, 2026
Size: 215.3 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.13

File hashes

Hashes for jupyter_databricks_kernel-1.3.0.tar.gz
Algorithm	Hash digest
SHA256	`e9925833994f523bb7aa82e616a213eb4f408ee1551edf60cd9b6e2edfa819c0`
MD5	`9d645b926466ccaf05a11d0960507259`
BLAKE2b-256	`0802cd679635894513d99e8ba310229730b334d3c6382d7da9ca440731c9c3da`

See more details on using hashes here.

Provenance

The following attestation bundles were made for jupyter_databricks_kernel-1.3.0.tar.gz:

Publisher: release.yaml on i9wa4/jupyter-databricks-kernel

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: jupyter_databricks_kernel-1.3.0.tar.gz
- Subject digest: e9925833994f523bb7aa82e616a213eb4f408ee1551edf60cd9b6e2edfa819c0
- Sigstore transparency entry: 1383142542
- Sigstore integration time: Apr 25, 2026
Source repository:
- Permalink: i9wa4/jupyter-databricks-kernel@e85e09d5201ad348667030b25213ee0b43b421d2
- Branch / Tag: refs/tags/v1.3.0
- Owner: https://github.com/i9wa4
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release.yaml@e85e09d5201ad348667030b25213ee0b43b421d2
- Trigger Event: push

File details

Details for the file jupyter_databricks_kernel-1.3.0-py3-none-any.whl.

File metadata

Download URL: jupyter_databricks_kernel-1.3.0-py3-none-any.whl
Upload date: Apr 25, 2026
Size: 33.2 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.13

File hashes

Hashes for jupyter_databricks_kernel-1.3.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`e265249c40cf16669159d6de0578038ad9605bd52688417beb4b90330e1d02c1`
MD5	`8d3453ec56f4ac2a8363baeb63acc875`
BLAKE2b-256	`48bcd80e821b675694ca3414efc9f3876ababd11f1f4efe31907f8468aea3300`

See more details on using hashes here.

Provenance

The following attestation bundles were made for jupyter_databricks_kernel-1.3.0-py3-none-any.whl:

Publisher: release.yaml on i9wa4/jupyter-databricks-kernel

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: jupyter_databricks_kernel-1.3.0-py3-none-any.whl
- Subject digest: e265249c40cf16669159d6de0578038ad9605bd52688417beb4b90330e1d02c1
- Sigstore transparency entry: 1383142555
- Sigstore integration time: Apr 25, 2026
Source repository:
- Permalink: i9wa4/jupyter-databricks-kernel@e85e09d5201ad348667030b25213ee0b43b421d2
- Branch / Tag: refs/tags/v1.3.0
- Owner: https://github.com/i9wa4
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release.yaml@e85e09d5201ad348667030b25213ee0b43b421d2
- Trigger Event: push

jupyter-databricks-kernel 1.3.0

Navigation

Verified details

Maintainers

Unverified details

Meta

Classifiers

Project description

jupyter-databricks-kernel

1. Features

2. Requirements

3. Quick Start

4. Configuration

4.1. Cluster ID

4.2. Sync Settings

5. CLI Execution

5.1. Options

5.2. Notes

6. Papermill Integration

7. Known Limitations

8. Troubleshooting

8.1. Kernel feels slow

9. Development

10. License

Project details

Verified details

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance