No project description provided
Project description
dask-databricks
Cluster tools for running Dask on Databricks multi-node clusters.
Quickstart
To launch a Dask cluster on Databricks you need to create an init script with the following contents and configure your multi-node cluster to use it.
#!/bin/bash
# Install Dask + Dask Databricks
/databricks/python/bin/pip install --upgrade dask[complete] dask-databricks
# Start Dask cluster components
dask databricks run
Then from your Databricks Notebook you can quickly connect a Dask Client
to the scheduler running on the Spark Driver Node.
import dask_databricks
client = dask_databricks.get_client()
Now you can submit work from your notebook to the multi-node Dask cluster.
def inc(x):
return x + 1
x = client.submit(inc, 10)
x.result()
Dashboard
You can access the Dask dashboard via the Databricks driver-node proxy. The link can be found in Client
or DatabricksCluster
repr or via client.dashboard_link
.
>>> print(client.dashboard_link)
https://dbc-dp-xxxx.cloud.databricks.com/driver-proxy/o/xxxx/xx-xxx-xxxx/8087/status
Releasing
Releases of this project are automated using GitHub Actions and the pypa/gh-action-pypi-publish
action.
To create a new release push a tag to the upstream repo in the format x.x.x
. The package will be built and pushed to PyPI automatically and then later picked up by conda-forge.
# Make sure you have an upstream remote
git remote add upstream git@github.com:dask-contrib/dask-databricks.git
# Create a tag and push it upstream
git tag x.x.x && git push upstream main --tags
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file dask_databricks-0.3.2.tar.gz
.
File metadata
- Download URL: dask_databricks-0.3.2.tar.gz
- Upload date:
- Size: 8.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/5.0.0 CPython/3.12.3
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 09dc89dbb472270ab5491891f65199a4a258d9e6214e7bfb8243077aa71515bc |
|
MD5 | 2d4ef22f462ac775a4facb79130c3862 |
|
BLAKE2b-256 | 269352bed2f5a9f5c32abef821af49984d919e8a6b094d9562c91c99ab88baa2 |
File details
Details for the file dask_databricks-0.3.2-py3-none-any.whl
.
File metadata
- Download URL: dask_databricks-0.3.2-py3-none-any.whl
- Upload date:
- Size: 7.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/5.0.0 CPython/3.12.3
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 37c3102917c9bd2da22204e66f9d8bec9ef3b79052e9e91f86d6e6d35bee3a9d |
|
MD5 | c2cec7445bed4d35969215bb7c5c6c25 |
|
BLAKE2b-256 | 11c338bd87b8451545e29bec360678dbf280ebbb4cc9685323688d48085eaf5a |