Run app and get cluster proxy url for it in databricks clusters
Project description
dbtunnel
Proxy solution to run elegant Web UIs natively inside databricks notebooks.
YOU CAN ONLY USE THIS IN DATABRICKS NOTEBOOKS WITH A RUNNING CLUSTER
FOR SECURE ACCESS PLEASE USE A SINGLE USER CLUSTER (ANYONE WITH ATTACH CAN ACCESS THE UIs)
Description
Easy way to test the following things on a databricks cluster and notebooks
Framework Support
- fastapi: fastapi.py
- gradio: gradio-demo.py
- stable diffusion webui: stable-diffusion-example.py
- streamlit: streamlit_example.py
- nicegui: nicegui-example.py
- flask: flask-app.py
- dash: dask-example.py
- bokeh: bokeh-example.py
- shiny for python: shiny-python-example.py
- panel
- solara: solara-example.py
- chainlit: chainlit-foundation-model-rag-example.py
- code-server on repos code-server-example.py
Easy way to test out llm chatbots; look in examples/gradio
File or Directory Support
This is to support decoupling your UI code from your databricks notebooks. Usually will have a script_path argument instead of directly passing your "app" object. This is convenient for shipping your app outside of a notebook.
- fastapi: fastapi.py
- gradio: gradio-demo.py
- streamlit: streamlit_example.py: This is partially implemented, it only works with one file.
- nicegui: nicegui-example.py
- flask: flask-app.py
- dash: dask-example.py
- bokeh: bokeh-example.py
- shiny for python
- panel
- solara: solara-example.py
- chainlit: chainlit-foundation-model-rag-example.py
Frameworks that leverage asgiproxy
DBTunnel provides a proxy layer using asgiproxy fork to support UIs that do not support proxy root paths, etc. It also comes with a simple token based auth provider that only works on databricks to help you get access to user information.
DBTunnel Proxy features:
- Token based auth: This is a simple token based auth that only works on databricks. This token is saved in the app memory as a python TTLCache object.
- Support for frameworks that dont support proxies: This proxy solution intercepts requests and rewrites js and html files to allow support for hosting behind proxies that are dynamic. This is a temporary measure before researching a way of exposing root path details. Then this step will be skipped. If you are running into issues with this please file a github issue.
- Support for audit logging: This is simply logging tracked users and saving them to a file. Its yet to be implemented.
- Support for frameworks that do not support root paths
- Inject auth headers: This is to provide user information directly to your app via request object. Most frameworks support the access to request object to access headers, etc.
- fastapi
- gradio
- streamlit
- nicegui
- flask
- dash
- bokeh
- shiny for python
- panel
- solara
- chainlit
Chatbot Support
You must use A10 GPU instances or higher
- Mistral-7b gradio-chat-mistral7b-demo.py
- Mixtral 8x7B chainlit-foundation-model.py
- Llama-2-7b
- mpt-7b
- Streaming support (vllm, etc.)
- Streaming support foundation model api
- Typewriter effect
Tunnel Support:
- ngrok
- devtunnels
- cloudflared
- dbtunnel custom relay (private only)
Setup
Please do not use this in production!!
- Clone this repo into databricks repos
- Go to any of the examples to see how to use them
- Enjoy your proxy experience :-)
- If you want to share the link ensure that the other user has permission to attach to your cluster.
Passing databricks auth to your app via inject_auth
You can pass databricks user auth from your notebook session to any of the frameworks by doing the following:
from dbtunnel import dbtunnel
dbtunnel.<framework>(<script_path>).inject_auth().run()
For example:
from dbtunnel import dbtunnel
dbtunnel.gradio(demo).inject_auth().run()
This exposes the user information via environment variable DATABRICKS_HOST and DATABRICKS_TOKEN.
Passing a warehouse to your app via inject_sql_warehouse
You can pass databricks warehouse auth from your notebook session to any of the frameworks by doing the following:
from dbtunnel import dbtunnel
dbtunnel.<framework>(<script_path>).inject_sql_warehouse().run()
This exposes the warehouse information via environment variable DATABRICKS_HOST, DATABRICKS_TOKEN and DATABRICKS_HTTP_PATH.
Passing custom environment variables via inject_env
You can pass custom environment variables from your notebook to any of the frameworks by doing the following:
from dbtunnel import dbtunnel
dbtunnel.<framework>(<script_path>).inject_env(**{
"MY_CUSTOM_ENV": "my_custom_env_value"
}).run()
For example:
from dbtunnel import dbtunnel
dbtunnel.gradio(demo).inject_env(**{
"MY_CUSTOM_ENV": "my_custom_env_value"
}).run()
Alternatively
from dbtunnel import dbtunnel
dbtunnel.gradio(demo).inject_env(MY_CUSTOM_ENV="my_custom_env_value").run()
Keep in mind environment variables need to be keyword arguments!
Exposing to internet using ngrok
WARNING: IT WILL BE PUBLICLY AVAILABLE TO ANYONE WITH THE LINK SO DO NOT EXPOSE ANYTHING SENSITIVE
The reason for doing this is to test something with a friend or colleague who is not logged in into databricks. The proxy option requires you to be logged in into databricks.
- Go to ngrok and create an account and get an api token and a tunnel auth token
- Go to a databricks notebook:
- If you are using free tier of ngrok you can only have one tunnel and one session at a time so enable
kill_all_tunnel_sessions=True
Take a look at the full example here streamlit-example-ngrok.py
from dbtunnel import dbtunnel
# again this example is with streamlit but works with any framework
dbtunnel.streamlit("<script_path>").share_to_internet_via_ngrok(
ngrok_api_token="<ngrok api token>",
ngrok_tunnel_auth_token="<ngrok tunnel auth token>"
).run()
# if you need to kill tunnels because you are on free tier:
# again this example is with streamlit but works with any framework
dbtunnel.streamlit("<script_path>").share_to_internet_via_ngrok(
ngrok_api_token="<ngrok api token>",
ngrok_tunnel_auth_token="<ngrok tunnel auth token>",
kill_all_tunnel_sessions=True,
).run()
Killing processes on a specific port
from dbtunnel import dbtunnel
dbtunnel.kill_port(<port number as int>)
Disclaimer
dbtunnel is not developed, endorsed not supported by Databricks. It is provided as-is; no warranty is derived from using this package. For more details, please refer to the license.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file dbtunnel-0.16.0.tar.gz
.
File metadata
- Download URL: dbtunnel-0.16.0.tar.gz
- Upload date:
- Size: 99.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.10.8
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 40fe8732aba47780f99eb35561155770a410852b64bf4a6e8bf2519a6203cd7b |
|
MD5 | b3d90de2523f943b5dd7fb4f1762233c |
|
BLAKE2b-256 | ef37945a216f00af072a1f9a15ca0e901aaef567267131e4c21f193cc31b6e3b |
File details
Details for the file dbtunnel-0.16.0-py3-none-any.whl
.
File metadata
- Download URL: dbtunnel-0.16.0-py3-none-any.whl
- Upload date:
- Size: 54.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.10.8
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 52cea1020229d9ec6df598b4fad87c8b07591078aff85719e21ad877dd31a59b |
|
MD5 | 53238de3e759719edea3f07fb6b68525 |
|
BLAKE2b-256 | 66d190c10be2d61c1a827dc8b051342ea523e64978da094aaa1ffe74904967d9 |