Skip to main content

Ease-of-use utility tools for databricks notebooks.

Project description


Python version Pyspark version Build Status

databricks-utils is a python package that provide several utility classes/func that improve ease-of-use in databricks notebook.


pip install databricks-utils


  • S3Bucket class to easily interact with a S3 bucket via dbfs and databricks spark.

  • vega_embed to render charts from Vega and Vega-Lite specifications.


API documentation can be found at

Quick start


import json
from import S3Bucket

# need to attach notebook's dbutils
# before S3Bucket can be used

# create an instance of the s3 bucket
bucket = (S3Bucket("somebucketname", "SOMEACCESSKEY", "SOMESECRETKEY")
          .allow_spark(sc) # local spark context
          .mount("somebucketname")) # mount location name (resolves as `/mnt/somebucketname`)

# show list of files/folders in the bucket "resource" folder"resource/")

# read in a json file from the bucket
data = json.load(open(bucket.local("resource/somefile.json", "r")))

# read from parquet via spark
dataframe ="resource/somedf.parquet"))

# umount

Vega and Vega-Lite are high-level grammars of interactive graphics. They provide concise JSON syntax for rapidly generating visualizations to support analysis.

from databricks_utils.vega import vega_embed

# vega-lite spec for a bar chart
spec = {
  "data": {
    "values": [
      {"a": "A","b": 28}, {"a": "B","b": 55}, {"a": "C","b": 43},
      {"a": "D","b": 91}, {"a": "E","b": 81}, {"a": "F","b": 53},
      {"a": "G","b": 19}, {"a": "H","b": 87}, {"a": "I","b": 52}
  "mark": "bar",
  "encoding": {
    "x": {"field": "a", "type": "ordinal"},
    "y": {"field": "b", "type": "quantitative"}

# plot out the vega chart in databricks notebook


# add a version to git tag and publish to pypi

Project details

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Filename, size & hash SHA256 hash help File type Python version Upload date
databricks-utils-0.0.7.tar.gz (4.5 kB) Copy SHA256 hash SHA256 Source None

Supported by

Elastic Elastic Search Pingdom Pingdom Monitoring Google Google BigQuery Sentry Sentry Error logging AWS AWS Cloud computing DataDog DataDog Monitoring Fastly Fastly CDN SignalFx SignalFx Supporter DigiCert DigiCert EV certificate StatusPage StatusPage Status page