Skip to main content

FUSE driver for Databricks Unity Catalog Volumes.

Project description

fuse4dbricks

A filesystem in userspace for mounting the Unity Catalog from Databricks.

The filesystem is read only.

This filesystem uses the public databricks API to retrieve files, directories and access permissions from the Unity Catalog.

To mitigate latency and improve performance, file metadata is cached in-memory. Data is cached to a local cache directory (--disk-cache-dir) and partially to RAM as well. Options to control the sizes of those caches are available.

Credentials are stored in RAM while the filesystem is mounted, and must be passed by writing a personal access token to a virtual file:

echo "dapi0000000-2" > /Volumes/.auth/personal_access_token

If fuse (/etc/fuse.conf) has user_allow_other activated, this driver supports the --allow-other, option so multiple users can access it. In this case, the process should typically run from a system user, (you may consider creating a fuse4dbricks user?) who should have exclusive access to --disk-cache-dir. Each user should provide its own personal access token as described. Permissions are respected for each user. The cache is shared among all users in this scenario.

When an access token is missing, revoked or expired, the unity catalog is not accessible anymore and only a virtual /Volumes/README.txt file appears, with instructions on how to add the access token.

In the future other auth options may be integrated.

Installation

You can install this from pypi:

pip install "fuse4dbricks"

Or the development version:

pip install "git+https://github.com/zeehio/fuse4dbricks.git"

Quickstart

Assuming you are the only user:

sudo mkdir "/Volumes" # or any other directory, in your home, it's up to you
fuse4dbricks --workspace "https://adb-xxxx.azuredatabricks.net" /Volumes

Open a new terminal:

# Provide your databricks access token:
echo "dapi0000000-2" > /Volumes/.auth/personal_access_token
# Access your catalog files:
ls /Volumes
# Your catalogs will appear

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

fuse4dbricks-0.5.0.tar.gz (85.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

fuse4dbricks-0.5.0-py3-none-any.whl (35.3 kB view details)

Uploaded Python 3

File details

Details for the file fuse4dbricks-0.5.0.tar.gz.

File metadata

  • Download URL: fuse4dbricks-0.5.0.tar.gz
  • Upload date:
  • Size: 85.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.12

File hashes

Hashes for fuse4dbricks-0.5.0.tar.gz
Algorithm Hash digest
SHA256 ba38b12919de45500b171c9ec78e7d9ad34e7bf1e1e806ddddff494344fc6314
MD5 7ea05b71af5258525b112c57b6502273
BLAKE2b-256 29cb16344157bbd8f9aec4cbd8360dfad1864da32cc0c1cb7fb8021a8c95e277

See more details on using hashes here.

File details

Details for the file fuse4dbricks-0.5.0-py3-none-any.whl.

File metadata

  • Download URL: fuse4dbricks-0.5.0-py3-none-any.whl
  • Upload date:
  • Size: 35.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.12

File hashes

Hashes for fuse4dbricks-0.5.0-py3-none-any.whl
Algorithm Hash digest
SHA256 185f2d944320729a895914719698bf1a53a9be5addf3e913c1a15acf64d4db33
MD5 2b40be92364c8354d45e3aae1765048f
BLAKE2b-256 c99a54b5df91196b6e6c23823f2b4d4fa69baf652be2e2cc82bf9c8ff732db57

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page