FUSE driver for Databricks Unity Catalog Volumes.
Project description
fuse4dbricks
A filesystem in userspace for mounting the Unity Catalog from Databricks.
Disclaimer
This is not an official databricks package. I, the author of this package, am not affiliated to Databricks. My capacity to support this package is very limited or none. I may review issues and pull requests but I won't commit to timelines or features.
Features
The filesystem is read only.
This filesystem uses the public databricks API to retrieve files, directories and access permissions from the Unity Catalog.
To mitigate latency and improve performance, file metadata is cached in-memory. Data is cached
to a local cache directory (--disk-cache-dir) and partially to RAM as well. Options to control
the sizes of those caches are available.
Credentials are stored in RAM while the filesystem is mounted, and must be passed by writing a personal access token to a virtual file:
echo "dapi0000000-2" > /Volumes/.auth/personal_access_token
If fuse (/etc/fuse.conf) has user_allow_other activated, this driver supports the --allow-other,
option so multiple users can access it. In this case, the process should typically run from a system user,
(you may consider creating a fuse4dbricks user?) who should have exclusive access to --disk-cache-dir. Each user should provide its own personal access token as described. Permissions are respected for each user. The cache is shared among all users in this scenario.
When an access token is missing, revoked or expired, the unity catalog is not accessible anymore and only
a virtual /Volumes/README.txt file appears, with instructions on how to add the access token.
In the future other auth options may be integrated.
Installation
You can install this package from pypi:
pip install "fuse4dbricks"
Or the development version:
pip install "git+https://github.com/zeehio/fuse4dbricks.git"
Quickstart
Assuming you are the only user:
sudo mkdir "/Volumes" # or any other directory, in your home, it's up to you
fuse4dbricks --workspace "https://adb-xxxx.azuredatabricks.net" /Volumes
Open a new terminal:
# Provide your databricks access token:
echo "dapi0000000-2" > /Volumes/.auth/personal_access_token
# Access your catalog files:
ls /Volumes
# Your catalogs will appear
Multi user setup
-
Create a virtual environment and install fuse4dbricks there:
# Note that fuse4dbricks requires python>=3.11 sudo mkdir /opt/fuse4dbricks sudo chmod 755 sudo python3.11 -m venv /opt/fuse4dbricks/venv source /opt/fuse4dbricks/venv/bin/activate python3 -m pip install fuse4dbricks deactivate -
Create a system user account
sudo useradd --system --shell /usr/sbin/nologin fuse4dbricks -
Create the mount directory:
sudo mkdir /Volumes sudo chown fuse4dbricks /Volumes sudo chmod 0700 /Volumes -
Create the cache directory:
sudo mkdir /var/cache/fuse4dbricks sudo chmod 0700 /var/cache/fuse4dbricks sudo chown fuse4dbricks /var/cache/fuse4dbricks -
Create a starting script and make it executable:
Please replace whatever you need here
cat << EOF | sudo tee /opt/fuse4dbricks/fuse4dbricks_start.sh #!/bin/bash source /opt/fuse4dbricks/venv/bin/activate fuse4dbricks \ --workspace "https://adb-xxxx.azuredatabricks.net" \ --disk-cache-dir /var/cache/fuse4dbricks \ --allow-other \ --ram-cache-mb 512 \ --disk-cache-gb 1024 \ --disk-cache-max-days 30 \ /Volumes EOF sudo chmod +x /opt/fuse4dbricks/fuse4dbricks_start.sh -
Create a systemd unit
cat << EOF | sudo tee /etc/systemd/system/fuse4dbricks.service [Unit] Description=fuse4dbricks After=network.target [Service] Type=simple User=fuse4dbricks WorkingDirectory=/opt/fuse4dbricks ExecStart=/opt/fuse4dbricks/fuse4dbricks_start.sh Restart=on-failure RestartSec=5 [Install] WantedBy=multi-user.target EOF -
Reload the daemon lists
sudo systemctl daemon-reload -
Enable and start the service
sudo systemctl enable fuse4dbricks sudo systemctl start fuse4dbricks
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file fuse4dbricks-0.5.1.tar.gz.
File metadata
- Download URL: fuse4dbricks-0.5.1.tar.gz
- Upload date:
- Size: 85.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
00179d2565c1475135f9a3dd682bcaf7e5bd4105a11f53c73fce956657509799
|
|
| MD5 |
8937665127c31244e79879e7cdf6b1b6
|
|
| BLAKE2b-256 |
99ebd927806b765d6fa65ac6bc4054bb0d881692f7879292352f67c64a1d626a
|
Provenance
The following attestation bundles were made for fuse4dbricks-0.5.1.tar.gz:
Publisher:
test.yml on zeehio/fuse4dbricks
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
fuse4dbricks-0.5.1.tar.gz -
Subject digest:
00179d2565c1475135f9a3dd682bcaf7e5bd4105a11f53c73fce956657509799 - Sigstore transparency entry: 982769490
- Sigstore integration time:
-
Permalink:
zeehio/fuse4dbricks@bebd231c2209e01fadfd44e098b0185bcd076695 -
Branch / Tag:
refs/tags/v0.5.1 - Owner: https://github.com/zeehio
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
test.yml@bebd231c2209e01fadfd44e098b0185bcd076695 -
Trigger Event:
push
-
Statement type:
File details
Details for the file fuse4dbricks-0.5.1-py3-none-any.whl.
File metadata
- Download URL: fuse4dbricks-0.5.1-py3-none-any.whl
- Upload date:
- Size: 36.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
089c068c03de040be4f4821052ed4cd0a65f2241e64db85d84351d6fd8574e26
|
|
| MD5 |
6aa489a425925fd554f55d3164301e6e
|
|
| BLAKE2b-256 |
3461fdd6ef3a713f416890342dcfa1abc6628914a2b40873cfc4662f2b0af6d4
|
Provenance
The following attestation bundles were made for fuse4dbricks-0.5.1-py3-none-any.whl:
Publisher:
test.yml on zeehio/fuse4dbricks
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
fuse4dbricks-0.5.1-py3-none-any.whl -
Subject digest:
089c068c03de040be4f4821052ed4cd0a65f2241e64db85d84351d6fd8574e26 - Sigstore transparency entry: 982769493
- Sigstore integration time:
-
Permalink:
zeehio/fuse4dbricks@bebd231c2209e01fadfd44e098b0185bcd076695 -
Branch / Tag:
refs/tags/v0.5.1 - Owner: https://github.com/zeehio
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
test.yml@bebd231c2209e01fadfd44e098b0185bcd076695 -
Trigger Event:
push
-
Statement type: