Cluster mode plugin for Tusk - distributed queries with DataFusion
Project description
tusk-cluster
Cluster mode plugin for Tusk - distributed queries with DataFusion and Arrow Flight.
Installation
pip install tusk-cluster
Or for development:
cd plugins/tusk-cluster
pip install -e .
Usage
CLI Commands
# Start a local dev cluster (scheduler + workers)
tusk cluster-dev --workers 3
# Or start components separately:
tusk cluster-scheduler --port 8814
tusk cluster-worker --scheduler localhost:8814 --port 8815
Web UI
Once installed, a "Cluster" tab will appear in Tusk Studio where you can:
- Connect to remote schedulers
- Start/stop local clusters
- Monitor workers and jobs
- Submit distributed queries
Architecture
- Scheduler: Coordinates job distribution using Arrow Flight
- Worker: Executes queries using DataFusion
- Communication: Arrow Flight for efficient data transfer
Requirements
tuskdata>=0.1.0datafusion>=43.0.0pyarrow>=18.0.0
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
No source distribution files available for this release.See tutorial on generating distribution archives.
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file tusk_cluster-0.1.0-py3-none-any.whl.
File metadata
- Download URL: tusk_cluster-0.1.0-py3-none-any.whl
- Upload date:
- Size: 27.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.8
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
9cfd7a32c6dfef5b14227fbbc1c6e694730abe0c0bb880e01998b07854b5e8c0
|
|
| MD5 |
3a311b64362feb1e0d85fdc87fca6b9a
|
|
| BLAKE2b-256 |
5eaffc12def74f8ddbe8c53b3331ca26dff001177139d8ce3464ad78751f76e8
|