Skip to main content

Launches a Dask Gateway cluster in K8s and joins HTCondor workers to it

Project description

NCSA HTCdaskGateway

Subclasses the Dask Gateway client to launch dask clusters in Kubernetes, but with HTCondor workers. This is a fork of the ingenious original idea by Maria Acosta at Fermilab as part of their Elastic Analysis Facility project.

How it Works

This is a drop-in replacement for the official Dask Gateway client. It keeps the same authentication and interaction with the gateway server (which is assumed to be running in a Kubernetes cluster). When the user requests a new cluster, this client communicates with the gateway server and instructs it to launch a cluster. We are running a modified docker image in the cluster which only launches the scheduler, and assumes that HTC workers will evetually join.

The client then uses the user's credentials to build an HTC Job file and submits it to the cluster. These jobs run the dask worker and have the necessary certs to present themselves to the scheduler.

The scheduler then accepts them into the cluster and we are ready to compute

  • A Dask Gateway client extension for heterogeneous cluster mode combining the Kubernetes backend for pain-free scheduler networking, with COFFEA-powered HTCondor workers and/or OKD [coming soon].
  • Latest PyPI version is installed by default and deployed to the COFFEA-DASK notebook on EAF (https://analytics-hub.fnal.gov). A few lines will get you going!
  • The current image for workers/schedulers is: coffeateam/coffea-dask-cc7-gateway:0.7.12-fastjet-3.3.4.0rc9-g8a990fa

Basic usage @ Fermilab EAF

  • Make sure the notebook launched supports this functionality (COFFEA-DASK notebook)
from htcdaskgateway import HTCGateway

gateway = HTCGateway()
cluster = gateway.new_cluster()
cluster

# Scale my cluster to 5 HTCondor workers
cluster.scale(5)

# Obtain a client for connecting to your cluster scheduler
# Your cluster should be ready to take requests
client = cluster.get_client()
client

# When computations are finished, shutdown the cluster
cluster.shutdown()

Other functions worth checking out

  • This is a multi-tenant environment, and you are authenticated via JupyterHub Oauth which means that you can create as many* clusters as you wish
  • To list your clusters:
# Verify that the gateway is responding to requests by asking to list all its clusters
clusters = gateway.list_clusters()
clusters
  • To connect to a specific cluster from the list:
cluster = gateway.connect(cluster_name)
cluster
cluster.shutdown()
  • To gracefully close the cluster and remove HTCondor worker jobs associated to it:
cluster.shutdown()
  • There are widgets implemented by Dask Gateway. Make sure to give them a try from your EAF COFFEA notebook, just execute the client and cluster commands (after properly initializing them) in a cell like:
-------------
cluster = gateway.new_cluster()
cluster
< Widget will appear after this step>
-------------
client = cluster.get_client()
client
< Widget will appear after this step >
-------------
cluster
< Widget will appear after this step >

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ncsa_htcdaskgateway-1.0.0rc1.tar.gz (7.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ncsa_htcdaskgateway-1.0.0rc1-py3-none-any.whl (7.4 kB view details)

Uploaded Python 3

File details

Details for the file ncsa_htcdaskgateway-1.0.0rc1.tar.gz.

File metadata

  • Download URL: ncsa_htcdaskgateway-1.0.0rc1.tar.gz
  • Upload date:
  • Size: 7.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for ncsa_htcdaskgateway-1.0.0rc1.tar.gz
Algorithm Hash digest
SHA256 0974c41b509ab9e66833e28d5c862168a8f972612bfca11a4aa7bc3ecd4d5169
MD5 2d9be180b0235675a7e8ff2285311351
BLAKE2b-256 a4c68ae8c2ad86c195c29fb3d875615d06fe4042394370d77bf674ac192f0a65

See more details on using hashes here.

Provenance

The following attestation bundles were made for ncsa_htcdaskgateway-1.0.0rc1.tar.gz:

Publisher: cd.yml on ncsa/htcdaskgateway

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file ncsa_htcdaskgateway-1.0.0rc1-py3-none-any.whl.

File metadata

File hashes

Hashes for ncsa_htcdaskgateway-1.0.0rc1-py3-none-any.whl
Algorithm Hash digest
SHA256 e01214533165829f4f8a935a9d39d0c2d3cd13a81708b929a1bdfd8f4a3fb099
MD5 7cd26872e617c71edea314dde1566382
BLAKE2b-256 2a912e63fab5c62bcfea4bb090a70d113de9f1c915b76f4970879ad1e2abf4e8

See more details on using hashes here.

Provenance

The following attestation bundles were made for ncsa_htcdaskgateway-1.0.0rc1-py3-none-any.whl:

Publisher: cd.yml on ncsa/htcdaskgateway

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page