Skip to main content

The Kafka Slurm Agent is a distributed computing and stream processing engine that can be used to run python code acrossmultiple SLURM managed HPC clusters or individual workstations.It uses Kafka to asynchronously communicate with agents installed on clusters and workstations.It contains a monitoring tool with a Web JSON API and a job submitter.It is a pure Python implementation using faust stream processing

Project description

Kafka Slurm Agent

The Kafka Slurm Agent is a distributed computing and stream processing engine that can be used to run python code across multiple SLURM managed HPC clusters or individual workstations. It uses Kafka (3.x) to asynchronously communicate with agents installed on clusters and workstations. It contains a monitoring tool with a Web JSON API and a job submitter. It is a pure Python implementation using faust stream processing

Installation.

Use the standard pip tool to install. The recommended way is to use a Python virtual environment:

python3 -m venv venv

source venv/bin/activate

pip install kafka-slurm-agent

Quick User Guide

Create a new project

In the folder in which you created the venv subfolder run the following command:

kafka-slurm create-project --folder .

This will generate the following files:

  1. kafkaslurm_cfg.py - the configuration file
  2. Startup scripts for worker-agent, cluster-agent and monitor-agent
  3. An example file to run your code (run.py)
  4. The job submitter example (submitter.py)
  5. The class that can be optionally used to override the existing implementation of the worker-agent (my_worker_agent.py)

Configuration

Please adjust the config file. The config file name can be also injected using the KSA_CONFIG_FILE environment variable

  1. Modify the configuration of the connection to Apache Kafka. The default one assumes that kafka is running on localhost and default port (9092) and doesn't use authentication or SSL. In the comments you will find parameters necessary to connect to Kafka configured using SASL and plaintext password. If you use this type of connection please uncomment also the line that starts with: # KAFKA_FAUST_BROKER_CREDENTIALS
  2. Make sure that PREFIX points to the location of your project
  3. Change the names of topics used for your project to avoid any conflict with projects sharing the same kafka instance.
  4. If you want to use a SLURM cluster please change the job CLUSTER_JOB_NAME_SUFFIX = '_KSA' to avoid conflicts with other projects running on your slurm cluster. The jobs managed by cluster-agent will be named "JOBID_SUFFIX" where the JOBID is the identifier that you assign when submitting a job and SUFFIX is handled by this configuration parameter.

For a full list of configuration parameters refer to the documentation.

Creating topics on Kafka with appropriate partitions

Use the built-in command kafka-slurm to create topics. You should set the --new-topic-partitions paramter to at least the number of planned clusters and workstations that will be used simultaneously. For example:

kafka-slurm --new-topic-partitions 4 topics-create

Implement your code

An example of a script that you can implement is generated in run.py

Submitting jobs

Once this is ready, you can test your project locally:

  1. Open a new terminal and start the worker-agent (./start_worker_agent)
  2. Open a new terminal and start the monitor-agent (./start_monitor_agent)
  3. Submit new jobs using the submitter.py

You can monitor the execution by opening http://localhost:6067/mon/stats/ on the host on which you've started the monitor-agent.

Kafka

For development and testing purposes, you can use Kafka in Docker:

docker run -d -p 9092:9092 --name broker apache/kafka:3.9.1

Demo project

You can download and directly run a demonstration project: https://github.com/ilbsm/ksa_demo

Limitations

Due to the underlying libraries (faust-streaming and it's dependency: aiokafka) currently Kafka 4.0 is not supported.

Cite

If you use Kafka Slurm Agent in your research, please cite:

Applying Large-Scale Distributed Computing to Structural Bioinformatics -- Bridging Legacy HPC Clusters with Big Data Technologies using kafka-slurm-agent

You can also read more about the motivation and applications of this tool.

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

kafka_slurm_agent-1.4.1-py3-none-any.whl (33.6 kB view details)

Uploaded Python 3

File details

Details for the file kafka_slurm_agent-1.4.1-py3-none-any.whl.

File metadata

File hashes

Hashes for kafka_slurm_agent-1.4.1-py3-none-any.whl
Algorithm Hash digest
SHA256 54a39e829cda51136115a581cf793b70b36c75892de9d0b71542808985c08309
MD5 53d66942cd4c6c03bb302ed4a88eda23
BLAKE2b-256 e53d2e46f9375697058d5df6b36969c95bdb7dc4ecd315c9725bece803ce4069

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page