The Kafka Slurm Agent is a distributed computing and stream processing engine that can be used to run python code acrossmultiple SLURM managed HPC clusters or individual workstations.It uses Kafka to asynchronously communicate with agents installed on clusters and workstations.It contains a monitoring tool with a Web JSON API and a job submitter.It is a pure Python implementation using faust stream processing
Project description
Kafka Slurm Agent
The Kafka Slurm Agent is a distributed computing and stream processing engine that can be used to run python code across multiple SLURM managed HPC clusters or individual workstations. It uses Kafka to asynchronously communicate with agents installed on clusters and workstations. It contains a monitoring tool with a Web JSON API and a job submitter. It is a pure Python implementation using faust stream processing
Installation.
Use the standard pip
tool to install. The recommended way is to use a Python virtual environment:
python3 -m venv venv
source venv/bin/activate
pip install kafka-slurm-agent
Quick User Guide
Create a new project
In the folder in which you created the venv
subfolder run the following command:
kafka-slurm create-project --folder .
This will generate the following files:
kafkaslurm_cfg.py
- the configuration file- Startup scripts for worker-agent, cluster-agent and monitor-agent
- An example file to run your code (
run.py
) - The job submitter example (
submitter.py
) - The class that can be optionally used to override the existing implementation of the worker-agent (``my_worker_agent.py`)
Configuration
Please adjust the config file.
- Modify the configuration of the connection to Apache Kafka. The default one assumes that kafka is running on localhost and default port (9092) and doesn't use authentication or SSL.
In the comments you will find parameters necessary to connect to Kafka configured using SASL and plaintext password. If you use this type of connection please uncomment also the line that starts with:
# KAFKA_FAUST_BROKER_CREDENTIALS
- Make sure that ``PREFIX` points to the location of your project
- Change the names of topics used for your project to avoid any conflict with projects sharing the same kafka instance.
- If you want to use a SLURM cluster please change the job
CLUSTER_JOB_NAME_SUFFIX = '_KSA'
to avoid conflicts with other projects running on your slurm cluster. The jobs managed by cluster-agent will be named "JOBID_SUFFIX" where the JOBID is the identifier that you assign when submitting a job and SUFFIX is handled by this configuration parameter.
For a full list of configuration parameters refer to the documentation.
Creating topics on Kafka with appropriate partitions
Use the built-in command kafka-slurm
to create topics. You should set the --new-topic-partitions
paramter to at least the number of planned clusters and workstations that will be used simultaneously.
For example:
kafka-slurm --new-topic-partitions 4 topics-create
Implement your code
An example of a script that you can implement is generated in run.py
Submitting jobs
Once this is ready, you can test your project locally:
- Open a new terminal and start the worker-agent (
./start_worker_agent
) - Open a new terminal and start the monitor-agent (
./start_monitor_agent
) - Submit new jobs using the
submitter.py
You can monitor the execution by opening http://localhost:6067/mon/stats/ on the host on which you've started the monitor-agent.
Demo project
You can download and directly run a demonstration project: https://github.com/ilbsm/ksa_demo
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distribution
File details
Details for the file kafka_slurm_agent-1.3.0-py3-none-any.whl
.
File metadata
- Download URL: kafka_slurm_agent-1.3.0-py3-none-any.whl
- Upload date:
- Size: 31.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.12.3
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 018eeaf7c1cdd38e875b4b27bd55c9f93cd1d3b441c706147f3dfb1f62299089 |
|
MD5 | 3b307154979597315dd65043ca97c7a6 |
|
BLAKE2b-256 | e964bdbe513729ced178a6fbb0e8d753ec21122d701860ffba0d9800557567de |