Prefect worker for running flows on a Slurm HPC cluster
Project description
Prefect-Slurm
A Prefect worker for running flows on Slurm HPC clusters
Execute your Prefect flows on high-performance computing clusters using the Slurm workload manager. This worker seamlessly integrates with Slurm's REST API to submit, monitor, and manage flow runs as Slurm jobs.
Features
✨ Automatic API Version Detection - Supports Slurm REST API versions 0.0.40-0.0.42 with automatic detection
🔒 Secure Token Management - JWT-based authentication with file locking and proper permissions
🔄 Zombie Job Recovery - Automatically detects and handles orphaned flow runs after worker restarts
📊 Resource Management - Full Slurm job specification support for CPU, memory, and time limits
🛠️ CLI Tools - Built-in utilities for token management and worker administration
🧪 Comprehensive Testing - Both unit and integration tests
Quick Start
Installation
pip install prefect-slurm
Basic Setup
-
Create a work pool using the Slurm worker type:
prefect work-pool create slurm-pool --type slurm
-
Configure authentication - Set up your Slurm credentials:
export PREFECT_SLURM_USER_NAME=your_username export PREFECT_SLURM_API_URL=http://your-slurm-server:6820
-
Set up authentication token:
# Generate and store token using built-in CLI scontrol token username=$USER lifespan=3600 | prefect-slurm token # Or set token directly via environment variable export PREFECT_SLURM_USER_TOKEN=your_jwt_token
-
Start the worker:
prefect worker start --pool slurm-pool --type slurm
Configuration
Environment Variables
| Variable | Description | Default |
|---|---|---|
PREFECT_SLURM_USER_NAME |
Slurm username | Required |
PREFECT_SLURM_API_URL |
Slurm REST API URL | Required |
PREFECT_SLURM_USER_TOKEN |
JWT authentication token | Optional |
PREFECT_SLURM_TOKEN_FILE |
Path to token file | ~/.prefect_slurm.jwt |
PREFECT_SLURM_LOCK_TIMEOUT |
File lock timeout (seconds) | 60 |
PREFECT_SLURM_ENV_FILE |
Override environment file path | Optional |
Environment Files
The worker supports loading configuration from environment files using a hierarchical discovery system. Files are loaded in priority order (later files override earlier ones):
- System-wide:
/etc/prefect-slurm/.env - XDG Config:
~/.config/prefect-slurm/.env(or$XDG_CONFIG_HOME/prefect-slurm/.env) - User Home:
~/.prefect_slurm.env - Current Directory (app-specific):
./.prefect_slurm.env - Current Directory:
./.env - Environment Variable Override:
$PREFECT_SLURM_ENV_FILE
Example environment file (.prefect_slurm.env):
# Slurm connection settings
PREFECT_SLURM_USER_NAME=your_username
PREFECT_SLURM_API_URL=http://your-slurm-server:6820
# Optional token (alternative to token file)
PREFECT_SLURM_USER_TOKEN=your_jwt_token_here
# Optional custom token file location
PREFECT_SLURM_TOKEN_FILE=~/my_custom_token.jwt
# Optional custom lock timeout
PREFECT_SLURM_LOCK_TIMEOUT=120
You can override the automatic discovery by setting PREFECT_SLURM_ENV_FILE to point to a specific file:
export PREFECT_SLURM_ENV_FILE=/path/to/my/custom.env
prefect worker start --pool slurm-pool --type slurm
Note: CLI commands (prefect-slurm token) also support environment files, though only PREFECT_SLURM_TOKEN_FILE and PREFECT_SLURM_LOCK_TIMEOUT are relevant for CLI operations.
Work Pool Configuration
Configure your Slurm work pool with job specifications:
job_configuration:
partition: "compute"
cpu: 4
memory: 8
time_limit: 2
working_dir: "/path/to/working/directory"
source_files: # Optional - omit for default Python environment
- "~/.bashrc"
- "~/envs/conda/bin/activate"
Environment Setup
The worker supports two environment configuration modes:
Custom Environment (when source_files are specified):
job_configuration:
source_files:
- "~/.bashrc"
- "/opt/conda/bin/activate"
- "/opt/modules/init.sh"
The worker will source these files before executing your flow. Use this for conda environments, module systems, or custom shell configurations.
Default Python Environment (when source_files is empty or omitted):
job_configuration:
partition: "compute"
cpu: 4
memory: 8
The worker automatically creates a temporary Python virtual environment with the matching Prefect version installed. The environment is created in $TMPDIR/.venv_$SLURM_JOB_ID and cleaned up after job completion.
CLI Tools
The package includes a command-line utility for token management:
# Store token from scontrol output at default location
scontrol token username=$USER lifespan=3600 | prefect-slurm token
# Store token to custom location
echo "jwt_token_here" | prefect-slurm token ~/my_token.jwt
# Get help
prefect-slurm token --help
The default location for the token is ~/.prefect_slurm.jwt (can be overridden by setting PREFECT_SLURM_TOKEN_FILE) and default permissions are 600 (read/write allowed for user only)
Running the Examples
You can test the examples in the examples/ directory using the local Docker Compose Slurm cluster:
-
Start the local cluster:
cd slurm_environment/ docker-compose up -d
-
Wait for services to be healthy (check with
docker-compose ps) -
Deploy and run example flows (from the prefect_server container):
# Enter the Prefect server container docker-compose exec prefect_server bash # Navigate to examples and deploy the hello world example interactively cd /opt/data/examples prefect deploy # Run the deployment prefect deployment run slurm-hello-world/slurm-hello-world-deployment
-
Monitor execution:
- Prefect UI: http://localhost:4200
- Check Slurm jobs (from slurm_node container):
docker-compose exec slurm_node squeue - View worker logs:
docker-compose logs slurm_submitter
The Docker environment provides a complete Slurm cluster with the worker automatically configured and example flows ready to deploy.
Architecture
The Slurm worker integrates with Prefect's execution model:
- Worker Polling - Continuously polls Prefect API for scheduled flow runs
- Job Submission - Converts flow runs to Slurm job specifications
- Execution - Submits jobs via Slurm REST API with proper resource allocation
- Monitoring - Tracks job status and reports back to Prefect
- Cleanup - Handles zombie jobs and ensures proper flow run state management
graph TB
A[Prefect Server] -->|polls for flows| B[Slurm Worker]
B -->|submits jobs| C[Slurm REST API]
C -->|schedules| D[Slurm Cluster]
D -->|executes| E[Flow Run]
E -->|reports status| B
B -->|updates state| A
Requirements
- Python: 3.11+ (< 3.14)
- Prefect: 3.4.13+
- Slurm: Cluster with REST API enabled (versions 0.0.40-0.0.42 supported)
- Network: Access from worker node to both Prefect API and Slurm REST API
Development
Running Tests
# Unit tests only
pytest -m unit
# Integration tests (requires Docker)
pytest -m integration
# CLI tests
pytest -m cli
# All tests
pytest
Test Environment
The project includes Docker-based Slurm cluster for integration testing:
cd slurm_environment/
docker-compose up -d
Contributing
Contributions are welcome! This project is developed by the EBI Metagenomics team.
Development Workflow
- Fork the repository
- Create a feature branch
- Make your changes with tests
- Run the full test suite
- Submit a pull request
License
Licensed under the Apache License 2.0. See LICENSE for details.
Support
- Issues: Report bugs and request features via GitHub Issues
- Documentation: See tests/README.md for detailed testing information
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file prefect_slurm-0.1.1.tar.gz.
File metadata
- Download URL: prefect_slurm-0.1.1.tar.gz
- Upload date:
- Size: 18.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/2.1.4 CPython/3.13.0 Darwin/25.1.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2c9c0de6310d30c35be4787fd64225f7e4383f853cd70501bfe3dd6c28bd6a61
|
|
| MD5 |
55aea5262ac90fa55058abba2865f21b
|
|
| BLAKE2b-256 |
3fdd1e37d1019957ae9b054abbbfcb170b9fca5eab603ae0ae2824502c75ceec
|
File details
Details for the file prefect_slurm-0.1.1-py3-none-any.whl.
File metadata
- Download URL: prefect_slurm-0.1.1-py3-none-any.whl
- Upload date:
- Size: 19.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/2.1.4 CPython/3.13.0 Darwin/25.1.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c86723b1235df74c5a731d1bee0be64525db979ae1e54e4415bc724da06226c1
|
|
| MD5 |
34c2790deb5e32a088a55edebdd44ac3
|
|
| BLAKE2b-256 |
a9abf6207dee89aa90f1a232cedf4ed0fd5a690f3b3a0f8931e3061dce3ddad8
|