Skip to main content

Slurm executor for Apache Airflow using REST API

Project description

Airflow Provider Slurm

PyPI version Python Support License Alpha

🚀 Apache Airflow Executor for Slurm HPC Clusters

Execute Apache Airflow tasks on High-Performance Computing (HPC) clusters using the Slurm REST API. Validated against live Slurm 25.11.1 infrastructure.

Features

  • 🚀 Submit Airflow tasks as Slurm jobs via REST API
  • 📊 Monitor job status and update task states in real-time
  • 📝 Stream logs from Slurm to Airflow UI
  • 🔧 Configure resources (CPU, memory, time limits) per task
  • 🐳 Support for both containerized and virtual environment execution
  • 🔄 Automatic job recovery after scheduler restarts
  • ⚡ Efficient batch job submission for high-throughput workloads

✨ Key Highlights

  • 🎯 Live Tested: Validated against live Slurm 25.11.1 clusters
  • 🔧 HPC Ready: Supports HPC, ML, bioinformatics, and distributed computing workloads
  • High Performance: Optimized for large-scale workflow orchestration
  • 🛡️ Reliable: Comprehensive error handling and job recovery mechanisms
  • 📈 Scalable: Dynamic resource allocation and multi-partition support

Requirements

  • Python: 3.8, 3.9, 3.10, 3.11
  • Apache Airflow: 2.5.0+ (including 3.x support)
  • Slurm: 20.02+ with REST API (slurmrestd) v0.0.40-v0.0.44
  • System: scontrol binary available in PATH
  • Storage: Shared filesystem between Airflow and Slurm compute nodes

Installation

pip install airflow-provider-slurm

Quick Start

  1. Configure Airflow to use the Slurm executor:
# airflow.cfg
[core]
executor = airflow_slurm_executor.SlurmExecutor

[slurm]
api_url = https://your-slurm-cluster:6820
default_partition = compute
  1. Create a DAG that leverages Slurm resources:
from airflow import DAG
from airflow.decorators import task
from datetime import datetime

with DAG(
    'slurm_example',
    start_date=datetime(2024, 1, 1),
    schedule=None,
) as dag:
    
    @task(executor_config={
        'partition': 'gpu',
        'cpus_per_task': 4,
        'mem': '16G',
        'time_limit': '02:00:00',
    })
    def gpu_task():
        import torch
        # Your GPU workload here
        return "Task completed on Slurm!"
    
    gpu_task()

Configuration

See Configuration Guide for detailed options.

Development

# Clone the repository
git clone https://github.com/jontk/airflow-slurm-executor
cd airflow-slurm-executor

# Install in development mode
pip install -e ".[dev]"

# Run tests
pytest

# Run linting
black . && isort . && flake8

Documentation

License

Apache License 2.0 - see LICENSE file for details.

Acknowledgments

This executor was developed to bridge the gap between modern data orchestration and traditional HPC infrastructure, enabling organizations to leverage their existing Slurm clusters for Airflow workflows.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

airflow_provider_slurm-0.0.1.tar.gz (18.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

airflow_provider_slurm-0.0.1-py3-none-any.whl (19.7 kB view details)

Uploaded Python 3

File details

Details for the file airflow_provider_slurm-0.0.1.tar.gz.

File metadata

  • Download URL: airflow_provider_slurm-0.0.1.tar.gz
  • Upload date:
  • Size: 18.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.13

File hashes

Hashes for airflow_provider_slurm-0.0.1.tar.gz
Algorithm Hash digest
SHA256 b0f56a29994fd0d5e507ed70bf5a1d187a5a785d80a04b1e4abd213be5f5a61c
MD5 67a84c92ae500f016639499554f4bf02
BLAKE2b-256 a5f2a1689aaf927ccb47823ce23aec9aeb1bb982e3be973deb9b828d2866cb13

See more details on using hashes here.

File details

Details for the file airflow_provider_slurm-0.0.1-py3-none-any.whl.

File metadata

File hashes

Hashes for airflow_provider_slurm-0.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 0e5a7dca6755e3dcd5a34a5d11ae72d0d5161ac85af5a99e13ad47e50060abac
MD5 faa1e9ed5c4af0caff5ef2119a15de96
BLAKE2b-256 4fa2e89f3ee72a705eedc0fa0af999b0de0c08e2cf2f3768c5bb6824880ef2d2

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page