Skip to main content

Slurm executor for Apache Airflow using REST API

Project description

Airflow Provider Slurm

PyPI version Python Support License Alpha

🚀 Apache Airflow Executor for Slurm HPC Clusters

Execute Apache Airflow tasks on High-Performance Computing (HPC) clusters using the Slurm REST API. Validated against live Slurm 25.11.1 infrastructure.

Features

  • 🚀 Submit Airflow tasks as Slurm jobs via REST API
  • 📊 Monitor job status and update task states in real-time
  • 📝 Stream logs from Slurm to Airflow UI
  • 🔧 Configure resources (CPU, memory, time limits) per task
  • 🐳 Support for both containerized and virtual environment execution
  • 🔄 Automatic job recovery after scheduler restarts
  • ⚡ Efficient batch job submission for high-throughput workloads

✨ Key Highlights

  • 🎯 Live Tested: Validated against live Slurm 25.11.1 clusters
  • 🔧 HPC Ready: Supports HPC, ML, bioinformatics, and distributed computing workloads
  • High Performance: Optimized for large-scale workflow orchestration
  • 🛡️ Reliable: Comprehensive error handling and job recovery mechanisms
  • 📈 Scalable: Dynamic resource allocation and multi-partition support

Requirements

  • Python: 3.8, 3.9, 3.10, 3.11
  • Apache Airflow: 2.5.0+ (including 3.x support)
  • Slurm: 20.02+ with REST API (slurmrestd) v0.0.40-v0.0.44
  • System: scontrol binary available in PATH
  • Storage: Shared filesystem between Airflow and Slurm compute nodes

Installation

pip install airflow-provider-slurm

Quick Start

  1. Configure Airflow to use the Slurm executor:
# airflow.cfg
[core]
executor = airflow_slurm_executor.SlurmExecutor

[slurm]
api_url = https://your-slurm-cluster:6820
default_partition = compute
  1. Create a DAG that leverages Slurm resources:
from airflow import DAG
from airflow.decorators import task
from datetime import datetime

with DAG(
    'slurm_example',
    start_date=datetime(2024, 1, 1),
    schedule=None,
) as dag:
    
    @task(executor_config={
        'partition': 'gpu',
        'cpus_per_task': 4,
        'mem': '16G',
        'time_limit': '02:00:00',
    })
    def gpu_task():
        import torch
        # Your GPU workload here
        return "Task completed on Slurm!"
    
    gpu_task()

Configuration

See Configuration Guide for detailed options.

Development

# Clone the repository
git clone https://github.com/jontk/airflow-slurm-executor
cd airflow-slurm-executor

# Install in development mode
pip install -e ".[dev]"

# Run tests
pytest

# Run linting
black . && isort . && flake8

Documentation

License

Apache License 2.0 - see LICENSE file for details.

Acknowledgments

This executor was developed to bridge the gap between modern data orchestration and traditional HPC infrastructure, enabling organizations to leverage their existing Slurm clusters for Airflow workflows.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

airflow_provider_slurm-0.1.0.tar.gz (18.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

airflow_provider_slurm-0.1.0-py3-none-any.whl (19.7 kB view details)

Uploaded Python 3

File details

Details for the file airflow_provider_slurm-0.1.0.tar.gz.

File metadata

  • Download URL: airflow_provider_slurm-0.1.0.tar.gz
  • Upload date:
  • Size: 18.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.13

File hashes

Hashes for airflow_provider_slurm-0.1.0.tar.gz
Algorithm Hash digest
SHA256 e8c3db7af3e99f8669024aaafb00f4aa75e2573406e9768a3dd54791b4ea9667
MD5 db99741d6d62d8e0bad7c07224fbaeb6
BLAKE2b-256 31b876babf32d9b94bd0994a1d5f9c7bc46cfb25530f78fae25562465797f765

See more details on using hashes here.

File details

Details for the file airflow_provider_slurm-0.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for airflow_provider_slurm-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 38034f76db2342a902f68e54e39354600d3899ffcad331f82a2216f3f70d5bd7
MD5 78e01b90008417bf5d796ae0778267a6
BLAKE2b-256 cb271454d4bad875597295ca7aaa9876fd63b1e501c00ea6ccd468c74b63a380

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page