Skip to main content

SlurmRay is an official tool from DESI @ HEC UNIL for effortlessly distributing tasks on Slurm clusters (e.g., Curnagl) or standalone servers (e.g., ISIPOL09/Desi) using the Ray library.

Project description

SlurmRay v9.10.0 - Autonomous Distributed Ray on Slurm

[!IMPORTANT] Bug Reports: SlurmRay is in beta. If you find a bug, please report it on GitHub.

[!TIP] Full Documentation: Access the complete documentation here.

The intelligent bridge between your local terminal and High-Performance Computing (HPC) power.

SlurmRay allows you to transparently distribute your Python tasks across Slurm clusters (like Curnagl) or standalone servers (like Desi). It handles environment synchronization, local package detection, and task distribution automatically, turning your local machine into a control center for massive compute resources.

Current State: Version 9.10.0 (Feb 12). Recursive Callable Args Scanning: The scanner now recursively walks all data structures (nested dicts, lists, tuples) passed as function arguments to detect local callable objects and their dependencies at any depth. Only project-local callables are scanned (installed libraries are filtered via inspect.getfile). Handles circular references safely. Removed aggressive parent directory expansion in file sync that caused cross-job contamination.

[!NOTE] Ray Multiprocessing Patch (v9.0.2): Uses a proxy module that preserves all multiprocessing attributes (Queue, Process, Lock, reduction, etc.) while overriding only Pool with Ray's distributed version. Fixes all ImportError issues from v9.0.0-9.0.1.

Libraries Supported: sentence-transformers, ColBERT, torch.multiprocessing, and any library using standard multiprocessing.

⚠️ Infrastructure Warning (Jan 28 2026)

Python 3.12.1 on Desi is currently unstable (Ray Segfaults). While the new uv integration fixes the installation issues, runtime crashes (Exit 245) have been observed. Recommendation: Use Python 3.11.6 for critical workloads until the Ray binary incompatibility is resolved.

🌟 Key Features (SlurmRay v9.2.0)

  • Smart Hash Sync with Delete Detection: Uses local mtime/size cache for instant scans, verifies remote file existence, and automatically removes stale files on the cluster when files are renamed or deleted locally.
  • Ray Multiprocessing Patch: Transparently replaces multiprocessing.Pool with ray.util.multiprocessing.Pool for distributed execution.
  • Local Wheel Packages Auto-Upload: Reads [tool.hatch.build.targets.wheel].packages from your pyproject.toml and automatically uploads declared local packages (e.g. vendored libraries) to the cluster. Excludes them from requirements.txt to prevent failed PyPI installs.
  • Zero-Config Launch: No project_name required. Auto-git detection.
  • Robust Venv: Uses uv venv to safely create environments even on broken system Pythons.
  • Precision Logging: Explicitly reports why a venv is reused or rebuilt (Hash Match vs Missing).

Main Entry Scripts

Script/Command Description Usage / Example
slurmray curnagl Connect to Curnagl cluster via CLI slurmray curnagl
slurmray desi Connect to Desi server via CLI slurmray desi
pytest tests/... Run test suites pytest tests/test_local_complete_suite.py

Installation

pip install -e .

Prerequisites

  • Local: Python 3.9+
  • Remote: SSH access to a Slurm cluster or a standalone server with Ray support.
  • Configuration: Create a .env file at the root.

Key Results (Performance Baseline)

Scenario Mode Status Avg Time
CPU Task (Simple) Local ✅ Pass < 2s
GPU Task (Detection) Desi ✅ Pass ~15s
Dependency Detection Slurm ✅ Pass < 1s
Concurrent Launch (3 jobs) Local ✅ Pass ~5s
Multiprocessing Patch Local ✅ Pass ~30s

Repository Map

root/
├── slurmray/              # Core logic
│   ├── backend/           # Backends (Slurm, Desi, Local)
│   ├── assets/            # Templates & Wrappers
│   ├── scanner.py         # AST Dependency Detection
│   ├── file_sync.py       # File Synchronization Logic
│   ├── RayLauncher.py     # Main API Entry Point
│   └── cli.py             # Interactive CLI
├── scripts/               # Maintenance & Cleanup utilities
├── tests/                 # Comprehensive test suites
├── documentation/         # HTML/Markdown docs
├── install.sh             # Installation Helper
└── README.md              # Documentation source

Utility Scripts (scripts/)

Script Rôle technique Contexte d'exécution
diagnose_uv.py Validates uv based environment handling Local/Remote
diagnose_ray_segfault.py Diagnoses 3.12.1 Segfaults on Desi Remote
check_desi_locks.py Inspects lock files on Desi Local (connects to Remote)
check_desi_resources.py Checks CPU/GPU availability Local (connects to Remote)
cleanup_desi_projects.py Removes old projects/venvs Maintenance

Roadmap

Priority Task Status
🔥 High Global Venv Caching Optimization of setup times.
Medium Live Dashboard Real-time monitoring UI.
🌱 Low Container Support Apptainer/Singularity support on Slurm.

👥 Credits & License

Bugs & Support: This library is currently in beta. If you encounter any bugs, please report them on the GitHub Issues page.

Maintained by the DESI Department @ HEC UNIL. License: MIT.

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

slurmray-9.11.1.tar.gz (96.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

slurmray-9.11.1-py3-none-any.whl (105.6 kB view details)

Uploaded Python 3

File details

Details for the file slurmray-9.11.1.tar.gz.

File metadata

  • Download URL: slurmray-9.11.1.tar.gz
  • Upload date:
  • Size: 96.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.1.1 CPython/3.11.6 Linux/6.6.87.2-microsoft-standard-WSL2

File hashes

Hashes for slurmray-9.11.1.tar.gz
Algorithm Hash digest
SHA256 0b35040a878d3cba5455287ed7dc4a7a19ed899ce916375eaf2492ac394ad042
MD5 abb664a2931cc9d4d3b3fccadbfdf87a
BLAKE2b-256 7e4cab7df7dea66950f5748d0e816d60b067397a870916d8f0bcac4d6e69cfb6

See more details on using hashes here.

File details

Details for the file slurmray-9.11.1-py3-none-any.whl.

File metadata

  • Download URL: slurmray-9.11.1-py3-none-any.whl
  • Upload date:
  • Size: 105.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.1.1 CPython/3.11.6 Linux/6.6.87.2-microsoft-standard-WSL2

File hashes

Hashes for slurmray-9.11.1-py3-none-any.whl
Algorithm Hash digest
SHA256 2edc9d846846344daabf8c1b486bf45a2b421112fb0fc603a41b79684983da9b
MD5 95fee7c2c4c1f7139e5837f494d61ac6
BLAKE2b-256 caa826ff3220d517edae18e44f15de886326901336b913a97f7a924cbfe0f3e1

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page