Skip to main content

SlurmRay is an official tool from DESI @ HEC UNIL for effortlessly distributing tasks on Slurm clusters (e.g., Curnagl) or standalone servers (e.g., ISIPOL09/Desi) using the Ray library.

Project description

SlurmRay v9.10.0 - Autonomous Distributed Ray on Slurm

[!IMPORTANT] Bug Reports: SlurmRay is in beta. If you find a bug, please report it on GitHub.

[!TIP] Full Documentation: Access the complete documentation here.

The intelligent bridge between your local terminal and High-Performance Computing (HPC) power.

SlurmRay allows you to transparently distribute your Python tasks across Slurm clusters (like Curnagl) or standalone servers (like Desi). It handles environment synchronization, local package detection, and task distribution automatically, turning your local machine into a control center for massive compute resources.

Current State: Version 9.10.0 (Feb 12). Desi Backend Modernization: Removed all pyenv dependencies on Desi in favor of uv exclusively. The client's local Python version is strictly enforced during virtualenv creation via uv venv --python <version>, eliminating silent fallbacks and implementing a strict fail-fast model. Recursive Callable Args Scanning: The scanner now recursively walks all data structures (nested dicts, lists, tuples) passed as function arguments to detect local callable objects and their dependencies at any depth. Only project-local callables are scanned.

[!NOTE] Ray Multiprocessing Patch (v9.0.2): Uses a proxy module that preserves all multiprocessing attributes (Queue, Process, Lock, reduction, etc.) while overriding only Pool with Ray's distributed version. Fixes all ImportError issues from v9.0.0-9.0.1.

Libraries Supported: sentence-transformers, ColBERT, torch.multiprocessing, and any library using standard multiprocessing.

⚠️ Infrastructure Warning (Jan 28 2026)

Python 3.12.1 on Desi is currently unstable (Ray Segfaults). While the new uv integration fixes the installation issues, runtime crashes (Exit 245) have been observed. Recommendation: Use Python 3.11.6 for critical workloads until the Ray binary incompatibility is resolved.

🌟 Key Features (SlurmRay v9.2.0)

  • Strict uv-Only Environments on Desi: Virtual environments on Desi are created exclusively via uv venv using the client's local Python version. No pyenv, no silent system Python fallback, and strict fail-fast errors on build issues.
  • Smart Hash Sync with Delete Detection: Uses local mtime/size cache for instant scans, verifies remote file existence, and automatically removes stale files on the cluster when files are renamed or deleted locally.
  • Ray Multiprocessing Patch: Transparently replaces multiprocessing.Pool with ray.util.multiprocessing.Pool for distributed execution.
  • Local Wheel Packages Auto-Upload: Reads [tool.hatch.build.targets.wheel].packages from your pyproject.toml and automatically uploads declared local packages (e.g. vendored libraries) to the cluster. Excludes them from requirements.txt to prevent failed PyPI installs.
  • Zero-Config Launch: No project_name required. Auto-git detection.
  • Robust Venv: Uses uv venv to safely create environments even on broken system Pythons.
  • Precision Logging: Explicitly reports why a venv is reused or rebuilt (Hash Match vs Missing).

Main Entry Scripts

Script/Command Description Usage / Example
slurmray curnagl Connect to Curnagl cluster via CLI slurmray curnagl
slurmray desi Connect to Desi server via CLI slurmray desi
pytest tests/... Run test suites pytest tests/test_local_complete_suite.py

Installation

pip install -e .

Prerequisites

  • Local: Python 3.9+
  • Remote: SSH access to a Slurm cluster or a standalone server with Ray support.
  • Configuration: Create a .env file at the root.

Key Results (Performance Baseline)

Scenario Mode Status Avg Time
CPU Task (Simple) Local ✅ Pass < 2s
GPU Task (Detection) Desi ✅ Pass ~15s
Dependency Detection Slurm ✅ Pass < 1s
Concurrent Launch (3 jobs) Local ✅ Pass ~5s
Multiprocessing Patch Local ✅ Pass ~30s

Repository Map

root/
├── slurmray/              # Core logic
│   ├── backend/           # Backends (Slurm, Desi, Local)
│   ├── assets/            # Templates & Wrappers
│   ├── scanner.py         # AST Dependency Detection
│   ├── file_sync.py       # File Synchronization Logic
│   ├── RayLauncher.py     # Main API Entry Point
│   └── cli.py             # Interactive CLI
├── scripts/               # Maintenance & Cleanup utilities
├── tests/                 # Comprehensive test suites
├── documentation/         # HTML/Markdown docs
├── install.sh             # Installation Helper
└── README.md              # Documentation source

Utility Scripts (scripts/)

Script Rôle technique Contexte d'exécution
diagnose_uv.py Validates uv based environment handling Local/Remote
diagnose_ray_segfault.py Diagnoses 3.12.1 Segfaults on Desi Remote
check_desi_locks.py Inspects lock files on Desi Local (connects to Remote)
check_desi_resources.py Checks CPU/GPU availability Local (connects to Remote)
cleanup_desi_projects.py Removes old projects/venvs Maintenance

Roadmap

Priority Task Status
🔥 High Global Venv Caching Optimization of setup times.
Medium Live Dashboard Real-time monitoring UI.
🌱 Low Container Support Apptainer/Singularity support on Slurm.

👥 Credits & License

Bugs & Support: This library is currently in beta. If you encounter any bugs, please report them on the GitHub Issues page.

Maintained by the DESI Department @ HEC UNIL. License: MIT.

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

slurmray-9.11.2.tar.gz (96.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

slurmray-9.11.2-py3-none-any.whl (106.0 kB view details)

Uploaded Python 3

File details

Details for the file slurmray-9.11.2.tar.gz.

File metadata

  • Download URL: slurmray-9.11.2.tar.gz
  • Upload date:
  • Size: 96.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.1.1 CPython/3.11.6 Linux/6.6.114.1-microsoft-standard-WSL2

File hashes

Hashes for slurmray-9.11.2.tar.gz
Algorithm Hash digest
SHA256 a0871f49d6bc1583009463835900031844be0be00a593540c8137a5828233c19
MD5 ee0532163f8dcdaa1115d68ec71d8177
BLAKE2b-256 7b0624c8a54869d28a71987906469ab48ec34d056ccd992263a1d1fca11c3227

See more details on using hashes here.

File details

Details for the file slurmray-9.11.2-py3-none-any.whl.

File metadata

  • Download URL: slurmray-9.11.2-py3-none-any.whl
  • Upload date:
  • Size: 106.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.1.1 CPython/3.11.6 Linux/6.6.114.1-microsoft-standard-WSL2

File hashes

Hashes for slurmray-9.11.2-py3-none-any.whl
Algorithm Hash digest
SHA256 6fbfe25f10e9572fcd1dd2453e46ea085fe35cd30563d1f2425baa3139ed4ada
MD5 90b6b88641364e0c714ff6f06412a228
BLAKE2b-256 3a303d8c31a828714fd792fd03d158ca060c5cec42039b819b4d2b5b587dea93

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page