Skip to main content

SlurmRay is an official tool from DESI @ HEC UNIL for effortlessly distributing tasks on Slurm clusters (e.g., Curnagl) or standalone servers (e.g., ISIPOL09/Desi) using the Ray library.

Project description

SlurmRay v9.10.0 - Autonomous Distributed Ray on Slurm

[!IMPORTANT] Bug Reports: SlurmRay is in beta. If you find a bug, please report it on GitHub.

[!TIP] Full Documentation: Access the complete documentation here.

The intelligent bridge between your local terminal and High-Performance Computing (HPC) power.

SlurmRay allows you to transparently distribute your Python tasks across Slurm clusters (like Curnagl) or standalone servers (like Desi). It handles environment synchronization, local package detection, and task distribution automatically, turning your local machine into a control center for massive compute resources.

Current State: Version 9.10.0 (Feb 12). Recursive Callable Args Scanning: The scanner now recursively walks all data structures (nested dicts, lists, tuples) passed as function arguments to detect local callable objects and their dependencies at any depth. Only project-local callables are scanned (installed libraries are filtered via inspect.getfile). Handles circular references safely. Removed aggressive parent directory expansion in file sync that caused cross-job contamination.

[!NOTE] Ray Multiprocessing Patch (v9.0.2): Uses a proxy module that preserves all multiprocessing attributes (Queue, Process, Lock, reduction, etc.) while overriding only Pool with Ray's distributed version. Fixes all ImportError issues from v9.0.0-9.0.1.

Libraries Supported: sentence-transformers, ColBERT, torch.multiprocessing, and any library using standard multiprocessing.

⚠️ Infrastructure Warning (Jan 28 2026)

Python 3.12.1 on Desi is currently unstable (Ray Segfaults). While the new uv integration fixes the installation issues, runtime crashes (Exit 245) have been observed. Recommendation: Use Python 3.11.6 for critical workloads until the Ray binary incompatibility is resolved.

🌟 Key Features (SlurmRay v9.2.0)

  • Smart Hash Sync with Delete Detection: Uses local mtime/size cache for instant scans, verifies remote file existence, and automatically removes stale files on the cluster when files are renamed or deleted locally.
  • Ray Multiprocessing Patch: Transparently replaces multiprocessing.Pool with ray.util.multiprocessing.Pool for distributed execution.
  • Local Wheel Packages Auto-Upload: Reads [tool.hatch.build.targets.wheel].packages from your pyproject.toml and automatically uploads declared local packages (e.g. vendored libraries) to the cluster. Excludes them from requirements.txt to prevent failed PyPI installs.
  • Zero-Config Launch: No project_name required. Auto-git detection.
  • Robust Venv: Uses uv venv to safely create environments even on broken system Pythons.
  • Precision Logging: Explicitly reports why a venv is reused or rebuilt (Hash Match vs Missing).

Main Entry Scripts

Script/Command Description Usage / Example
slurmray curnagl Connect to Curnagl cluster via CLI slurmray curnagl
slurmray desi Connect to Desi server via CLI slurmray desi
pytest tests/... Run test suites pytest tests/test_local_complete_suite.py

Installation

pip install -e .

Prerequisites

  • Local: Python 3.9+
  • Remote: SSH access to a Slurm cluster or a standalone server with Ray support.
  • Configuration: Create a .env file at the root.

Key Results (Performance Baseline)

Scenario Mode Status Avg Time
CPU Task (Simple) Local ✅ Pass < 2s
GPU Task (Detection) Desi ✅ Pass ~15s
Dependency Detection Slurm ✅ Pass < 1s
Concurrent Launch (3 jobs) Local ✅ Pass ~5s
Multiprocessing Patch Local ✅ Pass ~30s

Repository Map

root/
├── slurmray/              # Core logic
│   ├── backend/           # Backends (Slurm, Desi, Local)
│   ├── assets/            # Templates & Wrappers
│   ├── scanner.py         # AST Dependency Detection
│   ├── file_sync.py       # File Synchronization Logic
│   ├── RayLauncher.py     # Main API Entry Point
│   └── cli.py             # Interactive CLI
├── scripts/               # Maintenance & Cleanup utilities
├── tests/                 # Comprehensive test suites
├── documentation/         # HTML/Markdown docs
├── install.sh             # Installation Helper
└── README.md              # Documentation source

Utility Scripts (scripts/)

Script Rôle technique Contexte d'exécution
diagnose_uv.py Validates uv based environment handling Local/Remote
diagnose_ray_segfault.py Diagnoses 3.12.1 Segfaults on Desi Remote
check_desi_locks.py Inspects lock files on Desi Local (connects to Remote)
check_desi_resources.py Checks CPU/GPU availability Local (connects to Remote)
cleanup_desi_projects.py Removes old projects/venvs Maintenance

Roadmap

Priority Task Status
🔥 High Global Venv Caching Optimization of setup times.
Medium Live Dashboard Real-time monitoring UI.
🌱 Low Container Support Apptainer/Singularity support on Slurm.

👥 Credits & License

Bugs & Support: This library is currently in beta. If you encounter any bugs, please report them on the GitHub Issues page.

Maintained by the DESI Department @ HEC UNIL. License: MIT.

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

slurmray-9.11.0.tar.gz (96.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

slurmray-9.11.0-py3-none-any.whl (105.4 kB view details)

Uploaded Python 3

File details

Details for the file slurmray-9.11.0.tar.gz.

File metadata

  • Download URL: slurmray-9.11.0.tar.gz
  • Upload date:
  • Size: 96.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.1.1 CPython/3.11.6 Linux/6.6.87.2-microsoft-standard-WSL2

File hashes

Hashes for slurmray-9.11.0.tar.gz
Algorithm Hash digest
SHA256 e2978430c6dcdcdee2c6ad7ebc4595016c256838993db075a783daeae753906f
MD5 9fc063ccae79eee705758b7e38f542cc
BLAKE2b-256 01db9cea412b0600bb4f4554816097a78ce007700049afc4acfbd0c7cdfb1085

See more details on using hashes here.

File details

Details for the file slurmray-9.11.0-py3-none-any.whl.

File metadata

  • Download URL: slurmray-9.11.0-py3-none-any.whl
  • Upload date:
  • Size: 105.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.1.1 CPython/3.11.6 Linux/6.6.87.2-microsoft-standard-WSL2

File hashes

Hashes for slurmray-9.11.0-py3-none-any.whl
Algorithm Hash digest
SHA256 3fc0a3fa0f6b71fa6d431ab20bf5b454a12d2a7384e4106a2480bee548a6c105
MD5 b2bdb24d14df985eb9a635961e7e287c
BLAKE2b-256 c020d0c37e11fd9a0548ee9cc50ca33b862007de99b0e92f8938205a4f75f55b

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page