A Snakemake executor plugin for Huawei Donau scheduler
Project description
Snakemake Executor Plugin for Huawei Donau Scheduler
This is a Snakemake executor plugin designed specifically for the Huawei Donau High Performance Computing (HPC) scheduler. It enables Snakemake to interact directly with the Donau scheduling system, handling job submission, status monitoring, and resource management automatically.
✨ Key Features
- Native Adaptation: deeply integrated with
dsub,djob, anddkillcommands. - Smart Resource Mapping: Automatically translates Snakemake resources (
threads,mem_mb,runtime,account,mpi) into Donau resource request parameters (e.g.,-R cpu=X,mem=YMB,-T,-A,--mpi). - Robust Status Checking: Implements a dual-query mechanism ("Active Queue" + "History Database") to prevent false status judgments caused by jobs finishing instantly or rapid scheduler cleanup.
- Detailed Audit Logs: Integrated with
loguruto provide full-link debugging logs (command construction, raw output, status changes) for easy troubleshooting. - Safe Cancellation: Supports batch, forced, and non-interactive job cancellation via Ctrl+C.
- Async Performance: Utilizes
asynciofor non-blocking status polling, suitable for large-scale workflows.
🛠️ Installation
Ensure you have Python 3.8+ and Snakemake 8.0+ installed.
Source Installation (Recommended)
Since Snakemake is often used with Conda & Mamba, it is recommended to install the plugin via pip after activating your Snakemake environment:
git clone https://github.com/xsx123123/snakemake_executor_donau.git
cd snakemake_executor_donau
pip install -e .
🚀 Quick Start
1. Basic Usage
Once installed, use the --executor donau argument to enable this plugin:
snakemake --executor donau --jobs 100
🧪 Testing
A test environment is provided in the Test/ directory. You can verify the plugin's functionality using the following command:
# Run from the project root
snakemake --snakefile Test/snakefile --executor donau --jobs 10 --latency-wait 60
--jobs 10: Limits the maximum number of concurrent jobs to 10.--latency-wait 60: Wait up to 60 seconds for output files to appear on the filesystem (recommended for HPC shared filesystems).
2. Snakefile Example
Define resources in your Snakefile, and the plugin will automatically convert them to scheduler parameters:
rule complex_task:
input:
"data/raw.txt"
output:
"results/final.txt"
# 1. Set Job Priority (Maps to dsub -p)
priority: 50
# 2. Set Resources
resources:
queue = "fat_node", # -q fat_node
mem_mb = 8192, # -R mem=8192MB
runtime = 120, # -T 7200 (120 min -> seconds)
nodes = 2, # -N 2 (Replicas/Nodes)
exclusive = True, # -x (Exclusive mode)
tag = "group=bio", # --tag group=bio
account = "proj_01", # -A proj_01
mpi = "openmpi" # --mpi openmpi
threads: 8 # -R cpu=8
shell:
"echo 'Running on Donau' > {output}"
⚙️ Resource Mapping Details
The plugin maps Snakemake resource definitions to dsub parameters as follows:
| Snakemake Keyword | Meaning | Donau Parameter | Notes |
|---|---|---|---|
threads |
CPU Cores | -R cpu=<threads> |
Defaults to 1 |
priority |
Priority | -p <int> |
Maps Snakemake priority (1-9999) |
resources.mem_mb |
Memory (MB) | -R mem=<mem_mb>MB |
Defaults to 1024MB |
resources.queue |
Queue Name | -q <queue> |
partition is also supported |
resources.runtime |
Runtime (min) | -T <seconds> |
Converted to seconds. time_min is also supported |
resources.nodes |
Replicas/Nodes | -N <count> |
replica is also supported |
resources.exclusive |
Exclusive | -x job |
Set to True or 1 to enable |
resources.tag |
Custom Tag | --tag <string> |
e.g. "key=value" |
resources.account |
Account | -A <account> |
For billing/permissions |
resources.mpi |
MPI Type | --mpi <type> |
e.g., openmpi, intelmpi |
📝 Logging & Troubleshooting
1. Executor Log (Workdir)
Scheduling actions and errors are now logged directly to your working directory:
- Path:
./donau_executor.log - Content: Detailed timestamps, UUIDs, executed shell commands, and debugging info.
2. Job Standard Output (Per Rule)
The stdout and stderr of each specific job are redirected to:
- Path:
.snakemake/donau_logs/rule_<name>/<wildcards>/<jobid>.log - Usage: To check specific job errors or program outputs.
🔧 Underlying Logic
This plugin relies on the following Donau commands (ensure they are available in $PATH):
-
Job Submission (
dsub)- Uses
-nto specify the job name. - Uses
-ooto capture both stdout and stderr. - Uses
--cwdto lock the working directory. - Includes automatic retry logic for network stability.
- Uses
-
Status Query (
djob)- Command:
djob -o "jobid state" --no-header <id_list> - Logic: Prioritizes querying the active list. If an ID is missing, it automatically appends the
-Dflag to query the completed/history database, ensuring accurate status retrieval.
- Command:
-
Job Cancellation (
dkill)- Command:
dkill -y --force <id_list> - Logic: Uses
-yto skip interactive confirmation and--forceto ensure jobs are thoroughly cleaned up.
- Command:
📂 Project Structure
Following the official Snakemake plugin conventions:
snakemake_executor_donau/
├── pyproject.toml # Poetry configuration (deps & entry points)
├── README.md # Documentation (English)
├── docs/
│ └── README_zh.md # Documentation (Chinese)
└── snakemake_executor_plugin_donau/ # Core code directory (must follow strict naming)
├── __init__.py # Plugin entry point
├── executor.py # Core logic (submit/query/cancel)
└── logging.py # Logging configuration
📦 Development & Building Guide
If you intend to develop your own Snakemake plugin or contribute to this project, please adhere to the following standards:
1. Naming Convention (Strict)
Snakemake's plugin discovery mechanism enforces strict naming:
- Code Directory: Must be named
snakemake_executor_plugin_<name>(e.g.,snakemake_executor_plugin_donau). - Project Name (PyPI): Recommended to be
snakemake-executor-plugin-<name>.
2. Configuration (pyproject.toml)
This project uses the Poetry standard format, which is recommended by Snakemake. The key configuration is:
[tool.poetry.plugins."snakemake.executors"]
donau = "snakemake_executor_plugin_donau:Executor"
This line tells Snakemake: "When the user specifies --executor donau, load the Executor class from the snakemake_executor_plugin_donau module."
3. Local Development Flow
- Clone the Repository:
git clone https://github.com/xsx123123/snakemake_executor_donau.git cd snakemake_executor_donau
- Install in Editable Mode:
Within your Snakemake environment, run:
pip install -e .
Note: You do not need to installpoetryexplicitly;piphandles the build viapyproject.toml. - Verify:
snakemake --help | grep donau
Ifdonauappears in the output, the plugin is successfully registered.
⚠️ Notes
- Runtime Configuration: It is not recommended to set
runtimeortime_minin yourresourcesunless strictly necessary. Setting a hard limit might cause the scheduler to kill long-running jobs prematurely, or if misconfigured, might affect Snakemake's status polling behavior (though the plugin handles timeouts gracefully). Let the scheduler determine the default walltime when possible. - Queue Names: Ensure the
queuespecified in your Snakefile actually exists in your cluster. - Memory Units: The plugin enforces
MBas the unit when interacting with the scheduler. - Shared Filesystem: The default configuration assumes all compute nodes share a filesystem. If not, storage plugins need to be configured.
📝 Logging & Troubleshooting
1. Executor Log (Workdir)
Scheduling actions, status updates, and errors are now logged directly to your working directory:
- Path:
./donau_executor.log - Content:
- Detailed timestamps and UUIDs.
- Executed shell commands (
dsub,djob, etc.). - Job Completion: Clear "Job (ID: )" finished successfully" messages.
- Debugging info for development.
2. Job Standard Output (Per Rule)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file snakemake_executor_plugin_donau-0.1.0.tar.gz.
File metadata
- Download URL: snakemake_executor_plugin_donau-0.1.0.tar.gz
- Upload date:
- Size: 9.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/2.2.1 CPython/3.13.5 Linux/6.8.0-90-generic
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
6c2d36f14110acb7ed645c8604c08ceafd495110327d2142c1cee07e13b5ac52
|
|
| MD5 |
6a11bddedfa5a7250134d961a83e4dfc
|
|
| BLAKE2b-256 |
c9ebdcba13dd6f2b299f4c6bc784c7b092d295ed5e0ad2f171c8a10b72ff4a0d
|
File details
Details for the file snakemake_executor_plugin_donau-0.1.0-py3-none-any.whl.
File metadata
- Download URL: snakemake_executor_plugin_donau-0.1.0-py3-none-any.whl
- Upload date:
- Size: 11.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/2.2.1 CPython/3.13.5 Linux/6.8.0-90-generic
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
72e26b4abd3ea31da383297ae6ae7a4322f00dc5933d3046e8ae999c97432cd4
|
|
| MD5 |
f3ab42d2219c70e761b21334ba5035ac
|
|
| BLAKE2b-256 |
c890f8fac709e8279eec31fbfd3f280f141187ba685c5a891071663b69097fa0
|