Skip to main content

A Mixed-Integer Linear Programming (MILP) scheduler for Snakemake to optimally analyze the jobs for heterogeneous compute resources

Project description

MILP Snakemake Scheduler Plugin

PyPI version License

A pip-installable scheduler plugin for Snakemake that uses a Mixed-Integer Linear Programming (MILP) scheduler for Snakemake to optimally analyze the jobs for heterogeneous compute resources.

  • milp: schedules only ready jobs via MILP.
  • milp-ext: schedules across all pending jobs using MILP + critical-path analysis.

Features

  • Resource-Aware Scheduling: Considers CPU cores, memory, GPU, custom resources, and I/O constraints.
  • Critical-Path Analysis: Optionally detect and prioritize the longest dependency chain.
  • Historical Estimation: Learns from past executions to refine runtime and I/O size predictions.
  • Multi-Objective Optimization: Balances makespan vs. energy consumption with configurable weights.
  • Graceful Fallbacks: Falls back to greedy or ILP strategies if MILP fails or times out.
  • Plugin Auto-Discovery: Integrates seamlessly via Snakemake entry points.

Installation

From PyPI

pip install milp-snakemake-scheduler

From Source

git clone https://github.com/AasishKumarSharma/milp_snakemake_scheduler.git
cd milp_snakemake_scheduler
pip install .

For development (including test dependencies):

pip install -e .[dev]

Quick Start

  1. Create a System Profile: system_profile.json describes your clusters/nodes.

  2. (Optional) Customize: scheduler_config.yaml to adjust estimation, objectives, and fallbacks.

  3. Run Snakemake:

    3.1. Run in MILP-only mode for ready jobs:

    snakemake --scheduler milp \
              --cores 8 \
              --scheduler-config scheduler_config.yaml \
              --system-profile system_profile.json \
              --jobs 16
    

    3.2. Run in Extended MILP mode with critical-path for a ready jobs in a workflow:

    snakemake --scheduler milp-ext \
              --cores 8 \
              --scheduler-config scheduler_config.yaml \
              --system-profile system_profile.json \
              --jobs 16
    
  4. Dry-Run:

    snakemake --scheduler milp -n --cores 8 --scheduler-config scheduler_config.yaml --system-profile system_profile.json
    

Configuration

system_profile.json

Defines compute clusters and nodes. Example:

{
  "clusters": {
    "local": {
      "nodes": {
        "default": {
          "resources": {"cores": 8, "memory_mb": 32768, "gpu_count": 0},
          "features": ["cpu", "x86_64", "avx2"],
          "properties": {"cpu_flops": 1e11, "memory_bandwidth_mbps": 25600}
        }
      }
    },
    "gpu_cluster": { ... },
    "high_memory": { ... }
  }
}

scheduler_config.yaml

Controls scheduler behavior. Example:

scheduler:
  type: milp
  paths:
    system_profile: system_profile.json
  estimation:
    auto_estimate_file_sizes: true
    history:
      enabled: true
      adaptation_weight: 0.7
  optimization:
    objective_weights:
      makespan: 0.8
      energy: 0.2
    time_limit_seconds: 30
    fallback: greedy

Key options:

  • estimation.auto_estimate_file_sizes: infer I/O if not provided.
  • optimization.time_limit_seconds: MILP solver cutoff.
  • optimization.fallback: greedy or ilp on failure.

Example Snakefile

See example_snakefile.py in this repo. A minimal snippet:

configfile: "scheduler_config.yaml"

rule all:
    input: "results/output.txt"

rule demo:
    output: "results/output.txt"
    threads: 2
    resources:
        mem_mb=1024,
        runtime=5
    params:
        job_specification={
            "features": ["cpu"],
            "resources": {"input_size_mb": 10, "output_size_mb": 5},
            "properties": {"cpu_flops": 1e9}
        }
    shell:
        "echo Hello > {output}"

Testing

Run tests with:

pytest tests/

Include additional tests in tests/ following the existing patterns.


Packaging & Publishing

  • Bump version in setup.py.
  • Tag a release:
git tag v0.1.0
git push --tags
  • Build and upload:
python3 -m build
twine upload dist/*

Contributing

  1. Fork and clone the repo
  2. Create a branch: git checkout -b feature/new-flag
  3. Implement changes and add tests
  4. Ensure all tests pass: pytest
  5. Submit a pull request

Please follow Conventional Commits for commit messages.


License

This project is licensed under the MIT License.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

milp_snakemake_scheduler-0.1.1.tar.gz (20.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

milp_snakemake_scheduler-0.1.1-py3-none-any.whl (16.9 kB view details)

Uploaded Python 3

File details

Details for the file milp_snakemake_scheduler-0.1.1.tar.gz.

File metadata

  • Download URL: milp_snakemake_scheduler-0.1.1.tar.gz
  • Upload date:
  • Size: 20.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.2

File hashes

Hashes for milp_snakemake_scheduler-0.1.1.tar.gz
Algorithm Hash digest
SHA256 92421ff47c857680bd6b2b5d1bb8926c76bc146fa0bc0d0b71170ae44f0f54f2
MD5 38d8907ed6c13c57e0b7c4e44ff9fd97
BLAKE2b-256 c92ba383ff10a8441e8a9883f3c996e4c50a72ff2770ac6b870664a271441555

See more details on using hashes here.

File details

Details for the file milp_snakemake_scheduler-0.1.1-py3-none-any.whl.

File metadata

File hashes

Hashes for milp_snakemake_scheduler-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 df6e8438046970e1acc52c3262b7c42c11137a697ad1a71f557556434217922d
MD5 f5ad63d45bbbb3bce87099f94f00fa7f
BLAKE2b-256 324f4e6246efaffb589e7aaf09bd6aa39f224ce027d9cb82a22af5e51f0236c4

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page