Skip to main content

No project description provided

Project description

nshrunner

nshrunner is a Python library that provides a unified way to run functions in various environments, such as local dev machines, cloud VMs, and SLURM clusters. It was created to simplify the process of running ML training jobs across multiple machines and environments.

Motivation

When running ML training jobs on different machines and environments, it can be challenging to manage the specifics of each environment. nshrunner was developed to address this issue by providing a single function that can be used to run jobs on any supported environment without having to worry about the details of each environment.

Features

  • Supports running functions locally, on SLURM clusters, and in GNU Screen sessions
  • Provides a unified interface for running functions across different environments
  • Allows for easy configuration of job options, such as resource requirements and environment variables
  • Supports snapshotting the environment to ensure reproducibility, using the nshsnap library
  • Provides utilities for logging, seeding, and signal handling

Installation

nshrunner can be installed using pip:

pip install nshrunner

Usage

Here's a simple example showing the different ways to run a function:

import nshrunner as R

def train_model(batch_size: int, learning_rate: float):
    # Training logic here
    return {"accuracy": 0.95}

# Define runs with different hyperparameters
runs = [
    (32, 0.001),  # (batch_size, learning_rate)
    (64, 0.0005),
]

# Run locally
results = R.run_local(train_model, runs)

# Run in a GNU Screen session
R.submit_screen(
    train_model,
    runs,
    screen={
        "name": "training",
        "logging": {
            "output_file": "logs/output.log",
            "error_file": "logs/error.log"
        },
        "attach": False  # Run detached
    }
)

# Run on SLURM
R.submit_slurm(
    train_model,
    runs,
    slurm={
        "name": "training",
        "partition": "gpu",
        "resources": {
            "nodes": 1,
            "cpus": 4,
            "gpus": 1,
            "memory_gb": 32,
            "time": "12:00:00"
        },
        "output_dir": "logs"
    }
)

The library provides a consistent interface across different execution environments while handling the complexities of:

  • Job submission and management
  • Resource allocation
  • Environment setup
  • Output logging
  • Error handling

For more advanced usage, you can configure additional options like:

# Configure environment snapshot for reproducibility
R.submit_slurm(
    train_model,
    runs,
    runner={
        "working_dir": "experiments",
        "snapshot": True,  # Snapshot code and dependencies
        "seed": {"seed": 42}  # Set random seeds
    },
    slurm={...}
)

Contributing

Contributions are welcome! For feature requests, bug reports, or questions, please open an issue on GitHub. If you'd like to contribute code, please submit a pull request with your changes.

License

nshrunner is released under the MIT License. See LICENSE for more information.

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

nshrunner-1.0.0.dev2.tar.gz (29.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

nshrunner-1.0.0.dev2-py3-none-any.whl (44.6 kB view details)

Uploaded Python 3

File details

Details for the file nshrunner-1.0.0.dev2.tar.gz.

File metadata

  • Download URL: nshrunner-1.0.0.dev2.tar.gz
  • Upload date:
  • Size: 29.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.5 CPython/3.12.3 Linux/6.8.0-49-generic

File hashes

Hashes for nshrunner-1.0.0.dev2.tar.gz
Algorithm Hash digest
SHA256 0f9327f9cffa4fb6651b6c7e796d094616f5e302c6a5cb102b745069b749f11b
MD5 fc30c530c7153c00d22601e2fa5a8bfc
BLAKE2b-256 ae230b605b4c6a0fa9cb5d64e98a24fee988b55988f593874612c42b505f74ab

See more details on using hashes here.

File details

Details for the file nshrunner-1.0.0.dev2-py3-none-any.whl.

File metadata

  • Download URL: nshrunner-1.0.0.dev2-py3-none-any.whl
  • Upload date:
  • Size: 44.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.5 CPython/3.12.3 Linux/6.8.0-49-generic

File hashes

Hashes for nshrunner-1.0.0.dev2-py3-none-any.whl
Algorithm Hash digest
SHA256 6011d90c7578fa0f55d304a1f1c5ca9a02039d1a9992b7dc4dd9af0657f15b2f
MD5 70ff24044b94008c60dcf34e8b9f52ae
BLAKE2b-256 d25bec16260be8e70b94fdf365695e533775e731e69954f52552ffac36880cde

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page