Skip to main content

The easiest way to define ML workflows and run them on any cloud platform

Project description

dstack

ML workflows as code

The easiest way to define ML workflows and run them on any cloud platform

Slack

Quick startDocsTutorialsBlog

Last commit PyPI - License

What is dstack?

dstack makes it very easy to define ML workflows and run them on any cloud platform. It provisions infrastructure, manages data, and monitors usage for you.

Ideal for processing data, training models, running apps, and any other ML development tasks.

Install the CLI

Use pip to install dstack:

pip install dstack

Define workflows

Define ML workflows, their output artifacts, hardware requirements, and dependencies via YAML.

workflows:
  - name: train-mnist
    provider: bash
    commands:
      - pip install torchvision pytorch-lightning tensorboard
      - python examples/mnist/train_mnist.py
    artifacts:
      - path: ./lightning_logs

Run locally

By default, workflows run locally on your machine.

dstack run train-mnist

RUN        WORKFLOW     SUBMITTED  STATUS     TAG  BACKENDS
penguin-1  train-mnist  now        Submitted       local

Provisioning... It may take up to a minute. ✓

To interrupt, press Ctrl+C.

GPU available: True, used: True

Epoch 1: [00:03<00:00, 280.17it/s, loss=1.35, v_num=0]

Run remotely

To run workflows remotely in a configured cloud, you will need the Hub application, which can be installed either on a dedicated server for team work or directly on your local machine.

Start the Hub application

To start the Hub application, use this command:

$ dstack hub start

The hub is available at http://127.0.0.1:3000?token=b934d226-e24a-4eab-a284-eb92b353b10f

To login as an administrator, visit the URL in the output.

Create a project

Go ahead and create a new project.

Choose a backend type (such as AWS or GCP), provide cloud credentials, and specify settings like artifact storage bucket and the region where to run workflows.

Configure the CLI

Copy the CLI command from the project settings and execute it in your terminal to configure the project as a remote.

$ dstack config hub --url http://127.0.0.1:3000 \
  --project my-awesome-project \
  --token b934d226-e24a-4eab-a284-eb92b353b10f

Now, you can run workflows remotely in the created project by adding the --remote flag to the dstack run command and request hardware resources (like GPU, memory, interruptible instances, etc.) that you need.

dstack run train-mnist --remote --gpu 1

RUN       WORKFLOW     SUBMITTED  STATUS     TAG  BACKENDS
turtle-1  train-mnist  now        Submitted       aws

Provisioning... It may take up to a minute. ✓

To interrupt, press Ctrl+C.

GPU available: True, used: True

Epoch 1: [00:03<00:00, 280.17it/s, loss=1.35, v_num=0]

The command will automatically provision the required cloud resources in the corresponding cloud upon workflow startup and tear them down upon completion.

More information

For additional information and examples, see the following links:

Licence

Mozilla Public License 2.0

Project details


Release history Release notifications | RSS feed

This version

0.8

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dstack-0.8.tar.gz (120.3 kB view hashes)

Uploaded Source

Built Distribution

dstack-0.8-py3-none-any.whl (13.7 MB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page