Easy-to-run ML workflows on any cloud
Reason this release was yanked:
A critical bug: AttributeError: module 'dstack.version' has no attribute 'miniforge_image'
Project description
A better way to run ML workflows
Define ML workflows as code and run via CLI. Use any cloud. Collaborate within teams.
Docs • Installation • Quick start • Usage
What is dstack?
dstack
allows you to define machine learning workflows as code and run them on any cloud.
It helps you set up a reproducible environment, reuse artifacts, and launch interactive development environments and apps.
Installation
Use pip
to install dstack
:
pip install dstack --upgrade
Configure a remote
To run workflows remotely (e.g. in a configured cloud account),
configure a remote using the dstack config
command.
dstack config
? Choose backend. Use arrows to move, type to filter
> [aws]
[gcp]
[hub]
If you intend to run remote workflows directly in the cloud using local cloud credentials,
feel free to choose aws
or gcp
. Refer to AWS and GCP correspondingly for the details.
If you would like to manage cloud credentials, users and other settings centrally
via a user interface, it is recommended to choose hub
.
The
hub
remote is currently in an experimental phase. If you are interested in trying it out, please contact us via Slack.
Define workflows
Define ML workflows, their output artifacts, hardware requirements, and dependencies via YAML.
workflows:
- name: mnist-data
provider: bash
commands:
- pip install torchvision
- python mnist/mnist_data.py
artifacts:
- path: ./data
- name: train-mnist
provider: bash
deps:
- workflow: mnist-data
commands:
- pip install torchvision pytorch-lightning tensorboard
- python mnist/train_mnist.py
artifacts:
- path: ./lightning_logs
YAML eliminates the need to modify code in your scripts, giving you the freedom to choose frameworks, experiment trackers, and cloud providers.
Providers
dstack
supports multiple providers that enable you to set up environment,
run scripts, launch interactive dev environments and apps, and perform many other tasks.
Run workflows
Once a workflow is defined, you can use the dstack run
command to run it either locally or remotely.
Run locally
By default, workflows run locally on your machine.
dstack run mnist-data
RUN WORKFLOW SUBMITTED STATUS TAG BACKENDS
penguin-1 mnist-data now Submitted local
Provisioning... It may take up to a minute. ✓
To interrupt, press Ctrl+C.
Downloading http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz
The artifacts from local workflows are also stored and can be reused in other local workflows.
Run remotely
To run a workflow remotely (e.g. in a configured cloud account), add the --remote
flag to the dstack run
command:
dstack run mnist-data --remote
RUN WORKFLOW SUBMITTED STATUS TAG BACKENDS
mangust-1 mnist-data now Submitted aws
Provisioning... It may take up to a minute. ✓
To interrupt, press Ctrl+C.
Downloading http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz
The output artifacts from remote workflows are also stored remotely and can be reused by other remote workflows.
The necessary hardware resources can be configured either via YAML or through arguments in the dstack run
command, such
as --gpu
and --gpu-name
.
dstack run train-mnist --remote --gpu 1
RUN WORKFLOW SUBMITTED STATUS TAG BACKENDS
turtle-1 train-mnist now Submitted aws
Provisioning... It may take up to a minute. ✓
To interrupt, press Ctrl+C.
GPU available: True, used: True
Epoch 1: [00:03<00:00, 280.17it/s, loss=1.35, v_num=0]
Upon running a workflow remotely, dstack
automatically creates resources in the configured cloud account and destroys them
once the workflow is complete.
Ports
When a workflow uses ports to host interactive dev environments or applications, the dstack run
command automatically
forwards these ports to your local machine, allowing you to access them.
Refer to Providers and Apps for the details.
More information
For additional information and examples, see the following links:
Licence
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file dstack-0.2.4.tar.gz
.
File metadata
- Download URL: dstack-0.2.4.tar.gz
- Upload date:
- Size: 109.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.9.16
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 7d8ab5328b96105049fba37fba4b9eb0c87f6cf05ba8cfa624d37c3f38a2c4df |
|
MD5 | 7f8f33565c708bc9268481b8e0dc0530 |
|
BLAKE2b-256 | 3928e1871e35a3347f028126550bd1af1fd229b81772dde03921d8aa3eb0ef00 |
File details
Details for the file dstack-0.2.4-py3-none-any.whl
.
File metadata
- Download URL: dstack-0.2.4-py3-none-any.whl
- Upload date:
- Size: 13.6 MB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.9.16
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 945d8e15d7c513660cc671afc94f045d786bce09c4cd4f4122a8dba485ca05d9 |
|
MD5 | a788808262d90f703d0b60602e30057e |
|
BLAKE2b-256 | 3c0b62878908af12a485d1cf0b914b3cf9d236ae0d1e9d64daf383ec63844115 |