An open-source tool for teams to build reproducible ML workflows
Project description
Easy-to-run & reproducible ML pipelines on any cloud
dstack is an open-source tool that streamlines the process of creating reproducible ML training pipelines that are independent of any specific vendor.
Docs • Quick start • Basics • Slack
dstack is an open-source tool that streamlines the process of creating reproducible ML training pipelines that are
independent of any specific vendor. It also promotes collaboration on data and models.
Highlighted features
- Define ML pipelines via YAML and run either locally or on any cloud
- Have cloud instances created and destroyed automatically
- Use spot instances efficiently to save costs
- Save artifacts and reuse them conveniently across workflows
- Use interactive dev environments, such as notebooks or IDEs
- Use any frameworks or experiment trackers. No code changes are required.
- No need for Kubernetes or custom Docker images
Installation
Use pip to install the dstack CLI:
pip install dstack --upgrade
Example
Here's an example from the Quick start.
workflows:
- name: mnist-data
provider: bash
commands:
- pip install torchvision
- python mnist/mnist_data.py
artifacts:
- path: ./data
- name: train-mnist
provider: bash
deps:
- workflow: mnist-data
commands:
- pip install torchvision pytorch-lightning tensorboard
- python mnist/train_mnist.py
artifacts:
- path: ./lightning_logs
With workflows defined in this manner, dstack allows for effortless execution either locally or in a configured cloud
account, while also enabling reuse of artifacts.
Run locally
Use the dstack CLI to run workflows locally:
dstack run mnist-data
Configure a remote
To run workflows remotely (e.g. in the cloud) or share artifacts outside your machine,
you must configure your remote settings using the dstack config command:
dstack config
This command will ask you to choose an AWS profile (which will be used for AWS credentials), an AWS region (where workflows will be run), and an S3 bucket (to store remote artifacts and metadata).
AWS profile: default
AWS region: eu-west-1
S3 bucket: dstack-142421590066-eu-west-1
EC2 subnet: none
For more details on how to configure a remote, check the installation guide.
Run remotely
Once a remote is configured, use the --remote flag with the dstack run command to run the
workflow in the configured cloud:
dstack run mnist-data --remote
You can configure the required resources to run the workflows either via the resources property in YAML
or the dstack run command's arguments, such as --gpu, --gpu-name, etc:
dstack run train-mnist --remote --gpu 1
When you run a workflow remotely, dstack automatically creates resources in the configured cloud,
and releases them once the workflow is finished.
More information
For additional information and examples, see the following links:
Licence
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file dstack-0.1.2.tar.gz.
File metadata
- Download URL: dstack-0.1.2.tar.gz
- Upload date:
- Size: 80.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.9.16
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
70f5fbb22e52e2f9533cf07808cb53794ac215904065128ac2706e6765592172
|
|
| MD5 |
72802cddf740e72ef6eb161960d5acaa
|
|
| BLAKE2b-256 |
d71c41a171ebe14bfa5d2084e75baa580bb6800732c6c722ec50ed3502827264
|
File details
Details for the file dstack-0.1.2-py3-none-any.whl.
File metadata
- Download URL: dstack-0.1.2-py3-none-any.whl
- Upload date:
- Size: 13.6 MB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.9.16
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0aaa4e5d1fe7a6c770fe2025c9849573f4bebd0bc5e31277172394f2cdb5dfec
|
|
| MD5 |
954f5c9328ae1481537fb042344bb03b
|
|
| BLAKE2b-256 |
d308e83ac7d24aec2a1bd7bb2a0ddf74e50c4c7cb43a2ef0bf4f7ecde2a0dcb2
|