Skip to main content

An open-source tool for teams to build reproducible ML workflows

Project description

dstack

Easy-to-run & reproducible ML pipelines on any cloud

dstack is an open-source tool that streamlines the process of creating reproducible ML training pipelines that are independent of any specific vendor.

Slack

DocsQuick startBasicsSlack

Last commit PyPI - License

dstack is an open-source tool that streamlines the process of creating reproducible ML training pipelines that are independent of any specific vendor. It also promotes collaboration on data and models.

Highlighted features

  • Define ML pipelines via YAML and run either locally or on any cloud
  • Have cloud instances created and destroyed automatically
  • Use spot instances efficiently to save costs
  • Save artifacts and reuse them conveniently across workflows
  • Use interactive dev environments, such as notebooks or IDEs
  • Use any frameworks or experiment trackers. No code changes are required.
  • No need for Kubernetes or custom Docker images

Installation

Use pip to install the dstack CLI:

pip install dstack --upgrade

Example

Here's an example from the Quick start.

workflows:
  - name: mnist-data
    provider: bash
    commands:
      - pip install torchvision
      - python mnist/mnist_data.py
    artifacts:
      - path: ./data

  - name: train-mnist
    provider: bash
    deps:
      - workflow: mnist-data
    commands:
      - pip install torchvision pytorch-lightning tensorboard
      - python mnist/train_mnist.py
    artifacts:
      - path: ./lightning_logs

With workflows defined in this manner, dstack allows for effortless execution either locally or in a configured cloud account, while also enabling reuse of artifacts.

Run locally

Use the dstack CLI to run workflows locally:

dstack run mnist-data

Configure a remote

To run workflows remotely (e.g. in the cloud) or share artifacts outside your machine, you must configure your remote settings using the dstack config command:

dstack config

This command will ask you to choose an AWS profile (which will be used for AWS credentials), an AWS region (where workflows will be run), and an S3 bucket (to store remote artifacts and metadata).

AWS profile: default
AWS region: eu-west-1
S3 bucket: dstack-142421590066-eu-west-1
EC2 subnet: none

For more details on how to configure a remote, check the installation guide.

Run remotely

Once a remote is configured, use the --remote flag with the dstack run command to run the workflow in the configured cloud:

dstack run mnist-data --remote

You can configure the required resources to run the workflows either via the resources property in YAML or the dstack run command's arguments, such as --gpu, --gpu-name, etc:

dstack run train-mnist --remote --gpu 1

When you run a workflow remotely, dstack automatically creates resources in the configured cloud, and releases them once the workflow is finished.

More information

For additional information and examples, see the following links:

Licence

Mozilla Public License 2.0

Project details


Release history Release notifications | RSS feed

This version

0.1.2

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dstack-0.1.2.tar.gz (80.1 kB view details)

Uploaded Source

Built Distribution

dstack-0.1.2-py3-none-any.whl (13.6 MB view details)

Uploaded Python 3

File details

Details for the file dstack-0.1.2.tar.gz.

File metadata

  • Download URL: dstack-0.1.2.tar.gz
  • Upload date:
  • Size: 80.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.9.16

File hashes

Hashes for dstack-0.1.2.tar.gz
Algorithm Hash digest
SHA256 70f5fbb22e52e2f9533cf07808cb53794ac215904065128ac2706e6765592172
MD5 72802cddf740e72ef6eb161960d5acaa
BLAKE2b-256 d71c41a171ebe14bfa5d2084e75baa580bb6800732c6c722ec50ed3502827264

See more details on using hashes here.

File details

Details for the file dstack-0.1.2-py3-none-any.whl.

File metadata

  • Download URL: dstack-0.1.2-py3-none-any.whl
  • Upload date:
  • Size: 13.6 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.9.16

File hashes

Hashes for dstack-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 0aaa4e5d1fe7a6c770fe2025c9849573f4bebd0bc5e31277172394f2cdb5dfec
MD5 954f5c9328ae1481537fb042344bb03b
BLAKE2b-256 d308e83ac7d24aec2a1bd7bb2a0ddf74e50c4c7cb43a2ef0bf4f7ecde2a0dcb2

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page