dstack is an open-source framework for orchestration GPU workloads and development of generative AI models across multiple clouds.
Project description
dstack
is an open-source framework for orchestrating GPU workloads
across multiple cloud GPU providers. It provides a simple cloud-agnostic interface for
development and deployment of generative AI models.
Latest news ✨
- [2023/09] Deploying LLMs using Python API (Example)
- [2023/09] Managed gateways (Release)
- [2023/08] Fine-tuning Llama 2 using QLoRA (Example)
- [2023/08] Deploying Stable Diffusion using FastAPI (Example)
- [2023/07] Deploying LLMS using TGI (Example)
- [2023/07] Deploying LLMS using vLLM (Example)
Installation
To use dstack
, install it with pip
, and start the server.
pip install "dstack[all]" -U
dstack start
Configure clouds
Upon startup, the server sets up the default project called main
.
Prior to using dstack
, make sure to configure clouds.
Once the server is up, you can orchestrate GPU workloads using either the CLI or Python API.
Using CLI
Define a configuration
The CLI allows you to define what you want to run as a YAML file and
run it via the dstack run
CLI command.
Configurations can be of three types: dev-environment
, task
, and service
.
Dev environments
A dev environment is a virtual machine with a pre-configured IDE.
type: dev-environment
python: "3.11" # (Optional) If not specified, your local version is used
setup: # (Optional) Executed once at the first startup
- pip install -r requirements.txt
ide: vscode
Tasks
A task can be either a batch job, such as training or fine-tuning a model, or a web application.
type: task
python: "3.11" # (Optional) If not specified, your local version is used
ports:
- 7860
commands:
- pip install -r requirements.txt
- python app.py
While the task is running in the cloud, the CLI forwards its ports traffic to localhost
for convenient access.
Services
A service is an application that is accessible through a public endpoint.
type: service
port: 7860
commands:
- pip install -r requirements.txt
- python app.py
Once the service is up, dstack
makes it accessible from the Internet through
the gateway.
Run a configuration
To run a configuration, use the dstack run
command followed by
working directory and the path to the configuration file.
dstack run . -f text-generation-inference/serve.dstack.yml --gpu 80GB -y
RUN BACKEND INSTANCE SPOT PRICE STATUS SUBMITTED
tasty-zebra-1 lambda 200GB, 1xA100 (80GB) no $1.1 Submitted now
Privisioning...
Serving on https://tasty-zebra-1.mydomain.com
Using API
As an alternative to the CLI, you can run tasks and services programmatically via Python API.
import sys
import dstack
task = dstack.Task(
image="ghcr.io/huggingface/text-generation-inference:latest",
env={"MODEL_ID": "TheBloke/Llama-2-13B-chat-GPTQ"},
commands=[
"text-generation-launcher --trust-remote-code --quantize gptq",
],
ports=["8080:80"],
)
resources = dstack.Resources(gpu=dstack.GPU(memory="20GB"))
if __name__ == "__main__":
print("Initializing the client...")
client = dstack.Client.from_config(repo_dir="~/dstack-examples")
print("Submitting the run...")
run = client.runs.submit(configuration=task, resources=resources)
print(f"Run {run.name}: " + run.status())
print("Attaching to the run...")
run.attach()
# After the endpoint is up, http://127.0.0.1:8080/health will return 200 (OK).
try:
for log in run.logs():
sys.stdout.buffer.write(log)
sys.stdout.buffer.flush()
except KeyboardInterrupt:
print("Aborting the run...")
run.stop(abort=True)
finally:
run.detach()
More information
For additional information and examples, see the following links:
Licence
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file dstack-0.12.0rc2.tar.gz
.
File metadata
- Download URL: dstack-0.12.0rc2.tar.gz
- Upload date:
- Size: 130.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.9.18
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 7c2e2e2adf59e9f635f116571c5d7a8179e142ed490f432a98d1049ad065d82e |
|
MD5 | 81628e08ef6395dae44cf1a40bcf83f6 |
|
BLAKE2b-256 | e58e3b05564b8d2eb962d8cda57928d90d731752abe86b6db6c01b588001bd57 |
File details
Details for the file dstack-0.12.0rc2-py3-none-any.whl
.
File metadata
- Download URL: dstack-0.12.0rc2-py3-none-any.whl
- Upload date:
- Size: 202.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.9.18
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 34e7fc3bbd558ad588b2bf2604a351841fc351df8aebc3b2ed03930d1ea77fcc |
|
MD5 | ea9c8cac7fdfbfe84218ab81ebdfc151 |
|
BLAKE2b-256 | 673f72c4665e61caea33a2123b58b2cd489a27387179b09ae5a02410988ed270 |