dstack is an open-source platform for training, fine-tuning, and deployment of generative AI models across various cloud providers.
Project description
dstack
is an open-source platform for training, fine-tuning, and deployment of
generative AI models across various cloud providers (e.g., AWS, GCP, Azure, Lambda Cloud, etc.)
Latest news ✨
- [2023/10] Simplified cloud setup, and refined API (Release)
- [2023/09] RAG with Llama Index and Weaviate (Example)
- [2023/09] Deploying LLMs using Python API (Example)
- [2023/08] Fine-tuning Llama 2 using QLoRA (Example)
- [2023/08] Deploying Stable Diffusion using FastAPI (Example)
- [2023/07] Deploying LLMS using TGI (Example)
- [2023/07] Deploying LLMS using vLLM (Example)
Installation
Before using dstack
through CLI or API, set up a dstack
server.
Install the server
The easiest way to install the server, is via pip
:
$ pip install "dstack[all]" -U
Configure clouds
If you have default AWS, GCP, or Azure credentials on your machine, dstack
will pick them up automatically.
Otherwise, you need to manually specify the cloud credentials in ~/.dstack/server/config.yml
.
For further cloud configuration details, refer to Clouds.
Start the server
To start the server, use the dstack server
command:
$ dstack server
The server is running at http://127.0.0.1:3000/.
The admin token is xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx
Using CLI
Define a configuration
The CLI allows you to define what you want to run as a YAML file and
run it via the dstack run
CLI command.
Dev environments
A dev environment is a cloud instance pre-configured with an IDE.
type: dev-environment
python: "3.11" # (Optional) If not specified, your local version is used
ide: vscode
Tasks
A task can be a batch job or a web app.
type: task
python: "3.11" # (Optional) If not specified, your local version is used
ports:
- 6006
commands:
- pip install -r requirements.txt
- tensorboard --logdir ./logs &
- python train.py
If you run a task, dstack
forwards the configured ports to localhost
.
Services
A service is a web app accessible from the Internet.
type: service
image: ghcr.io/huggingface/text-generation-inference:latest
env:
- MODEL_ID=TheBloke/Llama-2-13B-chat-GPTQ
port: 80
commands:
- text-generation-launcher --hostname 0.0.0.0 --port 80 --trust-remote-code
Note: Before you can run a service, you have to set up a gateway.
Running a service will make it available at https://<run-name>.<your-domain>
using the
domain configured for the gateway.
Run a configuration
To run a configuration, use the dstack run
command followed by
working directory and the path to the configuration file.
dstack run . -f text-generation-inference/serve.dstack.yml --gpu 80GB -y
RUN BACKEND INSTANCE SPOT PRICE STATUS SUBMITTED
tasty-zebra-1 lambda 200GB, 1xA100 (80GB) no $1.1 Submitted now
Privisioning...
Serving on https://tasty-zebra-1.mydomain.com
Using API
As an alternative to the CLI, you can run tasks and services and manage runs programmatically.
Create a client
First, create an instance of dstack.api.Client
:
from dstack.api import Client, ClientError
try:
client = Client.from_config()
except ClientError:
print("Can't connect to the server")
Submit a run
Here's an example of how to run a task:
from dstack.api import Task, Resources, GPU
task = Task(
image="ghcr.io/huggingface/text-generation-inference:latest",
env={"MODEL_ID": "TheBloke/Llama-2-13B-chat-GPTQ"},
commands=[
"text-generation-launcher --trust-remote-code --quantize gptq",
],
ports=["80"],
)
run = client.runs.submit(
run_name="my-awesome-run",
configuration=task,
resources=Resources(gpu=GPU(memory="24GB")),
)
To forward the configured ports to localhost
, use the attach
and detach
methods on the run.
try:
run.attach()
# ...
except KeyboardInterrupt:
run.stop(abort=True)
finally:
run.detach()
More information
For additional information and examples, see the following links:
Licence
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.