Skip to main content

Personal GPU launcher for on-demand LLM inference on Vast.ai

Project description

vasted

CI License: MIT

vasted is a CLI that launches on-demand Vast.ai GPU workers for llama.cpp GGUF inference and exposes a stable OpenAI-compatible /v1 endpoint.

Built by deeflect.com · Follow on X: x.com/deeflectcom

Demo

vasted demo

Why vasted

  • Stable client endpoint while worker URLs rotate.
  • Setup wizard for local machine and VPS deployments.
  • Non-interactive automation mode for agents/CI.
  • OpenAI-compatible proxy for tools that expect /v1 APIs.
  • Session usage and cost tracking.
  • Optional Telegram bot control commands.

Requirements

  • Python 3.12+
  • uv
  • Vast.ai account + API key
  • Optional: Telegram bot token (telegram extra)

Install

Use from source (recommended while iterating)

git clone https://github.com/deeflect/vasted.git
cd vasted
uv sync --extra dev

Run CLI commands from the repo:

uv run vasted --help

Install as a tool

uv tool install "git+https://github.com/deeflect/vasted.git"

Upgrade later:

uv tool upgrade vasted

Quick Start

uv run vasted setup
uv run vasted up
uv run vasted status --verbose

Client connection values after setup:

  • Base URL: http://<host>:<port>/v1
  • Auth header: Authorization: Bearer <token>

When proxy_host is 0.0.0.0, use your real machine/VPS IP or domain in clients.

Automation / Unattended Mode

Use non-interactive commands to avoid prompts:

uv run vasted setup --non-interactive \
  --vast-api-key "$VASTED_API_KEY" \
  --bearer-token "$VASTED_BEARER_TOKEN" \
  --client openclaw \
  --deployment-mode local_pc \
  --model qwen3-coder-30b \
  --quality balanced \
  --gpu-mode auto

uv run vasted up --non-interactive --yes --jinja --model qwen3-coder-30b --quality balanced --gpu-mode auto --no-serve
uv run vasted status --verbose
uv run vasted usage
uv run vasted down --force

Environment variables accepted by setup --non-interactive:

  • VASTED_API_KEY
  • VASTED_BEARER_TOKEN
  • VASTED_CLIENT (openclaw, opencode, custom)
  • VASTED_LLAMA_JINJA (true/false)
  • VASTED_MODEL, VASTED_QUALITY, VASTED_GPU_MODE, VASTED_GPU_PRESET
  • VASTED_DEPLOYMENT_MODE, VASTED_PROXY_HOST, VASTED_PROXY_PORT, VASTED_PUBLIC_HOST

Client Profiles and Jinja Behavior

setup supports client presets that define default llama.cpp --jinja behavior:

  • --client openclaw: jinja on by default
  • --client opencode: jinja off by default
  • --client custom: keep/manual behavior

Per launch override is still available:

uv run vasted up --jinja
uv run vasted up --no-jinja

Command Reference

vasted setup [--non-interactive] [--manual] [--client openclaw|opencode|custom]
vasted up [--model ...] [--quality ...] [--gpu-mode auto|manual] [--gpu-preset ...] [--profile ...] [--max-price ...] [--jinja|--no-jinja] [--yes] [--non-interactive] [--serve|--no-serve]
vasted down [--force]
vasted status [--verbose]
vasted logs [--instance-id N] [--tail N]
vasted usage
vasted token show [--full]
vasted token rotate
vasted rotate-token
vasted config show
vasted profile list|add|use|remove
vasted completions <bash|zsh|fish>

Telegram Bot (Optional)

Install telegram extra and run:

uv sync --extra telegram
uv run python bot.py

Development

uv run ruff check .
uv run mypy app tests bot.py
uv run pytest -q

Project Layout

  • app/commands/*: CLI command handlers
  • app/service.py: worker lifecycle + launch policy
  • app/proxy.py: OpenAI-compatible reverse proxy
  • app/vast.py: Vast API integration + startup script generation
  • app/usage.py: token/time/cost accounting
  • app/user_config.py: persistent config + keyring integration
  • app/state.py: runtime state persistence
  • bot.py: optional Telegram control plane

Security

  • Keep Vast API keys and bearer tokens private.
  • Prefer localhost binds unless remote access is required.
  • See SECURITY.md for disclosure policy.

Contributing

See CONTRIBUTING.md and run the validation commands before opening a PR.

License

MIT — see LICENSE.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

vasted-0.1.0.tar.gz (21.6 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

vasted-0.1.0-py3-none-any.whl (51.8 kB view details)

Uploaded Python 3

File details

Details for the file vasted-0.1.0.tar.gz.

File metadata

  • Download URL: vasted-0.1.0.tar.gz
  • Upload date:
  • Size: 21.6 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for vasted-0.1.0.tar.gz
Algorithm Hash digest
SHA256 3e1c45d47e371ad4f646fea79f863b04342aad2d1fc54948e25c618eb63dfdb7
MD5 1a5ec3f3b11566c9feea6b46daa247fc
BLAKE2b-256 f0dc01fd428c01390cc4d77918d52b7d093c5158a50f488cacbb392ef7208fbd

See more details on using hashes here.

Provenance

The following attestation bundles were made for vasted-0.1.0.tar.gz:

Publisher: publish.yml on deeflect/vasted

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file vasted-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: vasted-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 51.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for vasted-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 1e85287fec1d35a6ba8a1a42a55a7ec1fa6ba9fbfdadf1984999e21841802354
MD5 03d0f4d9bd1dde7c11a1e642d6715fce
BLAKE2b-256 ce251c01e0156b0aee1328aa3be66a5d5a8861f780f72c0ca8f83a8adb33d3c3

See more details on using hashes here.

Provenance

The following attestation bundles were made for vasted-0.1.0-py3-none-any.whl:

Publisher: publish.yml on deeflect/vasted

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page