Personal GPU launcher for on-demand LLM inference on Vast.ai
Project description
vasted
vasted is a CLI that launches on-demand Vast.ai GPU workers for llama.cpp GGUF inference and exposes a stable OpenAI-compatible /v1 endpoint.
Built by deeflect.com · Follow on X: x.com/deeflectcom
Demo
Why vasted
- Stable client endpoint while worker URLs rotate.
- Setup wizard for local machine and VPS deployments.
- Non-interactive automation mode for agents/CI.
- OpenAI-compatible proxy for tools that expect
/v1APIs. - Session usage and cost tracking.
- Optional Telegram bot control commands.
Requirements
- Python
3.12+ uv- Vast.ai account + API key
- Optional: Telegram bot token (
telegramextra)
Install
Use from source (recommended while iterating)
git clone https://github.com/deeflect/vasted.git
cd vasted
uv sync --extra dev
Run CLI commands from the repo:
uv run vasted --help
Install as a tool
uv tool install "git+https://github.com/deeflect/vasted.git"
Upgrade later:
uv tool upgrade vasted
Quick Start
uv run vasted setup
uv run vasted up
uv run vasted status --verbose
Client connection values after setup:
- Base URL:
http://<host>:<port>/v1 - Auth header:
Authorization: Bearer <token>
When proxy_host is 0.0.0.0, use your real machine/VPS IP or domain in clients.
Automation / Unattended Mode
Use non-interactive commands to avoid prompts:
uv run vasted setup --non-interactive \
--vast-api-key "$VASTED_API_KEY" \
--bearer-token "$VASTED_BEARER_TOKEN" \
--client openclaw \
--deployment-mode local_pc \
--model qwen3-coder-30b \
--quality balanced \
--gpu-mode auto
uv run vasted up --non-interactive --yes --jinja --model qwen3-coder-30b --quality balanced --gpu-mode auto --no-serve
uv run vasted status --verbose
uv run vasted usage
uv run vasted down --force
Environment variables accepted by setup --non-interactive:
VASTED_API_KEYVASTED_BEARER_TOKENVASTED_CLIENT(openclaw,opencode,custom)VASTED_LLAMA_JINJA(true/false)VASTED_MODEL,VASTED_QUALITY,VASTED_GPU_MODE,VASTED_GPU_PRESETVASTED_DEPLOYMENT_MODE,VASTED_PROXY_HOST,VASTED_PROXY_PORT,VASTED_PUBLIC_HOST
Client Profiles and Jinja Behavior
setup supports client presets that define default llama.cpp --jinja behavior:
--client openclaw: jinja on by default--client opencode: jinja off by default--client custom: keep/manual behavior
Per launch override is still available:
uv run vasted up --jinja
uv run vasted up --no-jinja
Command Reference
vasted setup [--non-interactive] [--manual] [--client openclaw|opencode|custom]
vasted up [--model ...] [--quality ...] [--gpu-mode auto|manual] [--gpu-preset ...] [--profile ...] [--max-price ...] [--jinja|--no-jinja] [--yes] [--non-interactive] [--serve|--no-serve]
vasted down [--force]
vasted status [--verbose]
vasted logs [--instance-id N] [--tail N]
vasted usage
vasted token show [--full]
vasted token rotate
vasted rotate-token
vasted config show
vasted profile list|add|use|remove
vasted completions <bash|zsh|fish>
Telegram Bot (Optional)
Install telegram extra and run:
uv sync --extra telegram
uv run python bot.py
Development
uv run ruff check .
uv run mypy app tests bot.py
uv run pytest -q
Project Layout
app/commands/*: CLI command handlersapp/service.py: worker lifecycle + launch policyapp/proxy.py: OpenAI-compatible reverse proxyapp/vast.py: Vast API integration + startup script generationapp/usage.py: token/time/cost accountingapp/user_config.py: persistent config + keyring integrationapp/state.py: runtime state persistencebot.py: optional Telegram control plane
Security
- Keep Vast API keys and bearer tokens private.
- Prefer localhost binds unless remote access is required.
- See SECURITY.md for disclosure policy.
Contributing
See CONTRIBUTING.md and run the validation commands before opening a PR.
License
MIT — see LICENSE.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file vasted-0.1.0.tar.gz.
File metadata
- Download URL: vasted-0.1.0.tar.gz
- Upload date:
- Size: 21.6 MB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3e1c45d47e371ad4f646fea79f863b04342aad2d1fc54948e25c618eb63dfdb7
|
|
| MD5 |
1a5ec3f3b11566c9feea6b46daa247fc
|
|
| BLAKE2b-256 |
f0dc01fd428c01390cc4d77918d52b7d093c5158a50f488cacbb392ef7208fbd
|
Provenance
The following attestation bundles were made for vasted-0.1.0.tar.gz:
Publisher:
publish.yml on deeflect/vasted
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
vasted-0.1.0.tar.gz -
Subject digest:
3e1c45d47e371ad4f646fea79f863b04342aad2d1fc54948e25c618eb63dfdb7 - Sigstore transparency entry: 1006970297
- Sigstore integration time:
-
Permalink:
deeflect/vasted@ab5cc0788cc05ba139b47002c9b86363091a3eaf -
Branch / Tag:
refs/heads/main - Owner: https://github.com/deeflect
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@ab5cc0788cc05ba139b47002c9b86363091a3eaf -
Trigger Event:
workflow_dispatch
-
Statement type:
File details
Details for the file vasted-0.1.0-py3-none-any.whl.
File metadata
- Download URL: vasted-0.1.0-py3-none-any.whl
- Upload date:
- Size: 51.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
1e85287fec1d35a6ba8a1a42a55a7ec1fa6ba9fbfdadf1984999e21841802354
|
|
| MD5 |
03d0f4d9bd1dde7c11a1e642d6715fce
|
|
| BLAKE2b-256 |
ce251c01e0156b0aee1328aa3be66a5d5a8861f780f72c0ca8f83a8adb33d3c3
|
Provenance
The following attestation bundles were made for vasted-0.1.0-py3-none-any.whl:
Publisher:
publish.yml on deeflect/vasted
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
vasted-0.1.0-py3-none-any.whl -
Subject digest:
1e85287fec1d35a6ba8a1a42a55a7ec1fa6ba9fbfdadf1984999e21841802354 - Sigstore transparency entry: 1006970326
- Sigstore integration time:
-
Permalink:
deeflect/vasted@ab5cc0788cc05ba139b47002c9b86363091a3eaf -
Branch / Tag:
refs/heads/main - Owner: https://github.com/deeflect
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@ab5cc0788cc05ba139b47002c9b86363091a3eaf -
Trigger Event:
workflow_dispatch
-
Statement type: