Skip to main content

Lightweight web admin panel for llama.cpp server management

Project description

llama-wrangler

PyPI GitHub Release Docker

Lightweight web admin panel for llama.cpp server management.

Features

  • Model Browser — Scan local directory for .gguf files, view name/size/modified
  • Model Download — Search HuggingFace for GGUF models, download with progress tracking
  • Server Lifecycle — Start/stop/restart llama-server subprocess from the browser
  • Parameter Config — Visual editor for llama-server flags (context size, GPU layers, batch size, flash attention, etc.)
  • System Monitoring — Real-time GPU (VRAM, temp, utilization, power), CPU, RAM, and disk usage
  • Log Viewer — Stream llama-server stdout/stderr via Server-Sent Events
  • Health Monitoring — Poll /health endpoint, show status badge
  • i18n — English and Chinese interface, switchable at runtime

Install

pip install llama-wrangler

Prerequisites

llama-wrangler manages a llama-server process on the host machine. Make sure you have:

  • llama.cpp compiled with llama-server binary (build instructions)
  • NVIDIA GPU driver installed (for GPU inference and monitoring)

Quick Start

# Start the admin panel
llama-wrangler --host 0.0.0.0 --port 7860

# With custom config
llama-wrangler --config /path/to/config.json

Then open http://localhost:7860 in your browser.

Configuration

Config is stored at ~/.config/llama-wrangler/config.json:

{
  "llama_server_path": "/path/to/llama-server",
  "models_dir": "/path/to/models",
  "default_args": {
    "host": "0.0.0.0",
    "port": 8080,
    "n_gpu_layers": 99,
    "ctx_size": 8192,
    "flash_attn": true,
    "batch_size": 2048,
    "ubatch_size": 512,
    "threads": 0,
    "parallel": 1,
    "cont_batching": true,
    "metrics": true
  }
}

Docker

Host prerequisites

The following must be set up on the host machine before running the container:

  1. NVIDIA GPU driver — install from NVIDIA or your distro's package manager
  2. NVIDIA Container Toolkit — required for --gpus flag to work:
    # Ubuntu/Debian
    curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | \
      sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg
    curl -s -L https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list | \
      sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | \
      sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list
    sudo apt-get update && sudo apt-get install -y nvidia-container-toolkit
    sudo nvidia-ctk runtime configure --runtime=docker
    sudo systemctl restart docker
    
    See the official install guide for other distros.
  3. llama.cpp compiled on the host with llama-server binary
  4. Verify everything works: docker run --rm --gpus all ubuntu nvidia-smi

Build and run

# Build
docker build -t llama-wrangler .

# Run
docker run --gpus all -p 7860:7860 \
  -v /path/to/models:/mnt/data/models \
  -v /path/to/llama-server:/opt/llama-server:ro \
  -v /sys:/sys:ro \
  -v ~/.config/llama-wrangler:/root/.config/llama-wrangler \
  llama-wrangler

Volume mounts explained:

Mount Purpose
-v /path/to/models:/mnt/data/models GGUF model files (read/write for downloads)
-v /path/to/llama-server:/opt/llama-server:ro llama-server binary from host
-v /sys:/sys:ro Sensor data (disk/NVMe temperatures via psutil)
-v ~/.config/llama-wrangler:... Persist configuration across restarts
--gpus all GPU access (nvidia-smi, CUDA for llama-server)

Note: CPU and RAM metrics work out of the box in Docker — psutil reads /proc which is shared from the host. GPU monitoring requires --gpus all via nvidia-container-toolkit.

Without GPU

llama-wrangler works without a GPU (CPU-only inference). Simply omit --gpus all:

docker run -p 7860:7860 \
  -v /path/to/models:/mnt/data/models \
  -v /path/to/llama-server:/opt/llama-server:ro \
  llama-wrangler

The GPU section on the dashboard will be hidden automatically.

Tech Stack

  • Backend: Python asyncio (zero-framework, vendored HTTP server)
  • Frontend: Single-file vanilla HTML/CSS/JS
  • Dependencies: huggingface-hub, psutil only
  • No: Flask, FastAPI, React, npm, database

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

llama_wrangler-0.1.0.tar.gz (46.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

llama_wrangler-0.1.0-py3-none-any.whl (48.4 kB view details)

Uploaded Python 3

File details

Details for the file llama_wrangler-0.1.0.tar.gz.

File metadata

  • Download URL: llama_wrangler-0.1.0.tar.gz
  • Upload date:
  • Size: 46.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for llama_wrangler-0.1.0.tar.gz
Algorithm Hash digest
SHA256 98a575b6c63a151f6a655d26e2f3a13630176e744047c34a24212fa62e295eae
MD5 34ac255de0a44568dc2721bc631c95ec
BLAKE2b-256 aed64db01d030315141be027eb687d992d76c4886d2f44b5d64a9bf4780cc715

See more details on using hashes here.

Provenance

The following attestation bundles were made for llama_wrangler-0.1.0.tar.gz:

Publisher: release.yml on Oaklight/llama-wrangler

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file llama_wrangler-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: llama_wrangler-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 48.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for llama_wrangler-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 7bc3a625df2029f0d165edb32286aad7cf9a75cd2962256f30b6c531d7b1c9e6
MD5 2ce19825bc404c8d266989e76d7cd87c
BLAKE2b-256 0b54dfe8ea13295face4af1236221f19f3ad605f701500050fff8bc3193af695

See more details on using hashes here.

Provenance

The following attestation bundles were made for llama_wrangler-0.1.0-py3-none-any.whl:

Publisher: release.yml on Oaklight/llama-wrangler

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page