Skip to main content

A web interface for managing and interacting with vLLM servers

Project description

vLLM Playground

A modern web interface for managing and interacting with vLLM servers (www.github.com/vllm-project/vllm). Supports GPU and CPU modes, with special optimizations for macOS Apple Silicon and enterprise deployment on OpenShift/Kubernetes.

✨ Claude Code Integration

vLLM Playground Claude Code

Run Claude Code with open-source models served by vLLM - your private, local coding assistant.

✨ Agentic-Ready with MCP Support

vLLM Playground MCP Integration

MCP (Model Context Protocol) integration enables models to use external tools with human-in-the-loop approval.

✨ Tool Calling Support

vLLM Playground Interface

✨ Structured Outputs Support

vLLM Playground with Structured Outputs

🆕 What's New in v0.1.3

  • 🎮 Multi-Accelerators - NVIDIA CUDA, AMD ROCm, Google TPU support
  • 🤖 Claude Code - Use open-source models as Claude Code backend
  • Metal GPU - Apple Silicon GPU acceleration via vllm-metal
  • 🔧 Custom venv - Use specific vLLM versions or custom builds
  • 🐳 vLLM v0.12.0 - Updated container image with Anthropic Messages API

See Changelog for full details.


🚀 Quick Start

# Install from PyPI
pip install vllm-playground

# Pre-download container image (~10GB for GPU)
vllm-playground pull

# Start the playground
vllm-playground

Open http://localhost:7860 and click "Start Server" - that's it! 🎉

CLI Options

vllm-playground pull                # Pre-download GPU image
vllm-playground pull --cpu          # Pre-download CPU image
vllm-playground --port 8080         # Custom port
vllm-playground stop                # Stop running instance
vllm-playground status              # Check status

✨ Key Features

Feature Description
🤖 Claude Code Use open-source models as Claude Code backend via vLLM
💬 Modern Chat UI Streamlined ChatGPT-style interface with streaming responses
🔧 Tool Calling Function calling with Llama, Mistral, Qwen, and more
🔗 MCP Integration Connect to MCP servers for agentic capabilities
🏗️ Structured Outputs Constrain responses to JSON Schema, Regex, or Grammar
🐳 Container Mode Zero-setup vLLM via automatic container management
☸️ OpenShift/K8s Enterprise deployment with dynamic pod creation
📊 Benchmarking GuideLLM integration for load testing
📚 Recipes One-click configs from vLLM community recipes

📦 Installation Options

Method Command Best For
PyPI pip install vllm-playground Most users
With Benchmarking pip install vllm-playground[benchmark] Load testing
From Source git clone + python run.py Development
OpenShift/K8s ./openshift/deploy.sh Enterprise

📖 See Installation Guide for detailed instructions.


🔧 Configuration

Tool Calling

Enable in Server Configuration before starting:

  1. Check "Enable Tool Calling"
  2. Select parser (or "Auto-detect")
  3. Start server
  4. Define tools in the 🔧 toolbar panel

Supported Models:

  • Llama 3.x (llama3_json)
  • Mistral (mistral)
  • Qwen (hermes)
  • Hermes (hermes)

Claude Code Integration

Use vLLM to serve open-source models as a backend for Claude Code:

  1. Go to Claude Code in the sidebar
  2. Start vLLM with a recommended model (see tips on the page)
  3. The embedded terminal connects automatically

Requirements:

  • vLLM v0.12.0+ (for Anthropic Messages API)
  • Model with native 65K+ context and tool calling support
  • ttyd installed for web terminal

Recommended Model for most GPUs:

meta-llama/Llama-3.1-8B-Instruct
--max-model-len 65536 --enable-auto-tool-choice --tool-call-parser llama3_json

MCP Servers

Connect to external tools via Model Context Protocol:

  1. Go to MCP Servers in the sidebar
  2. Add a server (presets available: Filesystem, Git, Fetch, Time)
  3. Connect and enable in chat panel

⚠️ MCP requires Python 3.10+

CPU Mode (macOS)

Edit config/vllm_cpu.env:

export VLLM_CPU_KVCACHE_SPACE=40
export VLLM_CPU_OMP_THREADS_BIND=auto

Metal GPU Support (macOS Apple Silicon)

vLLM Playground supports Apple Silicon GPU acceleration:

  1. Install vllm-metal following official instructions
  2. Configure playground to use Metal:
    • Run Mode: Subprocess
    • Compute Mode: Metal
    • Venv Path: ~/.venv-vllm-metal (or your installation path)

See macOS Metal Guide for details.

Custom vLLM Installations

Use specific vLLM versions or custom builds:

  1. Install vLLM in a virtual environment
  2. Configure playground:
    • Run Mode: Subprocess
    • Venv Path: /path/to/your/venv

See Custom venv Guide for details.


📖 Documentation

Getting Started

Features

Deployment

Reference

Releases

  • Changelog - Version history and changes
  • v0.1.3 - Multi-accelerators, Claude Code, vLLM-Metal
  • v0.1.2 - ModelScope integration, i18n improvements
  • v0.1.1 - MCP integration, runtime detection
  • v0.1.0 - First release, modern UI, tool calling

🏗️ Architecture

┌──────────────────┐
│   User Browser   │
└────────┬─────────┘
         │ http://localhost:7860
         ↓
┌──────────────────┐
│   Web UI (Host)  │  ← FastAPI + JavaScript
└────────┬─────────┘
         │
    ┌────┴────┐
    ↓         ↓
┌───────-─┐ ┌────────┐
│ vLLM    │ │  MCP   │  ← Containers / External Servers
│Container│ │Servers │
└────────-┘ └────────┘

📖 See Architecture Overview for details.


🆘 Quick Troubleshooting

Issue Solution
Port in use vllm-playground stop
Container won't start podman logs vllm-service
Tool calling fails Restart with "Enable Tool Calling" checked
Image pull errors vllm-playground pull --all

📖 See Troubleshooting Guide for more.


🔗 Related Projects


📝 License

Apache 2.0 License - See LICENSE file for details.

🤝 Contributing

Contributions welcome! Please feel free to submit issues and pull requests.


Made with ❤️ for the vLLM community

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

vllm_playground-0.1.3rc1.tar.gz (9.1 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

vllm_playground-0.1.3rc1-py3-none-any.whl (9.1 MB view details)

Uploaded Python 3

File details

Details for the file vllm_playground-0.1.3rc1.tar.gz.

File metadata

  • Download URL: vllm_playground-0.1.3rc1.tar.gz
  • Upload date:
  • Size: 9.1 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.8.17

File hashes

Hashes for vllm_playground-0.1.3rc1.tar.gz
Algorithm Hash digest
SHA256 841400f6aee93f98c69a4721046d768e79369341b2f30217b1f6edab1c5e0bab
MD5 c08e74b48b6d9e680396ed0d27ce5670
BLAKE2b-256 d50b4a11d58f9b9b4ab025b67d17aad861e65da28d0fff2b66149e6e0aa562c7

See more details on using hashes here.

File details

Details for the file vllm_playground-0.1.3rc1-py3-none-any.whl.

File metadata

File hashes

Hashes for vllm_playground-0.1.3rc1-py3-none-any.whl
Algorithm Hash digest
SHA256 4b0d045cd33306113ece951a56db7fcd8f4604bbc8e2e83f131029ae014f7323
MD5 b224407ada4977589c7fb8db28f3d20b
BLAKE2b-256 d5b7a09436c475ca2c4b3f8e45fb6fda6880836fabdf7154d936a68834274b28

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page