Skip to main content

A web interface for managing and interacting with vLLM servers

Project description

vLLM Playground

A modern web interface for managing and interacting with vLLM servers (www.github.com/vllm-project/vllm). Supports GPU and CPU modes, with special optimizations for macOS Apple Silicon and enterprise deployment on OpenShift/Kubernetes.

โœจ Agentic-Ready with MCP Support

vLLM Playground MCP Integration

MCP (Model Context Protocol) integration enables models to use external tools with human-in-the-loop approval.

โœจ Tool Calling Support

vLLM Playground Interface

โœจ Structured Outputs Support

vLLM Playground with Structured Outputs

๐Ÿ†• What's New in v0.1.2

  • ๐ŸŒ ModelScope Support - Alternative model source for China region users
  • ๐ŸŒ i18n Chinese - Comprehensive Chinese language translations
  • ๐Ÿ’ฌ Chat Export - Save conversations with export functionality
  • ๐Ÿ› Bug Fixes - Windows Unicode fix, sidebar UI improvements

See Changelog for full details.


๐Ÿš€ Quick Start

# Install from PyPI
pip install vllm-playground

# Pre-download container image (~10GB for GPU)
vllm-playground pull

# Start the playground
vllm-playground

Open http://localhost:7860 and click "Start Server" - that's it! ๐ŸŽ‰

CLI Options

vllm-playground pull                # Pre-download GPU image
vllm-playground pull --cpu          # Pre-download CPU image
vllm-playground --port 8080         # Custom port
vllm-playground stop                # Stop running instance
vllm-playground status              # Check status

โœจ Key Features

Feature Description
๐Ÿ’ฌ Modern Chat UI Streamlined ChatGPT-style interface with streaming responses
๐Ÿ”ง Tool Calling Function calling with Llama, Mistral, Qwen, and more
๐Ÿ”— MCP Integration Connect to MCP servers for agentic capabilities
๐Ÿ—๏ธ Structured Outputs Constrain responses to JSON Schema, Regex, or Grammar
๐Ÿณ Container Mode Zero-setup vLLM via automatic container management
โ˜ธ๏ธ OpenShift/K8s Enterprise deployment with dynamic pod creation
๐Ÿ“Š Benchmarking GuideLLM integration for load testing
๐Ÿ“š Recipes One-click configs from vLLM community recipes

๐Ÿ“ฆ Installation Options

Method Command Best For
PyPI pip install vllm-playground Most users
With Benchmarking pip install vllm-playground[benchmark] Load testing
From Source git clone + python run.py Development
OpenShift/K8s ./openshift/deploy.sh Enterprise

๐Ÿ“– See Installation Guide for detailed instructions.


๐Ÿ”ง Configuration

Tool Calling

Enable in Server Configuration before starting:

  1. Check "Enable Tool Calling"
  2. Select parser (or "Auto-detect")
  3. Start server
  4. Define tools in the ๐Ÿ”ง toolbar panel

Supported Models:

  • Llama 3.x (llama3_json)
  • Mistral (mistral)
  • Qwen (hermes)
  • Hermes (hermes)

MCP Servers

Connect to external tools via Model Context Protocol:

  1. Go to MCP Servers in the sidebar
  2. Add a server (presets available: Filesystem, Git, Fetch, Time)
  3. Connect and enable in chat panel

โš ๏ธ MCP requires Python 3.10+

CPU Mode (macOS)

Edit config/vllm_cpu.env:

export VLLM_CPU_KVCACHE_SPACE=40
export VLLM_CPU_OMP_THREADS_BIND=auto

๐Ÿ“– Documentation

Getting Started

Features

Deployment

Reference

Releases

  • Changelog - Version history and changes
  • v0.1.2 - ModelScope integration, i18n improvements
  • v0.1.1 - MCP integration, runtime detection
  • v0.1.0 - First release, modern UI, tool calling

๐Ÿ—๏ธ Architecture

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚   User Browser   โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
         โ”‚ http://localhost:7860
         โ†“
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚   Web UI (Host)  โ”‚  โ† FastAPI + JavaScript
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
         โ”‚
    โ”Œโ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”
    โ†“         โ†“
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€-โ”€โ” โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ vLLM    โ”‚ โ”‚  MCP   โ”‚  โ† Containers / External Servers
โ”‚Containerโ”‚ โ”‚Servers โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€-โ”˜ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

๐Ÿ“– See Architecture Overview for details.


๐Ÿ†˜ Quick Troubleshooting

Issue Solution
Port in use vllm-playground stop
Container won't start podman logs vllm-service
Tool calling fails Restart with "Enable Tool Calling" checked
Image pull errors vllm-playground pull --all

๐Ÿ“– See Troubleshooting Guide for more.


๐Ÿ”— Related Projects


๐Ÿ“ License

Apache 2.0 License - See LICENSE file for details.

๐Ÿค Contributing

Contributions welcome! Please feel free to submit issues and pull requests.


Made with โค๏ธ for the vLLM community

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

vllm_playground-0.1.2.tar.gz (5.3 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

vllm_playground-0.1.2-py3-none-any.whl (5.3 MB view details)

Uploaded Python 3

File details

Details for the file vllm_playground-0.1.2.tar.gz.

File metadata

  • Download URL: vllm_playground-0.1.2.tar.gz
  • Upload date:
  • Size: 5.3 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.8.17

File hashes

Hashes for vllm_playground-0.1.2.tar.gz
Algorithm Hash digest
SHA256 d97fb1a9d9c6bab6319558cc08494598810f6e7f8655a083874350062aade0e7
MD5 3c779d907917af94816cf88b28f81655
BLAKE2b-256 5425abe4c21074b9998efd35f40caadf6fdec4f082d9d36a8377104933f09e74

See more details on using hashes here.

File details

Details for the file vllm_playground-0.1.2-py3-none-any.whl.

File metadata

File hashes

Hashes for vllm_playground-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 230d049f9a1cd7376041e4e1b154b9c7187f1636186480b3cda8d93e308bf4fe
MD5 278bd37442ec79934bac9c590f5afd34
BLAKE2b-256 646a27226daff9e100630f7d6553e41aefd821d136438a9531229a7c8702144b

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page