LLM-as-a-Judge evaluations for vLLM hosted models
Project description
vLLM Judge
A lightweight library for LLM-as-a-Judge evaluations using vLLM hosted models. Please refer the documentation for usage details.
Features
- 🚀 Simple Interface: Single
evaluate()method that adapts to any use case - 🎯 Pre-built Metrics: 20+ ready-to-use evaluation metrics
- 🔧 Template Support: Dynamic evaluations with template variables
- ⚡ High Performance: Optimized for vLLM with automatic batching
- 🌐 API Mode: Run as a REST API service
- 🔄 Async Native: Built for high-throughput evaluations
Installation
# Basic installation
pip install vllm-judge
# With API support
pip install vllm-judge[api]
# With Jinja2 template support
pip install vllm-judge[jinja2]
# Everything
pip install vllm-judge[dev]
Quick Start
from vllm_judge import Judge
# Initialize with vLLM url
judge = Judge.from_url("http://localhost:8000")
# Simple evaluation
result = await judge.evaluate(
response="The Earth orbits around the Sun.",
criteria="scientific accuracy"
)
print(f"Decision: {result.decision}")
print(f"Reasoning: {result.reasoning}")
# Using pre-built metrics
from vllm_judge import CODE_QUALITY
result = await judge.evaluate(
response="def add(a, b): return a + b",
metric=CODE_QUALITY
)
# With template variables
result = await judge.evaluate(
response="Essay content here...",
criteria="Evaluate this {doc_type} for {audience}",
template_vars={
"doc_type": "essay",
"audience": "high school students"
}
)
API Server
Run Judge as a REST API:
vllm-judge serve --base-url http://localhost:8000 --port 9090 --host localhost
Then use the HTTP API:
from vllm_judge.api import JudgeClient
client = JudgeClient("http://localhost:9090")
result = await client.evaluate(
response="Python is great!",
criteria="technical accuracy"
)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
vllm_judge-0.1.1.tar.gz
(30.5 kB
view details)
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file vllm_judge-0.1.1.tar.gz.
File metadata
- Download URL: vllm_judge-0.1.1.tar.gz
- Upload date:
- Size: 30.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.11.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
968c52831edaadc2a36ee57f4ec90546bec1380577320d23020eca713b5d7e76
|
|
| MD5 |
e9c9065aaf66fb2225f2bb014bc061c5
|
|
| BLAKE2b-256 |
03cdf2eab195ab6e42e486eaa744e18c1c1b3a6741d25536980c89228140c329
|
File details
Details for the file vllm_judge-0.1.1-py3-none-any.whl.
File metadata
- Download URL: vllm_judge-0.1.1-py3-none-any.whl
- Upload date:
- Size: 34.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.11.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
9156022e8d7a5fca80d4dda883d22013ddf737ce67be3aad1cd094258f686f5e
|
|
| MD5 |
ad36c5e23ced203dc2b9afa2ed4aa235
|
|
| BLAKE2b-256 |
479fa29959965663d419b4c11cf19f3638b3be186f0d67c1e4c83c900230cc63
|