gRPC frontend server exposing vLLM's V1 engine over protobuf/gRPC
Project description
vllm-grpc-frontend
The gRPC frontend server for vllm-grpc.
It wraps vLLM's V1 AsyncLLM engine and serves the project's protobuf
ChatService, CompletionsService, and HealthService over gRPC.
Affiliation: vllm-grpc is an independent, community project and is not affiliated with, endorsed by, or sponsored by the vLLM project. "vLLM" is used here only to identify the inference engine this frontend works with.
Install
pip install vllm-grpc-frontend
The base install pulls no vLLM, so it succeeds on any platform — including those without a vLLM wheel. vLLM is required only to actually run the engine.
vLLM prerequisite (V1 engine)
The frontend drives vLLM's V1 AsyncLLM API, so it requires vllm>=0.20.
Install it via the opt-in engine extra:
pip install "vllm-grpc-frontend[engine]" # pulls vllm>=0.20
Or provide vLLM yourself — useful on platforms where the stock wheel does not
build (for example macOS, which uses vllm-metal). If vLLM is missing at
runtime, the server raises an ImportError when it starts; the package still
installs fine without it.
Run
The package installs a vllm-grpc-frontend console script:
vllm-grpc-frontend # serves gRPC on 0.0.0.0:50051 by default
Configure with environment variables:
MODEL_NAME— model to load (defaultQwen/Qwen3-0.6B)FRONTEND_HOST/FRONTEND_PORT— bind address (default0.0.0.0:50051)
Links
- Repository: https://github.com/AncientStudying/vllm-grpc
- Changelog: https://github.com/AncientStudying/vllm-grpc/blob/main/CHANGELOG.md
- Issues: https://github.com/AncientStudying/vllm-grpc/issues
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file vllm_grpc_frontend-0.1.0.tar.gz.
File metadata
- Download URL: vllm_grpc_frontend-0.1.0.tar.gz
- Upload date:
- Size: 16.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3f2c86c58f68f51fc277fffa12d1eb49ecd6fd482d3b04f29dfadf3ad2eed7c1
|
|
| MD5 |
d35796607baba4f83121f09271d97069
|
|
| BLAKE2b-256 |
246932a936d3178b15fabfaf2642bb3a5fa17acf28f772ca39381a379bc70b3d
|
File details
Details for the file vllm_grpc_frontend-0.1.0-py3-none-any.whl.
File metadata
- Download URL: vllm_grpc_frontend-0.1.0-py3-none-any.whl
- Upload date:
- Size: 13.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ce3afc41f91c803fbce2863f445ce1ea380f010210ea318cbd90cb4e22cfc39e
|
|
| MD5 |
92674ee2d8f0a6e66ed2a6d85a3c0e35
|
|
| BLAKE2b-256 |
980353d7ff802d3df3cb0e44daed6b3f5f2159258e23e52de43e4bf048ee2dac
|