Enterprise-grade resilient vLLM client network and preflight validation engine.
Project description
vllm-sdk
A high-performance, asynchronous resilience gateway client built to connect distributed application services to remote GPU infrastructure safely and efficiently.
Table of Contents
- Overview
- Key Architecture Benefits
- Installation
- Environment Configuration
- Quick Start Usage
- Console Logging Aesthetics
- License
##Overview
Managing raw HTTP streaming routes directly to high-throughput LLM clusters can cause major stability issues, such as socket exhaustion, memory crashes, or lost responses.
The vllm-sdk wraps all this complex networking inside a clean, production-hardened interface. It handles background network management, automatically cleans up messy raw Server-Sent Event (SSE) blocks, and feeds your applications crisp, ready-to-use text tokens in real-time.
Key Architecture Benefits
- Asynchronous Concurrency: Built natively on Python's
asyncioloop. It easily supports 100+ concurrent app instances (like Chatbots, BOM Parsers, and Tender Text Extractors) without stalling performance. - Keep-Alive Connection Pooling: Reuses active TCP paths over
httpx.AsyncClientinstead of spinning up new sockets for every line, cutting down Time-To-First-Token (TTFT). - Pre-flight Integrity Checking: Instantly scans system paths and environment flags before booting to prevent downstream configuration crashes.
- Localized Brand Logging: Implements highly scannable terminal tracking designed after modern web frameworks like FastAPI and Uvicorn.
Installation
This project is fully managed using the lightning-fast uv Python package manager.
# Clone the repository
git clone [inprogress](inprogress)
cd vllm_sdk
# Sync dependencies and create the virtual environment automatically
uv sync
Quick Start Usage
# -*- coding: utf-8 -*-
import asyncio
from src.vllm_resilience_sdk.logging_config import setup_sdk_logging
from src.vllm_resilience_sdk import SystemInitializationEngine
from src.vllm_resilience_sdk.clients import ProductionVLLMClient
async def main():
# 1. Initialize FAANG-style high-visibility console reporting
setup_sdk_logging()
# 2. Run background system verification checks
verifier = SystemInitializationEngine(target_log_dir="./logs")
verifier.run_pre_boot_pipeline()
# 3. Instantiate the connection pooling client
client = ProductionVLLMClient()
await client.initialize_vllm_connection()
# 4. Construct workload payload
sample_payload = {
"model": "LocalModel",
"messages": [{"role": "user", "content": "Analyze PCB raw schematics metadata."}]
}
print("\n--- AI Engine Stream Output Response ---")
# 5. Stream processed text tokens seamlessly
async for token in client.send_inference_request(sample_payload):
print(token, end="", flush=True)
print("\n----------------------------------------\n")
# 6. Safely flush socket channels on teardown
await client.close_vllm_connection()
if __name__ == "__main__":
asyncio.run(main())
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file open_vllm_sdk-1.0.0.tar.gz.
File metadata
- Download URL: open_vllm_sdk-1.0.0.tar.gz
- Upload date:
- Size: 10.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.8.22
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
72e92cd38d08dfe27a6a56aa0360c480938d4d17e68c79b2bd7453914554cc2d
|
|
| MD5 |
db8ec080a2dba44183bfbf236024ebca
|
|
| BLAKE2b-256 |
b85e5c54bcd390a34c898982cdad874abd75bd4e8d0d86f3ca130da92dceb1e7
|
File details
Details for the file open_vllm_sdk-1.0.0-py3-none-any.whl.
File metadata
- Download URL: open_vllm_sdk-1.0.0-py3-none-any.whl
- Upload date:
- Size: 12.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.8.22
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ae8794e3c76b17feb2a73f799ac057858889d012f0e06ca742e872dbb80dc8f8
|
|
| MD5 |
7731b0a6c06b01f842e1336c001e0bb3
|
|
| BLAKE2b-256 |
40dca9abc134872df65a75e09c16495a0c4f36bada330b7b5303a2d77a7b2413
|