Skip to main content

Enterprise-grade resilient vLLM client network and preflight validation engine.

Project description

vllm-sdk

A high-performance, asynchronous resilience gateway client built to connect distributed application services to remote GPU infrastructure safely and efficiently.


Table of Contents

  1. Overview
  2. Key Architecture Benefits
  3. Installation
  4. Environment Configuration
  5. Quick Start Usage
  6. Console Logging Aesthetics
  7. License

##Overview

Managing raw HTTP streaming routes directly to high-throughput LLM clusters can cause major stability issues, such as socket exhaustion, memory crashes, or lost responses.

The vllm-sdk wraps all this complex networking inside a clean, production-hardened interface. It handles background network management, automatically cleans up messy raw Server-Sent Event (SSE) blocks, and feeds your applications crisp, ready-to-use text tokens in real-time.

Key Architecture Benefits

  • Asynchronous Concurrency: Built natively on Python's asyncio loop. It easily supports 100+ concurrent app instances (like Chatbots, BOM Parsers, and Tender Text Extractors) without stalling performance.
  • Keep-Alive Connection Pooling: Reuses active TCP paths over httpx.AsyncClient instead of spinning up new sockets for every line, cutting down Time-To-First-Token (TTFT).
  • Pre-flight Integrity Checking: Instantly scans system paths and environment flags before booting to prevent downstream configuration crashes.
  • Localized Brand Logging: Implements highly scannable terminal tracking designed after modern web frameworks like FastAPI and Uvicorn.

Installation

This project is fully managed using the lightning-fast uv Python package manager.

# Clone the repository
git clone [inprogress](inprogress)
cd vllm_sdk

# Sync dependencies and create the virtual environment automatically
uv sync

Quick Start Usage

# -*- coding: utf-8 -*-
import asyncio
from src.vllm_resilience_sdk.logging_config import setup_sdk_logging
from src.vllm_resilience_sdk import SystemInitializationEngine
from src.vllm_resilience_sdk.clients import ProductionVLLMClient

async def main():
    # 1. Initialize FAANG-style high-visibility console reporting
    setup_sdk_logging()
    
    # 2. Run background system verification checks
    verifier = SystemInitializationEngine(target_log_dir="./logs")
    verifier.run_pre_boot_pipeline()
    
    # 3. Instantiate the connection pooling client
    client = ProductionVLLMClient()
    await client.initialize_vllm_connection()
    
    # 4. Construct workload payload
    sample_payload = {
        "model": "LocalModel",
        "messages": [{"role": "user", "content": "Analyze PCB raw schematics metadata."}]
    }
    
    print("\n--- AI Engine Stream Output Response ---")
    
    # 5. Stream processed text tokens seamlessly
    async for token in client.send_inference_request(sample_payload):
        print(token, end="", flush=True)
        
    print("\n----------------------------------------\n")
    
    # 6. Safely flush socket channels on teardown
    await client.close_vllm_connection()

if __name__ == "__main__":
    asyncio.run(main()) 

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

open_vllm_sdk-1.0.0.tar.gz (10.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

open_vllm_sdk-1.0.0-py3-none-any.whl (12.6 kB view details)

Uploaded Python 3

File details

Details for the file open_vllm_sdk-1.0.0.tar.gz.

File metadata

  • Download URL: open_vllm_sdk-1.0.0.tar.gz
  • Upload date:
  • Size: 10.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.8.22

File hashes

Hashes for open_vllm_sdk-1.0.0.tar.gz
Algorithm Hash digest
SHA256 72e92cd38d08dfe27a6a56aa0360c480938d4d17e68c79b2bd7453914554cc2d
MD5 db8ec080a2dba44183bfbf236024ebca
BLAKE2b-256 b85e5c54bcd390a34c898982cdad874abd75bd4e8d0d86f3ca130da92dceb1e7

See more details on using hashes here.

File details

Details for the file open_vllm_sdk-1.0.0-py3-none-any.whl.

File metadata

File hashes

Hashes for open_vllm_sdk-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 ae8794e3c76b17feb2a73f799ac057858889d012f0e06ca742e872dbb80dc8f8
MD5 7731b0a6c06b01f842e1336c001e0bb3
BLAKE2b-256 40dca9abc134872df65a75e09c16495a0c4f36bada330b7b5303a2d77a7b2413

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page