Skip to main content

High-Performance Middleware Engine for LLMs

Project description

Velox-Core ⚡

The High-Performance Middleware Engine for LLMs

Velox is not just another API wrapper. It is a motor designed for professional AI developers who need control, observability, and resilience.

Velox Core Futuristic Engine

🌟 Why Velox?

Building production-grade AI applications is more than just sending a prompt to OpenAI. It requires handling complex infrastructure challenges that often clutter business logic. We built Velox to solve these core pain points:

  1. The "Spaghetti Pipeline" Problem: Most developers manually chain retries, logs, and caches. Velox uses a Middleware Architecture (inspired by Express.js/FastAPI) to keep your code clean and modular.
  2. Uncontrolled Costs: High-end models like GPT-4 are expensive. Velox's Semantic Router automatically diverts trivial fluff to cheaper models, while the Cost Optimizer prevents recursive agents from draining your budget.
  3. Privacy & Compliance: Sending user data to third-party LLMs is a risk. PII Guard ensures sensitive data (emails, phones, credit cards) is redacted before it leaves your infrastructure.
  4. Observability Vacuum: Standard SDKs don't tell you why a request failed or how much it actually cost in real-time. Velox's Advanced Dashboard provides a "Hacker-style" TUI for live telemetry.

🆕 New in v0.1.0

We've just supercharged Velox with these powerful additions:

  • Multi-Provider Support: Seamlessly switch between OpenAI, Claude (Anthropic), Gemini (Google), Mistral, and Groq.
  • Semantic Routing: Native intelligence to route simple prompts to cheaper models automatically.
  • PII Guard: Built-in regex engine to redact sensitive information before it hits the cloud.
  • Cost Optimizer: Strict USD budget enforcement at the middleware level.

📦 Installation

# Clone the repository
git clone https://github.com/your-repo/velox-core
cd velox-core

# Install in development mode
pip install -e .

🚀 Usage Guide

Velox is designed to be progressive. Start simple, then stack power.

1. The Fundamental Motor

The simplest way to get started. Just a provider and a prompt.

import asyncio
from velox.core.engine import Velox
from velox.providers.mock import MockProvider

async def main():
    motor = Velox()
    motor.use(MockProvider(response_text="Ignition successful."))
    
    response = await motor.prompt("Status?")
    print(f"Velox: {response}")

if __name__ == "__main__":
    asyncio.run(main())

2. The Production Stack (Security & Speed)

This is how you use Velox in a real app: layered with security and optimization.

from velox.core.engine import Velox
from velox.providers.openai import OpenAIProvider
from velox.layers import (
    AdvancedDashboardLayer, 
    PIIGuardLayer, 
    SemanticRouterLayer,
    CostOptimizerLayer
)

async def main():
    motor = Velox()
    
    # Order matters: Redact -> Route -> Limit -> Log
    motor.add(PIIGuardLayer())               # 1. Zero-trust PII masking
    motor.add(SemanticRouterLayer(           # 2. Fast-path trivial queries
        simple_model="gpt-3.5-turbo",
        threshold_chars=40 
    ))
    motor.add(CostOptimizerLayer(            # 3. Budget safety net
        max_cost_usd=0.05, 
        per_request=True 
    ))
    motor.add(AdvancedDashboardLayer())      # 4. Live telemetry UI
    
    motor.use(OpenAIProvider(model="gpt-4o"))
    
    # If this message contains an email, it's redacted before GPT-4 sees it.
    # If the message is just "Hi", it stays on GPT-3.5.
    await motor.prompt("My email is ceo@company.com, summarize this project.")

3. Agentic & Shadow Mode

Test new models in the background without affecting users.

from velox.layers.shadow import ShadowLayer

# Main model handles the user. 
# Shadow model (Llama-3) runs in parallel; 
# results are logged for performance comparison.
motor.add(ShadowLayer(llama_provider, name="Llama-Testing"))
motor.use(gpt4_provider)

🧩 Supported Providers

Velox is engine-agnostic. Use any of the major providers with a consistent interface:

Provider Class Notes
OpenAI OpenAIProvider Native SDK integration
Anthropic AnthropicProvider High-speed httpx implementation
Google GoogleGeminiProvider Official Gemini SDK support
Mistral MistralProvider OpenAI-compatible via httpx
Groq GroqProvider Ultra-fast inference via httpx

🧩 Middleware Layers Table

Layer Purpose Key Feature
PIIGuardLayer Data Privacy Automatic Regex-based PII redaction (Email, Phone, etc.)
SemanticRouterLayer Intelligence Routes prompts to different models based on complexity
CostOptimizerLayer Safety Enforces budget limits in USD to prevent overspending
AdvancedDashboard Observability Live Hacker-style TUI with metrics and telemetry
AutoToolingLayer Agentic Simplifies function calling and tool orchestration
ShadowLayer Testing Runs background models for A/B performance comparison
RetryLayer Resilience Configurable exponential backoff for API failures
CacheLayer Efficiency Exact and Semantic caching to save time/tokens

👨‍💻 Author

Zied Boughdir

📄 License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

velox_core-0.1.0.tar.gz (930.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

velox_core-0.1.0-py3-none-any.whl (22.5 kB view details)

Uploaded Python 3

File details

Details for the file velox_core-0.1.0.tar.gz.

File metadata

  • Download URL: velox_core-0.1.0.tar.gz
  • Upload date:
  • Size: 930.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.2

File hashes

Hashes for velox_core-0.1.0.tar.gz
Algorithm Hash digest
SHA256 b3c79caaff4915c02268a0bf88ae12ec1d8a8a173b7e1027a467b45f9a9b8749
MD5 1350a8db2ab983d307fedafced791f35
BLAKE2b-256 93e3633b01aa464fe2b3d4c84f6709de281e7b45162e0b8b0a5e4a014acbb7d5

See more details on using hashes here.

File details

Details for the file velox_core-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: velox_core-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 22.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.2

File hashes

Hashes for velox_core-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 a445245b35fca8b0a20e86cc6a411be870d828751e0ce885921f55693dc36f0f
MD5 f08d0ffa363f0e10c4061e8f053c36dd
BLAKE2b-256 208cb273a45cd6dbc4f4ea7ed9540ee8f44d897c55141ec24c577543b575fc37

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page