Skip to main content

A lightweight LLM API proxy framework with network connectivity and usage analysis

Project description

LessLLM v0.2.0

🚀 A lightweight, enterprise-grade LLM API proxy framework with comprehensive analytics and intelligent routing

Python 3.10+ License: MIT Version

Do more with less code/gpu/mem

✨ Key Features

🎯 Intelligent API Routing

  • Smart Format Conversion: Seamlessly switch between Claude Messages API and OpenAI Chat Completions API
  • Provider Transparency: Use any API format with any provider (Claude ↔ OpenAI)
  • Automatic Routing: Based on model names and endpoint types
  • Streaming Support: Real-time format conversion for streaming requests

📊 Enterprise Analytics

  • 100% Data Capture: Complete HTTP request/response logging to DuckDB
  • Performance Monitoring: TTFT, TPOT, throughput analysis
  • Cost Tracking: Token usage and cost estimation per request
  • Cache Analytics: Cache hit rates and savings calculation
  • Interactive Dashboard: Web-based analytics with real-time data

🔧 Multi-Provider Support

  • Claude (Anthropic): Direct Messages API support + Aliyun proxy compatibility
  • OpenAI: Chat Completions API with full feature parity
  • Extensible: Easy to add new providers
  • Load Balancing: Intelligent provider selection

🌐 Web Interface

  • Real-time Monitoring: Live request tracking and analysis
  • Interactive Tables: Click-to-view detailed request information
  • Data Visualization: Charts and graphs for usage patterns
  • SQL Query Interface: Custom analytics with pre-built templates

Quick Start

Installation

pip install -e .

Initialize Configuration

lessllm init --output lessllm.yaml

Start the Proxy Server

# Set your API keys
export OPENAI_API_KEY="your-openai-key"
export ANTHROPIC_API_KEY="your-claude-key"

# Start server
lessllm server --config lessllm.yaml --port 8000

Start the Analytics Dashboard GUI

# Start GUI (requires optional GUI dependencies)
pip install -e .[gui]
lessllm gui --port 8501 --host localhost

Use with OpenAI Client

import openai

# Point OpenAI client to LessLLM proxy
client = openai.OpenAI(
    base_url="http://localhost:8000/v1",
    api_key="dummy"  # LessLLM uses configured keys
)

response = client.chat.completions.create(
    model="gpt-4",
    messages=[{"role": "user", "content": "Hello!"}]
)

Configuration

Create a lessllm.yaml configuration file:

proxy:
  socks_proxy: "socks5://127.0.0.1:1080"  # Optional
  timeout: 30

providers:
  openai:
    api_key: "${OPENAI_API_KEY}"
  claude:
    api_key: "${ANTHROPIC_API_KEY}"

logging:
  enabled: true
  storage:
    type: "duckdb"
    db_path: "./lessllm_logs.db"

analysis:
  enable_cache_estimation: true
  enable_performance_tracking: true

Project Status

Current Implementation Status:

✅ Core Framework

  • ✅ Configuration management system
  • ✅ FastAPI proxy server
  • ✅ Network proxy support (HTTP/SOCKS)
  • ✅ Provider abstraction layer

✅ API Support

  • ✅ OpenAI provider implementation
  • ✅ Claude provider implementation
  • ✅ Streaming and non-streaming support

✅ Logging & Analysis

  • ✅ DuckDB storage system
  • ✅ Performance tracking (TTFT/TPOT)
  • ✅ Cache estimation algorithms
  • ✅ Cost calculation utilities

✅ Developer Tools

  • ✅ CLI interface
  • ✅ Example configurations
  • ✅ Test client scripts
  • ✅ Analytics dashboard GUI

Next Steps

🔄 In Progress:

  • Unit tests and integration tests
  • Documentation improvements
  • Performance optimizations

📋 Planned:

  • More LLM provider integrations
  • Advanced caching strategies
  • Batch request optimization

License

MIT License

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

lessllm-0.2.0.tar.gz (299.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

lessllm-0.2.0-py3-none-any.whl (40.9 kB view details)

Uploaded Python 3

File details

Details for the file lessllm-0.2.0.tar.gz.

File metadata

  • Download URL: lessllm-0.2.0.tar.gz
  • Upload date:
  • Size: 299.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.10.12

File hashes

Hashes for lessllm-0.2.0.tar.gz
Algorithm Hash digest
SHA256 0bc7ee01e450d4b64c885fe16bea997e8febe2c963eb14fb870e5facfca811af
MD5 16279fccf1275950646cc0eb91ecd5b4
BLAKE2b-256 fba2566eb988af1f5eed56cb4fffe87c408724c52d81e0f2971f81e82940c734

See more details on using hashes here.

File details

Details for the file lessllm-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: lessllm-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 40.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.10.12

File hashes

Hashes for lessllm-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 3516d445b311ec0d8c526114edf43168f2cc24a5b2cd4f7351c3a37d0bc234b2
MD5 0dea7de006683553483c7ce1169c5f1c
BLAKE2b-256 4693c5395100c7e801f52781153fcdc6c2dca3b3e73d8e63c659197a7682b62d

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page