OpenAI HTTP Proxy is an OpenAI-compatible http proxy server for inferencing various LLMs capable of working with Google, Anthropic, OpenAI APIs, local PyTorch inference, etc.

These details have not been verified by PyPI

Project links

Project description

OpenAI HTTP Proxy

Lightweight, OpenAI-compatible HTTP proxy server / gateway
unifying access to multiple Large Language Model providers and local inference
through a single, standardized API endpoint.

Code Coverage

Built with Python, FastAPI and MicroCore, OpenAI HTTP Proxy seamlessly integrates cloud providers like Google, Anthropic, and OpenAI, as well as local PyTorch-based inference, while maintaining full compatibility with OpenAI's API format.

It works as a drop-in replacement for OpenAI's API, allowing you to switch between cloud providers and local models without modifying your existing client code.

OpenAI HTTP Proxy supports real-time token streaming, secure Virtual API key management, and can be used both as an importable Python library and as a standalone HTTP service. Whether you're building production applications or experimenting with different models, OpenAI HTTP Proxy eliminates integration complexity and keeps your codebase provider-agnostic.

Overview
Features
Getting Started
- Installation
- Quick Start
Configuration
- Basic Structure
- Environment Variables
Proxy API Keys vs. Provider API Keys
API Usage
- Chat Completions Endpoint
- Models List Endpoint
User Groups Configuration
Advanced Usage
Add-on Components
- Database Connector
Request Handlers (Middleware)
Guides & Reference
Known Limitations
Debugging
Contributing
License

✨ Features

Provider Agnostic: Connect to OpenAI, Anthropic, Google AI, local models, and more using a single API
Unified Interface: Access all models through the standard OpenAI API format
Dynamic Routing: Route requests to different LLM providers based on model name patterns
Stream Support: Full streaming support for real-time responses
API Key Management: Configurable API key validation and access control
Easy Configuration: Simple TOML/YAML/JSON/Python configuration files for setup
Extensible by Design: Minimal core with clearly defined extension points, enabling seamless customization and expansion without modifying the core system.

🚀 Getting Started

Requirements

Python 3.11 | 3.12 | 3.13

Installation

pip install openai-http-proxy

For proxying to Anthropic API or Google Gemini via Vertex AI or Google AI Studio, install optional dependencies:

pip install openai-http-proxy[anthropic,google]

pip install openai-http-proxy[all]

Quick Start

1. Create a `config.toml` file:

host = "0.0.0.0"
port = 8000

[connections]
[connections.openai]
api_type = "open_ai"
api_base = "https://api.openai.com/v1/"
api_key = "env:OPENAI_API_KEY"

[connections.anthropic]
api_type = "anthropic"
api_key = "env:ANTHROPIC_API_KEY"

[routing]
"gpt*" = "openai.*"
"claude*" = "anthropic.*"
"*" = "openai.gpt-3.5-turbo"

[groups.default]
api_keys = ["YOUR_API_KEY_HERE"]

Note ℹ️ To enhance security, consider storing upstream API keys in operating system environment variables rather than embedding them directly in the configuration file. You can reference these variables in the configuration using the env:<VAR_NAME> syntax.

2. Start the server:

openai-http-proxy

Alternatively, run it as a Python module:

python -m lm_proxy

3. Use it with any OpenAI-compatible client:

from openai import OpenAI

client = OpenAI(
    api_key="YOUR_API_KEY_HERE",
    base_url="http://localhost:8000/v1"
)

completion = client.chat.completions.create(
    model="gpt-5",  # This will be routed to OpenAI based on config
    messages=[{"role": "user", "content": "Hello, world!"}]
)
print(completion.choices[0].message.content)

Or use the same endpoint with Claude models:

completion = client.chat.completions.create(
    model="claude-opus-4-1-20250805",  # This will be routed to Anthropic based on config
    messages=[{"role": "user", "content": "Hello, world!"}]
)

📝 Configuration

OpenAI HTTP Proxy is configured through a TOML/YAML/JSON/Python file that specifies connections, routing rules, and access control.

Basic Structure

host = "0.0.0.0"  # Interface to bind to
port = 8000       # Port to listen on
dev_autoreload = false  # Enable for development

# API key validation function (optional)
api_key_check = "lm_proxy.api_key_check.check_api_key_in_config"

# LLM Provider Connections
[connections]

[connections.openai]
api_type = "open_ai"
api_base = "https://api.openai.com/v1/"
api_key = "env:OPENAI_API_KEY"

[connections.google]
api_type = "google"
api_key = "env:GOOGLE_API_KEY"

[connections.anthropic]
api_type = "anthropic"
api_key  = "env:ANTHROPIC_API_KEY"

# Routing rules (model_pattern = "connection.model")
[routing]
"gpt*" = "openai.*"     # Route all GPT models to OpenAI
"claude*" = "anthropic.*"  # Route all Claude models to Anthropic
"gemini*" = "google.*"  # Route all Gemini models to Google
"*" = "openai.gpt-3.5-turbo"  # Default fallback

# Access control groups
[groups.default]
api_keys = [
    "KEY1",
    "KEY2"
]

# optional
[[loggers]]
class = 'lm_proxy.loggers.BaseLogger'
[loggers.log_writer]
class = 'lm_proxy.loggers.log_writers.JsonLogWriter'
file_name = 'storage/json.log'
[loggers.entry_transformer]
class = 'lm_proxy.loggers.LogEntryTransformer'
completion_tokens = "response.usage.completion_tokens"
prompt_tokens = "response.usage.prompt_tokens"
prompt = "request.messages"
response = "response"
group = "group"
connection = "connection"
api_key_id = "api_key_id"
remote_addr = "remote_addr"
created_at = "created_at"
duration = "duration"

Environment Variables

You can reference environment variables in your configuration file by prefixing values with env:.

For example:

[connections.openai]
api_key = "env:OPENAI_API_KEY"

At runtime, OpenAI HTTP Proxy automatically retrieves the value of the target variable (OPENAI_API_KEY) from your operating system's environment or from a .env file, if present.

.env Files

By default, OpenAI HTTP Proxy looks for a .env file in the current working directory and loads environment variables from it.

You can refer to the .env.template file for an example:

OPENAI_API_KEY=sk-u........
GOOGLE_API_KEY=AI........
ANTHROPIC_API_KEY=sk-ant-api03--vE........

# "1", "TRUE", "YES", "ON", "ENABLED", "Y", "+" are true, case-insensitive.
# See https://github.com/Nayjest/ai-microcore/blob/v4.4.3/microcore/configuration.py#L36
LM_PROXY_DEBUG=no

You can also control .env file usage with the --env command-line option:

# Use a custom .env file path
openai-http-proxy --env="path/to/your/.env"
# Disable .env loading
openai-http-proxy --env=""

🔑 Proxy API Keys vs. Provider API Keys

OpenAI HTTP Proxy utilizes two distinct types of API keys to facilitate secure and efficient request handling.

Proxy API Key (Virtual API Key, Client API Key):
A unique key generated and managed within OpenAI HTTP Proxy.
Clients use these keys to authenticate their requests to the proxy's API endpoints.
Each Client API Key is associated with a specific group, which defines the scope of access and permissions for the client's requests.
These keys allow users to securely interact with the proxy without direct access to external service credentials.
Provider API Key (Upstream API Key): A key provided by external LLM inference providers (e.g., OpenAI, Anthropic, Mistral, etc.) and configured within the OpenAI HTTP Proxy.
The proxy uses these keys to authenticate and forward validated client requests to the respective external services.
Provider API Keys remain hidden from end users, ensuring secure and transparent communication with provider APIs.

This distinction ensures a clear separation of concerns: Virtual API Keys manage user authentication and access within the proxy, while Upstream API Keys handle secure communication with external providers.

🔌 API Usage

OpenAI HTTP Proxy implements the OpenAI chat completions API endpoint. You can use any OpenAI-compatible client to interact with it.

Chat Completions Endpoint

POST /v1/chat/completions

Request Format

{
  "model": "gpt-3.5-turbo",
  "messages": [
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": "What is the capital of France?"}
  ],
  "temperature": 0.7,
  "stream": false
}

Response Format

{
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "The capital of France is Paris."
      },
      "finish_reason": "stop"
    }
  ]
}

Models List Endpoint

List and describe all models available through the API.

GET /v1/models

The OpenAI HTTP Proxy dynamically builds the models list based on routing rules defined in config.routing.
Routing keys can reference both exact model names and model name patterns (e.g., "gpt*", "claude*", etc.).

By default, wildcard patterns are displayed as-is in the models list (e.g., "gpt*", "claude*").
This behavior can be customized via the model_listing_mode configuration option:

model_listing_mode = "as_is" | "ignore_wildcards" | "expand_wildcards"

Available modes:

as_is (default) — Lists all entries exactly as defined in the routing configuration, including wildcard patterns.
ignore_wildcards — Excludes wildcard patterns, showing only explicitly defined model names.
expand_wildcards — Expands wildcard patterns by querying each connected backend for available models (feature not yet implemented).

To obtain a complete and accurate model list in the current implementation, all supported models must be explicitly defined in the routing configuration, for example:

[routing]
"gpt-4" = "my_openai_connection.*"
"gpt-5" = "my_openai_connection.*"
"gpt-8"= "my_openai_connection.gpt-3.5-turbo"
"claude-4.5-sonnet" = "my_anthropic_connection.claude-sonnet-4-5-20250929"
"claude-4.1-opus" = "my_anthropic_connection.claude-opus-4-1-20250805"
[connections]
[connections.my_openai_connection]
api_type = "open_ai"
api_base = "https://api.openai.com/v1/"
api_key  = "env:OPENAI_API_KEY"
[connections.my_anthropic_connection]
api_type = "anthropic"
api_key  = "env:ANTHROPIC_API_KEY"

Response Format

{
  "object": "list",
  "data": [
    {
      "id": "gpt-6",
      "object": "model",
      "created": 1686935002,
      "owned_by": "organization-owner"
    },
    {
      "id": "claude-5-sonnet",
      "object": "model",
      "created": 1686935002,
      "owned_by": "organization-owner"
    }
  ]
}

🔒 User Groups Configuration

The [groups] section in the configuration defines access control rules for different user groups.
Each group can have its own set of virtual API keys and permitted connections.

Basic Group Definition

[groups.default]
api_keys = ["KEY1", "KEY2"]
allowed_connections = "*"  # Allow access to all connections

Group-based Access Control

You can create multiple groups to segment your users and control their access:

# Admin group with full access
[groups.admin]
api_keys = ["ADMIN_KEY_1", "ADMIN_KEY_2"]
allowed_connections = "*"  # Access to all connections

# Regular users with limited access
[groups.users]
api_keys = ["USER_KEY_1", "USER_KEY_2"]
allowed_connections = "openai,anthropic"  # Only allowed to use specific connections

# Free tier with minimal access
[groups.free]
api_keys = ["FREE_KEY_1", "FREE_KEY_2"]
allowed_connections = "openai"  # Only allowed to use OpenAI connection

Connection Restrictions

The allowed_connections parameter controls which upstream providers a group can access:

"*" - Group can use all configured connections
"openai,anthropic" - Comma-separated list of specific connections the group can use

This allows fine-grained control over which users can access which AI providers, enabling features like:

Restricting expensive models to premium users
Creating specialized access tiers for different user groups
Implementing usage quotas per group
Billing and cost allocation by user group

Virtual API Key Validation

Overview

OpenAI HTTP Proxy includes 2 built-in methods for validating Virtual API keys:

lm_proxy.api_key_check.check_api_key_in_config - verifies API keys against those defined in the config file; used by default
lm_proxy.api_key_check.CheckAPIKeyWithRequest - validates API keys via an external HTTP service

The API key check method can be configured using the api_key_check configuration key.
Its value can be either a reference to a Python function in the format my_module.sub_module1.sub_module2.fn_name, or an object containing parameters for a class-based validator.

In the .py config representation, the validator function can be passed directly as a callable.

Example configuration for external API key validation using HTTP request to Keycloak / OpenID Connect

This example shows how to validate API keys against an external service (e.g., Keycloak):

[api_key_check]
class = "lm_proxy.api_key_check.CheckAPIKeyWithRequest"
method = "POST"
url = "http://keycloak:8080/realms/master/protocol/openid-connect/userinfo"
response_as_user_info = true  # interpret response JSON as user info object for further processing / logging
use_cache = true  # requires installing cachetools if True: pip install cachetools
cache_ttl = 60  # Cache duration in seconds

[api_key_check.headers]
Authorization = "Bearer {api_key}"

Custom API Key Validation / Extending functionality

For more advanced authentication needs, you can implement a custom validator function:

# my_validators.py
def validate_api_key(api_key: str) -> str | None:
    """
    Validate an API key and return the group name if valid.
    
    Args:
        api_key: The API key to validate
        
    Returns:
        The name of the group if valid, None otherwise
    """
    if api_key == "secret-key":
        return "admin"
    elif api_key.startswith("user-"):
        return "users"
    return None

Then reference it in your config:

api_key_check = "my_validators.validate_api_key"

NOTE In this case, the api_keys lists in groups are ignored, and the custom function is responsible for all validation logic.

🛠️ Advanced Usage

Dynamic Model Routing

The routing section allows flexible pattern matching with wildcards:

[routing]
"gpt-4*" = "openai.gpt-4"           # Route gpt-4 requests to OpenAI GPT-4
"gpt-3.5*" = "openai.gpt-3.5-turbo" # Route gpt-3.5 requests to OpenAI
"claude*" = "anthropic.*"           # Pass model name as-is to Anthropic
"gemini*" = "google.*"              # Pass model name as-is to Google
"custom*" = "local.llama-7b"        # Map any "custom*" to a specific local model
"*" = "openai.gpt-3.5-turbo"        # Default fallback for unmatched models

Keys are model name patterns (with * wildcard support), and values are connection/model mappings. Connection names reference those defined in the [connections] section.

Load Balancing Example

Simple load-balancer configuration
This example demonstrates how to set up a load balancer that randomly distributes requests across multiple language model servers using the lm_proxy.

Google Vertex AI Configuration Example

vertex-ai.toml This example demonstrates how to connect OpenAI HTTP Proxy to Google Gemini model via Vertex AI API

Using Tokens from OIDC Provider as Virtual/Client API Keys

You can configure OpenAI HTTP Proxy to validate tokens from OpenID Connect (OIDC) providers like Keycloak, Auth0, or Okta as API keys.

The following configuration validates Keycloak access tokens by calling the userinfo endpoint:

[api_key_check]
class = "lm_proxy.api_key_check.CheckAPIKeyWithRequest"
method = "POST"
url = "http://keycloak:8080/realms/master/protocol/openid-connect/userinfo"
response_as_user_info = true
use_cache = true
cache_ttl = 60

[api_key_check.headers]
Authorization = "Bearer {api_key}"

Configuration Parameters:

class - The API key validation handler class (lm_proxy.api_key_check.CheckAPIKeyWithRequest)
method - HTTP method for the validation request (typically POST or GET)
url - The OIDC provider's userinfo endpoint URL
response_as_user_info - Parse the response as user information for further usage in OpenAI HTTP Proxy (extend logged info, determine user group, etc.)
use_cache - Enable caching of validation results (requires installing the cachetools package if enabled: pip install cachetools)
cache_ttl - Cache time-to-live in seconds (reduces load on identity provider)
headers - Dictionary of headers to send with the validation request

Note: The {api_key} placeholder can be used in headers or in the URL. OpenAI HTTP Proxy substitutes it with the API key from the client to perform the check.

Usage:

Clients pass their OIDC access token as the API key when making requests to OpenAI HTTP Proxy.

🪝 Request Handlers (Middleware)

Handlers intercept and modify requests before they reach the upstream LLM provider. They enable cross-cutting concerns such as rate limiting, logging, auditing, and header manipulation.

Handlers are defined in the before list within the configuration file and execute sequentially in the order specified.

Built-in Handlers

OpenAI HTTP Proxy includes several built-in handlers for common operational needs.

Rate Limiter

The RateLimiter protects upstream credentials and manages traffic load using a sliding window algorithm.

Parameters:

Parameter	Type	Description
`max_requests`	int	Maximum number of requests allowed per window
`window_seconds`	int	Duration of the sliding window in seconds
`per`	string	Scope of the limit: `api_key`, `ip`, `connection`, `group`, or `global`

Configuration:

[[before]]
class = "lm_proxy.handlers.RateLimiter"
max_requests = 10
window_seconds = 60
per = "api_key"

[[before]]
class = "lm_proxy.handlers.RateLimiter"
max_requests = 1000
window_seconds = 300
per = "global"

HTTP Headers Forwarder

The HTTPHeadersForwarder passes specific headers from incoming client requests to the upstream provider—useful for distributed tracing or tenant context propagation.

Sensitive headers (Authorization, Host, Content-Length) are stripped by default to prevent protocol corruption and credential leaks.

[[before]]
class = "lm_proxy.handlers.HTTPHeadersForwarder"
white_list_headers = ["x-trace-id", "x-correlation-id", "x-tenant-id"]

Custom Handlers

Extend functionality by implementing custom handlers in Python. A handler is any callable (function or class instance) that accepts a RequestContext.

Interface

from lm_proxy.base_types import RequestContext

async def my_custom_handler(ctx: RequestContext) -> None:
    # Implementation here
    pass

Example: Audit Logger

# my_extensions.py
import logging
from lm_proxy.base_types import RequestContext

class AuditLogger:
    def __init__(self, prefix: str = "AUDIT"):
        self.prefix = prefix

    async def __call__(self, ctx: RequestContext) -> None:
        user = ctx.user_info.get("name", "anonymous")
        logging.info(f"[{self.prefix}] User '{user}' requested model '{ctx.model}'")

Registration:

[[before]]
class = "my_extensions.AuditLogger"
prefix = "SECURITY_AUDIT"

🧩 Add-on Components

Database Connector

openai-http-proxy-db-connector is a lightweight SQLAlchemy-based connector that enables OpenAI HTTP Proxy to work with relational databases including PostgreSQL, MySQL/MariaDB, SQLite, Oracle, Microsoft SQL Server, and many others.

Key Features:

Configure database connections directly through OpenAI HTTP Proxy configuration
Share database connections across components, extensions, and custom functions
Built-in database logger for structured logging of AI request data

📚 Guides & Reference

For more detailed information, check out these articles:

HTTP Header Management

🚧 Known Limitations

Multiple generations (n > 1): When proxying requests to Google or Anthropic APIs, only the first generation is returned. Multi-generation support is tracked in #35.
Model listing with wildcards / forwarding actual model metadata: The /v1/models endpoint does not query upstream providers to expand wildcard patterns (e.g., gpt*) or fetch model metadata. Only explicitly defined model names are listed #36.

🔍 Debugging

Overview

When debugging mode is enabled, OpenAI HTTP Proxy provides detailed logging information to help diagnose issues:

Stack traces for exceptions are shown in the console
Logging level is set to DEBUG instead of INFO

Warning ⚠️
Never enable debugging mode in production environments, as it may expose sensitive information to the application logs.

Enabling Debugging Mode

To enable debugging, set the LM_PROXY_DEBUG environment variable to a truthy value (e.g., "1", "true", "yes").

Tip 💡
Environment variables can also be defined in a .env file.

Alternatively, you can enable or disable debugging via the command-line arguments:

--debug to enable debugging
--no-debug to disable debugging

Note ℹ️
CLI arguments override environment variable settings.

🤝 Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

Fork the repository
Create your feature branch (git checkout -b feature/amazing-feature)
Commit your changes (git commit -m 'Add some amazing feature')
Push to the branch (git push origin feature/amazing-feature)
Open a Pull Request

📄 License

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

3.2.2

Apr 2, 2026

3.2.1

Mar 31, 2026

3.2.0

Mar 30, 2026

3.1.0

Mar 25, 2026

3.0.2

Feb 19, 2026

This version

3.0.1

Feb 10, 2026

3.0.0

Feb 5, 2026

3.0.0.dev1 pre-release

Jan 20, 2026

2.1.1

Nov 20, 2025

2.1.0

Nov 2, 2025

2.0.0

Oct 26, 2025

1.1.0

Oct 15, 2025

1.0.0

Oct 15, 2025

0.4.0

Oct 14, 2025

0.3.0

Oct 9, 2025

0.2.2

Oct 8, 2025

0.2.1

Aug 28, 2025

0.2.0

Aug 27, 2025

0.0.3

May 24, 2025

0.0.2

May 24, 2025

0.0.1

May 24, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

openai_http_proxy-3.0.1.tar.gz (28.0 kB view details)

Uploaded Feb 10, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

openai_http_proxy-3.0.1-py3-none-any.whl (35.1 kB view details)

Uploaded Feb 10, 2026 Python 3

File details

Details for the file openai_http_proxy-3.0.1.tar.gz.

File metadata

Download URL: openai_http_proxy-3.0.1.tar.gz
Upload date: Feb 10, 2026
Size: 28.0 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.1

File hashes

Hashes for openai_http_proxy-3.0.1.tar.gz
Algorithm	Hash digest
SHA256	`07b178f0a406e7d57fb7525f651d1991439b5515667d082f82e62dd3dc6ff31a`
MD5	`e8875fff218278dd75bb9c91e59a17ac`
BLAKE2b-256	`776cb24fe7338aa58746da04b4107348c9a4b2484a86d395cc632d9a54117599`

See more details on using hashes here.

File details

Details for the file openai_http_proxy-3.0.1-py3-none-any.whl.

File metadata

Download URL: openai_http_proxy-3.0.1-py3-none-any.whl
Upload date: Feb 10, 2026
Size: 35.1 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.1

File hashes

Hashes for openai_http_proxy-3.0.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`1807820d85be0ae80d77b4fbe284ec62892a5966c47bb3cf90aec04f6799bc32`
MD5	`d3c37d827a96f9fe8ff4fddceb1d6d82`
BLAKE2b-256	`9b6e92e880debb0f566c3616e279c0b64cc0ba267f5ddc87142fee30d5f24def`

See more details on using hashes here.

openai-http-proxy 3.0.1

Navigation

Verified details

Maintainers

Meta

Unverified details

Project links

Meta

Classifiers

Project description

OpenAI HTTP Proxy

Table of Contents

✨ Features

🚀 Getting Started

Requirements

Installation

Quick Start

1. Create a config.toml file:

2. Start the server:

3. Use it with any OpenAI-compatible client:

📝 Configuration

Basic Structure

Environment Variables

.env Files

🔑 Proxy API Keys vs. Provider API Keys

🔌 API Usage

Chat Completions Endpoint

Request Format

Response Format

Models List Endpoint

Response Format

🔒 User Groups Configuration

Basic Group Definition

Group-based Access Control

Connection Restrictions

Virtual API Key Validation

Overview

Example configuration for external API key validation using HTTP request to Keycloak / OpenID Connect

Custom API Key Validation / Extending functionality

🛠️ Advanced Usage

Dynamic Model Routing

Load Balancing Example

Google Vertex AI Configuration Example

Using Tokens from OIDC Provider as Virtual/Client API Keys

🪝 Request Handlers (Middleware)

Built-in Handlers

Rate Limiter

HTTP Headers Forwarder

Custom Handlers

Interface

Example: Audit Logger

🧩 Add-on Components

Database Connector

📚 Guides & Reference

🚧 Known Limitations

🔍 Debugging

Overview

Enabling Debugging Mode

🤝 Contributing

📄 License

Project details

Verified details

Maintainers

Meta

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

1. Create a `config.toml` file: