Multi-tenant LLM proxy with CAPSEM security policy enforcement
Project description
CAPSEM Proxy
Multi-tenant LLM proxy with CAPSEM security policy enforcement. Provides transparent security monitoring and control for OpenAI and Google Gemini API requests while supporting streaming responses and tool calling.
Features
- Multi-Provider Support: OpenAI and Google Gemini API proxying
- Multi-tenant Architecture: API keys passed through from clients, never stored server-side
- CAPSEM Security Integration: Real-time security policy enforcement at multiple interception points
- Streaming Support: Full support for SSE streaming responses (both OpenAI and Gemini)
- Tool Calling: Transparent proxy for tool calling (client-side execution)
- API Compatible: Drop-in replacement for OpenAI and Gemini API base URLs
- CORS Enabled: Ready for web client integration
Architecture
Client (OpenAI SDK / Gemini SDK / HTTP)
↓
CAPSEM Proxy (localhost:8000)
↓ CAPSEM Checks (prompt, tools, response)
↓
OpenAI API / Gemini API
Security Interception Points
- on_model_call: Validates prompts before sending to LLM provider
- on_tool_call: Validates tool definitions
- on_model_response: Validates responses from LLM provider
Installation
# Install dependencies
uv sync
# Activate virtual environment
source .venv/bin/activate
Configuration
Create a .env file in the capsem/ directory with your API keys:
OPENAI_API_KEY=sk-...
GEMINI_API_KEY=AIza...
You can use one or both providers depending on your needs.
Usage
Start the Proxy
uvicorn proxy.server:app --host 127.0.0.1 --port 8000
Use with OpenAI SDK
from openai import OpenAI
# Point to the proxy
client = OpenAI(
api_key="your-openai-key", # Your key, passed through
base_url="http://localhost:8000/v1"
)
# Use normally
response = client.chat.completions.create(
model="gpt-5-nano",
messages=[{"role": "user", "content": "Hello!"}]
)
Streaming Example
stream = client.chat.completions.create(
model="gpt-5-nano",
messages=[{"role": "user", "content": "Count to 5"}],
stream=True
)
for chunk in stream:
if chunk.choices[0].delta.content:
print(chunk.choices[0].delta.content, end="")
Tool Calling Example
tools = [{
"type": "function",
"function": {
"name": "get_weather",
"description": "Get weather for a location",
"parameters": {
"type": "object",
"properties": {
"location": {"type": "string"}
}
}
}
}]
response = client.chat.completions.create(
model="gpt-5-nano",
messages=[{"role": "user", "content": "Weather in Paris?"}],
tools=tools
)
Use with Gemini SDK
from google import genai
# Configure client to use the proxy
client = genai.Client(
api_key="your-gemini-key", # Your key, passed through
http_options={'base_url': 'http://localhost:8000', 'timeout': 60000}
)
# Use normally
response = client.models.generate_content(
model='gemini-2.0-flash-exp',
contents='Hello!'
)
print(response.text)
Use with Gemini (HTTP Client)
import httpx
# Make requests directly to the proxy
response = httpx.post(
"http://localhost:8000/v1beta/models/gemini-2.0-flash-exp:generateContent",
headers={"x-goog-api-key": "your-gemini-key"}, # Your key, passed through
json={
"contents": [
{
"role": "user",
"parts": [{"text": "Hello!"}]
}
]
}
)
Gemini with Tools
tools = [{
"functionDeclarations": [{
"name": "get_weather",
"description": "Get weather for a location",
"parameters": {
"type": "object",
"properties": {
"location": {
"type": "string",
"description": "City name"
}
},
"required": ["location"]
}
}]
}]
response = httpx.post(
"http://localhost:8000/v1beta/models/gemini-2.0-flash-exp:generateContent",
headers={"x-goog-api-key": "your-gemini-key"},
json={
"contents": [{
"role": "user",
"parts": [{"text": "Weather in Paris?"}]
}],
"tools": tools
}
)
Gemini Streaming
async with httpx.AsyncClient() as client:
async with client.stream(
"POST",
"http://localhost:8000/v1beta/models/gemini-2.0-flash-exp:streamGenerateContent",
headers={"x-goog-api-key": "your-gemini-key"},
json={
"contents": [{
"role": "user",
"parts": [{"text": "Count to 5"}]
}]
}
) as response:
async for chunk in response.aiter_bytes():
print(chunk.decode("utf-8"))
CAPSEM Security Policies
The proxy integrates with CAPSEM's DebugPolicy by default, which blocks:
- Prompts containing
capsem_blockkeyword - Tools with
capsem_blockin their name
Blocked requests return HTTP 403 with details:
{
"detail": "Request blocked by security policy: Detected 'capsem_block' in prompt"
}
Testing
The test suite is organized into three categories for optimal development workflow:
Test Categories
-
Fast Tests (run by default) - ~0.6s
- Unit tests using FastAPI TestClient
- Mock tests with fake LLM responses
- Validation and error handling tests
- No external dependencies required
-
Integration Tests (
@pytest.mark.integration) - skipped by default- Tests requiring real API calls
- Tests requiring proxy server running on localhost:8000
- Requires valid API keys in
.envfile
Running Tests
# Run fast tests only (DEFAULT - recommended for development)
# Good for quick feedback before git push
pytest -v
# Run ALL tests including integration tests
# Requires API keys and may require proxy server running
pytest -v -m ""
# Run only integration tests
pytest -v -m integration
# Run specific test file
pytest tests/test_openai_proxy_mock.py -v
# Run specific test
pytest tests/test_openai_proxy.py::test_health_check -v
# Run with detailed output
pytest -v -s
Test Files
tests/
├── test_openai_proxy.py # OpenAI integration tests
├── test_openai_proxy_mock.py # OpenAI mock tests (fast)
├── test_gemini_proxy.py # Gemini integration tests
└── test_gemini_proxy_mock.py # Gemini mock tests (fast)
Mock Tests
Mock tests use unittest.mock to fake httpx responses, allowing you to test:
- Request/response handling without real API calls
- CAPSEM security policy enforcement (blocks dangerous requests before reaching provider)
- Error handling and validation
- Tool/function calling flows
Example mock test verifying CAPSEM blocks dangerous prompts:
def test_capsem_blocks_dangerous_prompt_mock(test_client, mock_httpx):
"""CAPSEM blocks prompts with 'capsem_block' keyword"""
response = test_client.post(
"/v1/chat/completions",
headers={"Authorization": "Bearer sk-test-key"},
json={
"model": "gpt-5-nano",
"messages": [{"role": "user", "content": "Tell me about capsem_block"}]
}
)
# Verify blocked by CAPSEM (403), httpx never called
assert response.status_code == 403
assert "blocked by security policy" in response.json()["detail"].lower()
Integration Tests
Integration tests verify end-to-end functionality with real APIs:
- Actual LLM responses
- Streaming responses
- Multi-turn tool calling
- CAPSEM blocking with real providers
Requirements:
- Valid API keys in
.envfile - For some tests: proxy server running on
localhost:8000
# Start proxy server (in separate terminal)
uvicorn capsem_proxy.server:app --host 127.0.0.1 --port 8000
# Run integration tests
pytest -v -m integration
Test Configuration
Tests are configured in pyproject.toml:
[tool.pytest.ini_options]
markers = [
"integration: requires proxy server or real API calls",
]
# By default, skip integration tests
addopts = "-m 'not integration'"
API Endpoints
Health Check
GET /health
Returns status and list of available providers
OpenAI Endpoints
Chat Completions
POST /v1/chat/completions
OpenAI-compatible endpoint supporting:
- Non-streaming responses
- Streaming responses (SSE)
- Tool calling
- CAPSEM security checks
Responses API
POST /v1/responses
OpenAI Responses API endpoint (requires newer OpenAI SDK version)
Gemini Endpoints
Generate Content
POST /v1beta/models/{model}:generateContent
Gemini API endpoint supporting:
- Non-streaming responses
- Function declarations (tools)
- CAPSEM security checks
Stream Generate Content
POST /v1beta/models/{model}:streamGenerateContent
Gemini streaming endpoint (SSE)
Project Structure
capsem-proxy/
├── capsem_proxy/
│ ├── server.py # FastAPI app
│ ├── api/
│ │ ├── openai.py # OpenAI endpoints
│ │ └── gemini.py # Gemini endpoints
│ ├── providers/
│ │ ├── openai.py # OpenAI HTTP client
│ │ └── gemini.py # Gemini HTTP client
│ ├── security/
│ │ └── identity.py # API key hashing
│ └── capsem_integration.py # CAPSEM SecurityManager
├── tests/
│ ├── test_openai_proxy.py # OpenAI integration tests
│ ├── test_openai_proxy_mock.py # OpenAI mock tests (fast)
│ ├── test_gemini_proxy.py # Gemini integration tests
│ └── test_gemini_proxy_mock.py # Gemini mock tests (fast)
└── pyproject.toml
Multi-Tenant Design
- Each request is identified by a hashed
user_idderived from the API key - API keys are NEVER stored on the server
- All requests are logged with
user_idfor analytics - CAPSEM policies apply per-user automatically
Development
Adding New Endpoints
- Create endpoint in
capsem_proxy/api/openai.pyorcapsem_proxy/api/gemini.py - Add CAPSEM security checks at appropriate interception points
- Forward request to provider (using httpx)
- Write tests:
- Mock tests first (fast feedback)
- Integration tests for end-to-end validation
Adding New Providers
- Create provider class in
capsem_proxy/providers/ - Implement HTTP client methods using
httpx.AsyncClient - Add API router in
capsem_proxy/api/ - Register router in
capsem_proxy/server.py - Write comprehensive test suite (mock + integration)
Test-Driven Development Workflow
- Write mock tests first - Fast feedback on logic without external dependencies
- Run tests frequently -
pytest -vruns in ~0.6s - Add integration tests - Verify end-to-end with real APIs
- Mark integration tests - Use
@pytest.mark.integrationdecorator - CI/CD - Fast tests run on every commit, integration tests on demand
License
Copyright 2025 Google LLC
Licensed under the Apache License, Version 2.0
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file capsem_proxy-0.1.0.tar.gz.
File metadata
- Download URL: capsem_proxy-0.1.0.tar.gz
- Upload date:
- Size: 161.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.9.1
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
5ba5ae38f258e64d9f802ff59000c5dbe302c93f71213a1351849dc1787b9ef8
|
|
| MD5 |
d055794b986fa32a60588d753160a783
|
|
| BLAKE2b-256 |
00510a4f70bfcf1c3e5e9be0fcdec1c013cc23b2169e4967211b9603e530460b
|
File details
Details for the file capsem_proxy-0.1.0-py3-none-any.whl.
File metadata
- Download URL: capsem_proxy-0.1.0-py3-none-any.whl
- Upload date:
- Size: 19.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.9.1
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
5a6ccc7ab3855b0c21604d2dedd3ff748e03bfdd8078d5f51b7c49d2c47d6ab5
|
|
| MD5 |
aef7b28ae3339930068a08e69d3e3dfb
|
|
| BLAKE2b-256 |
28b0713187d9ce9dd7995e53cca211b1892d21d3db06a7aa56950c0b84b9cdd6
|