Multi-tenant LLM proxy with CAPSEM security policy enforcement

Project description

CAPSEM Proxy

Multi-tenant LLM proxy with CAPSEM security policy enforcement. Provides transparent security monitoring and control for OpenAI and Google Gemini API requests while supporting streaming responses and tool calling.

Features

Multi-Provider Support: OpenAI and Google Gemini API proxying
Multi-tenant Architecture: API keys passed through from clients, never stored server-side
CAPSEM Security Integration: Real-time security policy enforcement at multiple interception points
Streaming Support: Full support for SSE streaming responses (both OpenAI and Gemini)
Tool Calling: Transparent proxy for tool calling (client-side execution)
API Compatible: Drop-in replacement for OpenAI and Gemini API base URLs
CORS Enabled: Ready for web client integration

Architecture

Client (OpenAI SDK / Gemini SDK / HTTP)
    ↓
CAPSEM Proxy (localhost:8000)
    ↓ CAPSEM Checks (prompt, tools, response)
    ↓
OpenAI API / Gemini API

Security Interception Points

on_model_call: Validates prompts before sending to LLM provider
on_tool_call: Validates tool definitions
on_model_response: Validates responses from LLM provider

Installation

# Install dependencies
uv sync

# Activate virtual environment
source .venv/bin/activate

Configuration

Create a .env file in the capsem/ directory with your API keys:

OPENAI_API_KEY=sk-...
GEMINI_API_KEY=AIza...

You can use one or both providers depending on your needs.

Usage

Start the Proxy

uvicorn proxy.server:app --host 127.0.0.1 --port 8000

Use with OpenAI SDK

from openai import OpenAI

# Point to the proxy
client = OpenAI(
    api_key="your-openai-key",  # Your key, passed through
    base_url="http://localhost:8000/v1"
)

# Use normally
response = client.chat.completions.create(
    model="gpt-5-nano",
    messages=[{"role": "user", "content": "Hello!"}]
)

Streaming Example

stream = client.chat.completions.create(
    model="gpt-5-nano",
    messages=[{"role": "user", "content": "Count to 5"}],
    stream=True
)

for chunk in stream:
    if chunk.choices[0].delta.content:
        print(chunk.choices[0].delta.content, end="")

Tool Calling Example

tools = [{
    "type": "function",
    "function": {
        "name": "get_weather",
        "description": "Get weather for a location",
        "parameters": {
            "type": "object",
            "properties": {
                "location": {"type": "string"}
            }
        }
    }
}]

response = client.chat.completions.create(
    model="gpt-5-nano",
    messages=[{"role": "user", "content": "Weather in Paris?"}],
    tools=tools
)

Use with Gemini SDK

from google import genai

# Configure client to use the proxy
client = genai.Client(
    api_key="your-gemini-key",  # Your key, passed through
    http_options={'base_url': 'http://localhost:8000', 'timeout': 60000}
)

# Use normally
response = client.models.generate_content(
    model='gemini-2.0-flash-exp',
    contents='Hello!'
)
print(response.text)

Use with Gemini (HTTP Client)

import httpx

# Make requests directly to the proxy
response = httpx.post(
    "http://localhost:8000/v1beta/models/gemini-2.0-flash-exp:generateContent",
    headers={"x-goog-api-key": "your-gemini-key"},  # Your key, passed through
    json={
        "contents": [
            {
                "role": "user",
                "parts": [{"text": "Hello!"}]
            }
        ]
    }
)

Gemini with Tools

tools = [{
    "functionDeclarations": [{
        "name": "get_weather",
        "description": "Get weather for a location",
        "parameters": {
            "type": "object",
            "properties": {
                "location": {
                    "type": "string",
                    "description": "City name"
                }
            },
            "required": ["location"]
        }
    }]
}]

response = httpx.post(
    "http://localhost:8000/v1beta/models/gemini-2.0-flash-exp:generateContent",
    headers={"x-goog-api-key": "your-gemini-key"},
    json={
        "contents": [{
            "role": "user",
            "parts": [{"text": "Weather in Paris?"}]
        }],
        "tools": tools
    }
)

Gemini Streaming

async with httpx.AsyncClient() as client:
    async with client.stream(
        "POST",
        "http://localhost:8000/v1beta/models/gemini-2.0-flash-exp:streamGenerateContent",
        headers={"x-goog-api-key": "your-gemini-key"},
        json={
            "contents": [{
                "role": "user",
                "parts": [{"text": "Count to 5"}]
            }]
        }
    ) as response:
        async for chunk in response.aiter_bytes():
            print(chunk.decode("utf-8"))

CAPSEM Security Policies

The proxy integrates with CAPSEM's DebugPolicy by default, which blocks:

Prompts containing capsem_block keyword
Tools with capsem_block in their name

Blocked requests return HTTP 403 with details:

{
  "detail": "Request blocked by security policy: Detected 'capsem_block' in prompt"
}

Testing

The test suite is organized into three categories for optimal development workflow:

Test Categories

Fast Tests (run by default) - ~0.6s
- Unit tests using FastAPI TestClient
- Mock tests with fake LLM responses
- Validation and error handling tests
- No external dependencies required
Integration Tests (@pytest.mark.integration) - skipped by default
- Tests requiring real API calls
- Tests requiring proxy server running on localhost:8000
- Requires valid API keys in .env file

Running Tests

# Run fast tests only (DEFAULT - recommended for development)
# Good for quick feedback before git push
pytest -v

# Run ALL tests including integration tests
# Requires API keys and may require proxy server running
pytest -v -m ""

# Run only integration tests
pytest -v -m integration

# Run specific test file
pytest tests/test_openai_proxy_mock.py -v

# Run specific test
pytest tests/test_openai_proxy.py::test_health_check -v

# Run with detailed output
pytest -v -s

Test Files

tests/
├── test_openai_proxy.py          # OpenAI integration tests
├── test_openai_proxy_mock.py     # OpenAI mock tests (fast)
├── test_gemini_proxy.py          # Gemini integration tests
└── test_gemini_proxy_mock.py     # Gemini mock tests (fast)

Mock Tests

Mock tests use unittest.mock to fake httpx responses, allowing you to test:

Request/response handling without real API calls
CAPSEM security policy enforcement (blocks dangerous requests before reaching provider)
Error handling and validation
Tool/function calling flows

Example mock test verifying CAPSEM blocks dangerous prompts:

def test_capsem_blocks_dangerous_prompt_mock(test_client, mock_httpx):
    """CAPSEM blocks prompts with 'capsem_block' keyword"""
    response = test_client.post(
        "/v1/chat/completions",
        headers={"Authorization": "Bearer sk-test-key"},
        json={
            "model": "gpt-5-nano",
            "messages": [{"role": "user", "content": "Tell me about capsem_block"}]
        }
    )

    # Verify blocked by CAPSEM (403), httpx never called
    assert response.status_code == 403
    assert "blocked by security policy" in response.json()["detail"].lower()

Integration Tests

Integration tests verify end-to-end functionality with real APIs:

Actual LLM responses
Streaming responses
Multi-turn tool calling
CAPSEM blocking with real providers

Requirements:

Valid API keys in .env file
For some tests: proxy server running on localhost:8000

# Start proxy server (in separate terminal)
uvicorn capsem_proxy.server:app --host 127.0.0.1 --port 8000

# Run integration tests
pytest -v -m integration

Test Configuration

Tests are configured in pyproject.toml:

[tool.pytest.ini_options]
markers = [
    "integration: requires proxy server or real API calls",
]
# By default, skip integration tests
addopts = "-m 'not integration'"

API Endpoints

Health Check

GET /health

Returns status and list of available providers

OpenAI Endpoints

Chat Completions

POST /v1/chat/completions

OpenAI-compatible endpoint supporting:

Non-streaming responses
Streaming responses (SSE)
Tool calling
CAPSEM security checks

Responses API

POST /v1/responses

OpenAI Responses API endpoint (requires newer OpenAI SDK version)

Gemini Endpoints

Generate Content

POST /v1beta/models/{model}:generateContent

Gemini API endpoint supporting:

Non-streaming responses
Function declarations (tools)
CAPSEM security checks

Stream Generate Content

POST /v1beta/models/{model}:streamGenerateContent

Gemini streaming endpoint (SSE)

Project Structure

capsem-proxy/
├── capsem_proxy/
│   ├── server.py              # FastAPI app
│   ├── api/
│   │   ├── openai.py          # OpenAI endpoints
│   │   └── gemini.py          # Gemini endpoints
│   ├── providers/
│   │   ├── openai.py          # OpenAI HTTP client
│   │   └── gemini.py          # Gemini HTTP client
│   ├── security/
│   │   └── identity.py        # API key hashing
│   └── capsem_integration.py  # CAPSEM SecurityManager
├── tests/
│   ├── test_openai_proxy.py       # OpenAI integration tests
│   ├── test_openai_proxy_mock.py  # OpenAI mock tests (fast)
│   ├── test_gemini_proxy.py       # Gemini integration tests
│   └── test_gemini_proxy_mock.py  # Gemini mock tests (fast)
└── pyproject.toml

Multi-Tenant Design

Each request is identified by a hashed user_id derived from the API key
API keys are NEVER stored on the server
All requests are logged with user_id for analytics
CAPSEM policies apply per-user automatically

Development

Adding New Endpoints

Create endpoint in capsem_proxy/api/openai.py or capsem_proxy/api/gemini.py
Add CAPSEM security checks at appropriate interception points
Forward request to provider (using httpx)
Write tests:
- Mock tests first (fast feedback)
- Integration tests for end-to-end validation

Adding New Providers

Create provider class in capsem_proxy/providers/
Implement HTTP client methods using httpx.AsyncClient
Add API router in capsem_proxy/api/
Register router in capsem_proxy/server.py
Write comprehensive test suite (mock + integration)

Test-Driven Development Workflow

Write mock tests first - Fast feedback on logic without external dependencies
Run tests frequently - pytest -v runs in ~0.6s
Add integration tests - Verify end-to-end with real APIs
Mark integration tests - Use @pytest.mark.integration decorator
CI/CD - Fast tests run on every commit, integration tests on demand

License

Licensed under the Apache License, Version 2.0

Project details

Release history Release notifications | RSS feed

This version

0.1.0

Oct 10, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

capsem_proxy-0.1.0.tar.gz (161.3 kB view details)

Uploaded Oct 10, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

capsem_proxy-0.1.0-py3-none-any.whl (19.2 kB view details)

Uploaded Oct 10, 2025 Python 3

File details

Details for the file capsem_proxy-0.1.0.tar.gz.

File metadata

Download URL: capsem_proxy-0.1.0.tar.gz
Upload date: Oct 10, 2025
Size: 161.3 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.9.1

File hashes

Hashes for capsem_proxy-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`5ba5ae38f258e64d9f802ff59000c5dbe302c93f71213a1351849dc1787b9ef8`
MD5	`d055794b986fa32a60588d753160a783`
BLAKE2b-256	`00510a4f70bfcf1c3e5e9be0fcdec1c013cc23b2169e4967211b9603e530460b`

See more details on using hashes here.

File details

Details for the file capsem_proxy-0.1.0-py3-none-any.whl.

File metadata

Download URL: capsem_proxy-0.1.0-py3-none-any.whl
Upload date: Oct 10, 2025
Size: 19.2 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.9.1

File hashes

Hashes for capsem_proxy-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`5a6ccc7ab3855b0c21604d2dedd3ff748e03bfdd8078d5f51b7c49d2c47d6ab5`
MD5	`aef7b28ae3339930068a08e69d3e3dfb`
BLAKE2b-256	`28b0713187d9ce9dd7995e53cca211b1892d21d3db06a7aa56950c0b84b9cdd6`

See more details on using hashes here.

capsem-proxy 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Meta

Project description

CAPSEM Proxy

Features

Architecture

Security Interception Points

Installation

Configuration

Usage

Start the Proxy

Use with OpenAI SDK

Streaming Example

Tool Calling Example

Use with Gemini SDK

Use with Gemini (HTTP Client)

Gemini with Tools

Gemini Streaming

CAPSEM Security Policies

Testing

Test Categories

Running Tests

Test Files

Mock Tests

Integration Tests

Test Configuration

API Endpoints

Health Check

OpenAI Endpoints

Chat Completions

Responses API

Gemini Endpoints

Generate Content

Stream Generate Content

Project Structure

Multi-Tenant Design

Development

Adding New Endpoints

Adding New Providers

Test-Driven Development Workflow

License

Project details

Verified details

Maintainers

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes