Skip to main content

A lightweight, concurrency-safe credential orchestration runtime for AI API systems.

Project description

🗝️ KeyMesh

Lightweight, concurrency-safe credential orchestration for AI API systems.

PyPI version License: MIT Tool: uv

KeyMesh is a high-performance, framework-agnostic runtime designed to multiplex multiple API keys (e.g., OpenAI, Anthropic, Gemini) across highly concurrent workloads. It maximizes aggregate throughput by managing rate limits, cooldowns, and scheduling strategies—acting purely as a routing scheduler and cooldown manager.


✨ Features

  • 🚀 Maximized Throughput: Pool multiple lower-tier keys to act as a single high-throughput endpoint.
  • 🛡️ Concurrency Safe: Native asyncio and multi-threaded synchronous support with granular locks for high-frequency safe acquisition.
  • 🔌 Sync & Async Native: Identical features available in both async-first runtimes and standard synchronous/threaded architectures.
  • 🔄 Pluggable Schedulers: Choose between RoundRobin, LeastBusy, or Weighted strategies.
  • ❄️ Smart Cooldowns: Automatically detects rate limits (HTTP 429), parses Retry-After headers, and temporarily cools down keys.
  • 📊 Health Monitoring: Tracks latency using Exponential Moving Average (EMA), success rates, and consecutive failures to prune dead credentials.
  • 💾 Flexible Storage: Memory and JSON persistent backends for both async (MemoryStorage, JSONStorage) and sync (SyncMemoryStorage, SyncJSONStorage) runtimes.
  • 🔌 Zero Heavy Couplings: No hard runtime dependencies on specific client SDKs. Integrates natively via HTTP client adapters.

📦 Installation

KeyMesh is optimized for the uv package manager.

# Core package
uv add keymesh
pip install keymesh

# With OpenAI SDK integration support
uv add keymesh --optional openai
pip install keymesh[openai]

🚀 Recommended Approach: Transparent HTTP Client Handlers

The easiest, most robust way to integrate KeyMesh with the OpenAI SDK is using the built-in OpenAIHandler and AsyncOpenAIHandler.

These handlers subclass httpx.Client and httpx.AsyncClient respectively. When passed directly into the OpenAI SDK client constructor as the http_client, they intercept outgoing requests transparently to:

  1. Acquire a key from the pool automatically before the request starts.
  2. Inject the key dynamically into the request's Authorization header.
  3. Measure the latency of the request and record it on the key's stats upon success.
  4. Cool down the key if the server returns HTTP 429 (automatically parsing the Retry-After header if present).
  5. Prune / Mark Failed the key if connection errors or exceptions occur during transmission.

[!IMPORTANT] This approach keeps your code clean. You do not need to call pool.acquire(), pool.release(), or handle try/except blocks around key status updates manually. KeyMesh manages everything at the HTTP transport layer!

⚡ Asynchronous Integration (Recommended)

import asyncio
from openai import AsyncOpenAI
from keymesh import AsyncOpenAIHandler, SchedulerStrategy

async def main():
    # 1. Initialize the AsyncOpenAIHandler with your keys
    handler = AsyncOpenAIHandler(
        keys=["sk-key-1", "sk-key-2", "sk-key-3"],
        strategy=SchedulerStrategy.LEAST_BUSY,
        default_cooldown=60.0
    )

    # 2. Pass the handler directly as the http_client to AsyncOpenAI
    client = AsyncOpenAI(
        api_key="dummy-key",  # The dummy value is overridden dynamically per-request
        http_client=handler
    )

    try:
        # 3. Call the SDK normally! Key rotation & state management is 100% transparent.
        response = await client.chat.completions.create(
            model="gpt-4o",
            messages=[{"role": "user", "content": "Hello KeyMesh Async!"}]
        )
        print(f"Response: {response.choices[0].message.content}")
    finally:
        # 4. Gracefully close the handler to persist metrics/storage
        await handler.aclose()

asyncio.run(main())

🔌 Synchronous Integration (Thread-Safe)

from openai import OpenAI
from keymesh import OpenAIHandler, SchedulerStrategy

def main():
    # 1. Initialize the thread-safe OpenAIHandler
    handler = OpenAIHandler(
        keys=["sk-key-1", "sk-key-2", "sk-key-3"],
        strategy=SchedulerStrategy.ROUND_ROBIN
    )

    # 2. Pass the handler directly as the http_client to OpenAI
    client = OpenAI(
        api_key="dummy-key",
        http_client=handler
    )

    try:
        # 3. Use the SDK as usual
        response = client.chat.completions.create(
            model="gpt-4o",
            messages=[{"role": "user", "content": "Hello KeyMesh Sync!"}]
        )
        print(f"Response: {response.choices[0].message.content}")
    finally:
        # 4. Gracefully close the handler
        handler.close()

if __name__ == "__main__":
    main()

💡 Low-Level / Custom Integration Patterns

If you are using a custom HTTP client, a different LLM SDK (like Anthropic, Gemini, or Cohere), or need manual control over the lifecycle of your credentials, you can interface directly with KeyPool or SyncKeyPool.

[!WARNING] Strict Concurrency Rule: Never mutate a shared client's API key globally (e.g. client.api_key = key) in concurrent loops as it causes race conditions. Instead, use one of the patterns below to scope the key to the request context.

Pattern 1: Request-Scoped Client Overrides (with_options)

Modern SDKs support copying a client configuration with a overridden API key while sharing the underlying connection pool.

# Async
key = await pool.acquire()
start = time.monotonic()
try:
    scoped_client = client.with_options(api_key=key)
    response = await scoped_client.chat.completions.create(...)
    await pool.release(key, latency=time.monotonic() - start)
except Exception:
    await pool.mark_failed(key)
    raise

Pattern 2: Per-Request Custom Headers (extra_headers)

Pass the key as an HTTP header directly in the API call, bypassing global client state.

key = await pool.acquire()
start = time.monotonic()
try:
    response = await client.chat.completions.create(
        model="gpt-4o",
        messages=[{"role": "user", "content": "Query"}],
        extra_headers={"Authorization": f"Bearer {key}"}
    )
    await pool.release(key, latency=time.monotonic() - start)
except Exception:
    await pool.mark_failed(key)
    raise

Pattern 3: Context Managers (key_lifecycle)

Encapsulate the acquire/release/fail lifecycle into a clean Python context manager:

import time
import contextlib

@contextlib.asynccontextmanager
async def key_lifecycle(pool: KeyPool):
    key = await pool.acquire()
    start = time.monotonic()
    try:
        yield key
        await pool.release(key, latency=time.monotonic() - start)
    except Exception:
        await pool.mark_failed(key)
        raise

# Usage
async with key_lifecycle(pool) as key:
    scoped_client = client.with_options(api_key=key)
    response = await scoped_client.chat.completions.create(...)

🛠️ Architecture

KeyMesh follows a modular, thread-safe, and async-safe design:

  • KeyPool / SyncKeyPool: The central async / sync orchestrators.
  • Scheduler: Stateless selection logic for choosing the next key (e.g. RoundRobin, LeastBusy, Weighted).
  • KeyState / SyncKeyState: Lock-guarded runtime diagnostics tracking per API key (failures, latency average, cooldown timers, active requests).
  • Storage: Pluggable persistence layers (In-Memory or JSON-backed) for both asynchronous and synchronous runtimes.

🛠️ Development

This project uses uv for development.

# Install dependencies
uv sync

# Run tests
uv run pytest

# Lint and Format
uv run ruff check .
uv run mypy .

📄 License

MIT License. See LICENSE for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

keymesh-0.1.2b1.tar.gz (62.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

keymesh-0.1.2b1-py3-none-any.whl (31.9 kB view details)

Uploaded Python 3

File details

Details for the file keymesh-0.1.2b1.tar.gz.

File metadata

  • Download URL: keymesh-0.1.2b1.tar.gz
  • Upload date:
  • Size: 62.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.22 {"installer":{"name":"uv","version":"0.9.22","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for keymesh-0.1.2b1.tar.gz
Algorithm Hash digest
SHA256 ce56c08d07862af9d8ec111250e2931c59768772b79a5565482af3f681e93334
MD5 81bfc1a0d293942bfdde91e48855e3e2
BLAKE2b-256 57595b39c3a6951b51b71e9c7c7740a0e8e2399339aff497e3640bdfcb295b78

See more details on using hashes here.

File details

Details for the file keymesh-0.1.2b1-py3-none-any.whl.

File metadata

  • Download URL: keymesh-0.1.2b1-py3-none-any.whl
  • Upload date:
  • Size: 31.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.22 {"installer":{"name":"uv","version":"0.9.22","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for keymesh-0.1.2b1-py3-none-any.whl
Algorithm Hash digest
SHA256 e9d99237d8e90fa9f863d601fca6e815de04250a4c863e0179cb16097a0c5f88
MD5 d182433e15b2262945d9c41e8b278e74
BLAKE2b-256 2867378354dbd88c15b1782934ea24cf3b140e6074aab6fe04277cb13f4e9ec9

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page