Generic async LLM gateway with concurrency, retries, and JSON helpers
Project description
llmgateway
llmgateway is a small async Python library for calling large language models
through a shared runtime configuration. It is meant to be embedded in Python
applications that need one internal API while switching between multiple model
providers.
The package focuses on request routing and runtime behavior. It does not provide a hosted service, login flow, background daemon, or command-line application.
Features
- A single async
GatewayAPI for text, JSON, retryable, and batch requests. - Provider chains with ordered failover. A working provider is remembered in local provider state and preferred on later calls.
- Task-to-model routing with
strong,weak, fallback, and per-task model overrides. - Model normalization through provider
model_mapvalues. - Concurrency, timeout, transport retry, and validation retry controls.
- JSON helpers that parse plain or fenced JSON responses.
- Environment-variable interpolation for local configuration.
- Built-in transports for OpenAI Responses, OpenAI Chat Completions, Anthropic Messages, and LiteLLM-style backends.
Install
Python 3.10 or newer is required. The runtime dependencies are httpx and
PyYAML.
Registry Install
When the seemseam_llmgateway distribution is available from PyPI or your private Python
index, install it with:
python3 -m pip install seemseam_llmgateway
If pip reports that no matching distribution is available, use the GitHub or
local development install path below until the release is published to the
target registry.
GitHub Install
Install directly from the repository when you need the current source before a registry release:
python3 -m pip install "seemseam_llmgateway @ git+https://github.com/SeemSeam/llmgateway.git"
Local Development Install
From a checkout:
git clone https://github.com/SeemSeam/llmgateway.git
cd llmgateway
python3 -m pip install -e ".[dev]"
The LiteLLM transport imports litellm only when api_style: litellm is used.
Install it separately if you use that backend:
python3 -m pip install litellm
There is no maintained npm runtime for this project. For normal use, install the
Python package and import llmgateway from Python code.
Quick Start
Create a user config file:
mkdir -p ~/.llmgateway
# From a repository checkout:
cp llmgateway.example.yaml ~/.llmgateway/config.yaml
# Or create ~/.llmgateway/config.yaml with the minimal config below.
Set provider credentials in the environment instead of hard-coding them in the config file:
export LLM_API_KEY_1="your-provider-api-key"
A minimal config looks like this:
version: 1
providers:
- provider_type: openai
api_style: responses
base_url: ${LLM_API_BASE_URL_1:-https://api.openai.com/v1}
api_key: ${LLM_API_KEY_1}
headers: {}
model_map: {}
settings:
fallback_model: gpt-5.4-mini
strong_model: gpt-5.4
weak_model: gpt-5.4-mini
strong_reasoning_effort: high
weak_reasoning_effort: low
max_concurrent: 8
retry_max: 2
transport_retries: 2
timeout: 30
tasks:
analysis:
tier: weak
max_tokens: 4000
planner:
tier: strong
max_tokens: 8000
Run a text task from Python:
import asyncio
from llmgateway import Gateway, load_user_config, runtime_spec_from_dict
async def main() -> None:
runtime = runtime_spec_from_dict(load_user_config())
gateway = Gateway(runtime)
text = await gateway.run_task(
"analysis",
[{"role": "user", "content": "Summarize llmgateway in one sentence."}],
)
print(text)
asyncio.run(main())
Parse a JSON response:
import asyncio
from llmgateway import Gateway, load_user_config, runtime_spec_from_dict
async def main() -> None:
runtime = runtime_spec_from_dict(load_user_config())
gateway = Gateway(runtime)
result = await gateway.run_json_task(
"analysis",
[{"role": "user", "content": "Return JSON with keys: summary, risk."}],
)
print(result.data)
asyncio.run(main())
Run multiple tasks with the shared concurrency limit:
import asyncio
from llmgateway import Gateway, TaskRequest, load_user_config, runtime_spec_from_dict
async def main() -> None:
runtime = runtime_spec_from_dict(load_user_config())
gateway = Gateway(runtime)
results = await gateway.run_tasks(
[
TaskRequest(
task="analysis",
messages=[{"role": "user", "content": "List the main risks."}],
),
TaskRequest(
task="planner",
messages=[{"role": "user", "content": "Draft a short plan."}],
),
]
)
for result in results:
print(result.task, result.text)
asyncio.run(main())
Configuration
llmgateway can load a config from an explicit path with load_runtime_spec(),
or from the user config location with load_user_config().
Default locations:
- Config file:
~/.llmgateway/config.yaml - Provider state:
~/.llmgateway/provider-state.json
Environment overrides:
LLMGATEWAY_CONFIG: path to a config file.LLMGATEWAY_USER_CONFIG_DIR: directory that containsconfig.yaml.LLMGATEWAY_PROVIDER_STATE: path to provider-state JSON.
String values in user config can reference environment variables:
${ENV_NAME}resolves to the environment value or an empty string.${ENV_NAME:-default}resolves to the environment value ordefault.env:ENV_NAMEresolves to the environment value.
Provider fields:
provider_type: label such asopenai,anthropic, orlitellm.api_style: one ofresponses,openai_responses,openai_chat,anthropic, orlitellm.base_url: provider API base URL.api_key: provider API key, usually from an environment variable.headers: extra HTTP headers.model_map: maps logical model names to provider-specific model names.
Task fields:
model: exact model to use for the task.tier:strongorweak; resolved throughsettings.strong_modelandsettings.weak_model.temperature: request temperature.reasoning_effort: optional reasoning effort hint for compatible backends.max_tokens: output token limit.
See llmgateway.example.yaml for a fuller multi-provider template.
API Overview
Most applications use Gateway:
run_task(task, messages): return text for one task.run_task_with_retry(task, messages, validator=...): retry when generation or validation fails.run_json_task(task, messages): return aJSONResultwith parseddata.run_json_task_with_retry(...): combine validation retry and JSON parsing.run_tasks([TaskRequest, ...]): run several requests under the configured concurrency limit and returnCallResultobjects.run_tasks_with_retry(...)andrun_json_tasks_with_retry(...): batch variants with validation retry.
Lower-level helpers such as runtime_spec_from_dict(), load_runtime_spec(),
load_user_config(), and write_user_config() are exported for applications
that manage their own configuration UI or files.
Safety Boundaries
llmgatewaydoes not create accounts, ask for registry credentials, or manage provider billing.- Provider API keys should be supplied through environment variables or private local config files. Do not commit real credentials.
- Prompt and response content is sent to the configured provider endpoints. Avoid sending secrets unless the selected provider and account are approved for that data.
- Provider state stores local preference hashes for the configured provider chain. It is not a credential store.
- Config files control outbound API URLs and headers. Treat untrusted config as untrusted code-adjacent input.
Development
Run the test suite:
PYTHONPATH=src python3 -m pytest -q
Build and validate package metadata:
rm -rf build dist
find . -maxdepth 1 -name '*.egg-info' -prune -exec rm -rf {} +
find src -maxdepth 1 -name '*.egg-info' -prune -exec rm -rf {} +
python3 -m build
python3 -m twine check dist/*
Smoke-test a built wheel in a clean virtual environment:
python3 -m venv /tmp/llmgateway-smoke
/tmp/llmgateway-smoke/bin/python -m pip install --upgrade pip
/tmp/llmgateway-smoke/bin/python -m pip install dist/llmgateway-*.whl
/tmp/llmgateway-smoke/bin/python - <<'PY'
from llmgateway import Gateway, runtime_spec_from_dict
runtime = runtime_spec_from_dict({
"providers": [{"provider_type": "openai", "api_style": "responses"}],
"settings": {"strong_model": "example-model"},
"tasks": {"analysis": {"tier": "strong"}},
})
gateway = Gateway(runtime)
print(type(gateway).__name__, runtime.task("analysis").tier)
PY
Package Names
- Python distribution name:
seemseam_llmgateway - Python import name:
llmgateway - Command-line name: none
- npm package: none maintained for this runtime
The current package metadata version is 0.1.2 in
pyproject.toml.
License
MIT. See LICENSE.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file seemseam_llmgateway-0.1.2.tar.gz.
File metadata
- Download URL: seemseam_llmgateway-0.1.2.tar.gz
- Upload date:
- Size: 22.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
bd91c655546664dc82851b961c56188a5dcac1659ef0c1c244745559aa9c6ee1
|
|
| MD5 |
75c8901f1fbabd942f69375c36930499
|
|
| BLAKE2b-256 |
c5dbe8cab0d227091615c6a1e255384a6ea3161286bd7523e47a124e2df622bc
|
Provenance
The following attestation bundles were made for seemseam_llmgateway-0.1.2.tar.gz:
Publisher:
release.yml on SeemSeam/llmgateway
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
seemseam_llmgateway-0.1.2.tar.gz -
Subject digest:
bd91c655546664dc82851b961c56188a5dcac1659ef0c1c244745559aa9c6ee1 - Sigstore transparency entry: 1737393455
- Sigstore integration time:
-
Permalink:
SeemSeam/llmgateway@7d6e25c5cb66f5aa214a113c9dacf464454434ed -
Branch / Tag:
refs/tags/v0.1.2 - Owner: https://github.com/SeemSeam
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@7d6e25c5cb66f5aa214a113c9dacf464454434ed -
Trigger Event:
push
-
Statement type:
File details
Details for the file seemseam_llmgateway-0.1.2-py3-none-any.whl.
File metadata
- Download URL: seemseam_llmgateway-0.1.2-py3-none-any.whl
- Upload date:
- Size: 19.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
bebcd4a6353253773715afb581e602fce6e6a6439a6b6f00b312bdf56a544aa0
|
|
| MD5 |
f40cca0078c34c6e77b6c02decdb7a2f
|
|
| BLAKE2b-256 |
980f8ddd24e1f68a5dd7e6aaf81e32d475cfa781a1171c51765227f510ec7eb1
|
Provenance
The following attestation bundles were made for seemseam_llmgateway-0.1.2-py3-none-any.whl:
Publisher:
release.yml on SeemSeam/llmgateway
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
seemseam_llmgateway-0.1.2-py3-none-any.whl -
Subject digest:
bebcd4a6353253773715afb581e602fce6e6a6439a6b6f00b312bdf56a544aa0 - Sigstore transparency entry: 1737393465
- Sigstore integration time:
-
Permalink:
SeemSeam/llmgateway@7d6e25c5cb66f5aa214a113c9dacf464454434ed -
Branch / Tag:
refs/tags/v0.1.2 - Owner: https://github.com/SeemSeam
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@7d6e25c5cb66f5aa214a113c9dacf464454434ed -
Trigger Event:
push
-
Statement type: