The universal ModelAdapter Protocol for LLM clients — a vendor-neutral adapter interface (the "JDBC of LLMs"). Zero dependencies.
Project description
keel-llm-protocol
The vendor-neutral LLM adapter standard — the "JDBC of LLMs." A small, zero-dependency interface any LLM adapter implements, plus the standard types adapters exchange and a standardized error taxonomy. Implement it once; plug into anything that speaks it.
Part of Keel. No provider lock-in, no base class to inherit, no framework. A reference implementation against the OpenAI-compatible wire format ships as keel-llm-adapter-openai, so the standard is grounded in a working adapter — not just an interface document.
Why a standard
Every product talking to more than one LLM provider rebuilds the same shim: a common interface so the rest of the code doesn't care which provider answered. Everyone's shim is slightly different, so nothing composes. keel-llm-protocol is that interface, written once, vendor-neutral, with the part that actually matters for reliability standardized too:
- A standardized error taxonomy — every adapter raises typed,
retryable-tagged failures. A 429 isRateLimitErrorwhether it came from Groq, Gemini, or a local model, so circuit breakers, retries, and failover act on types instead of parsing provider strings. (This is the difference that lets a circuit breaker not trip on a 429 — a healthy-but-throttled model — and defer to a rate limiter instead.) - Composable capability protocols — implement only what your backend supports.
- Normalized results —
Usage,FinishReason,ToolCallmean the same thing everywhere.
Install
pip install keel-llm-protocol # or: uv add keel-llm-protocol
Zero runtime dependencies (stdlib only).
The composable protocols
from keel_llm_protocol import ModelAdapter, StreamingModelAdapter, ToolCallingModelAdapter
ModelAdapter— the core.model_key,generate(messages, ...) -> AdapterResponse,health_check(). Every adapter implements this.StreamingModelAdapter— addsstream(...) -> AsyncIterator[StreamChunk]. Implement only if your backend streams.ToolCallingModelAdapter— addsgenerate_with_tools(messages, tools, ...). Implement only if your backend supports tools.
Consumers type against the capability they need — a plain text model and a streaming tool-using model both fit, and the type checker tells a consumer at compile time whether the adapter it was handed can stream.
Implement an adapter (structural — no inheritance)
from keel_llm_protocol import (
ModelAdapter, Message, AdapterResponse, HealthStatus, Usage, user,
)
class MyAdapter:
@property
def model_key(self) -> str:
return "myprovider:my-model"
async def generate(self, messages, *, temperature=None, max_tokens=None,
stop=None, response_format=None) -> AdapterResponse:
# ... call the provider, map failures to keel_llm_protocol.errors.* ...
return AdapterResponse(
text="...", model_key=self.model_key, model_id="my-model",
usage=Usage(input_tokens=12, output_tokens=8), finish_reason="stop",
)
async def health_check(self) -> HealthStatus:
return HealthStatus(model_key=self.model_key, healthy=True)
adapter: ModelAdapter = MyAdapter()
assert isinstance(adapter, ModelAdapter) # @runtime_checkable
response = await adapter.generate([user("hello")])
The error taxonomy (the reliability core)
Every adapter maps its provider's failures to these types, so consuming reliability logic is provider-agnostic:
from keel_llm_protocol.errors import (
AdapterError, # base — catch this for "any adapter failure"
RateLimitError, # 429 / quota — retryable=True, carries retry_after
AuthenticationError, # 401/403 — retryable=False
AdapterTimeoutError, # timed out — retryable=True
TransientError, # 5xx / connection — retryable=True
ContentFilterError, # content policy — retryable=False
ContextLengthError, # context window exceeded — retryable=False
BadRequestError, # 400 invalid request — retryable=False
ProviderError, # unexpected — retryable=False
)
try:
resp = await adapter.generate(messages)
except RateLimitError as e:
await asyncio.sleep(e.retry_after or 1.0) # healthy but throttled — back off, don't trip the breaker
except AdapterError as e:
if e.retryable:
... # retry / failover
else:
raise
err.retryable lets generic retry/breaker logic decide without knowing the provider. That single fact is what every product otherwise reimplements per provider.
Consuming the taxonomy (how to act on a typed error)
Typed errors only improve reliability if reliability machinery acts on them correctly. The key is that there are three reactions, not two — and a plain retryable boolean can't express the most important one. Dispatch on error.category:
"backpressure"(e.g. 429) — a rate-limited model is healthy, not failing. Defer (let a rate limiter pace it); do not retry now, and do not record a circuit-breaker failure. Tripping a breaker on a 429 skips a working model — the opposite of what you want."transient"(5xx, timeout) — a real but temporary failure → retry / fail over (this one does count as a breaker failure)."terminal"(auth, bad request, context-length, content-filter) — won't change on retry → fail fast; retrying just burns quota.
from keel_llm_protocol.errors import AdapterError
try:
return await adapter.generate(messages)
except AdapterError as e:
if e.category == "backpressure":
... # defer to the rate limiter; do NOT record a breaker failure
elif e.category == "transient":
... # retry / fail over (this one DOES count as a breaker failure)
else: # "terminal"
raise # fail fast — won't change on retry
Grounded result: the
backpressure-vs-transientsplit — defer the 429 instead of retrying-or-failing it — moved a throttled model from 3/10 to 10/10 availability in a real multi-model fan-out (the model stayed in rotation instead of being spuriously circuit-broken).retryableremains as a convenience (category != "terminal"), butcategoryis the source of truth because only it distinguishes defer from retry-now. (Pattern contributed by the LLMCouncil product team, measured in their pipeline; the machinery lives inkeel-llm-reliability.)
This is guidance, not behavior baked into the protocol — routing, retry, and breaker policy live in your code (the protocol never decides or holds state). The taxonomy gives you the typed signal; this is how to use it well.
Standard types
@dataclass
class AdapterResponse:
text: str
model_key: str
model_id: str
finish_reason: FinishReason = "stop" # "stop" | "length" | "tool_calls" | "content_filter" | "error"
usage: Usage = Usage() # input_tokens, output_tokens, cost_usd?, .total_tokens
tool_calls: list[ToolCall] = []
latency_ms: int = 0
raw_response: dict[str, Any] | None = None # escape hatch; not part of the stable contract
@dataclass
class StreamChunk:
delta: str = ""
finish_reason: FinishReason | None = None
usage: Usage | None = None # typically only on the final chunk
Message helpers keep call sites clean: system(...), user(...), assistant(...), tool(..., tool_call_id=...).
What this is not (the framework line)
A Protocol describes the shape of a call and the type of a result. It deliberately does not: execute tools or run agent loops, decide retry/backoff policy, route failover, buffer/reassemble streams, or hold conversation state. Those are the consumer's to own. If it makes a decision or holds state, it isn't in this package.
Status
0.1.0 — first release, grounded by the keel-llm-adapter-openai reference implementation. An interface is most valuable when stable; the surface is intentionally minimal-but-complete and stays in 0.x through year one (changes documented in the CHANGELOG; pin exact versions). Source: Keel monorepo.
License
MIT — see LICENSE.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file keel_llm_protocol-0.1.0.tar.gz.
File metadata
- Download URL: keel_llm_protocol-0.1.0.tar.gz
- Upload date:
- Size: 13.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d9059662230e20ad8c24642e662239df2ff7f5232011be079144d0190ac72d68
|
|
| MD5 |
ac6bd926f5267a70f3d8c7e88b04dc98
|
|
| BLAKE2b-256 |
e38fa66e4e7d08c9173ecdc1da52bbeeebceb7c95caff215c6f7aaad2ef856ee
|
Provenance
The following attestation bundles were made for keel_llm_protocol-0.1.0.tar.gz:
Publisher:
publish-py.yml on keelplatform/keel
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
keel_llm_protocol-0.1.0.tar.gz -
Subject digest:
d9059662230e20ad8c24642e662239df2ff7f5232011be079144d0190ac72d68 - Sigstore transparency entry: 1608309534
- Sigstore integration time:
-
Permalink:
keelplatform/keel@c39e5e08eeba3595ec85cad3340735c748be8b0e -
Branch / Tag:
refs/tags/py-llm-protocol-v0.1.0 - Owner: https://github.com/keelplatform
-
Access:
private
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish-py.yml@c39e5e08eeba3595ec85cad3340735c748be8b0e -
Trigger Event:
push
-
Statement type:
File details
Details for the file keel_llm_protocol-0.1.0-py3-none-any.whl.
File metadata
- Download URL: keel_llm_protocol-0.1.0-py3-none-any.whl
- Upload date:
- Size: 13.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ecb036317be6eaf1c24dced920218856fc17df7feae89caae7d39db9fe8a81c0
|
|
| MD5 |
ce9731521f2026c47b7ad9a606b91ac3
|
|
| BLAKE2b-256 |
e24f1d8a4a13612ae0588865dc4519d6f88d5f245ae3bd754f0c7dc8419201e6
|
Provenance
The following attestation bundles were made for keel_llm_protocol-0.1.0-py3-none-any.whl:
Publisher:
publish-py.yml on keelplatform/keel
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
keel_llm_protocol-0.1.0-py3-none-any.whl -
Subject digest:
ecb036317be6eaf1c24dced920218856fc17df7feae89caae7d39db9fe8a81c0 - Sigstore transparency entry: 1608309629
- Sigstore integration time:
-
Permalink:
keelplatform/keel@c39e5e08eeba3595ec85cad3340735c748be8b0e -
Branch / Tag:
refs/tags/py-llm-protocol-v0.1.0 - Owner: https://github.com/keelplatform
-
Access:
private
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish-py.yml@c39e5e08eeba3595ec85cad3340735c748be8b0e -
Trigger Event:
push
-
Statement type: