Lightweight litellm replacement — thin OpenAI-SDK routing layer for multi-provider LLM calls
Project description
litelm
litellm's routing + translation in ~2,300 lines and 2 dependencies (openai, httpx).
litellm routes LLM calls across providers and translates between message formats. That core is buried under 100k+ LOC of proxy servers, caching layers, cost tracking, and dozens of features most users never touch. litelm extracts just the call path — model routing, message translation, streaming, tool use, embeddings — and nothing else. No Router class, no proxy, no caching.
Install
pip install litelm # openai + httpx
pip install litelm[anthropic] # + anthropic SDK
pip install litelm[bedrock] # + boto3
pip install litelm[all] # everything
Usage
import litelm
# Basic completion
response = litelm.completion("openai/gpt-4o", messages=[{"role": "user", "content": "Hello!"}])
print(response.choices[0].message.content)
# Streaming
for chunk in litelm.completion("groq/llama-3.1-70b-versatile", messages=[...], stream=True):
print(chunk.choices[0].delta.content or "", end="")
# Embeddings
response = litelm.embedding("openai/text-embedding-3-small", input=["hello world"])
Every function has an async variant: acompletion, aembedding, aresponses, atext_completion.
The API mirrors litellm — same function names, same arguments, same response types. If you're using litellm today, switching is s/litellm/litelm/ in your imports.
What's in / what's out
| litellm | litelm | |
|---|---|---|
Model routing (provider/model → right endpoint) |
✓ | ✓ |
| Message translation (Anthropic, Bedrock, Cloudflare, Mistral) | ✓ | ✓ |
Streaming + stream_chunk_builder |
✓ | ✓ |
| Tool use (function calling) | ✓ | ✓ |
| Embeddings | ✓ | ✓ |
| Text completions | ✓ | ✓ |
| OpenAI Responses API | ✓ | ✓ |
| Mock responses | ✓ | ✓ |
| Router (load balancing, fallbacks) | ✓ | ✗ |
| Proxy server | ✓ | ✗ |
| Caching / budgeting / cost tracking | ✓ | ✗ |
| Token counting | ✓ | ✗ |
| Image gen, audio, OCR, fine-tuning | ✓ | ✗ |
| Agents, guardrails, scheduler | ✓ | ✗ |
Providers
Routes to 19 providers via "provider/model-name" syntax. Any OpenAI-compatible endpoint works via api_base.
| Provider | Env Var | Handler | Verified |
|---|---|---|---|
| OpenAI | OPENAI_API_KEY |
OpenAI SDK | Yes |
| Anthropic | ANTHROPIC_API_KEY |
Custom | Yes |
| Groq | GROQ_API_KEY |
OpenAI-compat | Yes |
| Mistral | MISTRAL_API_KEY |
Custom | Yes |
| xAI | XAI_API_KEY |
OpenAI-compat | Yes |
| OpenRouter | OPENROUTER_API_KEY |
OpenAI-compat | Yes |
| Azure | AZURE_API_KEY |
OpenAI SDK (Azure) | No |
| Bedrock | AWS_ACCESS_KEY_ID |
Custom | No |
| Cloudflare | CLOUDFLARE_API_TOKEN |
Custom | No |
| Together | TOGETHERAI_API_KEY |
OpenAI-compat | No |
| Fireworks | FIREWORKS_API_KEY |
OpenAI-compat | No |
| DeepSeek | DEEPSEEK_API_KEY |
OpenAI-compat | No |
| Perplexity | PERPLEXITYAI_API_KEY |
OpenAI-compat | No |
| DeepInfra | DEEPINFRA_API_TOKEN |
OpenAI-compat | No |
| Gemini | GEMINI_API_KEY |
OpenAI-compat | No |
| Cohere | COHERE_API_KEY |
OpenAI-compat | No |
| Ollama | — | OpenAI-compat | No |
| vLLM | — | OpenAI-compat | No |
| LM Studio | — | OpenAI-compat | No |
API Keys
Set the environment variable for your provider:
export OPENAI_API_KEY=sk-...
export ANTHROPIC_API_KEY=sk-ant-...
Or pass directly:
litelm.completion("openai/gpt-4o", messages=[...], api_key="sk-...")
litelm.completion("openai/gpt-4o", messages=[...], api_base="http://localhost:8000/v1")
Error Handling
All provider errors are mapped to litelm's exception hierarchy:
from litelm import ContextWindowExceededError, RateLimitError, AuthenticationError
try:
response = litelm.completion("openai/gpt-4o", messages=messages)
except ContextWindowExceededError:
# prompt too long — truncate and retry
pass
except RateLimitError:
# back off
pass
except AuthenticationError:
# bad API key
pass
Tool Calling
tools = [{"type": "function", "function": {
"name": "get_weather",
"parameters": {"type": "object", "properties": {"city": {"type": "string"}}},
}}]
response = litelm.completion(
"openai/gpt-4o", messages=[{"role": "user", "content": "Weather in Paris?"}],
tools=tools, tool_choice="required",
)
tool_call = response.choices[0].message.tool_calls[0]
print(tool_call.function.name, tool_call.function.arguments)
Custom / Local Providers
Any OpenAI-compatible server works via api_base:
# vLLM
litelm.completion("openai/my-model", messages=[...], api_base="http://localhost:8000/v1")
# Ollama
litelm.completion("ollama/llama3", messages=[...], api_base="http://localhost:11434/v1")
# LM Studio
litelm.completion("openai/local-model", messages=[...], api_base="http://localhost:1234/v1")
Status
Alpha. 129 own tests, 56 ported litellm tests passing unmodified via sys.modules shimming.
DSPy drop-in verified — all 7 execution paths proven live (Predict, CoT, typed signatures, streaming, embeddings, tool use, multi-output).
Tests
uv run pytest tests/ -x --ignore=tests/ported # 129 unit tests
uv run pytest tests/test_live.py -m live --timeout=30 # 37 live provider tests
uv run pytest tests/test_dspy_smoke.py -m live --timeout=60 # 10 DSPy integration tests
Live tests require API keys in .env.test. Skipped by default; run with -m live.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file litelm-0.3.1.tar.gz.
File metadata
- Download URL: litelm-0.3.1.tar.gz
- Upload date:
- Size: 244.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3033edfe5da887510ea18d5ef1341df992ed574a08c776b93bf034896a9de6d6
|
|
| MD5 |
cc6c44c9153759746718b0a6e13674d1
|
|
| BLAKE2b-256 |
f512c487efa68e3732e0c4cc4c79bf34516c014b80dbf0d3ee49aef79582efd3
|
Provenance
The following attestation bundles were made for litelm-0.3.1.tar.gz:
Publisher:
publish.yml on kennethwolters/litelm
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
litelm-0.3.1.tar.gz -
Subject digest:
3033edfe5da887510ea18d5ef1341df992ed574a08c776b93bf034896a9de6d6 - Sigstore transparency entry: 1111257763
- Sigstore integration time:
-
Permalink:
kennethwolters/litelm@4e4f26f6a4625d2b591fce3f24145c3723720e68 -
Branch / Tag:
refs/tags/v0.3.1 - Owner: https://github.com/kennethwolters
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@4e4f26f6a4625d2b591fce3f24145c3723720e68 -
Trigger Event:
release
-
Statement type:
File details
Details for the file litelm-0.3.1-py3-none-any.whl.
File metadata
- Download URL: litelm-0.3.1-py3-none-any.whl
- Upload date:
- Size: 29.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8f7926cf9364b96f5e88a5ec8c023e3ada40d7688a2c052d8918f4c09737438f
|
|
| MD5 |
d7d3f7d0dfdb7bb9331e63122518ba14
|
|
| BLAKE2b-256 |
21f1e73946ee1652372c8a077cd23db17a59fc1567b37b478311b127617e6080
|
Provenance
The following attestation bundles were made for litelm-0.3.1-py3-none-any.whl:
Publisher:
publish.yml on kennethwolters/litelm
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
litelm-0.3.1-py3-none-any.whl -
Subject digest:
8f7926cf9364b96f5e88a5ec8c023e3ada40d7688a2c052d8918f4c09737438f - Sigstore transparency entry: 1111257839
- Sigstore integration time:
-
Permalink:
kennethwolters/litelm@4e4f26f6a4625d2b591fce3f24145c3723720e68 -
Branch / Tag:
refs/tags/v0.3.1 - Owner: https://github.com/kennethwolters
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@4e4f26f6a4625d2b591fce3f24145c3723720e68 -
Trigger Event:
release
-
Statement type: