Skip to main content

Lightweight litellm replacement — thin OpenAI-SDK routing layer for multi-provider LLM calls

Project description

litelm

PyPI Python Tests License: MIT

litellm's routing + translation in ~2,300 lines and 2 dependencies (openai, httpx).

litellm routes LLM calls across providers and translates between message formats. That core is buried under 100k+ LOC of proxy servers, caching layers, cost tracking, and dozens of features most users never touch. litelm extracts just the call path — model routing, message translation, streaming, tool use, embeddings — and nothing else. No Router class, no proxy, no caching.

Install

pip install litelm                # openai + httpx
pip install litelm[anthropic]     # + anthropic SDK
pip install litelm[bedrock]       # + boto3
pip install litelm[all]           # everything

Usage

import litelm

# Basic completion
response = litelm.completion("openai/gpt-4o", messages=[{"role": "user", "content": "Hello!"}])
print(response.choices[0].message.content)

# Streaming
for chunk in litelm.completion("groq/llama-3.1-70b-versatile", messages=[...], stream=True):
    print(chunk.choices[0].delta.content or "", end="")

# Embeddings
response = litelm.embedding("openai/text-embedding-3-small", input=["hello world"])

Every function has an async variant: acompletion, aembedding, aresponses, atext_completion.

The API mirrors litellm — same function names, same arguments, same response types. If you're using litellm today, switching is s/litellm/litelm/ in your imports.

What's in / what's out

litellm litelm
Model routing (provider/model → right endpoint)
Message translation (Anthropic, Bedrock, Cloudflare, Mistral)
Streaming + stream_chunk_builder
Tool use (function calling)
Embeddings
Text completions
OpenAI Responses API
Mock responses
Router (load balancing, fallbacks)
Proxy server
Caching / budgeting / cost tracking
Token counting
Image gen, audio, OCR, fine-tuning
Agents, guardrails, scheduler

Providers

Routes to 19 providers via "provider/model-name" syntax. Any OpenAI-compatible endpoint works via api_base.

Provider Env Var Handler Verified
OpenAI OPENAI_API_KEY OpenAI SDK Yes
Anthropic ANTHROPIC_API_KEY Custom Yes
Groq GROQ_API_KEY OpenAI-compat Yes
Mistral MISTRAL_API_KEY Custom Yes
xAI XAI_API_KEY OpenAI-compat Yes
OpenRouter OPENROUTER_API_KEY OpenAI-compat Yes
Azure AZURE_API_KEY OpenAI SDK (Azure) No
Bedrock AWS_ACCESS_KEY_ID Custom No
Cloudflare CLOUDFLARE_API_TOKEN Custom No
Together TOGETHERAI_API_KEY OpenAI-compat No
Fireworks FIREWORKS_API_KEY OpenAI-compat No
DeepSeek DEEPSEEK_API_KEY OpenAI-compat No
Perplexity PERPLEXITYAI_API_KEY OpenAI-compat No
DeepInfra DEEPINFRA_API_TOKEN OpenAI-compat No
Gemini GEMINI_API_KEY OpenAI-compat No
Cohere COHERE_API_KEY OpenAI-compat No
Ollama OpenAI-compat No
vLLM OpenAI-compat No
LM Studio OpenAI-compat No

API Keys

Set the environment variable for your provider:

export OPENAI_API_KEY=sk-...
export ANTHROPIC_API_KEY=sk-ant-...

Or pass directly:

litelm.completion("openai/gpt-4o", messages=[...], api_key="sk-...")
litelm.completion("openai/gpt-4o", messages=[...], api_base="http://localhost:8000/v1")

Error Handling

All provider errors are mapped to litelm's exception hierarchy:

from litelm import ContextWindowExceededError, RateLimitError, AuthenticationError

try:
    response = litelm.completion("openai/gpt-4o", messages=messages)
except ContextWindowExceededError:
    # prompt too long — truncate and retry
    pass
except RateLimitError:
    # back off
    pass
except AuthenticationError:
    # bad API key
    pass

Tool Calling

tools = [{"type": "function", "function": {
    "name": "get_weather",
    "parameters": {"type": "object", "properties": {"city": {"type": "string"}}},
}}]

response = litelm.completion(
    "openai/gpt-4o", messages=[{"role": "user", "content": "Weather in Paris?"}],
    tools=tools, tool_choice="required",
)
tool_call = response.choices[0].message.tool_calls[0]
print(tool_call.function.name, tool_call.function.arguments)

Custom / Local Providers

Any OpenAI-compatible server works via api_base:

# vLLM
litelm.completion("openai/my-model", messages=[...], api_base="http://localhost:8000/v1")

# Ollama
litelm.completion("ollama/llama3", messages=[...], api_base="http://localhost:11434/v1")

# LM Studio
litelm.completion("openai/local-model", messages=[...], api_base="http://localhost:1234/v1")

Status

Alpha. 129 own tests, 56 ported litellm tests passing unmodified via sys.modules shimming.

DSPy drop-in verified — all 7 execution paths proven live (Predict, CoT, typed signatures, streaming, embeddings, tool use, multi-output).

Tests

uv run pytest tests/ -x --ignore=tests/ported        # 129 unit tests
uv run pytest tests/test_live.py -m live --timeout=30 # 37 live provider tests
uv run pytest tests/test_dspy_smoke.py -m live --timeout=60  # 10 DSPy integration tests

Live tests require API keys in .env.test. Skipped by default; run with -m live.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

litelm-0.5.0.tar.gz (271.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

litelm-0.5.0-py3-none-any.whl (33.3 kB view details)

Uploaded Python 3

File details

Details for the file litelm-0.5.0.tar.gz.

File metadata

  • Download URL: litelm-0.5.0.tar.gz
  • Upload date:
  • Size: 271.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for litelm-0.5.0.tar.gz
Algorithm Hash digest
SHA256 9909a8901fe55b085d085d393fe097cfd245e9eb939e1d525ac8df3842282873
MD5 e9a6ef729455829ba50e76f8b987d4bd
BLAKE2b-256 6608cf2b46b0faa99cbf7ffb22e3144dcdd2203478384d3ba1c0c18450cd6826

See more details on using hashes here.

Provenance

The following attestation bundles were made for litelm-0.5.0.tar.gz:

Publisher: publish.yml on kennethwolters/litelm

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file litelm-0.5.0-py3-none-any.whl.

File metadata

  • Download URL: litelm-0.5.0-py3-none-any.whl
  • Upload date:
  • Size: 33.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for litelm-0.5.0-py3-none-any.whl
Algorithm Hash digest
SHA256 1de8e3f51f959399fa8fde9786760f5bfdfec880a82e4577672a994902237ac5
MD5 f74f544ad1301fda2dd240b8fe44f9a0
BLAKE2b-256 287c4bd59465fc13431bed5f2cc42829eddb1ca390ea0e5de74325f5036f2b50

See more details on using hashes here.

Provenance

The following attestation bundles were made for litelm-0.5.0-py3-none-any.whl:

Publisher: publish.yml on kennethwolters/litelm

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page