Skip to main content

Lightweight litellm replacement — thin OpenAI-SDK routing layer for multi-provider LLM calls

Project description

litelm

PyPI Python Tests License: MIT

litellm's routing + translation in ~2,300 lines and 2 dependencies (openai, httpx).

litellm routes LLM calls across providers and translates between message formats. That core is buried under 100k+ LOC of proxy servers, caching layers, cost tracking, and dozens of features most users never touch. litelm extracts just the call path — model routing, message translation, streaming, tool use, embeddings — and nothing else. No Router class, no proxy, no caching.

Install

pip install litelm                # openai + httpx
pip install litelm[anthropic]     # + anthropic SDK
pip install litelm[bedrock]       # + boto3
pip install litelm[all]           # everything

Usage

import litelm

# Basic completion
response = litelm.completion("openai/gpt-4o", messages=[{"role": "user", "content": "Hello!"}])
print(response.choices[0].message.content)

# Streaming
for chunk in litelm.completion("groq/llama-3.1-70b-versatile", messages=[...], stream=True):
    print(chunk.choices[0].delta.content or "", end="")

# Embeddings
response = litelm.embedding("openai/text-embedding-3-small", input=["hello world"])

Every function has an async variant: acompletion, aembedding, aresponses, atext_completion.

The API mirrors litellm — same function names, same arguments, same response types. If you're using litellm today, switching is s/litellm/litelm/ in your imports.

What's in / what's out

litellm litelm
Model routing (provider/model → right endpoint)
Message translation (Anthropic, Bedrock, Cloudflare, Mistral)
Streaming + stream_chunk_builder
Tool use (function calling)
Embeddings
Text completions
OpenAI Responses API
Mock responses
Router (load balancing, fallbacks)
Proxy server
Caching / budgeting / cost tracking
Token counting
Image gen, audio, OCR, fine-tuning
Agents, guardrails, scheduler

Providers

Routes to 19 providers via "provider/model-name" syntax. Any OpenAI-compatible endpoint works via api_base.

Provider Env Var Handler Verified
OpenAI OPENAI_API_KEY OpenAI SDK Yes
Anthropic ANTHROPIC_API_KEY Custom Yes
Groq GROQ_API_KEY OpenAI-compat Yes
Mistral MISTRAL_API_KEY Custom Yes
xAI XAI_API_KEY OpenAI-compat Yes
OpenRouter OPENROUTER_API_KEY OpenAI-compat Yes
Azure AZURE_API_KEY OpenAI SDK (Azure) No
Bedrock AWS_ACCESS_KEY_ID Custom No
Cloudflare CLOUDFLARE_API_TOKEN Custom No
Together TOGETHERAI_API_KEY OpenAI-compat No
Fireworks FIREWORKS_API_KEY OpenAI-compat No
DeepSeek DEEPSEEK_API_KEY OpenAI-compat No
Perplexity PERPLEXITYAI_API_KEY OpenAI-compat No
DeepInfra DEEPINFRA_API_TOKEN OpenAI-compat No
Gemini GEMINI_API_KEY OpenAI-compat No
Cohere COHERE_API_KEY OpenAI-compat No
Ollama OpenAI-compat No
vLLM OpenAI-compat No
LM Studio OpenAI-compat No

API Keys

Set the environment variable for your provider:

export OPENAI_API_KEY=sk-...
export ANTHROPIC_API_KEY=sk-ant-...

Or pass directly:

litelm.completion("openai/gpt-4o", messages=[...], api_key="sk-...")
litelm.completion("openai/gpt-4o", messages=[...], api_base="http://localhost:8000/v1")

Error Handling

All provider errors are mapped to litelm's exception hierarchy:

from litelm import ContextWindowExceededError, RateLimitError, AuthenticationError

try:
    response = litelm.completion("openai/gpt-4o", messages=messages)
except ContextWindowExceededError:
    # prompt too long — truncate and retry
    pass
except RateLimitError:
    # back off
    pass
except AuthenticationError:
    # bad API key
    pass

Tool Calling

tools = [{"type": "function", "function": {
    "name": "get_weather",
    "parameters": {"type": "object", "properties": {"city": {"type": "string"}}},
}}]

response = litelm.completion(
    "openai/gpt-4o", messages=[{"role": "user", "content": "Weather in Paris?"}],
    tools=tools, tool_choice="required",
)
tool_call = response.choices[0].message.tool_calls[0]
print(tool_call.function.name, tool_call.function.arguments)

Custom / Local Providers

Any OpenAI-compatible server works via api_base:

# vLLM
litelm.completion("openai/my-model", messages=[...], api_base="http://localhost:8000/v1")

# Ollama
litelm.completion("ollama/llama3", messages=[...], api_base="http://localhost:11434/v1")

# LM Studio
litelm.completion("openai/local-model", messages=[...], api_base="http://localhost:1234/v1")

Status

Alpha. 129 own tests, 56 ported litellm tests passing unmodified via sys.modules shimming.

DSPy drop-in verified — all 7 execution paths proven live (Predict, CoT, typed signatures, streaming, embeddings, tool use, multi-output).

Tests

uv run pytest tests/ -x --ignore=tests/ported        # 129 unit tests
uv run pytest tests/test_live.py -m live --timeout=30 # 37 live provider tests
uv run pytest tests/test_dspy_smoke.py -m live --timeout=60  # 10 DSPy integration tests

Live tests require API keys in .env.test. Skipped by default; run with -m live.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

litelm-0.3.1.tar.gz (244.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

litelm-0.3.1-py3-none-any.whl (29.2 kB view details)

Uploaded Python 3

File details

Details for the file litelm-0.3.1.tar.gz.

File metadata

  • Download URL: litelm-0.3.1.tar.gz
  • Upload date:
  • Size: 244.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for litelm-0.3.1.tar.gz
Algorithm Hash digest
SHA256 3033edfe5da887510ea18d5ef1341df992ed574a08c776b93bf034896a9de6d6
MD5 cc6c44c9153759746718b0a6e13674d1
BLAKE2b-256 f512c487efa68e3732e0c4cc4c79bf34516c014b80dbf0d3ee49aef79582efd3

See more details on using hashes here.

Provenance

The following attestation bundles were made for litelm-0.3.1.tar.gz:

Publisher: publish.yml on kennethwolters/litelm

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file litelm-0.3.1-py3-none-any.whl.

File metadata

  • Download URL: litelm-0.3.1-py3-none-any.whl
  • Upload date:
  • Size: 29.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for litelm-0.3.1-py3-none-any.whl
Algorithm Hash digest
SHA256 8f7926cf9364b96f5e88a5ec8c023e3ada40d7688a2c052d8918f4c09737438f
MD5 d7d3f7d0dfdb7bb9331e63122518ba14
BLAKE2b-256 21f1e73946ee1652372c8a077cd23db17a59fc1567b37b478311b127617e6080

See more details on using hashes here.

Provenance

The following attestation bundles were made for litelm-0.3.1-py3-none-any.whl:

Publisher: publish.yml on kennethwolters/litelm

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page