Skip to main content

Lightweight litellm replacement — thin OpenAI-SDK routing layer for multi-provider LLM calls

Project description

litelm

litellm's routing + translation in ~2,200 lines and 2 dependencies (openai, httpx).

litellm routes LLM calls across providers and translates between message formats. That core is buried under 100k+ LOC of proxy servers, caching layers, cost tracking, and dozens of features most users never touch. litelm extracts just the call path — model routing, message translation, streaming, tool use, embeddings — and nothing else. No Router class, no proxy, no caching.

Install

pip install litelm                # openai + httpx
pip install litelm[anthropic]     # + anthropic SDK
pip install litelm[bedrock]       # + boto3
pip install litelm[all]           # everything

Usage

import litelm

# Basic completion
response = litelm.completion("openai/gpt-4o", messages=[{"role": "user", "content": "Hello!"}])
print(response.choices[0].message.content)

# Streaming
for chunk in litelm.completion("groq/llama-3.1-70b-versatile", messages=[...], stream=True):
    print(chunk.choices[0].delta.content or "", end="")

# Embeddings
response = litelm.embedding("openai/text-embedding-3-small", input=["hello world"])

Every function has an async variant: acompletion, aembedding, aresponses, atext_completion.

The API mirrors litellm — same function names, same arguments, same response types. If you're using litellm today, switching is s/litellm/litelm/ in your imports.

What's in / what's out

litellm litelm
Model routing (provider/model → right endpoint)
Message translation (Anthropic, Bedrock, Cloudflare, Mistral)
Streaming + stream_chunk_builder
Tool use (function calling)
Embeddings
Text completions
OpenAI Responses API
Mock responses
Router (load balancing, fallbacks)
Proxy server
Caching / budgeting / cost tracking
Token counting
Image gen, audio, OCR, fine-tuning
Agents, guardrails, scheduler

Providers

Routes to 19 providers via "provider/model-name" syntax. Any OpenAI-compatible endpoint works via api_base.

Provider Env Var Handler Verified
OpenAI OPENAI_API_KEY OpenAI SDK Yes
Anthropic ANTHROPIC_API_KEY Custom Yes
Groq GROQ_API_KEY OpenAI-compat Yes
Mistral MISTRAL_API_KEY Custom Yes
xAI XAI_API_KEY OpenAI-compat Yes
OpenRouter OPENROUTER_API_KEY OpenAI-compat Yes
Azure AZURE_API_KEY OpenAI SDK (Azure) No
Bedrock AWS_ACCESS_KEY_ID Custom No
Cloudflare CLOUDFLARE_API_TOKEN Custom No
Together TOGETHERAI_API_KEY OpenAI-compat No
Fireworks FIREWORKS_API_KEY OpenAI-compat No
DeepSeek DEEPSEEK_API_KEY OpenAI-compat No
Perplexity PERPLEXITYAI_API_KEY OpenAI-compat No
DeepInfra DEEPINFRA_API_TOKEN OpenAI-compat No
Gemini GEMINI_API_KEY OpenAI-compat No
Cohere COHERE_API_KEY OpenAI-compat No
Ollama OpenAI-compat No
vLLM OpenAI-compat No
LM Studio OpenAI-compat No

Status

Alpha. Proven against litellm's own test suite — 206 ported tests passing unmodified via sys.modules shimming. 92 own tests.

DSPy drop-in verified — all 7 execution paths proven live (Predict, CoT, typed signatures, streaming, embeddings, tool use, multi-output).

Roadmap

Validating as a drop-in replacement one litellm consumer at a time — DSPy is done. More consumers, providers, and end-to-end verifications will follow.

Tests

uv run pytest tests/ -x --ignore=tests/ported        # 92 unit tests
uv run pytest tests/test_live.py -m live --timeout=30 # 37 live provider tests
uv run pytest tests/test_dspy_smoke.py -m live --timeout=60  # 10 DSPy integration tests

Live tests require API keys in .env.test. Skipped by default; run with -m live.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

litelm-0.2.0.tar.gz (239.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

litelm-0.2.0-py3-none-any.whl (27.2 kB view details)

Uploaded Python 3

File details

Details for the file litelm-0.2.0.tar.gz.

File metadata

  • Download URL: litelm-0.2.0.tar.gz
  • Upload date:
  • Size: 239.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for litelm-0.2.0.tar.gz
Algorithm Hash digest
SHA256 601fe2a429920992382cb6a2f06f7c2bdd5790a21793020b013abcf18c769467
MD5 df7bcc780a25efe411efe803c1cb8a36
BLAKE2b-256 8eef0ae933a654876fc9c3e7b7c9d24059260999c1f393642689191a602d0cbb

See more details on using hashes here.

Provenance

The following attestation bundles were made for litelm-0.2.0.tar.gz:

Publisher: publish.yml on kennethwolters/litelm

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file litelm-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: litelm-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 27.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for litelm-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 84fb699f8ea02195bf0745b988c29bec5bab35b7bc0c42677e8361d2d88ee33f
MD5 d1ddd3e066bae49946ce6f66f9ee6d9c
BLAKE2b-256 b64af50c7bccd7ea486050f7e3a0b2ca1e6e89c140c434f55218cd23a4568ed1

See more details on using hashes here.

Provenance

The following attestation bundles were made for litelm-0.2.0-py3-none-any.whl:

Publisher: publish.yml on kennethwolters/litelm

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page