Lightweight litellm replacement — thin OpenAI-SDK routing layer for multi-provider LLM calls

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

kennethwolters

These details have not been verified by PyPI

Project description

litelm

litellm's routing + translation in ~2,300 lines and 2 dependencies (openai, httpx).

litellm routes LLM calls across providers and translates between message formats. That core is buried under 100k+ LOC of proxy servers, caching layers, cost tracking, and dozens of features most users never touch. litelm extracts just the call path — model routing, message translation, streaming, tool use, embeddings — and nothing else. No Router class, no proxy, no caching.

Install

pip install litelm                # openai + httpx
pip install litelm[anthropic]     # + anthropic SDK
pip install litelm[bedrock]       # + boto3
pip install litelm[all]           # everything

Usage

import litelm

# Basic completion
response = litelm.completion("openai/gpt-4o", messages=[{"role": "user", "content": "Hello!"}])
print(response.choices[0].message.content)

# Streaming
for chunk in litelm.completion("groq/llama-3.1-70b-versatile", messages=[...], stream=True):
    print(chunk.choices[0].delta.content or "", end="")

# Embeddings
response = litelm.embedding("openai/text-embedding-3-small", input=["hello world"])

Every function has an async variant: acompletion, aembedding, aresponses, atext_completion.

The API mirrors litellm — same function names, same arguments, same response types. If you're using litellm today, switching is s/litellm/litelm/ in your imports.

What's in / what's out

	litellm	litelm
Model routing (`provider/model` → right endpoint)	✓	✓
Message translation (Anthropic, Bedrock, Cloudflare, Mistral)	✓	✓
Streaming + `stream_chunk_builder`	✓	✓
Tool use (function calling)	✓	✓
Embeddings	✓	✓
Text completions	✓	✓
OpenAI Responses API	✓	✓
Mock responses	✓	✓
Router (load balancing, fallbacks)	✓	✗
Proxy server	✓	✗
Caching / budgeting / cost tracking	✓	✗
Token counting	✓	✗
Image gen, audio, OCR, fine-tuning	✓	✗
Agents, guardrails, scheduler	✓	✗

Providers

Routes to 19 providers via "provider/model-name" syntax. Any OpenAI-compatible endpoint works via api_base.

Provider	Env Var	Handler	Verified
OpenAI	`OPENAI_API_KEY`	OpenAI SDK	Yes
Anthropic	`ANTHROPIC_API_KEY`	Custom	Yes
Groq	`GROQ_API_KEY`	OpenAI-compat	Yes
Mistral	`MISTRAL_API_KEY`	Custom	Yes
xAI	`XAI_API_KEY`	OpenAI-compat	Yes
OpenRouter	`OPENROUTER_API_KEY`	OpenAI-compat	Yes
Azure	`AZURE_API_KEY`	OpenAI SDK (Azure)	No
Bedrock	`AWS_ACCESS_KEY_ID`	Custom	No
Cloudflare	`CLOUDFLARE_API_TOKEN`	Custom	No
Together	`TOGETHERAI_API_KEY`	OpenAI-compat	No
Fireworks	`FIREWORKS_API_KEY`	OpenAI-compat	No
DeepSeek	`DEEPSEEK_API_KEY`	OpenAI-compat	No
Perplexity	`PERPLEXITYAI_API_KEY`	OpenAI-compat	No
DeepInfra	`DEEPINFRA_API_TOKEN`	OpenAI-compat	No
Gemini	`GEMINI_API_KEY`	OpenAI-compat	No
Cohere	`COHERE_API_KEY`	OpenAI-compat	No
Ollama	—	OpenAI-compat	No
vLLM	—	OpenAI-compat	No
LM Studio	—	OpenAI-compat	No

API Keys

Set the environment variable for your provider:

export OPENAI_API_KEY=sk-...
export ANTHROPIC_API_KEY=sk-ant-...

Or pass directly:

litelm.completion("openai/gpt-4o", messages=[...], api_key="sk-...")
litelm.completion("openai/gpt-4o", messages=[...], api_base="http://localhost:8000/v1")

Error Handling

All provider errors are mapped to litelm's exception hierarchy:

from litelm import ContextWindowExceededError, RateLimitError, AuthenticationError

try:
    response = litelm.completion("openai/gpt-4o", messages=messages)
except ContextWindowExceededError:
    # prompt too long — truncate and retry
    pass
except RateLimitError:
    # back off
    pass
except AuthenticationError:
    # bad API key
    pass

Tool Calling

tools = [{"type": "function", "function": {
    "name": "get_weather",
    "parameters": {"type": "object", "properties": {"city": {"type": "string"}}},
}}]

response = litelm.completion(
    "openai/gpt-4o", messages=[{"role": "user", "content": "Weather in Paris?"}],
    tools=tools, tool_choice="required",
)
tool_call = response.choices[0].message.tool_calls[0]
print(tool_call.function.name, tool_call.function.arguments)

Custom / Local Providers

Any OpenAI-compatible server works via api_base:

# vLLM
litelm.completion("openai/my-model", messages=[...], api_base="http://localhost:8000/v1")

# Ollama
litelm.completion("ollama/llama3", messages=[...], api_base="http://localhost:11434/v1")

# LM Studio
litelm.completion("openai/local-model", messages=[...], api_base="http://localhost:1234/v1")

Status

Alpha. 129 own tests, 56 ported litellm tests passing unmodified via sys.modules shimming.

DSPy drop-in verified — all 7 execution paths proven live (Predict, CoT, typed signatures, streaming, embeddings, tool use, multi-output).

Tests

uv run pytest tests/ -x --ignore=tests/ported        # 129 unit tests
uv run pytest tests/test_live.py -m live --timeout=30 # 37 live provider tests
uv run pytest tests/test_dspy_smoke.py -m live --timeout=60  # 10 DSPy integration tests

Live tests require API keys in .env.test. Skipped by default; run with -m live.

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

kennethwolters

These details have not been verified by PyPI

Release history Release notifications | RSS feed

0.5.0

Apr 22, 2026

0.4.0

Mar 29, 2026

0.3.2

Mar 16, 2026

This version

0.3.1

Mar 16, 2026

0.3.0

Mar 16, 2026

0.2.0

Mar 13, 2026

0.1.0

Mar 13, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

litelm-0.3.1.tar.gz (244.0 kB view details)

Uploaded Mar 16, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

litelm-0.3.1-py3-none-any.whl (29.2 kB view details)

Uploaded Mar 16, 2026 Python 3

File details

Details for the file litelm-0.3.1.tar.gz.

File metadata

Download URL: litelm-0.3.1.tar.gz
Upload date: Mar 16, 2026
Size: 244.0 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for litelm-0.3.1.tar.gz
Algorithm	Hash digest
SHA256	`3033edfe5da887510ea18d5ef1341df992ed574a08c776b93bf034896a9de6d6`
MD5	`cc6c44c9153759746718b0a6e13674d1`
BLAKE2b-256	`f512c487efa68e3732e0c4cc4c79bf34516c014b80dbf0d3ee49aef79582efd3`

See more details on using hashes here.

Provenance

The following attestation bundles were made for litelm-0.3.1.tar.gz:

Publisher: publish.yml on kennethwolters/litelm

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: litelm-0.3.1.tar.gz
- Subject digest: 3033edfe5da887510ea18d5ef1341df992ed574a08c776b93bf034896a9de6d6
- Sigstore transparency entry: 1111257763
- Sigstore integration time: Mar 16, 2026
Source repository:
- Permalink: kennethwolters/litelm@4e4f26f6a4625d2b591fce3f24145c3723720e68
- Branch / Tag: refs/tags/v0.3.1
- Owner: https://github.com/kennethwolters
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@4e4f26f6a4625d2b591fce3f24145c3723720e68
- Trigger Event: release

File details

Details for the file litelm-0.3.1-py3-none-any.whl.

File metadata

Download URL: litelm-0.3.1-py3-none-any.whl
Upload date: Mar 16, 2026
Size: 29.2 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for litelm-0.3.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`8f7926cf9364b96f5e88a5ec8c023e3ada40d7688a2c052d8918f4c09737438f`
MD5	`d7d3f7d0dfdb7bb9331e63122518ba14`
BLAKE2b-256	`21f1e73946ee1652372c8a077cd23db17a59fc1567b37b478311b127617e6080`

See more details on using hashes here.

Provenance

The following attestation bundles were made for litelm-0.3.1-py3-none-any.whl:

Publisher: publish.yml on kennethwolters/litelm

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: litelm-0.3.1-py3-none-any.whl
- Subject digest: 8f7926cf9364b96f5e88a5ec8c023e3ada40d7688a2c052d8918f4c09737438f
- Sigstore transparency entry: 1111257839
- Sigstore integration time: Mar 16, 2026
Source repository:
- Permalink: kennethwolters/litelm@4e4f26f6a4625d2b591fce3f24145c3723720e68
- Branch / Tag: refs/tags/v0.3.1
- Owner: https://github.com/kennethwolters
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@4e4f26f6a4625d2b591fce3f24145c3723720e68
- Trigger Event: release

litelm 0.3.1

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

litelm

Install

Usage

What's in / what's out

Providers

API Keys

Error Handling

Tool Calling

Custom / Local Providers

Status

Tests

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance