Python utility for using LLM API models.
Project description
lm_deluge
lm_deluge is a lightweight helper library for talking to large language model APIs. It wraps several providers under a single interface, handles rate limiting, and exposes a few useful utilities for common NLP tasks.
Features
- Unified client – send prompts to OpenAI‑compatible models, Anthropic, Cohere and Vertex hosted Claude models using the same API.
- Async or sync – process prompts concurrently with
process_prompts_asyncor run them synchronously withprocess_prompts_sync. - Spray across providers – configure multiple model names with weighting so requests are distributed across different providers.
- Caching – optional LevelDB, SQLite or custom caches to avoid duplicate calls.
- Embeddings and reranking – helper functions for embedding text and reranking documents via Cohere/OpenAI endpoints.
- Built‑in tools – simple
extract,translateandscore_llmhelpers for common patterns.
Installation
pip install lm_deluge
The package relies on environment variables for API keys. Typical variables include OPENAI_API_KEY, ANTHROPIC_API_KEY, COHERE_API_KEY, META_API_KEY (for Llama) and GOOGLE_APPLICATION_CREDENTIALS for Vertex.
Quickstart
from lm_deluge import LLMClient
client = LLMClient.basic(
model=["gpt-4o-mini"], # any model id from lm_deluge.models.registry
temperature=0.2,
max_new_tokens=256,
)
resp = client.process_prompts_sync(["Hello, world!"]) # returns list[APIResponse]
print(resp[0].completion)
Asynchronous usage
import asyncio
async def main():
responses = await client.process_prompts_async(
["an async call"],
return_completions_only=True,
)
print(responses[0])
asyncio.run(main())
Distributing requests across models
You can provide multiple model_names and optional model_weights when creating an LLMClient. Each prompt will be sent to one of the models based on those weights.
client = LLMClient(
model_names=["gpt-4o-mini", "claude-haiku-anthropic"],
model_weights="rate_limit", # or a list like [0.7, 0.3]
max_requests_per_minute=5000,
max_tokens_per_minute=1_000_000,
max_concurrent_requests=100,
)
Provider specific notes
- OpenAI and compatible providers – set
OPENAI_API_KEY. Model ids in the registry include OpenAI models as well as Meta Llama, Grok and many others that expose OpenAI style APIs. - Anthropic – set
ANTHROPIC_API_KEY. Use model ids such asclaude-haiku-anthropicorclaude-sonnet-anthropic. - Cohere – set
COHERE_API_KEY. Models likecommand-rare available. - Vertex Claude – set
GOOGLE_APPLICATION_CREDENTIALSandPROJECT_ID. Use a model id such asclaude-sonnet-vertex.
The models.py file lists every supported model and the required environment variable.
Built‑in tools
The lm_deluge.llm_tools package exposes a few helper functions:
extract– structure text or images into a Pydantic model based on a schema.translate– translate a list of strings to English if needed.score_llm– simple yes/no style scoring with optional log probability output.
Embeddings (embed.embed_parallel_async) and document reranking (rerank.rerank_parallel_async) are also provided.
Caching results
lm_deluge.cache includes LevelDB, SQLite and custom dictionary based caches. Pass an instance via LLMClient(..., cache=my_cache) and previously seen prompts will not be re‑sent.
Development notes
Models and costs are defined in src/lm_deluge/models.py. Conversations are built using the Conversation and Message helpers in src/lm_deluge/prompt.py, which also support images.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file lm_deluge-0.0.3.tar.gz.
File metadata
- Download URL: lm_deluge-0.0.3.tar.gz
- Upload date:
- Size: 50.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.11.10
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
5b42f2d0d9c4606aa8a483483f8d199f8dbff51cae9fcf25f87af67d391305b5
|
|
| MD5 |
50ecd6813ba31ccf46ec78262b8b13f6
|
|
| BLAKE2b-256 |
a619789b3f9aa6094e619d06e38b7258d332dad6dea6ddeec07bb4c3b12deacb
|
File details
Details for the file lm_deluge-0.0.3-py3-none-any.whl.
File metadata
- Download URL: lm_deluge-0.0.3-py3-none-any.whl
- Upload date:
- Size: 63.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.11.10
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
23bb52ef883778159c7cc7be61d29438a03e2f7bf37b497f795104831618d78c
|
|
| MD5 |
d40e7d9d19555fd4d4e90987102a36b9
|
|
| BLAKE2b-256 |
3fe15d2bd44f5a101bd92e674cea9fd23cb6b8dbdfe8949da0b9d3b9237a223b
|