Skip to main content

Python utility for using LLM API models.

Project description

lm_deluge

lm_deluge is a lightweight helper library for talking to large language model APIs. It wraps several providers under a single interface, handles rate limiting, and exposes a few useful utilities for common NLP tasks.

Features

  • Unified client – send prompts to OpenAI‑compatible models, Anthropic, Cohere and Vertex hosted Claude models using the same API.
  • Async or sync – process prompts concurrently with process_prompts_async or run them synchronously with process_prompts_sync.
  • Spray across providers – configure multiple model names with weighting so requests are distributed across different providers.
  • Caching – optional LevelDB, SQLite or custom caches to avoid duplicate calls.
  • Embeddings and reranking – helper functions for embedding text and reranking documents via Cohere/OpenAI endpoints.
  • Built‑in tools – simple extract, translate and score_llm helpers for common patterns.

Installation

pip install lm_deluge

The package relies on environment variables for API keys. Typical variables include OPENAI_API_KEY, ANTHROPIC_API_KEY, COHERE_API_KEY, META_API_KEY (for Llama) and GOOGLE_APPLICATION_CREDENTIALS for Vertex.

Quickstart

from lm_deluge import LLMClient

client = LLMClient.basic(
    model=["gpt-4o-mini"],    # any model id from lm_deluge.models.registry
    temperature=0.2,
    max_new_tokens=256,
)

resp = client.process_prompts_sync(["Hello, world!"])  # returns list[APIResponse]
print(resp[0].completion)

Asynchronous usage

import asyncio

async def main():
    responses = await client.process_prompts_async(
        ["an async call"],
        return_completions_only=True,
    )
    print(responses[0])

asyncio.run(main())

Distributing requests across models

You can provide multiple model_names and optional model_weights when creating an LLMClient. Each prompt will be sent to one of the models based on those weights.

client = LLMClient(
    model_names=["gpt-4o-mini", "claude-haiku-anthropic"],
    model_weights="rate_limit",        # or a list like [0.7, 0.3]
    max_requests_per_minute=5000,
    max_tokens_per_minute=1_000_000,
    max_concurrent_requests=100,
)

Provider specific notes

  • OpenAI and compatible providers – set OPENAI_API_KEY. Model ids in the registry include OpenAI models as well as Meta Llama, Grok and many others that expose OpenAI style APIs.
  • Anthropic – set ANTHROPIC_API_KEY. Use model ids such as claude-haiku-anthropic or claude-sonnet-anthropic.
  • Cohere – set COHERE_API_KEY. Models like command-r are available.
  • Vertex Claude – set GOOGLE_APPLICATION_CREDENTIALS and PROJECT_ID. Use a model id such as claude-sonnet-vertex.

The models.py file lists every supported model and the required environment variable.

Built‑in tools

The lm_deluge.llm_tools package exposes a few helper functions:

  • extract – structure text or images into a Pydantic model based on a schema.
  • translate – translate a list of strings to English if needed.
  • score_llm – simple yes/no style scoring with optional log probability output.

Embeddings (embed.embed_parallel_async) and document reranking (rerank.rerank_parallel_async) are also provided.

Caching results

lm_deluge.cache includes LevelDB, SQLite and custom dictionary based caches. Pass an instance via LLMClient(..., cache=my_cache) and previously seen prompts will not be re‑sent.

Development notes

Models and costs are defined in src/lm_deluge/models.py. Conversations are built using the Conversation and Message helpers in src/lm_deluge/prompt.py, which also support images.

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

lm_deluge-0.0.4.tar.gz (50.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

lm_deluge-0.0.4-py3-none-any.whl (63.0 kB view details)

Uploaded Python 3

File details

Details for the file lm_deluge-0.0.4.tar.gz.

File metadata

  • Download URL: lm_deluge-0.0.4.tar.gz
  • Upload date:
  • Size: 50.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.10

File hashes

Hashes for lm_deluge-0.0.4.tar.gz
Algorithm Hash digest
SHA256 544a96376bc4307927895e9ad71cee99d0a20f260365e23e6a65311871dd2ac9
MD5 5faa1eaaa2288a6d6e052a355b9a222f
BLAKE2b-256 ce61ad6fc777e989ee78dce8f77ea035c1ecb42e1a1864ac834cf2ad68e0ca18

See more details on using hashes here.

File details

Details for the file lm_deluge-0.0.4-py3-none-any.whl.

File metadata

  • Download URL: lm_deluge-0.0.4-py3-none-any.whl
  • Upload date:
  • Size: 63.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.10

File hashes

Hashes for lm_deluge-0.0.4-py3-none-any.whl
Algorithm Hash digest
SHA256 161ebb65a7dcff48219e11c3baf2cb706cb99349f087eb3f90cce052f95bee71
MD5 0034d5d2c2dccd0a435c2c5c0bcefcd8
BLAKE2b-256 a1f49d36df66e7f57319a408971936487c2bde36688f525df53830c1b0a50694

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page