Skip to main content

Python utility for using LLM API models.

Project description

lm_deluge

lm_deluge is a lightweight helper library for talking to large language model APIs. It wraps several providers under a single interface, handles rate limiting, and exposes a few useful utilities for common NLP tasks.

Features

  • Unified client – send prompts to OpenAI‑compatible models, Anthropic, Cohere and Vertex hosted Claude models using the same API.
  • Async or sync – process prompts concurrently with process_prompts_async or run them synchronously with process_prompts_sync.
  • Spray across providers – configure multiple model names with weighting so requests are distributed across different providers.
  • Caching – optional LevelDB, SQLite or custom caches to avoid duplicate calls.
  • Embeddings and reranking – helper functions for embedding text and reranking documents via Cohere/OpenAI endpoints.
  • Built‑in tools – simple extract, translate and score_llm helpers for common patterns.

Installation

pip install lm_deluge

The package relies on environment variables for API keys. Typical variables include OPENAI_API_KEY, ANTHROPIC_API_KEY, COHERE_API_KEY, META_API_KEY (for Llama) and GOOGLE_APPLICATION_CREDENTIALS for Vertex.

Quickstart

from lm_deluge import LLMClient

client = LLMClient.basic(
    model=["gpt-4o-mini"],    # any model id from lm_deluge.models.registry
    temperature=0.2,
    max_new_tokens=256,
)

resp = client.process_prompts_sync(["Hello, world!"])  # returns list[APIResponse]
print(resp[0].completion)

Asynchronous usage

import asyncio

async def main():
    responses = await client.process_prompts_async(
        ["an async call"],
        return_completions_only=True,
    )
    print(responses[0])

asyncio.run(main())

Distributing requests across models

You can provide multiple model_names and optional model_weights when creating an LLMClient. Each prompt will be sent to one of the models based on those weights.

client = LLMClient(
    model_names=["gpt-4o-mini", "claude-haiku-anthropic"],
    model_weights="rate_limit",        # or a list like [0.7, 0.3]
    max_requests_per_minute=5000,
    max_tokens_per_minute=1_000_000,
    max_concurrent_requests=100,
)

Provider specific notes

  • OpenAI and compatible providers – set OPENAI_API_KEY. Model ids in the registry include OpenAI models as well as Meta Llama, Grok and many others that expose OpenAI style APIs.
  • Anthropic – set ANTHROPIC_API_KEY. Use model ids such as claude-haiku-anthropic or claude-sonnet-anthropic.
  • Cohere – set COHERE_API_KEY. Models like command-r are available.
  • Vertex Claude – set GOOGLE_APPLICATION_CREDENTIALS and PROJECT_ID. Use a model id such as claude-sonnet-vertex.

The models.py file lists every supported model and the required environment variable.

Built‑in tools

The lm_deluge.llm_tools package exposes a few helper functions:

  • extract – structure text or images into a Pydantic model based on a schema.
  • translate – translate a list of strings to English if needed.
  • score_llm – simple yes/no style scoring with optional log probability output.

Embeddings (embed.embed_parallel_async) and document reranking (rerank.rerank_parallel_async) are also provided.

Caching results

lm_deluge.cache includes LevelDB, SQLite and custom dictionary based caches. Pass an instance via LLMClient(..., cache=my_cache) and previously seen prompts will not be re‑sent.

Development notes

Models and costs are defined in src/lm_deluge/models.py. Conversations are built using the Conversation and Message helpers in src/lm_deluge/prompt.py, which also support images.

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

lm_deluge-0.0.3.tar.gz (50.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

lm_deluge-0.0.3-py3-none-any.whl (63.5 kB view details)

Uploaded Python 3

File details

Details for the file lm_deluge-0.0.3.tar.gz.

File metadata

  • Download URL: lm_deluge-0.0.3.tar.gz
  • Upload date:
  • Size: 50.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.10

File hashes

Hashes for lm_deluge-0.0.3.tar.gz
Algorithm Hash digest
SHA256 5b42f2d0d9c4606aa8a483483f8d199f8dbff51cae9fcf25f87af67d391305b5
MD5 50ecd6813ba31ccf46ec78262b8b13f6
BLAKE2b-256 a619789b3f9aa6094e619d06e38b7258d332dad6dea6ddeec07bb4c3b12deacb

See more details on using hashes here.

File details

Details for the file lm_deluge-0.0.3-py3-none-any.whl.

File metadata

  • Download URL: lm_deluge-0.0.3-py3-none-any.whl
  • Upload date:
  • Size: 63.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.10

File hashes

Hashes for lm_deluge-0.0.3-py3-none-any.whl
Algorithm Hash digest
SHA256 23bb52ef883778159c7cc7be61d29438a03e2f7bf37b497f795104831618d78c
MD5 d40e7d9d19555fd4d4e90987102a36b9
BLAKE2b-256 3fe15d2bd44f5a101bd92e674cea9fd23cb6b8dbdfe8949da0b9d3b9237a223b

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page