Add your description here
Project description
any-llm-client
A unified and lightweight asynchronous Python API for communicating with LLMs.
Supports multiple providers, including OpenAI Chat Completions API (and any OpenAI-compatible API, such as Ollama and vLLM) and YandexGPT API.
How To Use
Before starting using any-llm-client, make sure you have it installed:
uv add any-llm-client
poetry add any-llm-client
Response API
Here's a full example that uses Ollama and Qwen2.5-Coder:
import asyncio
import any_llm_client
config = any_llm_client.OpenAIConfig(
url="http://127.0.0.1:11434/v1/chat/completions",
model_name="qwen2.5-coder:1.5b",
request_extra={"best_of": 3}
)
async def main() -> None:
async with any_llm_client.get_client(config) as client:
print(await client.request_llm_message("Кек, чо как вообще на нарах?"))
asyncio.run(main())
To use YandexGPT, replace the config:
config = any_llm_client.YandexGPTConfig(
auth_header=os.environ["YANDEX_AUTH_HEADER"], folder_id=os.environ["YANDEX_FOLDER_ID"], model_name="yandexgpt"
)
Streaming API
LLMs often take long time to respond fully. Here's an example of streaming API usage:
import asyncio
import any_llm_client
config = any_llm_client.OpenAIConfig(
url="http://127.0.0.1:11434/v1/chat/completions",
model_name="qwen2.5-coder:1.5b",
request_extra={"best_of": 3}
)
async def main() -> None:
async with (
any_llm_client.get_client(config) as client,
client.stream_llm_message_chunks("Кек, чо как вообще на нарах?") as message_chunks,
):
async for chunk in message_chunks:
print(chunk, end="", flush=True)
asyncio.run(main())
Passing chat history and temperature
You can pass list of messages instead of str as the first argument, and set temperature:
async with (
any_llm_client.get_client(config) as client,
client.stream_llm_message_chunks(
messages=[
any_llm_client.SystemMessage("Ты — опытный ассистент"),
any_llm_client.UserMessage("Кек, чо как вообще на нарах?"),
],
temperature=1.0,
) as message_chunks,
):
...
Reasoning models
Today you can access openapi-like reasoning models and retrieve their reasoning content:
async def main() -> None:
async with any_llm_client.get_client(config) as client:
llm_response = await client.request_llm_message("Кек, чо как вообще на нарах?")
print(f"Just a regular LLM response content: {llm_response.content}")
print(f"LLM reasoning response content: {llm_response.reasoning_content}")
...
Other
Mock client
You can use a mock client for testing:
config = any_llm_client.MockLLMConfig(
response_message=...,
stream_messages=["Hi!"],
)
async with any_llm_client.get_client(config, ...) as client:
...
Configuration with environment variables
Credentials
Instead of passing credentials directly, you can set corresponding environment variables:
- OpenAI:
ANY_LLM_CLIENT_OPENAI_AUTH_TOKEN, - YandexGPT:
ANY_LLM_CLIENT_YANDEXGPT_AUTH_HEADER,ANY_LLM_CLIENT_YANDEXGPT_FOLDER_ID.
LLM model config (with pydantic-settings)
import os
import pydantic_settings
import any_llm_client
class Settings(pydantic_settings.BaseSettings):
llm_model: any_llm_client.AnyLLMConfig
os.environ["LLM_MODEL"] = """{
"api_type": "openai",
"url": "http://127.0.0.1:11434/v1/chat/completions",
"model_name": "qwen2.5-coder:1.5b",
"request_extra": {"best_of": 3}
}"""
settings = Settings()
async with any_llm_client.get_client(settings.llm_model, ...) as client:
...
Combining with environment variables from previous section, you can keep LLM model configuration and secrets separate.
Using clients directly
The recommended way to get LLM client is to call any_llm_client.get_client(). This way you can easily swap LLM models. If you prefer, you can use any_llm_client.OpenAIClient or any_llm_client.YandexGPTClient directly:
config = any_llm_client.OpenAIConfig(
url=pydantic.HttpUrl("https://api.openai.com/v1/chat/completions"),
auth_token=os.environ["OPENAI_API_KEY"],
model_name="gpt-4o-mini",
request_extra={"best_of": 3}
)
async with any_llm_client.OpenAIClient(config, ...) as client:
...
Errors
any_llm_client.LLMClient.request_llm_message() and any_llm_client.LLMClient.stream_llm_message_chunks() will raise:
any_llm_client.LLMErrororany_llm_client.OutOfTokensOrSymbolsErrorwhen the LLM API responds with a failed HTTP status,any_llm_client.LLMRequestValidationErrorwhen images are passed to YandexGPT client.any_llm_client.LLMResponseValidationErrorwhen invalid response come from LLM API (reraised frompydantic.ValidationError).
All these exceptions inherit from the base class any_llm_client.AnyLLMClientError.
Timeouts, proxy & other HTTP settings
Pass custom HTTPX kwargs to any_llm_client.get_client():
import httpx
import any_llm_client
async with any_llm_client.get_client(
...,
mounts={"https://api.openai.com": httpx.AsyncHTTPTransport(proxy="http://localhost:8030")},
timeout=httpx.Timeout(None, connect=5.0),
) as client:
...
Default timeout is httpx.Timeout(None, connect=5.0) (5 seconds on connect, unlimited on read, write or pool).
Retries
By default, requests are retried 3 times on HTTP status errors. You can change the retry behaviour by supplying request_retry parameter:
async with any_llm_client.get_client(..., request_retry=any_llm_client.RequestRetryConfig(attempts=5, ...)) as client:
...
Passing extra data to LLM
await client.request_llm_message("Кек, чо как вообще на нарах?", extra={"best_of": 3})
The extra parameter is united with request_extra in OpenAIConfig
Passing images
You can pass images to OpenAI client (YandexGPT doesn't support images yet):
await client.request_llm_message(
messages=[
any_llm_client.TextContentItem("What's on the image?"),
any_llm_client.ImageContentItem("https://upload.wikimedia.org/wikipedia/commons/a/a9/Example.jpg"),
]
)
You can also pass a data url with base64-encoded image:
await client.request_llm_message(
messages=[
any_llm_client.TextContentItem("What's on the image?"),
any_llm_client.ImageContentItem(
f"data:image/jpeg;base64,{base64.b64encode(image_content_bytes).decode('utf-8')}"
),
]
)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file any_llm_client-3.1.0.tar.gz.
File metadata
- Download URL: any_llm_client-3.1.0.tar.gz
- Upload date:
- Size: 26.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.6.16
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2252c5f3b3f4c10365ebbff7b996c42c1867d3269ba8e90f80a9b48c532bff8e
|
|
| MD5 |
5cc590d4dbf7a6e3e4e6a67d346c9087
|
|
| BLAKE2b-256 |
9c8999195c213910d41fff11808dee19eccabea5991989871335f82ad8b484c7
|
File details
Details for the file any_llm_client-3.1.0-py3-none-any.whl.
File metadata
- Download URL: any_llm_client-3.1.0-py3-none-any.whl
- Upload date:
- Size: 14.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.6.16
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2fa23a8c089c24a5306fe3bfe1686a9da5e73bbbad8184a17ae457edf2b0f394
|
|
| MD5 |
09e95c849814fb703706299de1f62949
|
|
| BLAKE2b-256 |
f6ba887c45d2bbe79576ee98894f2156df427bd46c5c0838411a9b154206719e
|