Skip to main content

Add your description here

Project description

any-llm-client

A unified and lightweight asynchronous Python API for communicating with LLMs.

Supports multiple providers, including OpenAI Chat Completions API (and any OpenAI-compatible API, such as Ollama and vLLM) and YandexGPT API.

How To Use

Before starting using any-llm-client, make sure you have it installed:

uv add any-llm-client
poetry add any-llm-client

Response API

Here's a full example that uses Ollama and Qwen2.5-Coder:

import asyncio

import any_llm_client


config = any_llm_client.OpenAIConfig(
    url="http://127.0.0.1:11434/v1/chat/completions",
    model_name="qwen2.5-coder:1.5b",
    request_extra={"best_of": 3}
)


async def main() -> None:
    async with any_llm_client.get_client(config) as client:
        print(await client.request_llm_message("Кек, чо как вообще на нарах?"))


asyncio.run(main())

To use YandexGPT, replace the config:

config = any_llm_client.YandexGPTConfig(
    auth_header=os.environ["YANDEX_AUTH_HEADER"], folder_id=os.environ["YANDEX_FOLDER_ID"], model_name="yandexgpt"
)

Streaming API

LLMs often take long time to respond fully. Here's an example of streaming API usage:

import asyncio

import any_llm_client


config = any_llm_client.OpenAIConfig(
    url="http://127.0.0.1:11434/v1/chat/completions",
    model_name="qwen2.5-coder:1.5b",
    request_extra={"best_of": 3}
)


async def main() -> None:
    async with (
        any_llm_client.get_client(config) as client,
        client.stream_llm_message_chunks("Кек, чо как вообще на нарах?") as message_chunks,
    ):
        async for chunk in message_chunks:
            print(chunk, end="", flush=True)


asyncio.run(main())

Passing chat history and temperature

You can pass list of messages instead of str as the first argument, and set temperature:

async with (
    any_llm_client.get_client(config) as client,
    client.stream_llm_message_chunks(
        messages=[
            any_llm_client.SystemMessage("Ты — опытный ассистент"),
            any_llm_client.UserMessage("Кек, чо как вообще на нарах?"),
        ],
        temperature=1.0,
    ) as message_chunks,
):
    ...

Reasoning models

Today you can access openapi-like reasoning models and retrieve their reasoning content:

async def main() -> None:
    async with any_llm_client.get_client(config) as client:
        llm_response = await client.request_llm_message("Кек, чо как вообще на нарах?")
        print(f"Just a regular LLM response content: {llm_response.content}")
        print(f"LLM reasoning response content: {llm_response.reasoning_content}")

    ...

Other

Mock client

You can use a mock client for testing:

config = any_llm_client.MockLLMConfig(
    response_message=...,
    stream_messages=["Hi!"],
)

async with any_llm_client.get_client(config, ...) as client:
    ...

Configuration with environment variables

Credentials

Instead of passing credentials directly, you can set corresponding environment variables:

  • OpenAI: ANY_LLM_CLIENT_OPENAI_AUTH_TOKEN,
  • YandexGPT: ANY_LLM_CLIENT_YANDEXGPT_AUTH_HEADER, ANY_LLM_CLIENT_YANDEXGPT_FOLDER_ID.
LLM model config (with pydantic-settings)
import os

import pydantic_settings

import any_llm_client


class Settings(pydantic_settings.BaseSettings):
    llm_model: any_llm_client.AnyLLMConfig


os.environ["LLM_MODEL"] = """{
    "api_type": "openai",
    "url": "http://127.0.0.1:11434/v1/chat/completions",
    "model_name": "qwen2.5-coder:1.5b",
    "request_extra": {"best_of": 3}
}"""
settings = Settings()

async with any_llm_client.get_client(settings.llm_model, ...) as client:
    ...

Combining with environment variables from previous section, you can keep LLM model configuration and secrets separate.

Using clients directly

The recommended way to get LLM client is to call any_llm_client.get_client(). This way you can easily swap LLM models. If you prefer, you can use any_llm_client.OpenAIClient or any_llm_client.YandexGPTClient directly:

config = any_llm_client.OpenAIConfig(
    url=pydantic.HttpUrl("https://api.openai.com/v1/chat/completions"),
    auth_token=os.environ["OPENAI_API_KEY"],
    model_name="gpt-4o-mini",
    request_extra={"best_of": 3}
)

async with any_llm_client.OpenAIClient(config, ...) as client:
    ...

Errors

any_llm_client.LLMClient.request_llm_message() and any_llm_client.LLMClient.stream_llm_message_chunks() will raise:

  • any_llm_client.LLMError or any_llm_client.OutOfTokensOrSymbolsError when the LLM API responds with a failed HTTP status,
  • any_llm_client.LLMRequestValidationError when images are passed to YandexGPT client.
  • any_llm_client.LLMResponseValidationError when invalid response come from LLM API (reraised from pydantic.ValidationError).

All these exceptions inherit from the base class any_llm_client.AnyLLMClientError.

Timeouts, proxy & other HTTP settings

Pass custom HTTPX kwargs to any_llm_client.get_client():

import httpx

import any_llm_client


async with any_llm_client.get_client(
    ...,
    mounts={"https://api.openai.com": httpx.AsyncHTTPTransport(proxy="http://localhost:8030")},
    timeout=httpx.Timeout(None, connect=5.0),
) as client:
    ...

Default timeout is httpx.Timeout(None, connect=5.0) (5 seconds on connect, unlimited on read, write or pool).

Retries

By default, requests are retried 3 times on HTTP status errors. You can change the retry behaviour by supplying request_retry parameter:

async with any_llm_client.get_client(..., request_retry=any_llm_client.RequestRetryConfig(attempts=5, ...)) as client:
    ...

Passing extra data to LLM

await client.request_llm_message("Кек, чо как вообще на нарах?", extra={"best_of": 3})

The extra parameter is united with request_extra in OpenAIConfig

Passing images

You can pass images to OpenAI client (YandexGPT doesn't support images yet):

await client.request_llm_message(
    messages=[
        any_llm_client.TextContentItem("What's on the image?"),
        any_llm_client.ImageContentItem("https://upload.wikimedia.org/wikipedia/commons/a/a9/Example.jpg"),
    ]
)

You can also pass a data url with base64-encoded image:

await client.request_llm_message(
    messages=[
        any_llm_client.TextContentItem("What's on the image?"),
        any_llm_client.ImageContentItem(
            f"data:image/jpeg;base64,{base64.b64encode(image_content_bytes).decode('utf-8')}"
        ),
    ]
)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

any_llm_client-3.2.0.tar.gz (26.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

any_llm_client-3.2.0-py3-none-any.whl (14.8 kB view details)

Uploaded Python 3

File details

Details for the file any_llm_client-3.2.0.tar.gz.

File metadata

  • Download URL: any_llm_client-3.2.0.tar.gz
  • Upload date:
  • Size: 26.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.6.16

File hashes

Hashes for any_llm_client-3.2.0.tar.gz
Algorithm Hash digest
SHA256 d37ec41612c2c33caffe39a70235bbca50156f84d1236f848b9ce93dce2f59c1
MD5 fb5ff8e73d18212a73ab81d33e7dd4e3
BLAKE2b-256 1ae5c7347382e66d1ed19e17da65ac6137cece01420dba1610ff0a8e3fb13f94

See more details on using hashes here.

File details

Details for the file any_llm_client-3.2.0-py3-none-any.whl.

File metadata

File hashes

Hashes for any_llm_client-3.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 795ef729809d57beba36e0f18bc0889f855ea904b4e87648240174dfccc90855
MD5 6da48f5c2b54d2636f7ede6e39bd7e1c
BLAKE2b-256 2b1f0e0efbc9ca3a067ba4a8e6ad61e4b5335f5dfa6af59b9cb5b97fdd5a3853

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page