Unified Python client for OpenAI, Anthropic, Gemini, DeepSeek, Bedrock, and ChatGPT.
Project description
llmai
llmai is a Python library for working with multiple LLM providers through a shared set of message, tool, schema, and response primitives.
Today the repository includes adapters for:
- ChatGPT
- OpenAI
- DeepSeek
- Anthropic
- Google Gemini
- Amazon Bedrock
Each provider client exposes the same core entrypoint:
generate(..., stream=False)
Why This Exists
Provider SDKs differ in how they represent messages, tool calls, structured output, and streaming events. llmai smooths those differences out so application code can stay closer to one mental model.
Installation
Install the project locally with uv:
uv sync
Or install it in editable mode with pip:
pip install -e .
Quick Start
from llmai import OpenAIClient
from llmai.shared import UserMessage
client = OpenAIClient(api_key="OPENAI_API_KEY")
result = client.generate(
model="your-openai-model",
messages=[
UserMessage(content="Write a two-line poem about clean interfaces."),
],
)
print(result.content)
print(result.usage)
print(result.duration_seconds)
For text-only prompts, UserMessage(content="...") is the simplest form. You can also pass explicit content parts like TextContentPart when you need mixed multimodal input or tighter control over message structure.
If you want to swap providers, the overall call shape stays the same. In most cases you only need to change the client class, credentials, and model name.
ChatGPT
from llmai import ChatGPTClient
from llmai.shared import UserMessage
client = ChatGPTClient(access_token="CHATGPT_ACCESS_TOKEN")
result = client.generate(
model="chatgpt-4o-latest",
messages=[
UserMessage(content="Write a two-line poem about clean interfaces."),
],
)
print(result.content)
ChatGPTClient targets ChatGPT's Codex backend at https://chatgpt.com/backend-api/codex. It always uses the Responses API internally, and reads CHATGPT_ACCESS_TOKEN or CODEX_ACCESS_TOKEN by default, with optional CHATGPT_ACCOUNT_ID or CODEX_ACCOUNT_ID.
DeepSeek
from llmai import DeepSeekClient
from llmai.shared import JSONSchemaResponse, UserMessage
client = DeepSeekClient()
result = client.generate(
model="deepseek-chat",
messages=[
UserMessage(content="Return a JSON object with one field named answer."),
],
response_format=JSONSchemaResponse(
name="final_answer",
json_schema={
"type": "object",
"properties": {
"answer": {"type": "string"},
},
"required": ["answer"],
},
),
stream=True,
)
DeepSeekClient uses the OpenAI SDK against DeepSeek's OpenAI-compatible API and reads DEEPSEEK_API_KEY by default. For structured output, it always uses an internal function-tool schema because DeepSeek does not support response_format={"type":"json_schema"}. During streaming, the internal response tool is surfaced as incremental JSON content chunks, and the stream still ends with parsed JSON in ResponseStreamCompletionChunk.content. If you need DeepSeek's server-side strict tool enforcement, point base_url at https://api.deepseek.com/beta.
Amazon Bedrock
from llmai import BedrockClient
from llmai.shared import UserMessage
client = BedrockClient(
region="us-east-1",
aws_access_key_id="AWS_ACCESS_KEY_ID",
aws_secret_access_key="AWS_SECRET_ACCESS_KEY",
)
# Or use Bedrock API-key auth:
# client = BedrockClient(region="us-east-1", api_key="BEDROCK_API_KEY")
result = client.generate(
model="us.anthropic.claude-3-5-haiku-20241022-v1:0",
messages=[
UserMessage(content="Say hello."),
],
)
print(result.content)
Structured Output
from pydantic import BaseModel
from llmai import GoogleClient
from llmai.shared import JSONSchemaResponse, UserMessage
class Summary(BaseModel):
title: str
bullets: list[str]
client = GoogleClient(api_key="GOOGLE_API_KEY")
result = client.generate(
model="your-google-model",
messages=[
UserMessage(content="Summarize retrieval-augmented generation in simple terms."),
],
response_format=JSONSchemaResponse(json_schema=Summary),
)
print(result.content)
Use JSONSchemaResponse, JSONObjectResponse, or TextResponse to request different response shapes.
Multimodal Content
from llmai import GoogleClient
from llmai.shared import ImageContentPart, TextContentPart, UserMessage
client = GoogleClient(api_key="GOOGLE_API_KEY")
result = client.generate(
model="your-google-model",
messages=[
UserMessage(
content=[
TextContentPart(text="Describe this image."),
ImageContentPart(url="https://example.com/cat.png"),
]
),
],
)
print(result.content)
print(result.thinking)
Use explicit content parts when you need multimodal inputs or want to mix text with images in one message. Normal completion content is surfaced as list[TextContentPart | ImageContentPart] when the provider returns message content, including text-only replies. Reasoning is exposed on ResponseContent.thinking as list[str] when the provider returns one or more thinking blocks, and the same value is also available on the final AssistantMessage.
Tool Calling
from pydantic import BaseModel
from llmai import OpenAIClient
from llmai.shared import Tool, ToolResponseMessage, UserMessage
class WeatherArgs(BaseModel):
city: str
weather_tool = Tool(
name="get_weather",
description="Look up the weather for a city.",
schema=WeatherArgs,
)
client = OpenAIClient(api_key="OPENAI_API_KEY")
first = client.generate(
model="your-openai-model",
messages=[
UserMessage(content="What is the weather in Kathmandu?"),
],
tools=[weather_tool],
tool_choice={"tools": ["get_weather"]},
)
for tool_call in first.tool_calls:
if tool_call.name != "get_weather":
continue
follow_up = client.generate(
model="your-openai-model",
messages=[
*first.messages,
ToolResponseMessage(
id=tool_call.id,
content=["It is sunny in Kathmandu."],
),
],
tools=[weather_tool],
)
print(follow_up.content)
llmai returns tool calls in first.tool_calls and leaves execution to the caller.
Hosted Web Search
llmai also supports a provider-hosted web search tool that is not a function tool:
from llmai import OpenAIClient
from llmai.shared import UserMessage, WebSearchTool
client = OpenAIClient(api_key="OPENAI_API_KEY")
result = client.generate(
model="your-openai-model",
messages=[
UserMessage(content="What was a positive news story from today? Cite sources."),
],
tools=[WebSearchTool()],
api_type="responses",
)
print(result.content)
print(result.thinking)
You can also target it explicitly in tool_choice:
tool_choice = {
"mode": "required",
"tools": ["web_search"],
}
Current llmai behavior:
- OpenAI Responses: attaches built-in
web_search - ChatGPT/Codex: attaches built-in
web_search - Anthropic: attaches Anthropic's hosted web-search tool
- Google Gemini: attaches
google_search - OpenAI Chat Completions: ignores hosted
web_search - DeepSeek: ignores hosted
web_search - Amazon Bedrock: ignores hosted
web_search
web_search can be mixed with normal function tools in the same request.
Streaming
from llmai import AnthropicClient
from llmai.shared import UserMessage
client = AnthropicClient(api_key="ANTHROPIC_API_KEY")
for chunk in client.generate(
model="your-anthropic-model",
messages=[
UserMessage(content="Explain recursion in one paragraph."),
],
stream=True,
):
if chunk.type == "content":
print(chunk.chunk, end="")
generate(..., stream=True) yields ResponseStreamChunk markers with event="start" and event="end" around each content, thinking, and tool section. If a provider returns multiple reasoning blocks, each block gets its own thinking start/end pair. The final ResponseStreamCompletionChunk also includes top-level content and thinking.
Package Layout
llmai/openai: OpenAI adapterllmai/deepseek: DeepSeek adapterllmai/anthropic: Anthropic adapterllmai/google: Google Gemini adapterllmai/bedrock: Amazon Bedrock adapterllmai/shared: common message, tool, schema, and response models
Core Types
The shared layer includes the main primitives you will use across providers:
UserMessage,SystemMessage,AssistantMessageTextContentPart,ImageContentPartTool,WebSearchTool,ToolResponseMessageJSONSchemaResponse,JSONObjectResponse,TextResponseResponseContent,ResponseStreamChunk,ResponseStreamContentChunk,ResponseStreamThinkingChunk,ResponseStreamToolChunk,ResponseStreamToolCompleteChunk,ResponseStreamCompletionChunkResponseUsage
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file llmai-0.1.1.tar.gz.
File metadata
- Download URL: llmai-0.1.1.tar.gz
- Upload date:
- Size: 34.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
61ea1c5b2f462300664ee7608d6c9d903edfc42cf562b6e722503b346db6b052
|
|
| MD5 |
563175c686b68151cd425d4be468afa2
|
|
| BLAKE2b-256 |
8766f18e24779c6d55c6c1088584a949b70997c6b122e3b87fddafb2cdb69968
|
Provenance
The following attestation bundles were made for llmai-0.1.1.tar.gz:
Publisher:
publish.yml on presenton/llmai
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
llmai-0.1.1.tar.gz -
Subject digest:
61ea1c5b2f462300664ee7608d6c9d903edfc42cf562b6e722503b346db6b052 - Sigstore transparency entry: 1340649070
- Sigstore integration time:
-
Permalink:
presenton/llmai@cdd4f7a4624f11db4adf8ce2675cb24eb1f475b3 -
Branch / Tag:
refs/tags/v0.1.1 - Owner: https://github.com/presenton
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@cdd4f7a4624f11db4adf8ce2675cb24eb1f475b3 -
Trigger Event:
push
-
Statement type:
File details
Details for the file llmai-0.1.1-py3-none-any.whl.
File metadata
- Download URL: llmai-0.1.1-py3-none-any.whl
- Upload date:
- Size: 46.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d964e4e04ba571f336a9bcb79064bf60d991c1362a03f5e1d1394a447f0b48e7
|
|
| MD5 |
82d30a835fdf2535c0b337e9f0e69ad8
|
|
| BLAKE2b-256 |
033882440d07b96b0c4ba2a4eac00159853e3dfe2f6fc0218408feb5e1f9afe6
|
Provenance
The following attestation bundles were made for llmai-0.1.1-py3-none-any.whl:
Publisher:
publish.yml on presenton/llmai
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
llmai-0.1.1-py3-none-any.whl -
Subject digest:
d964e4e04ba571f336a9bcb79064bf60d991c1362a03f5e1d1394a447f0b48e7 - Sigstore transparency entry: 1340649085
- Sigstore integration time:
-
Permalink:
presenton/llmai@cdd4f7a4624f11db4adf8ce2675cb24eb1f475b3 -
Branch / Tag:
refs/tags/v0.1.1 - Owner: https://github.com/presenton
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@cdd4f7a4624f11db4adf8ce2675cb24eb1f475b3 -
Trigger Event:
push
-
Statement type: