ALEA LLM client abstraction library for Python
Project description
ALEA LLM Client
This is a simple, two-dependency (httpx, pydantic) LLM client for ~OpenAI APIs like:
- OpenAI (GPT-4, GPT-5, o-series)
- Anthropic (Claude 3.5, Claude 4)
- Google (Vertex AI, Gemini API)
- xAI (Grok)
- VLLM
Supported Patterns
It provides the following patterns for all endpoints:
completeandcomplete_async-> str viaModelResponsechatandchat_async-> str viaModelResponsejsonandjson_async-> dict viaJSONModelResponsepydanticandpydantic_async-> pydantic modelsresponsesandresponses_async-> structured output with tool use, grammar constraints, and reasoning modes
Model Registry & Capabilities
Version 0.2.0 introduces a comprehensive model registry with detailed capability tracking for 50+ models:
from alea_llm_client.llms import (
get_models_with_context_window_gte,
filter_models,
compare_models,
get_model_details
)
# Find models with large context windows
large_context = get_models_with_context_window_gte(1000000)
# Filter by multiple criteria
efficient = filter_models(
min_context=100000,
capabilities=["tools", "vision"],
tiers=["mini", "flash"], # Can also use ModelTier.MINI, ModelTier.FLASH
exclude_deprecated=True
)
# Compare specific models
comparison = compare_models(["gpt-5", "claude-sonnet-4-20250514", "gemini-2.5-pro"])
Advanced Features
Grammar Constraints (GPT-5)
from alea_llm_client import OpenAIModel
model = OpenAIModel(model="gpt-5")
response = model.responses(
input="Answer yes or no: Is 2+2=4?",
grammar='start: "yes" | "no"',
grammar_syntax="lark"
)
Thinking Mode (Claude 4+)
from alea_llm_client import AnthropicModel
model = AnthropicModel(model="claude-sonnet-4-20250514")
response = model.chat(
messages=[{"role": "user", "content": "Solve this complex problem..."}],
thinking={"enabled": True, "budget_tokens": 2000}
)
print(response.thinking) # Access thinking content
Reasoning Tokens (o-series)
from alea_llm_client import OpenAIModel
model = OpenAIModel(model="o3-mini")
response = model.chat(
messages=[{"role": "user", "content": "Think through this step by step..."}],
max_completion_tokens=50000
)
print(f"Used {response.reasoning_tokens} reasoning tokens")
Default Caching
Result caching is enabled by default for all methods.
To disable caching, you can either:
- set
ignore_cache=Truefor each method call (complete,chat,json,pydantic) - set
ignore_cache=Trueas a kwarg at model construction
Cached objects are stored in ~/.alea/cache/{provider}/{endpoint_model_hash}/{call_hash}.json
in compressed .json.gz format. You can delete these files to clear the cache.
Authentication
Authentication is handled in the following priority order:
- an
api_keyprovided at model construction - a standard environment variable (e.g.,
ANTHROPIC_API_KEYorOPENAI_API_KEY) - a key stored in
~/.alea/keys/{provider}(e.g.,openai,anthropic,gemini,grok)
Streaming
Given the research focus of this library, streaming generation is not supported. However,
you can directly access the httpx objects on .client and .async_client to stream responses
directly if you prefer.
Installation
pip install alea-llm-client
Examples
Basic JSON Example
from alea_llm_client import VLLMModel
if __name__ == "__main__":
model = VLLMModel(
endpoint="http://my.vllm.server:8000",
model="meta-llama/Meta-Llama-3.1-8B-Instruct"
)
messages = [
{
"role": "user",
"content": "Give me a JSON object with keys 'name' and 'age' for a person named Alice who is 30 years old.",
},
]
print(model.json(messages=messages, system="Respond in JSON.").data)
# Output: {'name': 'Alice', 'age': 30}
Basic Completion Example with KL3M
from alea_llm_client import VLLMModel
if __name__ == "__main__":
model = VLLMModel(
model="kl3m-1.7b", ignore_cache=True
)
prompt = "My name is "
print(model.complete(prompt=prompt, temperature=0.5).text)
# Output: Dr. Hermann Kamenzi, and
Pydantic Example
from pydantic import BaseModel
from alea_llm_client import AnthropicModel, format_prompt, format_instructions
class Person(BaseModel):
name: str
age: int
if __name__ == "__main__":
model = AnthropicModel(ignore_cache=True)
instructions = [
"Provide one random record based on the SCHEMA below.",
]
prompt = format_prompt(
{
"instructions": format_instructions(instructions),
"schema": Person,
}
)
person = model.pydantic(prompt, system="Respond in JSON.", pydantic_model=Person)
print(person)
# Output: name='Olivia Chen' age=29
Design
Class Inheritance
classDiagram
BaseAIModel <|-- OpenAICompatibleModel
OpenAICompatibleModel <|-- AnthropicModel
OpenAICompatibleModel <|-- OpenAIModel
OpenAICompatibleModel <|-- VLLMModel
OpenAICompatibleModel <|-- GrokModel
BaseAIModel <|-- GoogleModel
class BaseAIModel {
<<abstract>>
}
class OpenAICompatibleModel
class AnthropicModel
class OpenAIModel
class VLLMModel
class GrokModel
class GoogleModel
Example Call Flow
sequenceDiagram
participant Client
participant BaseAIModel
participant OpenAICompatibleModel
participant SpecificModel
participant API
Client->>BaseAIModel: json()
BaseAIModel->>BaseAIModel: _retry_wrapper()
BaseAIModel->>OpenAICompatibleModel: _json()
OpenAICompatibleModel->>OpenAICompatibleModel: format()
OpenAICompatibleModel->>OpenAICompatibleModel: _make_request()
OpenAICompatibleModel->>API: HTTP POST
API-->>OpenAICompatibleModel: Response
OpenAICompatibleModel->>OpenAICompatibleModel: _handle_json_response()
OpenAICompatibleModel-->>BaseAIModel: JSONModelResponse
BaseAIModel-->>Client: JSONModelResponse
License
The ALEA LLM client is released under the MIT License. See the LICENSE file for details.
Support
If you encounter any issues or have questions about using the ALEA LLM client library, please open an issue on GitHub.
Learn More
To learn more about ALEA and its software and research projects like KL3M and leeky, visit the ALEA website.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file alea_llm_client-0.2.0.tar.gz.
File metadata
- Download URL: alea_llm_client-0.2.0.tar.gz
- Upload date:
- Size: 39.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.8.4
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3ab5a0c126c7dfe0626e430eec0cb23e6230004ac87aea62c35761a00e0cdeb3
|
|
| MD5 |
a8aaa163e9c67d68f5dfaca2025922e1
|
|
| BLAKE2b-256 |
4ea3a4c38c0599e57bb0564bedcbec8fcbdfe7620630a03f28f54f5ec652c458
|
File details
Details for the file alea_llm_client-0.2.0-py3-none-any.whl.
File metadata
- Download URL: alea_llm_client-0.2.0-py3-none-any.whl
- Upload date:
- Size: 52.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.8.4
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
50cdf3641830b91a3b704828bf0907179c9c40c7172610c5be4edbe6f8dbb98b
|
|
| MD5 |
a8e30bffe4c1eb61a06fdd301e8459f9
|
|
| BLAKE2b-256 |
1db4258ee8e9d997a3f1e1d63b4343a5f677076e96e09b4d1b1b32281899b092
|