SUNWÆE gen — multi-provider LLM engine library.
Project description
All LLMs, one response format, one dependency (httpx). Supports switching model in conversations (e.g. draft with GPT, refine with Anthropic).
Handles streaming, tool calls, file attachments, prompt caching, extended thinking, and cost tracking across Anthropic, OpenAI, Google, DeepSeek, xAI, and Moonshot.
Install
pip install sunwaee
pip install "sunwaee[files]" # pdf, docx, xlsx, pptx extraction
pip install -e ".[dev,files]" # development
Quick start
import asyncio
from sunwaee.modules.gen.engine import get_engine
from sunwaee.modules.gen.engine.types import Message, Role
engine = get_engine("anthropic", "claude-sonnet-4-6") # reads ANTHROPIC_API_KEY
async def main():
messages = [Message(role=Role.USER, content="Hello")]
response = await engine.chat(messages)
print(response.content, response.cost.total)
async for chunk in engine.stream(messages):
if chunk.content:
print(chunk.content, end="", flush=True)
asyncio.run(main())
Providers
| Provider | provider= |
Env var |
|---|---|---|
| Anthropic | "anthropic" |
ANTHROPIC_API_KEY |
| OpenAI | "openai" |
OPENAI_API_KEY |
"google" |
GOOGLE_API_KEY |
|
| DeepSeek | "deepseek" |
DEEPSEEK_API_KEY |
| xAI | "xai" |
XAI_API_KEY |
| Moonshot | "moonshot" |
MOONSHOT_API_KEY |
Directory structure
sunwaee/
├── core/
│ ├── logger.py # get_logger(name) — scoped under "sunwaee.*"
│ └── tools.py # @tool decorator, ok(), err()
└── modules/gen/
├── __init__.py # public re-exports (get_engine, run, stream_run, …)
├── agent.py # ReAct loop — run() + stream_run()
├── tools.py # TOOLS list
└── engine/
├── __init__.py # get_engine, Message, Response, Tool, …
├── base.py # BaseEngine ABC
├── factory.py # get_engine() — provider routing + connection pooling
├── model.py # Model dataclass + compute_cost()
├── types.py # Message, Response, ToolCall, Usage, Cost, Performance, …
├── models/ # model registry per provider
│ ├── __init__.py # get_model(), list_models()
│ ├── anthropic.py / openai.py / google.py / deepseek.py / xai.py / moonshot.py
└── providers/
├── anthropic.py # AnthropicEngine
├── openai.py # OpenAIEngine (also used by DeepSeek, xAI, Moonshot)
└── google.py # GoogleEngine
tests/gen/
├── test_agent.py / test_stream_agent.py / test_tools.py
└── engine/
├── test_types.py / test_factory.py / test_model.py
├── providers/
│ └── test_anthropic.py / test_openai.py / test_google.py
└── live/
├── test_providers.py # real API calls, all providers × all scenarios
└── run/ # JSON snapshots (gitignored)
Core types (engine/types.py)
class Role(Enum): SYSTEM, USER, ASSISTANT, TOOL, CONTEXT
class StopReason(Enum): END_TURN, TOOL_USE, MAX_TOKENS
@dataclass class Message:
role: Role
content: str | None
reasoning_content: str | None # thinking for models that support it
reasoning_signature: str | None # opaque blob — echo back verbatim
tool_call_id: str | None # set on Role.TOOL messages
tool_calls: list[ToolCall] | None
attachments: list[FileAttachment] | None # Role.USER only
@dataclass class Error:
message: str = ""; status_code: int = 0 # defined but never populated — errors are raised
@dataclass class Response:
provider: str; model: str; streaming: bool; synthetic: bool
content: str | None; reasoning_content: str | None; reasoning_signature: str | None
tool_calls: list[ToolCall] | None; stop_reason: StopReason | None; error: Error | None
usage: Usage | None; cost: Cost | None; performance: Performance | None
@dataclass class ToolCall:
id: str; name: str; arguments: dict
thought_signature: str | None # Google only — echo back every subsequent turn
error: str | None; duration: float; results: list[dict]
@dataclass class Usage:
input_tokens: int; output_tokens: int; total_tokens: int
cache_read_tokens: int; cache_write_tokens: int
@dataclass class Cost:
input: float; output: float; cache_read: float; cache_write: float; total: float
@dataclass class Performance:
latency: float # seconds to first chunk
reasoning_duration: float; content_duration: float; total_duration: float
throughput: int # output tokens / second
@dataclass class FileAttachment:
data: bytes; filename: str; media_type: str = ""
# text/* → <file name="…">…</file> block
# image/jpeg|png|gif|webp → base64 inline
# application/pdf|json + OOXML (docx/xlsx/pptx) → extracted text
Usage
With tools
from sunwaee.core.tools import tool, ok, err
from sunwaee.modules.gen.engine.types import Tool
@tool("Return the current UTC time.")
def get_time() -> str:
from datetime import datetime, timezone
return ok({"time": datetime.now(timezone.utc).isoformat()})
response = await engine.chat(messages, tools=[get_time._tool])
File and image attachments
from sunwaee.modules.gen.engine.types import FileAttachment, Message, Role
with open("report.pdf", "rb") as f:
att = FileAttachment(data=f.read(), filename="report.pdf")
response = await engine.chat([Message(role=Role.USER, content="Summarise.", attachments=[att])])
Supported: text/*, application/json, image/jpeg|png|gif|webp, application/pdf, .docx, .xlsx, .pptx
ReAct agent loop
from sunwaee.modules.gen.agent import stream_run
new_messages = []
async for chunk in stream_run(messages, tools, engine, new_messages=new_messages):
if chunk.content:
print(chunk.content, end="", flush=True)
# new_messages has all assistant + tool turns appended during the run
Up to 10 iterations by default. Concurrent tool calls via asyncio.gather. Sync tools dispatched via run_in_executor.
Listing models
from sunwaee.modules.gen.engine.models import list_models, get_model
all_models = list_models() # list[Model]
model = get_model("claude-sonnet-4-6") # Model | None
Testing
pytest tests/gen/ -m "not live" # unit (no keys needed)
pytest tests/gen/ -m live # live (real API calls)
pytest tests/gen/ -m "not live" --cov=sunwaee --cov-report=term-missing
Unit test conventions:
- Mock
httpx.AsyncClient— never make real HTTP calls - Assert
response.cost,response.usage,response.performancepopulated on final chunk - For streaming, use an async generator as mock transport
Live test scenarios (all providers × chat + stream):
| Scenario | What it tests |
|---|---|
ONLY_SYSTEM |
System-only input edge case; lenient assertions |
ONLY_USER |
Single user message |
SYSTEM_AND_USER |
System prompt respected in response |
TOOL_CALL |
Model must issue at least one tool call |
TOOL_CALL_RESULT |
Full multi-turn with real tool IDs/signatures |
FILE_ATTACHMENT |
Text file attached; asserts content populated |
CONTEXT_ROLE |
Role.CONTEXT message handled without errors |
Image attachments tested separately, parametrized over vision-capable engines only.
How to add a model
File: sunwaee/modules/gen/engine/models/<provider>.py
Model(
name="provider-model-name",
display_name="Human Readable Name",
provider="anthropic",
context_window=200_000,
max_output_tokens=64_000,
input_price_per_mtok=3.0,
output_price_per_mtok=15.0,
cache_read_price_per_mtok=0.3,
cache_write_price_per_mtok=3.75,
input_price_per_mtok_200k=6.0, # omit if no >200k tier
output_price_per_mtok_200k=22.5,
supports_vision=True, supports_tools=True,
supports_thinking=True, supports_reasoning_tokens=True,
cache_min_tokens=1_024, # omit (None) if caching is undocumented
release_date="2025-01-01",
)
Pricing tiers (engine/model.py): base required; _128k when input_tokens > 128_000 (xAI only); _200k when > 200_000; _272k when > 272_000 (OpenAI only). Thresholds are strict > — exactly at the boundary uses the lower tier.
cache_min_tokens — minimum tokens required at a cache breakpoint for prompt caching to activate. None = no caching. 0 = no minimum (caches everything). Known values:
| Provider | Minimum | Models |
|---|---|---|
| Anthropic | 4,096 | Opus 4.6, Opus 4.5, Haiku 4.5 |
| Anthropic | 2,048 | Sonnet 4.6 |
| Anthropic | 1,024 | Sonnet 4.5 |
| OpenAI | 1,024 | All models (automatic prefix caching) |
| 1,024 | All models (explicit context caching) | |
| xAI | 0 | All models (automatic, no minimum) |
| DeepSeek | 64 | All models (automatic prefix caching) |
| Moonshot | 0 | All models (automatic, no minimum) |
Tests: Add assertions in engine/test_model.py for non-standard pricing tiers.
How to add an OpenAI-compatible provider
engine/models/<provider>.py—MODELSlistengine/models/__init__.py— import + add to_ALLengine/factory.py— add to_OPENAI_COMPATIBLE: dict[str, str](env var auto-derived asPROVIDER_API_KEY)tests/gen/engine/live/test_providers.py— add("provider", "cheapest-model")toENGINES
How to add a provider with a custom API
engine/models/<provider>.py+ register in__init__.pyengine/providers/<provider>.py— implementBaseEngine:async def chat(self, messages, tools=None) -> Responseasync def stream(self, messages, tools=None) -> AsyncIterator[Response]- Accept
client: httpx.AsyncClient | None = None - Call
resolve_tokens(usage)beforecompute_cost - Strip
reasoning_content/reasoning_signaturefrom all but the last assistant turn - Handle system-only input: promote to
Role.USER - On 4xx/5xx in streaming: read full body before raising
- Buffer tool call JSON across SSE chunks; parse only on stop
engine/factory.py— wire intoget_engine()- Tests: unit (
providers/test_<provider>.py) + live entry
How to add a tool to the agent
from typing import Annotated
from sunwaee.core.tools import tool, ok, err
@tool("Search the web for current information.")
def web_search(
query: Annotated[str, "The search query"],
num_results: Annotated[int, "Number of results"] = 5,
) -> str:
try:
return ok(_do_search(query, num_results))
except Exception as e:
return err(str(e))
Register: add web_search._tool to TOOLS in sunwaee/modules/gen/tools.py.
Tests: tests/gen/test_<tool_name>.py — call directly, assert JSON output shape, test error path. Never call real external APIs.
@tool decorator
Introspects signature to build JSON Schema parameters automatically.
Supports: str, int, float, bool, list[T], Literal[...], Optional[T], Annotated[T, "description"]
- Parameters with defaults → not
required - Both sync and async supported
- Must return JSON string:
ok(data)/err(message)/json.dumps(...)
ok({"id": "123"}) # '{"ok": true, "data": {"id": "123"}}'
err("Not found") # '{"ok": false, "error": "Not found"}'
Provider-specific quirks
| # | Rule |
|---|---|
| 1 | resolve_tokens() before compute_cost() — xAI/Google exclude reasoning tokens from output_tokens; resolve_tokens back-calculates from total_tokens |
| 2 | Strip reasoning from all but last assistant turn — stale reasoning_signature breaks APIs |
| 3 | OpenAI uses max_completion_tokens, not max_tokens |
| 4 | OpenAI reasoning models: yield synthetic chunk immediately — stream is silent during thinking; Response(reasoning_content="Reasoning in progress…", synthetic=True) |
| 5 | Google: thoughtSignature on functionCall part → ToolCall.thought_signature; echo every subsequent turn |
| 6 | Google: no tool call IDs — use function name as correlation ID |
| 7 | Google streaming: ?alt=sse required on streamGenerateContent |
| 8 | System-only input — promote system message to Role.USER (Anthropic + Google) |
| 9 | Anthropic thinking budget: 1024 ≤ thinking_budget < max_tokens; default max(1024, max_tokens - 1024) |
| 10 | Connection pooling: one httpx.AsyncClient per (event_loop_id, base_url) in factory.py |
| 11 | Role.CONTEXT mapping: Anthropic → standalone {"role":"user","content":"<context>…</context>"} (consecutive user messages are accepted); OpenAI → {"role":"system","content":"…"}; Google → {"role":"user","parts":[{"text":"<context>…</context>"}]} |
| 12 | Anthropic cache tokens suppressed when thinking is enabled — cache_read_tokens/cache_write_tokens are 0 in the API response even when caching is active; caching still happens transparently |
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file sunwaee-1.3.0.tar.gz.
File metadata
- Download URL: sunwaee-1.3.0.tar.gz
- Upload date:
- Size: 66.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
97dcf62010d2d3c2c8a312b6592f23d93757cc37cb516280e5767a7176dd0167
|
|
| MD5 |
94eafa12f87029391948b0cf6745b731
|
|
| BLAKE2b-256 |
8f7399c37dbd8181c0f72474a035051446e1eb93f94bcaae2166c6d1da91f8a2
|
File details
Details for the file sunwaee-1.3.0-py3-none-any.whl.
File metadata
- Download URL: sunwaee-1.3.0-py3-none-any.whl
- Upload date:
- Size: 40.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a9254fe0e79d3a907fc7ada15e9251668faed2dfdf88a4de581b8aa8d526d2bd
|
|
| MD5 |
b7f7c931ea6f1f3a27cc82396bd0fcc1
|
|
| BLAKE2b-256 |
8496cd3fcc425576ac8e1434480ae0252b1c0a6a3eb8a970d0d6a885891eba10
|