sunwaee

SUNWÆE gen — multi-provider LLM engine library.

Project description

Coverage Python PyPI License

All LLMs, one response format, one dependency (httpx). Supports switching model in conversations (e.g. draft with GPT, refine with Anthropic).

Handles streaming, tool calls, file attachments, prompt caching, reasoning on/off, and cost tracking across Anthropic, OpenAI, Google, DeepSeek, xAI, and Moonshot.

Install

pip install sunwaee
pip install "sunwaee[files]"   # pdf, docx, xlsx, pptx extraction
pip install -e ".[dev,files]"  # development

Quick start

import asyncio
from sunwaee.modules.gen.engine import get_engine
from sunwaee.modules.gen.engine.types import Message, Role

# enable_reasoning=False (default) — reasoning disabled / non-reasoning variant used
engine = get_engine("anthropic", "claude-sonnet-4-6")

# enable_reasoning=True — activates thinking for all providers
engine_reasoning = get_engine("anthropic", "claude-sonnet-4-6", enable_reasoning=True)

async def main():
    messages = [Message(role=Role.USER, content="Hello")]

    response = await engine.chat(messages)
    print(response.content, response.cost.total)

    async for chunk in engine.stream(messages):
        if chunk.content:
            print(chunk.content, end="", flush=True)

asyncio.run(main())

Providers

Provider	`provider=`	Env var
Anthropic	`"anthropic"`	`ANTHROPIC_API_KEY`
OpenAI	`"openai"`	`OPENAI_API_KEY`
Google	`"google"`	`GOOGLE_API_KEY`
DeepSeek	`"deepseek"`	`DEEPSEEK_API_KEY`
xAI	`"xai"`	`XAI_API_KEY`
Moonshot	`"moonshot"`	`MOONSHOT_API_KEY`

Directory structure

sunwaee/
├── core/
│   ├── logger.py                 # get_logger(name) — scoped under "sunwaee.*"
│   └── tools.py                  # @tool decorator, ok(), err()
└── modules/gen/
    ├── __init__.py               # public re-exports (get_engine, run, stream_run, …)
    ├── agent.py                  # ReAct loop — run() + stream_run()
    ├── tools.py                  # TOOLS list
    └── engine/
        ├── __init__.py           # get_engine, Message, Response, Tool, …
        ├── base.py               # BaseEngine ABC
        ├── factory.py            # get_engine() — provider routing + connection pooling
        ├── model.py              # Model dataclass + compute_cost()
        ├── types.py              # Message, Response, ToolCall, Usage, Cost, Performance, …
        ├── models/               # model registry per provider
        │   ├── __init__.py       # get_model(), list_models()
        │   ├── anthropic.py / openai.py / google.py / deepseek.py / xai.py / moonshot.py
        └── providers/
            ├── anthropic.py      # AnthropicEngine
            ├── openai.py         # OpenAIEngine (also used by DeepSeek, xAI, Moonshot)
            └── google.py         # GoogleEngine

tests/gen/
├── test_agent.py / test_stream_agent.py / test_tools.py
└── engine/
    ├── test_types.py / test_factory.py / test_model.py
    ├── providers/
    │   └── test_anthropic.py / test_openai.py / test_google.py
    └── live/
        ├── _shared.py            # shared config, data, helpers for all live tests
        ├── test_scenarios.py     # all providers × all scenarios × chat + stream
        ├── test_tool_call_result.py  # TOOL_CALL → execute → reply, all providers
        ├── test_attachments.py   # image attachments, vision-capable providers
        ├── test_chain.py         # three-provider conversation chain
        ├── test_caching.py       # prompt-cache hit on turn 2
        ├── test_reasoning.py     # reasoning ON / OFF per model category
        └── run/                  # JSON snapshots (gitignored)

Core types (`engine/types.py`)

class Role(Enum):       SYSTEM, USER, ASSISTANT, TOOL, CONTEXT
class StopReason(Enum): END_TURN, TOOL_USE, MAX_TOKENS

@dataclass class Message:
    role: Role
    content: str | None
    reasoning_content: str | None       # thinking for models that support it
    reasoning_signature: str | None     # opaque blob — echo back verbatim
    tool_call_id: str | None            # set on Role.TOOL messages
    tool_calls: list[ToolCall] | None
    attachments: list[FileAttachment] | None   # Role.USER only

@dataclass class Response:
    provider: str; model: str; streaming: bool; synthetic: bool
    content: str | None; reasoning_content: str | None; reasoning_signature: str | None
    tool_calls: list[ToolCall] | None; stop_reason: StopReason | None; error: Error | None
    usage: Usage | None; cost: Cost | None; performance: Performance | None

@dataclass class ToolCall:
    id: str; name: str; arguments: dict
    thought_signature: str | None    # Google only — echo back every subsequent turn
    error: str | None; duration: float; results: list[dict]

@dataclass class Usage:
    input_tokens: int; output_tokens: int; total_tokens: int
    cache_read_tokens: int; cache_write_tokens: int

@dataclass class Cost:
    input: float; output: float; cache_read: float; cache_write: float; total: float

@dataclass class Performance:
    latency: float            # seconds to first chunk
    reasoning_duration: float; content_duration: float; total_duration: float
    throughput: int           # output tokens / second

@dataclass class FileAttachment:
    data: bytes; filename: str; media_type: str = ""
    # text/* → <file name="…">…</file> block
    # image/jpeg|png|gif|webp → base64 inline
    # application/pdf|json + OOXML (docx/xlsx/pptx) → extracted text

`get_engine()` — reasoning control

engine = get_engine(
    provider,
    model,
    api_key=None,          # falls back to <PROVIDER>_API_KEY env var
    max_tokens=8192,
    enable_reasoning=False, # True activates thinking/reasoning for all providers
)

enable_reasoning resolves all provider complexity automatically:

Model `reasoning_mode`	`enable_reasoning=True`	`enable_reasoning=False`
`"dynamic"`	Sends provider thinking config	No thinking config sent
`"always"` + `non_reasoning_id`	Uses the model as-is	Swaps to `non_reasoning_id` variant
`"always"` + no swap	Uses the model as-is (always reasons)	Uses the model as-is (cannot disable)
`None` (no reasoning) + `reasoning_id`	Swaps to `reasoning_id` variant	Uses the model as-is

Default reasoning configs when enable_reasoning=True:

Provider / series	Mechanism	Value
Anthropic (Opus 4.7/4.6, Sonnet 4.6)	`output_config.effort` + `thinking: {type: "adaptive"}`	`effort="high"`
Anthropic (Opus 4.5, Haiku 4.5, Sonnet 4.5)	`thinking: {type: "enabled", budget_tokens: N}`	`max(1024, max_tokens - 1024)`
Google Gemini 3 (has `reasoning_efforts`)	`thinkingConfig.thinkingLevel`	`"high"`
Google Gemini 2.5 (no `reasoning_efforts`)	`thinkingConfig.thinkingBudget`	`-1` (dynamic)
OpenAI-compat (has `reasoning_efforts`)	`reasoning_effort`	`"high"`

When enable_reasoning=False for dynamic models: Anthropic effort models use effort="low"; Gemini 3 uses the lowest effort in reasoning_efforts[0]; Gemini 2.5 uses thinkingBudget=0.

`Model` dataclass (`engine/model.py`)

@dataclass class Model:
    name: str; provider: str; display_name: str; description: str | None

    # specs
    context_window: int; max_output_tokens: int | None

    # features
    supports_vision: bool
    supports_tools: bool
    supports_reasoning: bool           # True if model has any reasoning capability

    # reasoning config
    reasoning_mode: str | None         # "always" | "dynamic" | None
    reasoning_efforts: list[str] | None  # valid effort levels (e.g. ["low","medium","high"])
    reasoning_tokens_type: str | None  # "raw" | "summary" | None
    reasoning_disabled_payload: dict | None  # merged into request when reasoning is disabled
    reasoning_id: str | None           # for non-reasoning variants: name of reasoning counterpart
    non_reasoning_id: str | None       # for reasoning models: name of non-reasoning counterpart

    cache_min_tokens: int | None

    # pricing (per million tokens)
    input_price_per_mtok: float | None; output_price_per_mtok: float | None
    cache_read_price_per_mtok: float | None; cache_write_price_per_mtok: float | None
    input_price_per_mtok_128k: float | None   # xAI only
    output_price_per_mtok_128k: float | None
    input_price_per_mtok_200k: float | None   # most providers
    output_price_per_mtok_200k: float | None; ...
    input_price_per_mtok_272k: float | None   # OpenAI only
    output_price_per_mtok_272k: float | None; ...

    release_date: str | None; deprecated_at: str | None; sunset_at: str | None

reasoning_mode:

"always" — model always reasons; disable by swapping to non_reasoning_id (xAI models, DeepSeek Reasoner)
"dynamic" — reasoning can be toggled on/off via provider-specific config (Anthropic, Google, OpenAI, Moonshot)
None — no reasoning capability

reasoning_tokens_type:

"raw" — full reasoning content returned in response.reasoning_content (Anthropic, DeepSeek, grok-4.20, kimi-k2.5)
"summary" — summarised thought returned (Google, grok-4.20 on /v1/responses endpoint)
None — reasoning tokens tracked internally only; content not exposed (OpenAI, most xAI)

reasoning_efforts:

List of valid named effort levels for the model's reasoning API parameter.
Anthropic: ["low","medium","high","max"] or ["low","medium","high","xhigh","max"] for Opus 4.7.
Google Gemini 3: ["minimal","low","medium","high"] or ["low","medium","high"] depending on model.
OpenAI: ["none","low","medium","high","xhigh"] or similar per-model.
None for models that use integer budgets (Gemini 2.5) or have no effort control.

Usage

With tools

from sunwaee.core.tools import tool, ok, err
from sunwaee.modules.gen.engine.types import Tool

@tool("Return the current UTC time.")
def get_time() -> str:
    from datetime import datetime, timezone
    return ok({"time": datetime.now(timezone.utc).isoformat()})

response = await engine.chat(messages, tools=[get_time._tool])

File and image attachments

from sunwaee.modules.gen.engine.types import FileAttachment, Message, Role

with open("report.pdf", "rb") as f:
    att = FileAttachment(data=f.read(), filename="report.pdf")

response = await engine.chat([Message(role=Role.USER, content="Summarise.", attachments=[att])])

Supported: text/*, application/json, image/jpeg|png|gif|webp, application/pdf, .docx, .xlsx, .pptx

ReAct agent loop

from sunwaee.modules.gen.agent import stream_run

new_messages = []
async for chunk in stream_run(messages, tools, engine, new_messages=new_messages):
    if chunk.content:
        print(chunk.content, end="", flush=True)
# new_messages has all assistant + tool turns appended during the run

Up to 10 iterations by default. Concurrent tool calls via asyncio.gather. Sync tools dispatched via run_in_executor.

Listing models

from sunwaee.modules.gen.engine.models import list_models, get_model

all_models = list_models()              # list[Model]
model = get_model("claude-sonnet-4-6")  # Model | None

Testing

pytest tests/gen/ -m "not live"                                        # unit (no keys needed)
pytest tests/gen/ -m live                                              # live (real API calls)
pytest tests/gen/ -m "not live" --cov=sunwaee --cov-report=term-missing

# run a single live test file
pytest -m live tests/gen/engine/live/test_caching.py
pytest -m live tests/gen/engine/live/test_reasoning.py

Unit test conventions:

Mock httpx.AsyncClient — never make real HTTP calls
Assert response.cost, response.usage, response.performance populated on final chunk
For streaming, use an async generator as mock transport

Live test files and what they cover:

File	What it tests
`test_scenarios.py`	6 scenarios × 6 providers × chat + stream (72 tests)
`test_tool_call_result.py`	Full TOOL_CALL → execute → reply loop, all providers
`test_attachments.py`	PNG image attachment, vision-capable providers
`test_chain.py`	Three-provider conversation chain with shared history
`test_caching.py`	Prompt-cache hit on turn 2, static system prompt
`test_reasoning.py`	`enable_reasoning` ON / OFF per model category

Live scenarios:

Scenario	What it tests
`ONLY_SYSTEM`	System-only input edge case; lenient assertions
`ONLY_USER`	Single user message
`SYSTEM_AND_USER`	System prompt respected in response
`TOOL_CALL`	Model must issue at least one tool call
`TOOL_CALL_RESULT`	Full multi-turn with real tool IDs/signatures
`FILE_ATTACHMENT`	Text file attached; asserts content populated
`CONTEXT_ROLE`	`Role.CONTEXT` message handled without errors

All live tests default to enable_reasoning=False. test_reasoning.py is the only file that explicitly passes enable_reasoning=True.

How to add a model

File: sunwaee/modules/gen/engine/models/<provider>.py

Model(
    name="provider-model-name",
    display_name="Human Readable Name",
    provider="anthropic",
    context_window=200_000,
    max_output_tokens=64_000,
    input_price_per_mtok=3.0,
    output_price_per_mtok=15.0,
    cache_read_price_per_mtok=0.3,
    cache_write_price_per_mtok=3.75,
    input_price_per_mtok_200k=6.0,       # omit if no >200k tier
    output_price_per_mtok_200k=22.5,
    supports_vision=True,
    supports_tools=True,
    supports_reasoning=True,
    reasoning_mode="dynamic",             # "always" | "dynamic" | None
    reasoning_efforts=["low", "medium", "high", "max"],  # omit if not applicable
    reasoning_tokens_type="raw",          # "raw" | "summary" | None
    non_reasoning_id="model-non-reasoning",  # omit if no non-reasoning variant
    cache_min_tokens=1_024,              # omit (None) if caching is undocumented
    release_date="2025-01-01",
)

For a non-reasoning variant that pairs with a reasoning model:

Model(
    name="model-non-reasoning",
    ...same pricing...,
    supports_reasoning=False,
    reasoning_id="model",                 # points to the reasoning counterpart
)

Pricing tiers (engine/model.py): base required; _128k when input_tokens > 128_000 (xAI only); _200k when > 200_000; _272k when > 272_000 (OpenAI only). Thresholds are strict > — exactly at the boundary uses the lower tier.

cache_min_tokens — minimum tokens required at a cache breakpoint for prompt caching to activate. None = no caching. 0 = no minimum (caches everything). Known values:

Provider	Minimum	Models
Anthropic	4,096	Opus 4.7, Opus 4.6, Opus 4.5, Haiku 4.5
Anthropic	2,048	Sonnet 4.6
Anthropic	1,024	Sonnet 4.5
OpenAI	1,024	All models (automatic prefix caching)
Google	1,024	All models (explicit context caching)
xAI	0	All models (automatic, no minimum)
DeepSeek	64	All models (automatic prefix caching)
Moonshot	0	All models (automatic, no minimum)

How to add an OpenAI-compatible provider

engine/models/<provider>.py — MODELS list
engine/models/__init__.py — import + add to _ALL
engine/factory.py — add to _OPENAI_COMPATIBLE: dict[str, str] (env var auto-derived as PROVIDER_API_KEY)
tests/gen/engine/live/_shared.py — add ("provider", "cheapest-model") to ENGINES

How to add a provider with a custom API

engine/models/<provider>.py + register in __init__.py
engine/providers/<provider>.py — implement BaseEngine:
- async def chat(self, messages, tools=None) -> Response
- async def stream(self, messages, tools=None) -> AsyncIterator[Response]
- Accept client: httpx.AsyncClient | None = None
- Call resolve_tokens(usage) before compute_cost
- Strip reasoning_content/reasoning_signature from all but the last assistant turn
- Handle system-only input: promote to Role.USER
- On 4xx/5xx in streaming: read full body before raising
- Buffer tool call JSON across SSE chunks; parse only on stop
engine/factory.py — wire into get_engine(), handle enable_reasoning for the new provider
Tests: unit (providers/test_<provider>.py) + live entry in _shared.py

How to add a tool to the agent

from typing import Annotated
from sunwaee.core.tools import tool, ok, err

@tool("Search the web for current information.")
def web_search(
    query: Annotated[str, "The search query"],
    num_results: Annotated[int, "Number of results"] = 5,
) -> str:
    try:
        return ok(_do_search(query, num_results))
    except Exception as e:
        return err(str(e))

Tests: tests/gen/test_<tool_name>.py — call directly, assert JSON output shape, test error path. Never call real external APIs.

`@tool` decorator

Introspects signature to build JSON Schema parameters automatically.

Supports: str, int, float, bool, list[T], Literal[...], Optional[T], Annotated[T, "description"]

Parameters with defaults → not required
Both sync and async supported
Must return JSON string: ok(data) / err(message) / json.dumps(...)

ok({"id": "123"})   # '{"ok": true, "data": {"id": "123"}}'
err("Not found")    # '{"ok": false, "error": "Not found"}'

Provider-specific quirks

#	Rule
1	`resolve_tokens()` before `compute_cost()` — xAI/Google exclude reasoning tokens from `output_tokens`; `resolve_tokens` back-calculates from `total_tokens`. Always called unconditionally — it's a no-op when counts already match.
2	Strip reasoning from all but last assistant turn — stale `reasoning_signature` breaks APIs and blocks mid-session provider switches.
3	OpenAI uses `max_completion_tokens`, not `max_tokens`.
4	OpenAI reasoning models: yield synthetic chunk immediately — stream is silent during thinking; `Response(reasoning_content="Reasoning in progress…", synthetic=True)`. Never treat `synthetic=True` as real content.
5	Google: `thoughtSignature` on `functionCall` part → `ToolCall.thought_signature`; echo every subsequent turn.
6	Google: no tool call IDs — use function name as correlation ID.
7	Google streaming: `?alt=sse` required on `streamGenerateContent`.
8	System-only input — promote system message to `Role.USER` (Anthropic + Google).
9	Anthropic reasoning: two paths. Newer models (Opus 4.7/4.6, Sonnet 4.6) use `output_config: {effort: X}` + `thinking: {type: "adaptive"}`. Older models (Opus 4.5, Haiku 4.5, Sonnet 4.5) use `thinking: {type: "enabled", budget_tokens: N}` with `1024 ≤ budget < max_tokens`. The factory selects the path based on whether the model has `reasoning_efforts`.
10	Connection pooling: one `httpx.AsyncClient` per `(event_loop_id, base_url)` in `factory.py`.
11	`Role.CONTEXT` mapping: all providers wrap content in `<context>` tags automatically — Anthropic → `{"role":"user","content":"<context>…</context>"}`; OpenAI → `{"role":"system","content":"<context>…</context>"}`; Google → `{"role":"user","parts":[{"text":"<context>…</context>"}]}`.
12	Google Gemini 3 uses `thinkingLevel` (string: `"minimal"/"low"/"medium"/"high"`); Gemini 2.5 uses `thinkingBudget` (int: `-1` = dynamic, `0` = off, `N` = fixed). The engine selects based on whether the model has `reasoning_efforts`. Gemini 3.1 Pro and 2.5 Pro cannot disable thinking (`reasoning_mode="always"`).
13	kimi-k2.5 (Moonshot) reasons by default — disabling thinking requires an explicit payload `{"thinking": {"type": "disabled"}}`. Set via `Model.reasoning_disabled_payload`; the OpenAI engine merges it when `reasoning_effort` is None.
14	xAI always-reasoning models (`grok-4.20`, `grok-4-1-fast`, `grok-4-fast`) route to a non-reasoning variant on `enable_reasoning=False` via `non_reasoning_id`. Models without a `non_reasoning_id` (`grok-4`, `grok-3-mini`, `grok-code-fast-1`) cannot have reasoning disabled.
15	grok-4.20 returns `reasoning_content` on `chat/completions` — `reasoning_tokens_type="summary"` refers to the `/v1/responses` endpoint only; on `chat/completions` the field carries full raw reasoning text.

Project details

Release history Release notifications | RSS feed

1.7.11

May 8, 2026

1.7.10

May 8, 2026

1.7.9

May 5, 2026

1.7.8

May 2, 2026

1.7.7

Apr 24, 2026

1.7.6

Apr 24, 2026

1.7.5

Apr 24, 2026

1.7.4

Apr 24, 2026

1.7.3

Apr 22, 2026

1.7.2

Apr 19, 2026

1.7.1

Apr 19, 2026

1.7.0

Apr 19, 2026

1.6.3

Apr 18, 2026

1.6.2

Apr 18, 2026

1.6.1

Apr 18, 2026

1.6.0

Apr 18, 2026

1.5.1

Apr 17, 2026

1.5.0

Apr 17, 2026

This version

1.4.1

Apr 17, 2026

1.4.0

Apr 17, 2026

1.3.1

Apr 11, 2026

1.3.0

Apr 11, 2026

1.2.0

Apr 3, 2026

1.1.11

Mar 24, 2026

1.1.10

Mar 20, 2026

1.1.9

Mar 18, 2026

1.1.8

Mar 18, 2026

1.1.7

Mar 18, 2026

1.1.6

Mar 16, 2026

1.1.5

Mar 14, 2026

1.1.4

Mar 14, 2026

1.1.3

Mar 12, 2026

1.1.2

Mar 12, 2026

1.1.1

Mar 10, 2026

1.1.0

Mar 10, 2026

1.0.1

Mar 2, 2026

1.0.0

Feb 28, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sunwaee-1.4.1.tar.gz (73.3 kB view details)

Uploaded Apr 17, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

sunwaee-1.4.1-py3-none-any.whl (43.6 kB view details)

Uploaded Apr 17, 2026 Python 3

File details

Details for the file sunwaee-1.4.1.tar.gz.

File metadata

Download URL: sunwaee-1.4.1.tar.gz
Upload date: Apr 17, 2026
Size: 73.3 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for sunwaee-1.4.1.tar.gz
Algorithm	Hash digest
SHA256	`06553478751b6740a7304bb836adba7c204ff673ace99fda040d8d93b14896b1`
MD5	`f22a42daeedf84da61529ff202b9d9e0`
BLAKE2b-256	`95761883f096e08c4b7dd830ceec1d5f3a04ae110793dd016713cec59f9c409f`

See more details on using hashes here.

File details

Details for the file sunwaee-1.4.1-py3-none-any.whl.

File metadata

Download URL: sunwaee-1.4.1-py3-none-any.whl
Upload date: Apr 17, 2026
Size: 43.6 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for sunwaee-1.4.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`b6fbc566fdcc80c522fd61ccb89ab0c1618fd30abdc9f75e79931d2ae1c142a0`
MD5	`66ee5346d62699eb3d493ec84cac4ac9`
BLAKE2b-256	`319ca07f50521e3caf015632a410864c35c9eb2270b639ebd4984a95f243e7b1`

See more details on using hashes here.

sunwaee 1.4.1

Navigation

Verified details

Maintainers

Unverified details

Meta

Project description

Install

Quick start

Providers

Directory structure

Core types (engine/types.py)

get_engine() — reasoning control

Model dataclass (engine/model.py)

Usage

With tools

File and image attachments

ReAct agent loop

Listing models

Testing

How to add a model

How to add an OpenAI-compatible provider

How to add a provider with a custom API

How to add a tool to the agent

@tool decorator

Provider-specific quirks

Project details

Verified details

Maintainers

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

Core types (`engine/types.py`)

`get_engine()` — reasoning control

`Model` dataclass (`engine/model.py`)

`@tool` decorator