SUNWÆE gen — multi-provider LLM engine library.

Project description

SUNWÆE

Coverage Python PyPI License

All LLMs, one response format, one dependency (httpx). Supports switching model in conversations (e.g. draft with GPT, refine with Anthropic).

Handles streaming, tool calls, file attachments, prompt caching, extended thinking, and cost tracking across Anthropic, OpenAI, Google, DeepSeek, xAI, and Moonshot (and any other OpenAI-compatible provider).

Install

pip install sunwaee
# with file/image attachment support (pdf, docx, xlsx, pptx extraction)
pip install "sunwaee[files]"
# or for development
pip install -e ".[dev,files]"

Quick start

import asyncio
from sunwaee.modules.gen.engine import get_engine
from sunwaee.modules.gen.engine.types import Message, Role

engine = get_engine("anthropic", "claude-sonnet-4-6")  # reads ANTHROPIC_API_KEY

async def main():
    messages = [Message(role=Role.USER, content="Hello")]

    # Non-streaming
    response = await engine.chat(messages)
    print(response.content)
    print(response.cost.total)  # in USD

    # Streaming
    async for chunk in engine.stream(messages):
        if chunk.content:
            print(chunk.content, end="", flush=True)

asyncio.run(main())

Providers & API keys

Provider	`provider=`	Env var
Anthropic	`"anthropic"`	`ANTHROPIC_API_KEY`
OpenAI	`"openai"`	`OPENAI_API_KEY`
Google	`"google"`	`GOOGLE_API_KEY`
DeepSeek	`"deepseek"`	`DEEPSEEK_API_KEY`
xAI	`"xai"`	`XAI_API_KEY`
Moonshot	`"moonshot"`	`MOONSHOT_API_KEY`

Directory structure

sunwaee/
├── core/
│   ├── logger.py                 # get_logger(name) — scoped under "sunwaee.*"
│   └── tools.py                  # @tool decorator, ok(), err()
└── modules/gen/
    ├── __init__.py               # public re-exports (get_engine, run, stream_run, …)
    ├── agent.py                  # ReAct loop — run() + stream_run()
    ├── tools.py                  # TOOLS list (extend to add tools to the agent)
    └── engine/
        ├── __init__.py           # get_engine, Message, Response, Tool, …
        ├── base.py               # BaseEngine ABC
        ├── factory.py            # get_engine() — provider routing + connection pooling
        ├── model.py              # Model dataclass + compute_cost()
        ├── types.py              # Message, Response, ToolCall, Usage, Cost, Performance, …
        ├── models/               # model registry per provider
        │   ├── __init__.py       # get_model(), list_models()
        │   ├── anthropic.py
        │   ├── openai.py
        │   ├── google.py
        │   ├── deepseek.py
        │   ├── xai.py
        │   └── moonshot.py
        └── providers/            # engine implementations
            ├── anthropic.py      # AnthropicEngine
            ├── openai.py         # OpenAIEngine (also used by DeepSeek, xAI, Moonshot)
            └── google.py         # GoogleEngine

tests/gen/
├── test_agent.py
├── test_stream_agent.py
├── test_tools.py
└── engine/
    ├── test_types.py
    ├── test_factory.py
    ├── test_model.py
    ├── providers/
    │   ├── test_anthropic.py
    │   ├── test_openai.py
    │   └── test_google.py
    └── live/
        ├── test_providers.py     # real API calls, all providers × all scenarios
        └── run/                  # JSON snapshots written after each live run

Core types (`engine/types.py`)

These types flow through every layer. Read them before touching any engine code.

class Role(Enum):       SYSTEM, USER, ASSISTANT, TOOL
class StopReason(Enum): END_TURN, TOOL_USE, MAX_TOKENS

@dataclass
class FileAttachment:
    data: bytes                  # raw file bytes
    filename: str                # used for mime-type detection and the <file> wrapper
    media_type: str = ""         # auto-detected from filename if omitted

    # Supported types: text/*, image/jpeg|png|gif|webp,
    #   application/json|pdf + OOXML (docx, xlsx, pptx)
    # is_text  → serialized as <file name="…">…</file> text block
    # is_image → serialized as base64 inline image (provider-specific format)
    # as_text()    → decoded / extracted string (pdf/docx/xlsx/pptx use dedicated parsers)
    # as_base64()  → base64-encoded string for image parts

@dataclass
class Message:
    role: Role
    content: str | None = None
    reasoning_content: str | None = None    # thinking for models that support it
    reasoning_signature: str | None = None  # opaque blob — must be echoed back verbatim
    tool_call_id: str | None = None         # set on Role.TOOL messages
    tool_calls: list[ToolCall] | None = None
    attachments: list[FileAttachment] | None = None  # Role.USER only; ignored on other roles

@dataclass
class ToolCall:
    id: str
    name: str
    arguments: dict
    thought_signature: str | None = None    # Google only — lives on functionCall parts
    error: str | None = None
    duration: float = 0.0
    results: list[dict] = field(default_factory=list)

@dataclass
class Tool:
    name: str
    description: str
    parameters: dict        # JSON Schema object
    fn: Callable | None = None

@dataclass
class Response:
    provider: str
    model: str
    streaming: bool = False
    synthetic: bool = False         # sentinel — not a real model chunk
    content: str | None = None
    reasoning_content: str | None = None
    reasoning_signature: str | None = None
    tool_calls: list[ToolCall] | None = None
    stop_reason: StopReason | None = None
    error: str | None = None
    usage: Usage | None = None
    cost: Cost | None = None
    performance: Performance | None = None

@dataclass
class Usage:
    input_tokens: int = 0
    output_tokens: int = 0
    total_tokens: int = 0
    cache_read_tokens: int = 0
    cache_write_tokens: int = 0

@dataclass
class Cost:
    input: float = 0.0
    output: float = 0.0
    cache_read: float = 0.0
    cache_write: float = 0.0
    total: float = 0.0

@dataclass
class Performance:
    latency: float = 0.0            # seconds to first chunk
    reasoning_duration: float = 0.0
    content_duration: float = 0.0
    total_duration: float = 0.0
    throughput: int = 0             # output tokens / second

Usage

Basic chat

from sunwaee.modules.gen.engine import get_engine
from sunwaee.modules.gen.engine.types import Message, Role

engine = get_engine("anthropic", "claude-sonnet-4-6")
response = await engine.chat([Message(role=Role.USER, content="Hi")])
print(response.content, response.cost.total)

Streaming

async for chunk in engine.stream(messages):
    if chunk.content:
        print(chunk.content, end="", flush=True)
    if chunk.stop_reason is not None:
        print(chunk.usage, chunk.cost)  # only on the final chunk

With tools

from sunwaee.core.tools import tool, ok, err
from sunwaee.modules.gen.engine.types import Tool

@tool("Return the current UTC time.")
def get_time() -> str:
    from datetime import datetime, timezone
    return ok({"time": datetime.now(timezone.utc).isoformat()})

tools = [get_time._tool]
response = await engine.chat(messages, tools=tools)

File and image attachments

from sunwaee.modules.gen.engine.types import FileAttachment, Message, Role

# Text file — serialized as <file name="…">…</file> before the user text
with open("report.pdf", "rb") as f:
    pdf_att = FileAttachment(data=f.read(), filename="report.pdf")

response = await engine.chat([
    Message(role=Role.USER, content="Summarise this document.", attachments=[pdf_att])
])

# Image — serialized as base64 inline (provider-specific format)
# Raises ValueError if the model's supports_vision is False
with open("photo.png", "rb") as f:
    img_att = FileAttachment(data=f.read(), filename="photo.png")

response = await engine.chat([
    Message(role=Role.USER, content="What is in this image?", attachments=[img_att])
])

Supported attachment types:

Category	Mime types	Extensions
Text	`text/*`	`.txt`, `.csv`, `.md`, `.py`, …
JSON	`application/json`	`.json`
Image	`image/jpeg`, `image/png`, `image/gif`, `image/webp`	`.jpg`, `.png`, `.gif`, `.webp`
PDF	`application/pdf`	`.pdf`
Word	`application/vnd…wordprocessingml.document`	`.docx`
Excel	`application/vnd…spreadsheetml.sheet`	`.xlsx`
PowerPoint	`application/vnd…presentationml.presentation`	`.pptx`

Text and document files are serialized as a <file name="…">…</file> text block (extracted via pypdf, python-docx, openpyxl, or python-pptx). Images are sent as base64 inline — no provider upload API is used.

ReAct agent loop (automatic tool execution)

from sunwaee.modules.gen.agent import stream_run

new_messages = []
async for chunk in stream_run(messages, tools, engine, new_messages=new_messages):
    if chunk.content:
        print(chunk.content, end="", flush=True)
# new_messages contains all assistant + tool turns appended during the run

stream_run runs up to 10 iterations by default. It calls engine.stream(), detects TOOL_USE, runs all tool calls concurrently with asyncio.gather, appends results, and loops. Sync tool functions are dispatched via run_in_executor.

Listing available models

from sunwaee.modules.gen.engine.models import list_models, get_model

all_models = list_models()              # list[Model]
model = get_model("claude-sonnet-4-6")  # Model | None

Testing

# Unit tests — mocked HTTP, no API keys needed
pytest tests/gen/ -m "not live"

# Live tests — real API calls, keys required
pytest tests/gen/ -m live

# Coverage
pytest tests/gen/ -m "not live" --cov=sunwaee --cov-report=term-missing

# Single file / filter by name
pytest tests/gen/engine/providers/test_anthropic.py
pytest tests/gen/ -k "tool_call"

Unit test conventions

Mock httpx.AsyncClient — never make real HTTP calls in unit tests.
Standard pattern (see test_anthropic.py / test_openai.py):

import pytest

@pytest.fixture
def mock_client():
    class FakeResponse:
        status_code = 200
        def raise_for_status(self): pass
        def json(self): return { ... }  # provider response shape

    class FakeClient:
        async def post(self, *a, **kw): return FakeResponse()

    return FakeClient()

async def test_chat_basic(mock_client):
    engine = NewProviderEngine("model-name", "sk-test", client=mock_client)
    response = await engine.chat([Message(role=Role.USER, content="Hi")])
    assert response.content == "Hello"
    assert response.cost.total > 0

For streaming, use an async generator as the mock transport.
Always assert that response.cost, response.usage, and response.performance are populated on the final chunk.
Live tests (-m live) are excluded from coverage and CI — they require real API keys.

Live test scenarios

All providers are tested against 6 scenarios × chat + stream (12 calls total per provider):

Scenario	What it tests
`ONLY_SYSTEM`	System-only input edge case; lenient assertions
`ONLY_USER`	Single user message; asserts content + usage + cost populated
`SYSTEM_AND_USER`	System prompt is respected in the response
`TOOL_CALL`	Model must issue at least one tool call
`TOOL_CALL_RESULT`	Full multi-turn with real tool IDs/signatures captured live first
`FILE_ATTACHMENT`	Text file attached to a user message; asserts content populated

TOOL_CALL_RESULT runs TOOL_CALL first to capture real tool IDs and reasoning signatures, then replays with tool results. Required because Google has no tool call IDs and Anthropic/Google require reasoning signatures echoed verbatim.

Image attachments are tested separately via test_chat_image_attachment / test_stream_image_attachment, parametrized over vision-capable engines only (all except deepseek-chat).

How to add a model to an existing provider

File: sunwaee/modules/gen/engine/models/<provider>.py

Add a Model(...) entry to the MODELS list. Pricing fields are in USD per million tokens (_per_mtok).

Model(
    name="provider-model-name",         # exact API model identifier
    display_name="Human Readable Name",
    provider="anthropic",
    context_window=200_000,
    max_output_tokens=64_000,
    input_price_per_mtok=3.0,
    output_price_per_mtok=15.0,
    cache_read_price_per_mtok=0.3,      # omit if provider doesn't support caching
    cache_write_price_per_mtok=3.75,
    input_price_per_mtok_200k=6.0,      # omit if no >200k tier
    output_price_per_mtok_200k=22.5,
    supports_vision=True,
    supports_tools=True,
    supports_thinking=True,
    supports_reasoning_tokens=True,
    release_date="2025-01-01",
    deprecated_at=None,
    sunset_at=None,                     # psql/sync_models.py uses this
)

Pricing tiers (engine/model.py): base always required; _128k when input_tokens > 128_000 (xAI only); _200k when > 200_000; _272k when > 272_000 (OpenAI only).

The registry in engine/models/__init__.py picks up the new model automatically.

Tests: Add assertions in engine/test_model.py if the model has non-standard pricing tiers.

How to add an OpenAI-compatible provider

Use this path when the provider speaks the OpenAI Chat Completions API.

Step 1 — Models: Create engine/models/<provider>.py with a MODELS list.

Step 2 — Register models: In engine/models/__init__.py, import and add to _ALL.

Step 3 — Register base URL: In engine/factory.py, add to _OPENAI_COMPATIBLE:

_OPENAI_COMPATIBLE: dict[str, str] = {
    ...
    "newprovider": "https://api.newprovider.com/v1",
}

The env var is auto-derived as NEWPROVIDER_API_KEY. get_engine("newprovider", "model-name") now works.

Step 4 — Live tests: Add ("newprovider", "cheapest-model") to ENGINES in tests/gen/engine/live/test_providers.py.

How to add a provider with a custom API

Use this path when the provider does not speak the OpenAI schema.

Step 1 — Models: Same as OpenAI-compatible step 1.

Step 2 — Register models: Same as above.

Step 3 — Create the engine: engine/providers/<provider>.py

class NewProviderEngine(BaseEngine):
    BASE_URL = "https://api.newprovider.com"

    def __init__(self, model, api_key, max_tokens=8192, client=None):
        self.model = model
        self.api_key = api_key
        self.max_tokens = max_tokens
        self._client = client or httpx.AsyncClient()

    async def chat(self, messages, tools=None) -> Response: ...
    async def stream(self, messages, tools=None) -> AsyncIterator[Response]: ...

Key rules for every provider implementation:

Accept client: httpx.AsyncClient | None = None — the factory injects a pooled client.
Call resolve_tokens(usage) before compute_cost.
Strip reasoning_content/reasoning_signature from all assistant messages except the last.
Handle system-only input: promote system message to Role.USER if no other messages.
On 4xx/5xx in streaming, read the full body before raising.
Buffer tool call JSON across SSE chunks; parse only on the final stop event.

Step 4 — Wire into the factory: Import and add a case in get_engine() in engine/factory.py.

Step 5 — Tests:

tests/gen/engine/providers/test_newprovider.py — unit tests with mocked HTTP. Cover: payload building, response parsing, streaming events, tool call accumulation, error handling.
tests/gen/engine/live/test_providers.py — add ("newprovider", "cheapest-model") to ENGINES.

How to add a tool to the agent

Step 1 — Implement the tool:

from typing import Annotated, Literal
from sunwaee.core.tools import tool, ok, err

@tool("Search the web for current information.")
def web_search(
    query: Annotated[str, "The search query"],
    num_results: Annotated[int, "Number of results to return"] = 5,
) -> str:
    try:
        results = _do_search(query, num_results)
        return ok(results)
    except Exception as e:
        return err(str(e))

Step 2 — Register: In sunwaee/modules/gen/tools.py, add web_search._tool to TOOLS.

Step 3 — Tests: Add tests/gen/test_<tool_name>.py. Call the function directly, assert JSON output shape, test the error path. Never call real external APIs in unit tests — mock them.

`@tool` decorator reference

The @tool decorator introspects the function signature to build a JSON Schema parameters object automatically.

Supports: str, int, float, bool, list[T], Literal[...], Optional[T], Annotated[T, "description"]
Parameters with defaults are not marked required; Optional parameters are also not required
Both sync and async functions supported — agent._execute awaits async, dispatches sync to thread pool
Must return a JSON string — use ok() / err(), or json.dumps()

ok({"id": "123"})   # '{"ok": true, "data": {"id": "123"}}'
err("Not found")    # '{"ok": false, "error": "Not found"}'

Provider-specific quirks

Must read before touching any provider code.

1. Token normalisation — `resolve_tokens()`

xAI and Google (with thinking) exclude reasoning tokens from output_tokens but include them in total_tokens. Always call resolve_tokens(usage) before compute_cost. It treats total_tokens as truth and back-calculates output_tokens.

2. Reasoning echoed only on the last assistant turn

Strip reasoning_content and reasoning_signature from every assistant message except the very last one before building the request. Stale signatures break the API. This also enables mid-session provider switches.

3. OpenAI uses `max_completion_tokens`, not `max_tokens`

Translate at payload-build time. All other providers (including OpenAI-compatible ones) use max_tokens.

4. OpenAI reasoning models are silent during thinking

When reasoning_effort is set, the stream hangs until the answer starts. OpenAIEngine.stream() immediately yields a synthetic Response(reasoning_content="Reasoning in progress…", synthetic=True) so callers have something to display.

5. Google — `thoughtSignature` lives on `functionCall` parts

When thinking is enabled and a tool is called, the thoughtSignature is on the functionCall part, not the top-level thought block. Capture it into ToolCall.thought_signature and echo it back on every subsequent assistant turn.

6. Google — no tool call IDs

Use the function name as the ID for correlation throughout the Google provider.

7. Google streaming — `?alt=sse` required

The endpoint is streamGenerateContent?alt=sse. Without it, the response is a JSON array.

8. System-only input

Anthropic and Google reject requests with no non-system messages. Promote the system message to Role.USER.

9. Anthropic thinking budget

Requires 1024 ≤ thinking_budget < max_tokens. Default: max(1024, max_tokens - 1024).

10. Connection pooling

factory.py maintains one httpx.AsyncClient per (event_loop_id, base_url). Always default to client or httpx.AsyncClient() so the engine is usable standalone in tests.

Project details

Release history Release notifications | RSS feed

1.7.11

May 8, 2026

1.7.10

May 8, 2026

1.7.9

May 5, 2026

1.7.8

May 2, 2026

1.7.7

Apr 24, 2026

1.7.6

Apr 24, 2026

1.7.5

Apr 24, 2026

1.7.4

Apr 24, 2026

1.7.3

Apr 22, 2026

1.7.2

Apr 19, 2026

1.7.1

Apr 19, 2026

1.7.0

Apr 19, 2026

1.6.3

Apr 18, 2026

1.6.2

Apr 18, 2026

1.6.1

Apr 18, 2026

1.6.0

Apr 18, 2026

1.5.1

Apr 17, 2026

1.5.0

Apr 17, 2026

1.4.1

Apr 17, 2026

1.4.0

Apr 17, 2026

1.3.1

Apr 11, 2026

1.3.0

Apr 11, 2026

This version

1.2.0

Apr 3, 2026

1.1.11

Mar 24, 2026

1.1.10

Mar 20, 2026

1.1.9

Mar 18, 2026

1.1.8

Mar 18, 2026

1.1.7

Mar 18, 2026

1.1.6

Mar 16, 2026

1.1.5

Mar 14, 2026

1.1.4

Mar 14, 2026

1.1.3

Mar 12, 2026

1.1.2

Mar 12, 2026

1.1.1

Mar 10, 2026

1.1.0

Mar 10, 2026

1.0.1

Mar 2, 2026

1.0.0

Feb 28, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sunwaee-1.2.0.tar.gz (66.7 kB view details)

Uploaded Apr 3, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

sunwaee-1.2.0-py3-none-any.whl (42.2 kB view details)

Uploaded Apr 3, 2026 Python 3

File details

Details for the file sunwaee-1.2.0.tar.gz.

File metadata

Download URL: sunwaee-1.2.0.tar.gz
Upload date: Apr 3, 2026
Size: 66.7 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for sunwaee-1.2.0.tar.gz
Algorithm	Hash digest
SHA256	`2d4cbbcb3a3d9639c78608ae5e023853ea8962fbf3d85546a73e4237dc7aa096`
MD5	`a6a33b7001ddcce05d9ec30c4e6a6f97`
BLAKE2b-256	`87c3a73033d8b74c35cff7793377a3ceff4984147009f8954f8261840f05e5c4`

See more details on using hashes here.

File details

Details for the file sunwaee-1.2.0-py3-none-any.whl.

File metadata

Download URL: sunwaee-1.2.0-py3-none-any.whl
Upload date: Apr 3, 2026
Size: 42.2 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for sunwaee-1.2.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`59ba10af44cb66f2b009f4823a55d96daff7a350f46972cef8059f0edfc06074`
MD5	`01cfa6f2bee2e587f9902480db53e0ba`
BLAKE2b-256	`ba6458f058f22e0d1d8da0f9dc3fe60c992cbca8996b20fcd0eeb84c037b75ed`

See more details on using hashes here.

sunwaee 1.2.0

Navigation

Verified details

Maintainers

Unverified details

Meta

Project description

SUNWÆE

Install

Quick start

Providers & API keys

Directory structure

Core types (engine/types.py)

Usage

Basic chat

Streaming

With tools

File and image attachments

ReAct agent loop (automatic tool execution)

Listing available models

Testing

Unit test conventions

Live test scenarios

How to add a model to an existing provider

How to add an OpenAI-compatible provider

How to add a provider with a custom API

How to add a tool to the agent

@tool decorator reference

Provider-specific quirks

1. Token normalisation — resolve_tokens()

2. Reasoning echoed only on the last assistant turn

3. OpenAI uses max_completion_tokens, not max_tokens

4. OpenAI reasoning models are silent during thinking

5. Google — thoughtSignature lives on functionCall parts

6. Google — no tool call IDs

7. Google streaming — ?alt=sse required

8. System-only input

9. Anthropic thinking budget

10. Connection pooling

Project details

Verified details

Maintainers

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

Core types (`engine/types.py`)

`@tool` decorator reference

1. Token normalisation — `resolve_tokens()`

3. OpenAI uses `max_completion_tokens`, not `max_tokens`

5. Google — `thoughtSignature` lives on `functionCall` parts

7. Google streaming — `?alt=sse` required