Structured LLM responses across OpenAI, Anthropic, DeepSeek, and Mistral (async, Pydantic schemas).
Project description
xoin-py
Python LLM client for OpenAI, Claude, DeepSeek & more — async chat completions, Pydantic-validated structured outputs, text embeddings, and provider fallback for production services.
xoin-py is an open source LLM API client for Python 3.10+ that connects to multiple AI providers — OpenAI, Anthropic, Mistral, DeepSeek — through one consistent async API built on httpx.
It helps you ship AI features with:
✅ Chat completions (OpenAI-style where applicable; Anthropic Messages API for Claude)
✅ Structured output validated with Pydantic (BaseModel)
✅ Text embeddings on providers that expose OpenAI-compatible /embeddings (OpenAI, Mistral)
✅ Automatic provider fallback (provider_order, default_provider, fallback_providers)
✅ Retries with backoff on transient provider execution failures
Async-first, minimal dependencies (httpx, pydantic). Sister library to the JavaScript xoin-js client (@xoin/xoin-js).
Table of Contents
- Why xoin-py
- Installation
- Who It Is For
- Works Well In
- Quick Start
- Built-in Providers
- Parity with xoin-js
- Core Concepts
- API Overview
Xoin/create_xoinconfigurationgenerateparametersStructuredOutput(structured output)- Schema examples (Pydantic)
- Retry and fallback strategy
embedparameters- Provider constructors
- Parameter & types reference
- Examples by use case
- Framework snippets
- Custom providers
- Error handling
- Development
Why xoin-py
Production Python backends that call LLM APIs quickly outgrow one-off SDK calls and ad-hoc JSON parsing.
You usually want:
- one multi-provider surface for OpenAI, Anthropic, Mistral, DeepSeek
- structured outputs validated with Pydantic before business logic runs
- provider fallback when a vendor errors or rate-limits
- embeddings on the same abstraction where supported
- async integration with FastAPI, Starlette, workers, and scripts
xoin-py targets that workflow in a small codebase: configure providers once, await x.generate(...) / await x.embed(...).
Installation
pip install xoin-py
Import the package as xoin (distribution name on PyPI is xoin-py):
from xoin import Xoin, StructuredOutput
from xoin.providers import OpenAIProvider
Optional: load secrets from the environment (os.environ, pydantic-settings, etc.) — same idea as dotenv in Node.
Who It Is For
xoin-py fits if you are building:
- FastAPI / Starlette routes that return structured model output
- asyncio services and workers
- extraction, classification, and summarization pipelines
- internal tools that need validated JSON from LLMs
- RAG or search flows that need embeddings (OpenAI / Mistral)
Works Well In
Server-side Python where API keys stay private:
- FastAPI, Starlette, Django ASGI
- asyncio scripts and CLIs
- background workers (Celery with async bridge, arq, etc.)
Do not embed provider API keys in browser-delivered code.
Quick Start
Complete asyncio example:
import asyncio
import os
from pydantic import BaseModel
from xoin import StructuredOutput, Xoin
from xoin.providers import OpenAIProvider
class UserProfile(BaseModel):
name: str
age: int
async def main() -> None:
async with Xoin(
providers={
"openai": OpenAIProvider(
api_key=os.environ["OPENAI_API_KEY"],
default_model=os.getenv("OPENAI_MODEL", "gpt-4o-mini"),
),
},
default_provider="openai",
) as xoin:
result = await xoin.generate(
provider="openai",
prompt='Extract a JSON object from: "Ava is 31 years old."',
structured=StructuredOutput(response_model=UserProfile, name="user_profile"),
)
print(result.data)
asyncio.run(main())
Sample output (live APIs) — captured by running the snippet above with keys from examples/.env (set -a && source examples/.env && set +a). Exact models, token counts, and response IDs vary per request; the raw field from GenResult.model_dump() is omitted here for readability.
OpenAI (gpt-4o-mini, structured native schema):
{
"provider": "openai",
"model": "gpt-4o-mini-2024-07-18",
"text": "{\"name\":\"Ava\",\"age\":31}",
"data": {
"name": "Ava",
"age": 31
},
"usage": {
"input_tokens": 72,
"output_tokens": 10,
"total_tokens": 82
},
"finish_reason": "stop"
}
DeepSeek — same UserProfile pattern, from python examples/deepseek_structured_output.py:
{
"provider": "deepseek",
"model": "deepseek-v4-flash",
"text": "{\"name\": \"Kabir\", \"age\": 33}",
"data": {
"name": "Kabir",
"age": 33
},
"usage": {
"input_tokens": 40,
"output_tokens": 13,
"total_tokens": 53
},
"finish_reason": "stop"
}
Why this works:
- one prompt
- native or prompt-based JSON depending on provider capability (
StructuredOutput.mode) - Pydantic validates into
result.data - failures surface as typed exceptions (
xoin.errors)
Built-in Providers
xoin-py ships concrete provider classes (each in its own module under xoin.providers):
| Provider class | Typical use |
|---|---|
OpenAIProvider |
OpenAI Chat Completions + embeddings |
AnthropicProvider |
Claude Messages API + native tool-use structured output |
MistralProvider |
Mistral chat (json_object structured mode) + embeddings |
DeepSeekProvider |
DeepSeek chat (json_object); no embeddings in defaults |
import os
from xoin import Xoin
from xoin.providers import AnthropicProvider, DeepSeekProvider, MistralProvider, OpenAIProvider
xoin = Xoin(
default_provider="openai",
fallback_providers=["anthropic", "deepseek"],
providers={
"openai": OpenAIProvider(
api_key=os.environ["OPENAI_API_KEY"],
default_model=os.getenv("OPENAI_MODEL", "gpt-4o-mini"),
default_embedding_model=os.getenv("OPENAI_EMBEDDING_MODEL", "text-embedding-3-small"),
),
"anthropic": AnthropicProvider(
api_key=os.environ["ANTHROPIC_API_KEY"],
default_model=os.getenv("ANTHROPIC_MODEL", "claude-sonnet-4-20250514"),
),
"mistral": MistralProvider(
api_key=os.environ["MISTRAL_API_KEY"],
default_model=os.getenv("MISTRAL_MODEL", "mistral-small-latest"),
default_embedding_model=os.getenv("MISTRAL_EMBEDDING_MODEL", "mistral-embed"),
),
"deepseek": DeepSeekProvider(
api_key=os.environ["DEEPSEEK_API_KEY"],
default_model=os.getenv("DEEPSEEK_MODEL", "deepseek-chat"),
),
},
)
OpenAI-compatible backends (Groq, Azure OpenAI-style gateways, etc.) use the same OpenAIProvider class with name="groq", a custom base_url, and capabilities=Capabilities(structured_outputs="json-object", embeddings=False) when embeddings are unavailable.
Parity with xoin-js
xoin-js is the reference JavaScript client (npm install @xoin/xoin-js). xoin-py follows the same ideas with Python idioms.
| Feature | xoin-js | xoin-py |
|---|---|---|
| Structured schemas | Zod | Pydantic BaseModel |
| HTTP | fetch |
httpx.AsyncClient (shared on Xoin) |
Templates (template, templateId, templateFile, variables) |
✅ | ✅ (variables replaces JS input; YAML needs PyYAML) |
generateMany |
✅ | ✅ (await xoin.generate_many(...)) |
Priority providerTargets |
✅ | ✅ (provider_targets=[PriorityProviderTarget(...)]) |
registerProvider |
✅ | ✅ (xoin.register_provider(...)) |
Manual jsonSchema alongside schema |
✅ (structured.jsonSchema) |
✅ StructuredOutput(json_schema=...) accepts jsonSchema in dict input |
signal / AbortSignal |
✅ | ✅ cooperative signal=: asyncio.Event (is_set()) or any object with .aborted truthy → raises asyncio.CancelledError before HTTP |
metadata passthrough |
✅ | ✅ merged shallow into outbound bodies (provider_options wins on key clashes) |
Fallback vs validation: After a successful HTTP response, if Pydantic validation fails, xoin-py now wraps that failure as ProviderExecutionError, so the same provider_order / provider_targets fallback chain used for HTTP errors can try the next provider (matching xoin-js behavior). You can still catch the underlying cause via exc.__cause__ when needed. Direct callers may also catch StructuredOutputError from validate_response in lower-level code paths.
Core Concepts
1. One client, many providers
Register all vendor adapters on Xoin(providers={...}).
2. Structured output first
Define a BaseModel and pass StructuredOutput(response_model=...). Parsed output is GenResult.data.
3. Fallback without glue
Pass provider_order=[...] on generate, or configure default_provider / fallback_providers on the client.
4. Embeddings where the API matches OpenAI
Use await xoin.embed(input=[...]) with OpenAI or Mistral defaults.
5. Async context manager
Prefer async with Xoin(...) as xoin: — closes the internal httpx client when Xoin created it.
API Overview
Main exports (from xoin import ...):
Xoin,create_xoinStructuredOutput,ChatMessage,GenResult,EmbedResult,Usage,RetryCfgGenManyTarget,PriorityProviderTarget,TemplateDefinitionrender_template,resolve_named_template,load_template_fileerrors(module with exception classes)
Main methods:
await xoin.generate(...)await xoin.generate_many(...)await xoin.embed(...)xoin.register_provider(name, provider)
For a plain-language explanation of every argument and result field, see Parameter & types reference below.
Provider classes (from xoin.providers import ...):
OpenAIProvider,AnthropicProvider,MistralProvider,DeepSeekProvider
Protocol / internals (xoin.providers.base): Provider, ChatCompletionParameters, EmbeddingParameters, Capabilities — useful for custom adapters.
Xoin / create_xoin configuration
Xoin(...) (and create_xoin(**kwargs) — identical) accepts:
| Parameter | Type | What it does |
|---|---|---|
providers |
dict[str, Provider] |
Registered providers keyed by name (e.g. "openai"). Required. |
default_provider |
str | None |
Used when a request omits provider and as part of fallback ordering. |
fallback_providers |
list[str] | None |
Append-only fallback chain after primary / provider_order. |
templates |
dict[str, TemplateDefinition] | None |
Named templates referenced via template_id. |
retry |
int | RetryCfg | None |
Default retry policy for generate (ProviderExecutionError only). |
client |
httpx.AsyncClient | None |
Inject a shared client (tests, custom timeouts). If omitted, Xoin owns one. |
timeout_s |
float |
Default timeout when Xoin creates its own AsyncClient. |
Example:
from xoin import RetryCfg, Xoin
from xoin.providers import AnthropicProvider, OpenAIProvider
xoin = Xoin(
default_provider="openai",
fallback_providers=["anthropic"],
retry=RetryCfg(retries=2, delay_ms=300, backoff_multiplier=2.0),
providers={
"openai": OpenAIProvider(api_key="..."),
"anthropic": AnthropicProvider(api_key="..."),
},
)
generate parameters
await xoin.generate(**kwargs) — async chat / structured generation.
| Parameter | Type | What it does |
|---|---|---|
provider |
str | None |
Primary provider key from providers. |
provider_order |
list[str] | None |
Extra ordering after provider, before default_provider / fallback_providers. |
provider_targets |
list[PriorityProviderTarget | dict] | None |
Explicit priority plan (lower priority runs first). When set, replaces provider / provider_order routing. |
model |
str | None |
Overrides the provider’s default_model. |
prompt |
str | None |
Final user message appended after history / system / structured instructions. |
template |
str | None |
Inline template text containing {{variables}}. |
template_id |
str | None |
Lookup into Xoin(..., templates={...}). |
template_file |
str | Path | None |
Load YAML/JSON/plain template definitions from disk. |
variables |
Mapping[str, Any] | None |
Values merged with template defaults (same role as JS input). |
system |
str | None |
System instruction (inserted before conversation messages). |
messages |
Sequence[ChatMessage | dict] | None |
Chat history (role, content). |
structured |
StructuredOutput | dict | None |
Enables parsing + Pydantic validation into GenResult.data. |
temperature |
float | None |
Sampling temperature. |
max_tokens |
int | None |
Max output tokens (Anthropic defaults internally if unset). |
timeout_ms |
int | None |
Per-request timeout override (converted to seconds for httpx). |
metadata |
Mapping[str, Any] | None |
Extra fields merged into the provider JSON body before provider_options. |
provider_options |
Mapping[str, Any] | None |
Vendor-specific fields merged after metadata (same keys override). |
signal |
Any | None |
Cooperative cancel check: asyncio.Event when set, or truthy .aborted. |
retry |
int | RetryCfg | None |
Overrides client-level retry for this call. |
Plain text:
result = await xoin.generate(
provider="openai",
prompt="Write a short welcome message for a new SaaS customer.",
temperature=0.7,
max_tokens=120,
)
print(result.text)
Chat-style:
from xoin import ChatMessage
result = await xoin.generate(
provider="openai",
system="You are a concise support assistant.",
messages=[
ChatMessage(role="user", content="My payment failed yesterday."),
ChatMessage(role="assistant", content="I can help with that."),
ChatMessage(role="user", content="What should I check first?"),
],
)
Fallback chain:
result = await xoin.generate(
provider="openai",
provider_order=["anthropic", "mistral"],
prompt="Extract the order summary from the customer message.",
structured=StructuredOutput(response_model=OrderSummary),
)
generate_many
await xoin.generate_many(**kwargs) fans the same logical request out to multiple (provider, model?) targets in parallel via asyncio.gather. There is no shared fallback chain between targets—pair it with generate when you need resilience.
Shared parameters match generate, except provider, provider_order, provider_targets, and retry are replaced by:
| Parameter | Type | What it does |
|---|---|---|
targets |
Sequence[GenManyTarget | dict] |
Each entry names a provider key (and optional per-target model). |
All other shared knobs (prompt, templates, structured, metadata, signal, temperature, …) behave the same as generate. There is no retry wrapper around generate_many per target (matches xoin-js).
from xoin.types import GenManyTarget
results = await xoin.generate_many(
targets=[
GenManyTarget(provider="openai"),
GenManyTarget(provider="anthropic"),
],
prompt="Summarize why structured outputs matter in two bullets.",
)
for item in results:
print(item.provider, item.text[:120])
Runnable copies of these flows live under examples/ (see examples/README.md).
StructuredOutput (structured output)
Use StructuredOutput when you want validated JSON mapped to a Pydantic model (GenResult.data).
| Field | Type | What it does |
|---|---|---|
response_model |
type[BaseModel] |
Required — model used for validation (and default JSON Schema when native). |
json_schema |
dict[str, Any] | None |
Optional provider-facing schema override (JS jsonSchema). When set, native/prompt paths send this dict instead of model_json_schema(). |
mode |
'auto' | 'native' | 'prompted' |
Same semantics as xoin-js (native vs prompt-only instructions). |
name |
str |
Logical name / Anthropic tool name (default "structured_response"). |
description |
str | None |
Extra hint for providers that support descriptions. |
Modes
auto— use native structured features when the provider supports JSON Schema / JSON object modes; otherwise prepend strict JSON instructions.native— require native capability (prompt-onlycapabilities fall back to prompts).prompted— always use prompt instructions + local parsing.
Dict shorthand works (StructuredOutput.model_validate):
await xoin.generate(
prompt="…",
structured={
"response_model": UserProfile,
"name": "user_profile",
"mode": "auto",
# Optional camelCase parity:
# "jsonSchema": {"type": "object", "properties": {...}, "required": [...]},
},
)
Extraction example:
from pydantic import BaseModel
from xoin import StructuredOutput
class ShippingAddress(BaseModel):
line1: str
city: str
postal_code: str
country: str
result = await xoin.generate(
provider="anthropic",
prompt='Extract the shipping address from: "Ship this to 10 Park Street, Pune 411001, India."',
structured=StructuredOutput(
response_model=ShippingAddress,
name="shipping_address",
description="Normalized shipping address extracted from user input",
mode="auto",
),
)
print(result.data)
Schema examples (Pydantic)
Below mirror common xoin-js Zod patterns using Pydantic v2.
1. Basic object
from pydantic import BaseModel
class User(BaseModel):
name: str
age: int
result = await xoin.generate(
provider="openai",
prompt='Extract a JSON object from: "Ava is 31 years old."',
structured=StructuredOutput(response_model=User, name="user_profile"),
)
2. List of objects
Use a RootModel (or a small wrapper model) when the model must return a top-level JSON array.
from pydantic import BaseModel, RootModel
class OrderLine(BaseModel):
product: str
quantity: int
price: float
class OrderLines(RootModel[list[OrderLine]]):
pass
result = await xoin.generate(
provider="openai",
prompt=(
"Extract all purchased items:\n"
'"2 wireless mice at 25 each, 1 keyboard at 70, and 3 mouse pads at 10 each."'
),
structured=StructuredOutput(response_model=OrderLines, name="order_items"),
)
# Parsed payload is ``result.data.root``
3. Nested models
from pydantic import BaseModel
class Customer(BaseModel):
name: str
email: str
class Address(BaseModel):
line1: str
city: str
postal_code: str
country: str
class Item(BaseModel):
sku: str
title: str
quantity: int
class CustomerOrder(BaseModel):
customer: Customer
shipping_address: Address
items: list[Item]
result = await xoin.generate(
provider="anthropic",
prompt=f"Extract order details from:\n{email_text}",
structured=StructuredOutput(response_model=CustomerOrder, name="customer_order"),
)
4. Literal enums (strict categories)
from typing import Literal
from pydantic import BaseModel
class Ticket(BaseModel):
category: Literal["billing", "technical", "account", "other"]
priority: Literal["low", "medium", "high"]
summary: str
result = await xoin.generate(
provider="anthropic",
prompt="My card was charged twice and I still cannot access premium features.",
structured=StructuredOutput(response_model=Ticket, name="ticket_classification"),
)
5. Optional fields
from pydantic import BaseModel
class Lead(BaseModel):
name: str
company: str
email: str | None = None
phone: str | None = None
budget: str | None = None
result = await xoin.generate(
provider="openai",
prompt=f"Extract lead details from:\n{lead_message}",
structured=StructuredOutput(response_model=Lead, name="lead_profile"),
)
6. Union / discriminated unions
Plain union:
from typing import Literal, Union
from pydantic import BaseModel
class Refund(BaseModel):
action: Literal["refund"]
order_id: str
reason: str
class Replace(BaseModel):
action: Literal["replace"]
order_id: str
item: str
EmailAction = Union[Refund, Replace]
result = await xoin.generate(
provider="openai",
prompt=f"Determine the action:\n{support_message}",
structured=StructuredOutput(response_model=EmailAction, name="email_action"),
)
Discriminated union:
from typing import Annotated, Literal, Union
from pydantic import BaseModel, Field
class EmailNotif(BaseModel):
channel: Literal["email"]
subject: str
body: str
class SmsNotif(BaseModel):
channel: Literal["sms"]
message: str
class NotificationEnvelope(BaseModel):
notification: Annotated[
Union[EmailNotif, SmsNotif],
Field(discriminator="channel"),
]
result = await xoin.generate(
provider="openai",
prompt=f"Build notification payload from:\n{event_text}",
structured=StructuredOutput(response_model=NotificationEnvelope, name="notification_payload"),
)
7. Choosing schema styles (rules of thumb)
BaseModelfields for most business responseslist[T]when the model must return an array at the top levelLiteral[...]when downstream code branches on fixed values| Noneoptional fields when keys may be absent- Unions / discriminators when multiple shapes are valid
JSON Schema sent to providers is derived from model_json_schema() unless you implement a custom provider that overrides behavior.
Retry and fallback strategy
Retry the same provider
Retries apply when ProviderExecutionError is raised inside the generate attempt (HTTP errors, empty completions from the HTTP layer, etc.).
from xoin import RetryCfg
result = await xoin.generate(
provider="openai",
retry=2,
prompt="Extract the user profile from this message.",
structured=StructuredOutput(response_model=UserProfile),
)
Object form:
result = await xoin.generate(
provider="openai",
retry=RetryCfg(retries=2, delay_ms=500, backoff_multiplier=2.0),
prompt="Extract the user profile from this message.",
structured=StructuredOutput(response_model=UserProfile),
)
RetryCfg field |
Meaning |
|---|---|
retries |
Extra attempts before giving up on that attempt bundle |
delay_ms |
Base delay between retries |
backoff_multiplier |
Multiplies delay each retry |
Fallback across providers
Ordering is built from: provider (if set), then provider_order, then default_provider, then fallback_providers — deduplicated.
result = await xoin.generate(
provider="openai",
provider_order=["anthropic", "mistral"],
prompt="Summarize this incident for executives.",
)
If every provider in the chain raises ProviderExecutionError, the last failure may surface as AggregateProviderError when multiple providers were tried.
embed parameters
await xoin.embed(**kwargs) — vector embeddings (OpenAI / Mistral defaults).
| Parameter | Type | What it does |
|---|---|---|
input |
str | list[str] |
Text(s) to embed (keyword-only; shadows builtin name intentionally). |
provider |
str | None |
Provider key; defaults to default_provider or first fallback_providers entry. |
model |
str | None |
Overrides provider default_embedding_model. |
timeout_ms |
int | None |
Per-request timeout override. |
metadata |
Mapping | None |
Merged into the embeddings JSON body before provider_options. |
provider_options |
Mapping | None |
Extra JSON fields for the embeddings request (overrides metadata keys). |
signal |
Any | None |
Same cooperative cancellation semantics as generate. |
Example:
result = await xoin.embed(
provider="openai",
model="text-embedding-3-small",
input=[
"How do I reset my password?",
"How do I update my billing card?",
],
)
print(len(result.embeddings))
print(len(result.embeddings[0]))
DeepSeek and Anthropic defaults do not expose embeddings in xoin-py — configure OpenAI or Mistral for vectors.
Parameter & types reference
Short tables above are for scanning. This section explains what each argument does, in plain language.
create_xoin(...)
Same keyword arguments as Xoin(...). Returns a new client instance.
Xoin(...) constructor
| Argument | What it is for |
|---|---|
providers |
Required. Dict of {name: Provider}. Each name is the string you pass as provider= or list entries in provider_order / targets. |
default_provider |
Used when a call omits provider and when building the fallback list. Must be one of the keys in providers. |
fallback_providers |
Ordered list of provider names tried after explicit provider / provider_order entries (deduplicated). |
retry |
Default retry policy for generate only (int = retry count with zero delay, or a RetryCfg object). Does not apply to generate_many or embed. |
templates |
Dict of {id: TemplateDefinition} for use with generate(..., template_id="..."). |
client |
Optional shared httpx.AsyncClient. If you omit it, Xoin creates one and owns it (closed by aclose() or async with). |
timeout_s |
Default socket/read timeout for the internally created client only (seconds). Per-call timeout_ms still overrides per request. |
Lifecycle: call await xoin.aclose() when you are done, or use async with Xoin(...) as xoin: so the owned client is closed automatically.
Xoin.generate(...)
All arguments are keyword-only.
Routing & model
| Argument | What it is for |
|---|---|
provider |
First provider name to try. |
provider_order |
Extra names appended after provider, before default_provider / fallback_providers. Duplicates and unknown names are skipped. |
provider_targets |
If non-empty, replaces the provider + provider_order logic. Each PriorityProviderTarget has a priority (lower numbers run first), provider, and optional model. Duplicate (provider, model) pairs are deduplicated. |
model |
Chat model id for this request. If omitted, each provider’s default_model is used. When using provider_targets, a target’s model overrides this for that step only. |
Prompt content (pick one style or combine carefully)
| Argument | What it is for |
|---|---|
prompt |
Plain user text appended as the last user message (after history and template-driven content). |
template |
Inline template string with {{variable}} placeholders. If set, it wins over template_id / template_file and the rendered string becomes the prompt body (via the template pipeline). |
template_id |
Looks up a TemplateDefinition in Xoin(..., templates={...}). |
template_file |
Path to .yaml / .yml / .json / plain text template file (JSON/YAML must include a "template" string field). Requires PyYAML for YAML. |
variables |
Dict merged on top of template defaults when rendering {{...}} placeholders. Missing keys raise TemplateError. Non-string values are JSON-encoded in the output. |
system |
System instruction inserted before conversational messages when building the provider payload. |
messages |
Prior turns: each ChatMessage (or dict with role / content). Parsed and combined with system, structured instructions, and prompt. |
Template precedence: if template is set, it is used and template_id / template_file are ignored. Otherwise template_id is resolved from the client registry; otherwise template_file is loaded. Only when none of those three are set does xoin use a bare prompt string (plus optional messages).
You must end up with at least one message after composition—otherwise ProviderConfigurationError.
Structured output & sampling
| Argument | What it is for |
|---|---|
structured |
Optional StructuredOutput. When set, the client asks the model for JSON, parses it, and validates into GenResult.data. See also modes. |
temperature |
Sampling temperature forwarded to the provider when not None. |
max_tokens |
Cap on completion tokens. Anthropic defaults this internally when unset. |
Timeouts, metadata, cancellation, retries
| Argument | What it is for |
|---|---|
timeout_ms |
Overrides the httpx timeout for this HTTP call (milliseconds). Converted to seconds internally. |
metadata |
Shallow dict merged into the JSON request body first. Use for cross-cutting fields your vendor accepts. |
provider_options |
Second dict merged into the body; wins on duplicate keys over metadata. Use for vendor-specific flags (response_format extras, top_p, etc.—whatever the API allows next to messages). |
signal |
Cooperative cancel hook before network I/O: pass an asyncio.Event and call event.set() from another task, or any object with a truthy aborted attribute. Raises asyncio.CancelledError. |
retry |
Overrides the client’s default retry for this generate call only. Retries run the whole fallback chain again on ProviderExecutionError (including structured validation failures wrapped as that error). |
Result: GenResult
Xoin.generate_many(...)
Same keywords as generate, except provider, provider_order, provider_targets, and retry are not supported.
| Argument | What it is for |
|---|---|
targets |
Required. Non-empty sequence of GenManyTarget (or dicts). Each item names a provider and optional model. Runs in parallel (asyncio.gather). |
There is no automatic fallback between targets: each target performs exactly one provider call. Combine with generate when you need retries or fallback.
Results appear in the same order as targets (after coercion).
Xoin.embed(...)
| Argument | What it is for |
|---|---|
input |
Keyword-only (input= avoids shadowing the builtin in signatures). One string or a list of strings to embed. |
provider |
Embedding provider name. Defaults to default_provider, else the first entry in fallback_providers. |
model |
Embedding model id; defaults to the provider’s default_embedding_model. |
timeout_ms |
Per-request timeout override (milliseconds). |
metadata |
Merged into the embeddings JSON body before provider_options. |
provider_options |
Vendor-specific body fields; overrides metadata on key clashes. |
signal |
Same cancellation semantics as generate. |
Returns EmbedResult. Providers without capabilities.embeddings cannot be used.
Xoin.register_provider(name, provider)
| Argument | What it is for |
|---|---|
name |
String key used in provider= / ordering / targets. |
provider |
Instance implementing the Provider protocol. |
Overwrites an existing entry if name collides.
StructuredOutput fields
Used for validated JSON outputs (GenResult.data).
| Field | Type | What it is for |
|---|---|---|
response_model |
type[BaseModel] |
Required. Pydantic model used to validate the model output. |
mode |
"auto" | "native" | "prompted" |
Chooses provider-native JSON/schema modes vs instructions-only (see modes). |
name |
str |
Logical schema/tool name (default "structured_response"). |
description |
optional str |
Hint for providers that accept a schema description. |
json_schema |
optional dict |
Provider-facing JSON Schema. Accept json_schema or jsonSchema in dict input. If omitted, schema is derived from response_model.model_json_schema(). Validation always uses response_model, not this dict. |
RetryCfg
| Field | Meaning |
|---|---|
retries |
Maximum extra attempts after the first failure (>= 0). |
delay_ms |
Base pause before each retry (>= 0), in milliseconds. |
backoff_multiplier |
Factor >= 1.0. Seconds slept before retry attempt n (1-based) equal (delay_ms / 1000) * (backoff_multiplier ** (n - 1)). |
Only ProviderExecutionError triggers retries (inside generate).
GenManyTarget
| Field | Meaning |
|---|---|
provider |
Registered provider name (required). |
model |
Optional per-target chat model; overrides the request-level model for that parallel call. |
PriorityProviderTarget
| Field | Meaning |
|---|---|
priority |
Integer sort key—smaller values are tried earlier. |
provider |
Registered provider name. |
model |
Optional model override for that step in the fallback chain. |
TemplateDefinition
| Field | Meaning |
|---|---|
template |
String containing {{placeholder}} markers. |
defaults |
Default values for placeholders (merged under runtime variables). |
description |
Optional human-readable note (not sent to the LLM by xoin-py). |
ChatMessage
| Field | Meaning |
|---|---|
role |
"system" | "user" | "assistant" | "tool" |
content |
Message text for that turn. |
GenResult
| Field | Meaning |
|---|---|
provider |
Provider name that produced the response. |
model |
Model id returned or requested. |
text |
Raw assistant text from the provider. |
data |
Parsed BaseModel when structured was set; otherwise None. |
usage |
Optional Usage token counts. |
finish_reason |
Provider-specific completion reason string when available. |
raw |
Decoded JSON dict (or similar) from the vendor for debugging. |
EmbedResult
| Field | Meaning |
|---|---|
provider |
Provider name. |
model |
Embedding model id. |
embeddings |
List of float vectors (one per input string). |
usage |
Optional Usage. |
raw |
Raw provider payload for debugging. |
Usage
| Field | Meaning |
|---|---|
input_tokens |
Prompt tokens when the vendor reports them. |
output_tokens |
Completion tokens (chat). |
total_tokens |
Sum when reported. |
Any field may be None if the API did not return it.
Capabilities (dataclass)
Used on OpenAIProvider(capabilities=...) or custom providers.
| Field | Values | Meaning |
|---|---|---|
structured_outputs |
"json-schema" | "json-object" | "prompt-only" |
What the adapter can express natively: JSON Schema response format, plain JSON mode, or prompts only. |
embeddings |
bool |
Whether embed is allowed on this adapter. |
Template helpers (xoin.templates)
| Function | What it does |
|---|---|
render_template(definition, variables=None) |
Substitutes {{keys}} using defaults ∪ variables. |
load_template_file(path) |
Loads a TemplateDefinition from disk (YAML needs PyYAML). |
resolve_named_template(...) |
Low-level: chooses inline vs id vs file (used internally by Xoin). Rarely needed in application code. |
Custom providers: ChatCompletionParameters / EmbeddingParameters
When implementing Provider:
ChatCompletionParameters: model, messages (list[ChatMessage]), temperature, max_tokens, response_format (PlainTextResponseFormat / JsonObjectResponseFormat / JsonSchemaResponseFormat), provider_options (already merged metadata + options), timeout (float | None, seconds).
EmbeddingParameters: model, input (list[str]), provider_options, timeout.
Provider constructors
OpenAIProvider
| Parameter | Description |
|---|---|
api_key |
Bearer token (required). |
name |
Provider key used in logs/errors and when registering ("openai" by default). Use a different value when you register the same class twice (e.g. "groq"). |
base_url |
API root; default https://api.openai.com/v1. Point at any OpenAI-compatible server. |
default_model |
Chat model id when generate(..., model=None). |
default_embedding_model |
Embedding model when embed(..., model=None). |
capabilities |
Override structured-output and embedding support (see Capabilities). Defaults to JSON Schema structured outputs + embeddings enabled. |
headers |
Extra HTTP headers merged into every request. |
AnthropicProvider
| Parameter | Description |
|---|---|
api_key |
Anthropic API key (sent as x-api-key; required). |
base_url |
Messages API root; default https://api.anthropic.com/v1. |
default_model |
Claude model id when generate(..., model=None). |
headers |
Extra HTTP headers. |
Fixed on the class: name = "anthropic", structured outputs via JSON Schema path, embeddings=False (use another provider for vectors).
MistralProvider
| Parameter | Description |
|---|---|
api_key |
Mistral API key (required). |
base_url |
Default https://api.mistral.ai/v1. |
default_model |
Chat model when model omitted. |
default_embedding_model |
Embedding model when embed(..., model=None). |
headers |
Extra HTTP headers. |
Fixed on the class: name = "mistral", structured outputs use json-object mode (not full JSON Schema passthrough).
DeepSeekProvider
| Parameter | Description |
|---|---|
api_key |
DeepSeek API key (required). |
base_url |
Default https://api.deepseek.com. |
default_model |
Chat model when model omitted. |
headers |
Extra HTTP headers. |
Fixed on the class: name = "deepseek", json-object structured mode, embeddings=False.
Examples by use case
Extract CRM-style fields
class Lead(BaseModel):
company: str
contact_name: str
email: str
budget: str
result = await xoin.generate(
provider="openai",
prompt=(
'Extract company, contact, email, and budget from: '
'"Hi, this is Sarah from Northwind. Reach me at sarah@northwind.com. '
'Our budget is around $15k."'
),
structured=StructuredOutput(response_model=Lead, name="lead_info"),
)
Classify support tickets
from typing import Literal
from pydantic import BaseModel
class Ticket(BaseModel):
category: Literal["billing", "technical", "account", "other"]
priority: Literal["low", "medium", "high"]
summary: str
result = await xoin.generate(
provider="anthropic",
system="You classify support tickets.",
prompt="My card was charged twice and I still cannot access premium features.",
structured=StructuredOutput(response_model=Ticket, name="ticket_classification"),
)
Summarize transcript
result = await xoin.generate(
provider="openai",
prompt=f"Summarize this meeting transcript in 5 bullet points:\n\n{transcript}",
temperature=0.2,
max_tokens=250,
)
Embedding documents for search / RAG
vectors = await xoin.embed(input=[doc.content for doc in documents])
Framework snippets
asyncio CLI script
import asyncio
import os
from typing import Literal
from pydantic import BaseModel
from xoin import StructuredOutput, Xoin
from xoin.providers import OpenAIProvider
class Sentiment(BaseModel):
label: Literal["positive", "neutral", "negative"]
async def main() -> None:
async with Xoin(
providers={"openai": OpenAIProvider(api_key=os.environ["OPENAI_API_KEY"])},
default_provider="openai",
) as xoin:
result = await xoin.generate(
prompt='Classify sentiment of: "The onboarding was surprisingly smooth."',
structured=StructuredOutput(response_model=Sentiment),
)
print(result.data)
asyncio.run(main())
FastAPI route
import os
from fastapi import FastAPI, HTTPException
from pydantic import BaseModel
from xoin import StructuredOutput, Xoin
from xoin.errors import StructuredOutputError
from xoin.providers import OpenAIProvider
class Summary(BaseModel):
summary: str
class Body(BaseModel):
text: str
app = FastAPI()
_xoin = Xoin(
providers={"openai": OpenAIProvider(api_key=os.environ["OPENAI_API_KEY"])},
default_provider="openai",
)
@app.post("/summarize")
async def summarize(body: Body) -> Summary:
try:
result = await _xoin.generate(
provider="openai",
prompt=f"Summarize:\n{body.text}",
structured=StructuredOutput(response_model=Summary),
)
except StructuredOutputError as exc:
raise HTTPException(status_code=502, detail=str(exc)) from exc
assert result.data is not None
return result.data
Lifecycle: in production, prefer a shared
httpx.AsyncClientpassed intoXoin(client=...)or manage startup/shutdown hooks instead of creating a newXoinper request.
Custom providers
Implement the Provider protocol from xoin.providers.base: supply name, capabilities, default_model, default_embedding_model, and async generate / embed methods accepting httpx.AsyncClient plus ChatCompletionParameters / EmbeddingParameters.
import httpx
from xoin.providers.base import (
Capabilities,
ChatCompletionParameters,
EmbeddingParameters,
ProviderChatResponse,
ProviderEmbeddingResponse,
)
from xoin.types import Usage
class GatewayProvider:
name = "gateway"
capabilities = Capabilities(structured_outputs="prompt-only", embeddings=True)
default_model = "gateway-chat"
default_embedding_model = "gateway-embed"
async def generate(
self, client: httpx.AsyncClient, parameters: ChatCompletionParameters
) -> ProviderChatResponse:
response = await client.post(
"https://my-gateway.example.com/chat",
json={
"model": parameters.model,
"messages": [m.model_dump() for m in parameters.messages],
"temperature": parameters.temperature,
"max_tokens": parameters.max_tokens,
**parameters.provider_options,
},
)
response.raise_for_status()
payload = response.json()
return ProviderChatResponse(
model=payload.get("model", parameters.model),
text=payload["text"],
structured_data=None,
usage=None,
finish_reason=payload.get("finish_reason"),
raw=payload,
)
async def embed(self, client: httpx.AsyncClient, parameters: EmbeddingParameters) -> ProviderEmbeddingResponse:
response = await client.post(
"https://my-gateway.example.com/embed",
json={"model": parameters.model, "input": parameters.input, **parameters.provider_options},
)
response.raise_for_status()
payload = response.json()
return ProviderEmbeddingResponse(
model=payload.get("model", parameters.model),
embeddings=payload["embeddings"],
usage=None,
raw=payload,
)
xoin = Xoin(providers={"gateway": GatewayProvider()})
Error handling
Exceptions live under xoin.errors:
| Class | When |
|---|---|
XoinError |
Base class (code attribute) |
TemplateError |
Missing variables / malformed template files |
StructuredOutputError |
JSON parse / Pydantic validation failure |
ProviderExecutionError |
Provider HTTP/runtime failures surfaced by xoin-py |
ProviderConfigurationError |
Missing provider, model, embedding capability, etc. |
EmbeddingError |
Embedding not supported on provider |
AggregateProviderError |
All fallback providers failed |
from xoin.errors import (
AggregateProviderError,
ProviderConfigurationError,
ProviderExecutionError,
StructuredOutputError,
TemplateError,
)
try:
result = await xoin.generate(
provider="openai",
prompt="Extract a user.",
structured=StructuredOutput(response_model=UserProfile),
)
print(result.data)
except TemplateError:
print("Prompt template configuration failed.")
except StructuredOutputError:
print("Model output did not match the schema.")
except ProviderConfigurationError:
print("Misconfigured provider or missing default model.")
except ProviderExecutionError as exc:
print(f"{exc.provider} failed:", exc)
except AggregateProviderError as exc:
print("All providers failed:", exc.errors)
Examples
See examples/README.md for fully commented scripts (structured outputs, embeddings, generate_many,
priority provider_targets, templates, retries, and runtime register_provider).
Development
python -m venv .venv
source .venv/bin/activate # Windows: .venv\Scripts\activate
pip install -e ".[dev]"
pytest tests -v --tb=short
Tests use httpx.MockTransport — no real provider keys required. See TEST_REPORT.md for the latest local run summary.
Related: JavaScript / TypeScript client — xoin-js (@xoin/xoin-js).
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file xoin_py-0.1.0.tar.gz.
File metadata
- Download URL: xoin_py-0.1.0.tar.gz
- Upload date:
- Size: 47.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
5c10141f03d27e20c9be333ffef6e6343d2b6a40269e2e954fd886f38dd6fdce
|
|
| MD5 |
fde175cee57d8c3017acea835b30606f
|
|
| BLAKE2b-256 |
592a8f1b34cc26bbda9c2baa91945bfbc32b547316f40e556d5c63e5dbf43475
|
File details
Details for the file xoin_py-0.1.0-py3-none-any.whl.
File metadata
- Download URL: xoin_py-0.1.0-py3-none-any.whl
- Upload date:
- Size: 34.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2bf6b2b1c1fdc113803f4702965fc915d2c01a65ccfe0d603e734ecaac2ad1b8
|
|
| MD5 |
cb5855f07f8914ea235b388fc864d308
|
|
| BLAKE2b-256 |
9d10749f8cf7df287f4531f54581c3e072a0fdf9ab697137de888f1f43575b07
|