Drop-in Tessera integration for Pydantic AI. One line of config routes your Pydantic AI Agent's LLM calls through Tessera's auto-route + auto-cache + auto-compress + auto-batch proxy. Free 60M tokens/mo. Production: 20% of measured savings.

These details have not been verified by PyPI

Project links

Project description

`tessera-pydantic-ai`

Drop-in cost optimization for Pydantic AI. Two function calls (one for the underlying SDK client kwargs, one for the Pydantic AI Provider wrapper) and every agent.run_sync() / agent.run() call lands on the Tessera optimization proxy. Auto-route to cheaper-equivalent models, exact + provider-prompt-cache hits, prompt compression with per-stack quality canary, batch arbitrage on async-tolerant calls. Free Sandbox tier: 60M tokens/month, no card. Production: 20% of measured savings, $0 if we save you nothing.

Companion to tessera-sdk (vanilla provider SDKs), tessera-langchain (LangChain integration), tessera-vercel-ai (Vercel AI SDK integration), tessera-llamaindex (LlamaIndex integration), tessera-mastra (Mastra Agent framework integration), tessera-crewai (CrewAI multi-agent integration), and tessera-autogen (AutoGen 0.4+ multi-agent integration). Same proxy, same mechanic stack, Pydantic AI-shaped API.

What it looks like

from pydantic_ai import Agent
from pydantic_ai.models.openai import OpenAIChatModel
from tessera_pydantic_ai import tessera_openai_provider

provider = tessera_openai_provider(
    openai_api_key="sk-...",        # your OpenAI key
    tessera_api_key="tk_...",      # free from tesseraai.io/dev
)

agent = Agent(OpenAIChatModel("gpt-4o", provider=provider))

result = agent.run_sync("Summarize this customer support ticket in 2 sentences.")

One factory call, one Model wrap, one Agent. Your existing Pydantic AI code (tools, structured outputs, run_sync/run, streaming) works unchanged. Anthropic mirror:

from pydantic_ai import Agent
from pydantic_ai.models.anthropic import AnthropicModel
from tessera_pydantic_ai import tessera_anthropic_provider

provider = tessera_anthropic_provider(
    anthropic_api_key="sk-ant-...",
    tessera_api_key="tk_...",
)
agent = Agent(AnthropicModel("claude-sonnet-4-6", provider=provider))

Prefer explicit control over the underlying AsyncOpenAI / AsyncAnthropic client? Use the pass-through config functions:

from openai import AsyncOpenAI
from pydantic_ai.providers.openai import OpenAIProvider
from pydantic_ai.models.openai import OpenAIChatModel
from pydantic_ai import Agent
from tessera_pydantic_ai import tessera_openai_config

client = AsyncOpenAI(
    api_key="sk-...",
    organization="org-mine",
    **tessera_openai_config(api_key="tk_..."),
)
provider = OpenAIProvider(openai_client=client)
agent = Agent(OpenAIChatModel("gpt-4o", provider=provider))

Install

pip install tessera-pydantic-ai pydantic-ai openai anthropic

Pydantic AI + the underlying SDK clients (openai, anthropic) are NOT declared as dependencies of this package. Install whichever providers you actually use. Tessera's factories import them lazily at call time so importing this package never blows up over missing optional deps.

Get a free Tessera API key (60M tokens/mo, no card) at tesseraai.io/dev. Sign-up takes ~30 seconds and returns an instant tk_… key plus magic-link dashboard access.

Provider support

Provider	Pydantic AI Provider class	Tessera config function	Convenience factory
OpenAI	`OpenAIProvider`	`tessera_openai_config`	`tessera_openai_provider`
Anthropic	`AnthropicProvider`	`tessera_anthropic_config`	`tessera_anthropic_provider`

v0.1 ships OpenAI + Anthropic. Mistral / Groq / Cohere are queued for v0.2; the Pydantic AI Provider classes for those have a similar custom-client pattern but the exact signature has not been end-to-end verified. Per our diagnostic-vocab-in-writing discipline (every public-surface claim verified before publish), we'd rather ship 2 honest-and-tested providers than 5 claimed-but-unverified ones. See tessera-langchain, tessera-llamaindex, or tessera-vercel-ai for verified Mistral / Groq / Cohere integrations on those frameworks today.

The generic dispatcher tessera_config("openai" | "anthropic", api_key="tk_...") returns the right kwargs dict regardless of provider. Unknown-provider calls raise ValueError with a v0.2 pointer.

Worked example

Real customer-support agent on gpt-4o, 5B tokens/month, OpenAI list prices:

Stage	Cost / mo	Saved
Baseline (OpenAI direct via Pydantic AI)	$24,000	n/a
+ Tessera (route, cache, prompt-cache headers, compress, M9 ceiling, batch)	$9,400	$14,600
Tessera fee (20% × savings)	$2,920	n/a
You net pay	$12,320	$11,680 / mo saved

Quality canary across the full mechanic stack: mean-score 0.96 (floor 0.95). 0.95 SLA held all 30 days. Full breakdown: worked example with mechanic-level numbers + canary results.

What Tessera does on every Agent call

Same mechanic stack as the main tessera-sdk. Each mechanic is opt-in per workload, observable per request, and bypasses when its quality canary drops below the per-stack 0.95 floor.

Mechanic	What it does	Typical savings
Auto-route _(m1)	Route to a cheaper-equivalent model gated by a daily promptfoo canary on your eval set	15–35% on routed calls
Auto-cache _(m2)	sha256 cache on the canonical request body, 7-day TTL, Cloudflare edge KV	5–40% depending on prompt repetition
Auto-compress _(m3)	Per-role heuristic compression (system + user toggles independent). Preserves code fences and JSON shapes.	5–15% on prompt tokens
Prompt cache _(m6)	Inject provider-native cache headers: OpenAI cached-input (50% off), Anthropic `cache_control: ephemeral` (90% off cache reads)	50–90% on cached prefixes
Context prune _(m7)	Conservative trim on long conversations (system + last 8 turns; TF-IDF rerank on RAG attachments)	5–25% on multi-turn workloads
Output-length ceiling _(m9)	Daily compute fits p90 of completion length per workload, injects `maxTokens = p90 × 1.3`	5–15% on completion cost
Batch arbitrage _(m10)	Route async-tolerant Agent calls to provider Batch APIs (OpenAI Batch + Anthropic Message Batches both 50% off)	50% on batch-eligible traffic
Cross-provider failover _(m11)	When primary upstream returns 5xx / connection error / timeout, retry on OpenRouter (opt-in, default OFF)	Reliability primitive, n/a cost
Per-provider circuit breaker	Rolling 5xx-rate state machine per upstream. When a provider degrades, auto-route skips its intra-provider alternative mappings until the half-open probe succeeds.	n/a (keeps the savings stack honest)

Pricing

Free Sandbox: 60M tokens/month, 30 requests/minute, observability-only mechanics, no card. Forever.
Production: over 60M tokens/month or higher rate limit. 20% of measured savings only. Zero savings, zero fee. Prepaid Stripe balance, $100 minimum top-up. No subscription, no commit, no minimum monthly.

Existing customers of tessera-sdk, tessera-langchain, tessera-llamaindex, tessera-vercel-ai, or tessera-mastra keep their rate_locked_pct (if any) on this package too. Same tk_… key, same billing record.

FAQ

Q: How is this different from the other tessera-* packages?

Same proxy. Same mechanics. Same billing. The six packages target different code surfaces:

tessera-sdk: patches OpenAI / Anthropic / etc. client constructors directly via tessera.activate(key). Use when calling provider SDKs without a framework.
tessera-langchain: wires into LangChain ChatModel constructors. Use when you're on LangChain.
tessera-llamaindex: wires into LlamaIndex LLM adapter constructors. Use when you're on LlamaIndex.
tessera-vercel-ai: wires into the Vercel AI SDK provider factories. Use when you're on ai core + @ai-sdk/*.
tessera-mastra: Vercel AI SDK shape but Mastra-positioned. Use when you're on Mastra Agents.
tessera-pydantic-ai (this package): wires into Pydantic AI Provider classes. Use when you're on Pydantic AI.

Pick whichever fits your codebase. Side-by-side install is supported: all six resolve to the same proxy and same billing record.

Q: Why only OpenAI and Anthropic in v0.1?

Honesty over feature breadth. The Pydantic AI Provider class for each LLM accepts a custom underlying-SDK client (e.g. OpenAIProvider(openai_client=...)); we have end-to-end verified that pattern against the OpenAI and Anthropic SDK client constructors. The Mistral / Groq / Cohere Provider classes exist in Pydantic AI but their custom-client shape has not been tested in our CI. Per our public-surface-claim discipline (every claim verified before ship), the unverified providers stay queued for v0.2. The two shipped providers cover ~85% of customer LLM traffic per our outreach research.

Q: Does this break my tools / structured outputs / streaming?

No. The Pydantic AI Provider object that wraps the underlying SDK client is unchanged in shape (agent.run_sync(), agent.run(), tool calls, structured outputs, and streaming all work unchanged). Auto-route gates on tool-calling capability so an agent using tools never gets routed to a non-tool-capable model.

Q: What happens if Tessera's proxy is down?

Your Agent gets HTTP errors instead of LLM responses. On the proxy side, a per-provider circuit breaker tracks rolling 5xx rates and skips degraded providers in auto-route decisions. Cross-provider failover (m11) is opt-in and re-routes to OpenRouter when the primary upstream is down. See the workload toggle in the dashboard.

Q: What happens to my OpenAI / Anthropic rate limits?

They pass through. Tessera does not aggregate quotas across customers. Your provider rate limits apply normally; the proxy enforces only the Tessera tier limits (30 rpm Free Sandbox, 60 rpm Production by default; higher on request).

Q: Are you storing my prompts and completions?

No. We log only token counts, cost deltas, mechanics_stack, and provider response status. Prompts and completions are never persisted. Full data handling on tesseraai.io/security.

Q: Why are there two API surfaces (`tessera_openai_config` vs `tessera_openai_provider`)?

The config function returns the kwargs dict you spread into AsyncOpenAI(...): explicit, easy to combine with other settings (organization, custom http_client, retry config, etc.). The convenience factory imports AsyncOpenAI + OpenAIProvider for you and pre-merges. Use whichever you find more readable. Both ship in the same package.

About Tessera

Tessera is the substrate layer for LLM cost optimization, also called the Optimize Layer in our product surface. A thin proxy that sits in your application's request-path, applies a conservative cascade of optimization mechanics, and measures every saved dollar against an audit-immutable baseline. We bill 20% of verified savings, prepaid. Zero savings = zero fee. No per-token gateway fee, no subscription, no minimum monthly commitment; the category we operate in is "success-fee LLM optimizer," distinct from per-token AI gateways and observability dashboards.

Where observability tools tell you what you spent and AI gateways re-shape the request without measuring the cost delta, Tessera is the layer that does both, and only takes a cut when the measured savings are positive. The verified-savings ledger at ledger.tesseraai.io shows every original-vs-actual cost pair, snapshot-pinned to a pricing_catalog version captured at request time. Mid-contract price changes don't retroactively alter past savings. This is the FinOps-friendly model for AI inference: every line of the bill traces to a code-enforced rule.

Apache-2.0. Operated by Fintechagency OÜ (Tallinn, Estonia, registry code 16638667). Issues: github.com/tessera-llm/tessera-pydantic-ai/issues.

Developer entry: tesseraai.io/dev
Mechanic reference: tesseraai.io/how-it-works
Dashboard: ledger.tesseraai.io
Engineering blog: tesseraai.io/blog

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.1.1

May 24, 2026

0.1.0

May 19, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tessera_pydantic_ai-0.1.1.tar.gz (19.3 kB view details)

Uploaded May 24, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

tessera_pydantic_ai-0.1.1-py3-none-any.whl (14.3 kB view details)

Uploaded May 24, 2026 Python 3

File details

Details for the file tessera_pydantic_ai-0.1.1.tar.gz.

File metadata

Download URL: tessera_pydantic_ai-0.1.1.tar.gz
Upload date: May 24, 2026
Size: 19.3 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.10

File hashes

Hashes for tessera_pydantic_ai-0.1.1.tar.gz
Algorithm	Hash digest
SHA256	`a2a842616ecbd61e1df73cafc8962acd2f2264995467400d4398bfd50353b818`
MD5	`083e83b9b7959a3758e64500e60a8692`
BLAKE2b-256	`c88c963481c47a01e95b16951b37088b3a62d839d44e5e2fe3d0f30ef8b58721`

See more details on using hashes here.

File details

Details for the file tessera_pydantic_ai-0.1.1-py3-none-any.whl.

File metadata

Download URL: tessera_pydantic_ai-0.1.1-py3-none-any.whl
Upload date: May 24, 2026
Size: 14.3 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.10

File hashes

Hashes for tessera_pydantic_ai-0.1.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`38441b55f7f0136407ad4fbe70c536e20377fd01e2bd89ae3729ee44e0ac8018`
MD5	`0bfc3dca2ee7e6568f75c2f6761a26e8`
BLAKE2b-256	`11a6b5c668f8ea0a0432f6e0789c2f603e4c4da27d0386a5bd2e967fd5b3151d`

See more details on using hashes here.

tessera-pydantic-ai 0.1.1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

tessera-pydantic-ai

What it looks like

Install

Provider support

Worked example

What Tessera does on every Agent call

Pricing

FAQ

Q: How is this different from the other tessera-* packages?

Q: Why only OpenAI and Anthropic in v0.1?

Q: Does this break my tools / structured outputs / streaming?

Q: What happens if Tessera's proxy is down?

Q: What happens to my OpenAI / Anthropic rate limits?

Q: Are you storing my prompts and completions?

Q: Why are there two API surfaces (tessera_openai_config vs tessera_openai_provider)?

Links

About Tessera

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

`tessera-pydantic-ai`

Q: Why are there two API surfaces (`tessera_openai_config` vs `tessera_openai_provider`)?