Online LLM clients for OpenAI, Google Gemini, Mistral, Anthropic Claude, and OpenRouter

These details have not been verified by PyPI

Project description

covenance

Type-safe LLM outputs across any provider. Track every call and its cost.

from covenance import ask_llm

review = ask_llm("Write a short review of Inception", model="gpt-4.1-nano")
is_positive = ask_llm(
    "Is this review positive? '{review}'", 
    model="gemini-2.5-flash-lite", 
    response_type=bool)
print(is_positive)  # True/False

Usecases

Structured outputs that work - Same code, any provider. Pydantic models, primitives, lists, tuples.
Zero routing code - Model name determines provider automatically (gemini-*, claude-*, gpt-*)
Convenience - you get TPM (Token Per Minute) limit retries automatically, as well as if the LLM fails to return the type you have requested.
Visibility: Know what you're calling and spending - Every call logged with token counts and cost. print_usage() for totals, print_call_timeline() for a visual waterfall.

Installation

Install only the providers you need:

pip install covenance[openai]      # OpenAI, Grok, OpenRouter
pip install covenance[anthropic]   # Anthropic Claude
pip install covenance[google]      # Google Gemini
pip install covenance[mistral]     # Mistral

# Multiple providers
pip install covenance[openai,anthropic]

# All providers
pip install covenance[all]

Structured outputs

Pass response_type to get validated, typed results:

# Pydantic models
class Evaluation(BaseModel):
    reasoning: str
    is_correct: bool

result = ask_llm("Is 2+2=5?", model="gemini-2.5-flash-lite", response_type=Evaluation)
print(result.reasoning)  # "2+2 equals 4, not 5"
print(result.is_correct)  # False

# Primitives
answer = ask_llm("Is Python interpreted?", model="gpt-4.1-nano", response_type=bool)
print(answer)  # True

# Collections
items = ask_llm("List 3 prime numbers", model="claude-sonnet-4-20250514", response_type=list[int])
print(items)  # [2, 3, 5]

Works identically across OpenAI, Gemini, Anthropic, Mistral, Grok, and OpenRouter.

Cost tracking

Every call is recorded with token counts and cost:

from covenance import ask_llm, print_usage, print_call_timeline, get_records

ask_llm("Hello", model="gpt-4.1-nano")
ask_llm("Hello", model="gemini-2.5-flash-lite")

print_usage()
# ==================================================
# LLM Usage Summary (default client)
# ==================================================
#   Calls: 2
#   Tokens: 45 (In: 12, Out: 33)
#   Cost: $0.0001
#   Models: gemini/gemini-2.5-flash-lite, openai/gpt-4.1-nano

# Access individual records
for record in get_records():
    print(f"{record.model}: {record.cost_usd}")

Persist records by setting COVENANCE_RECORDS_DIR or calling set_llm_call_records_dir().

Call timeline

Visualize call sequences and parallelism in your terminal:

from covenance import print_call_timeline

print_call_timeline()
# LLM Call Timeline (4.4s total, 5 calls)
#                         |0s                                            4.4s|
#   gpt-4.1-nano    1.3s  |████████████████                                  |
#   g2.5-flash-l    1.1s  |                 ████████████                     |
#   g2.5-flash-l    1.1s  |                 ████████████                     |
#   g2.5-flash-l    1.5s  |                 ████████████████                 |
#   g2.5-flash-l    1.5s  |                                 █████████████████|

Each line is a call, sorted by start time. Blocks show when each call was active - parallel calls appear as overlapping bars on different rows.

Consensus for quality

Run parallel LLM calls and integrate results for higher quality:

from covenance import llm_consensus

result = llm_consensus(
    "Explain quantum entanglement",
    model="gpt-4.1-nano",
    response_type=Evaluation,
    num_candidates=3,  # 3 parallel calls + integration
)

Supported providers

Provider is determined by model name prefix:

Prefix	Provider
`gpt-`, `o1-`, `o3-*`	OpenAI
`gemini-*`	Google Gemini
`claude-*`	Anthropic
`mistral-`, `codestral-`	Mistral
`grok-*`	xAI Grok
`org/model` (contains `/`)	OpenRouter

Structured output reliability

Providers differ in how they enforce JSON schema compliance:

Provider	Method	Guarantee
OpenAI	Constrained decoding	100% schema-valid JSON
Google Gemini	Controlled generation	100% schema-valid JSON
Grok	Constrained decoding	100% schema-valid JSON
Anthropic	Structured outputs beta	100% schema-valid JSON*
Mistral	Best-effort	Probabilistic
OpenRouter	Varies	Depends on underlying model

*Anthropic structured outputs requires SDK >= 0.74.1 (uses anthropic-beta: structured-outputs-2025-11-13). Mistral uses probabilistic generation. Covenance retries automatically (up to 3 times) on JSON parse errors for Mistral.

API keys

Set environment variables for the providers you use:

OPENAI_API_KEY
GOOGLE_API_KEY (or GEMINI_API_KEY)
ANTHROPIC_API_KEY
MISTRAL_API_KEY
OPENROUTER_API_KEY
XAI_API_KEY (for Grok)

A .env file in the working directory is loaded automatically.

Isolated clients

Use Covenance instances for separate API keys and call records per subsystem:

from covenance import Covenance
from pydantic import BaseModel

# Each client tracks its own usage
question_client = Covenance(label="questions")
review_client = Covenance(label="review")

answer = question_client.ask_llm("Who is David Blaine?", model="gpt-4.1-nano")

class Evaluation(BaseModel):
    reasoning: str
    is_correct: bool

eval = review_client.llm_consensus(
    f"Is this accurate? '''{answer}'''",
    model="gemini-2.5-flash-lite",
    response_type=Evaluation,
)

question_client.print_usage()  # Shows only the question call
review_client.print_usage()    # Shows only the review call

How it works: dual backend

Covenance uses two backends for structured output and picks the better one per provider:

Native SDK — calls the provider's API directly (e.g., OpenAI Responses API with responses.parse)
pydantic-ai — uses pydantic-ai as a unified layer

The default routing:

Provider	Backend	Why
OpenAI	Native	Responses API with constrained decoding handles enums, recursive types, and large schemas more reliably
Grok	Native	OpenAI-compatible API, same benefits
Gemini	pydantic-ai	Native SDK hits `RecursionError` on self-referencing types (e.g., tree nodes)
Anthropic	pydantic-ai	No native client implemented
Mistral	pydantic-ai	Similar pass rates; pydantic-ai handles recursive types better
OpenRouter	pydantic-ai	No native client implemented

These defaults are based on a stress test suite that runs 14 test categories across providers with both backends. The results for the cheapest model per provider:

OpenAI  (gpt-4.1-nano):          native 14/14, pydantic-ai 10/14
Gemini  (gemini-2.5-flash-lite): native 11/14, pydantic-ai 13/14
Mistral (mistral-small-latest):  native  9/14, pydantic-ai  8/14

Where native beats pydantic-ai on OpenAI: enum adherence (strict values vs. hallucinated ones), recursive types (deeper trees), real-world schemas (fewer empty fields), and extreme schema limits (100+ fields with Literal types).

Where pydantic-ai beats native on Gemini: recursive/self-referencing types (native Google SDK crashes with RecursionError).

Overriding the backend

Each Covenance instance has a backends object with a field per provider. You can inspect and override them:

from covenance import Covenance

client = Covenance()
print(client.backends)
# Backends(native=[openai, grok], pydantic=[gemini, anthropic, mistral, openrouter])

# Override a specific provider
client.backends.anthropic = "native"

# Force all providers to one backend (useful for benchmarking)
client.backends.set_all("native")

Only "native" and "pydantic" are accepted — anything else raises ValueError.

Every call records which backend was used:

for record in client.get_records():
    print(f"{record.model}: {record.backend}")  # "native" or "pydantic"

The backend also shows in print_call_timeline() as (N) or (P):

print_call_timeline()
# LLM Call Timeline (2.1s total, 2 calls)
#                            |0s                                       2.1s|
#   gpt-4.1-nano(N)    0.8s  |█████████████████                            |
#   g2.5-flash-l(P)    1.1s  |                  ██████████████████████████  |

To see routing decisions in real time, enable debug logging:

import logging
logging.basicConfig(level=logging.DEBUG)
# DEBUG:covenance:ask_llm: model=gpt-4.1-nano provider=openai backend=native

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

0.0.9

Mar 21, 2026

0.0.8

Mar 20, 2026

0.0.7

Feb 9, 2026

0.0.6

Jan 30, 2026

0.0.5

Jan 29, 2026

0.0.4

Jan 29, 2026

0.0.3

Jan 28, 2026

0.0.2

Jan 26, 2026

0.0.1

Jan 26, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

covenance-0.0.9.tar.gz (33.9 kB view details)

Uploaded Mar 21, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

covenance-0.0.9-py3-none-any.whl (45.5 kB view details)

Uploaded Mar 21, 2026 Python 3

File details

Details for the file covenance-0.0.9.tar.gz.

File metadata

Download URL: covenance-0.0.9.tar.gz
Upload date: Mar 21, 2026
Size: 33.9 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.7

File hashes

Hashes for covenance-0.0.9.tar.gz
Algorithm	Hash digest
SHA256	`df7b05a9115134c3c47540cc6d4be574111ea52b3c463f134e8821aa8b0f049f`
MD5	`fcfb5798387355775fc73951fc8fd607`
BLAKE2b-256	`7bf776f6e05d0902b5d2d72f4ddbc834fa80fad4c39b626f865eb1631b8429d1`

See more details on using hashes here.

File details

Details for the file covenance-0.0.9-py3-none-any.whl.

File metadata

Download URL: covenance-0.0.9-py3-none-any.whl
Upload date: Mar 21, 2026
Size: 45.5 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.7

File hashes

Hashes for covenance-0.0.9-py3-none-any.whl
Algorithm	Hash digest
SHA256	`dc9fc6a6346773e837e59f9d4177ce3e00aa16435875604b7ef0bf1e5c1c2054`
MD5	`66843cdcc55e54b57ca009dd07bd2228`
BLAKE2b-256	`2de8fac7584c3e941dc45103d988f173e0932bd2578f275ee2e4df543a5eb566`

See more details on using hashes here.

covenance 0.0.9

Navigation

Verified details

Maintainers

Unverified details

Meta

Classifiers

Project description

covenance

Usecases

Installation

Structured outputs

Cost tracking

Call timeline

Consensus for quality

Supported providers

Structured output reliability

API keys

Isolated clients

How it works: dual backend

Overriding the backend

Project details

Verified details

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes