Skip to main content

Online LLM clients for OpenAI, Google Gemini, Mistral, Anthropic Claude, and OpenRouter

Project description

covenance

PyPI version Tests codecov

Type-safe LLM outputs across any provider. Track every call and its cost.

from covenance import ask_llm

review = ask_llm("Write a short review of Inception", model="gpt-4.1-nano")
is_positive = ask_llm(
    "Is this review positive? '{review}'", 
    model="gemini-2.5-flash-lite", 
    response_type=bool)
print(is_positive)  # True/False

Usecases

  • Structured outputs that work - Same code, any provider. Pydantic models, primitives, lists, tuples.
  • Zero routing code - Model name determines provider automatically (gemini-*, claude-*, gpt-*)
  • Convenience - you get TPM (Token Per Minute) limit retries automatically, as well as if the LLM fails to return the type you have requested.
  • Visibility: Know what you're calling and spending - Every call logged with token counts and cost. print_usage() for totals, print_call_timeline() for a visual waterfall.

Installation

Install only the providers you need:

pip install covenance[openai]      # OpenAI, Grok, OpenRouter
pip install covenance[anthropic]   # Anthropic Claude
pip install covenance[google]      # Google Gemini
pip install covenance[mistral]     # Mistral

# Multiple providers
pip install covenance[openai,anthropic]

# All providers
pip install covenance[all]

Structured outputs

Pass response_type to get validated, typed results:

# Pydantic models
class Evaluation(BaseModel):
    reasoning: str
    is_correct: bool

result = ask_llm("Is 2+2=5?", model="gemini-2.5-flash-lite", response_type=Evaluation)
print(result.reasoning)  # "2+2 equals 4, not 5"
print(result.is_correct)  # False

# Primitives
answer = ask_llm("Is Python interpreted?", model="gpt-4.1-nano", response_type=bool)
print(answer)  # True

# Collections
items = ask_llm("List 3 prime numbers", model="claude-sonnet-4-20250514", response_type=list[int])
print(items)  # [2, 3, 5]

Works identically across OpenAI, Gemini, Anthropic, Mistral, Grok, and OpenRouter.

Cost tracking

Every call is recorded with token counts and cost:

from covenance import ask_llm, print_usage, print_call_timeline, get_records

ask_llm("Hello", model="gpt-4.1-nano")
ask_llm("Hello", model="gemini-2.5-flash-lite")

print_usage()
# ==================================================
# LLM Usage Summary (default client)
# ==================================================
#   Calls: 2
#   Tokens: 45 (In: 12, Out: 33)
#   Cost: $0.0001
#   Models: gemini/gemini-2.5-flash-lite, openai/gpt-4.1-nano

# Access individual records
for record in get_records():
    print(f"{record.model}: {record.cost_usd}")

Persist records by setting COVENANCE_RECORDS_DIR or calling set_llm_call_records_dir().

Call timeline

Visualize call sequences and parallelism in your terminal:

from covenance import print_call_timeline

print_call_timeline()
# LLM Call Timeline (4.4s total, 5 calls)
#                         |0s                                            4.4s|
#   gpt-4.1-nano    1.3s  |████████████████                                  |
#   g2.5-flash-l    1.1s  |                 ████████████                     |
#   g2.5-flash-l    1.1s  |                 ████████████                     |
#   g2.5-flash-l    1.5s  |                 ████████████████                 |
#   g2.5-flash-l    1.5s  |                                 █████████████████|

Each line is a call, sorted by start time. Blocks show when each call was active - parallel calls appear as overlapping bars on different rows.

Consensus for quality

Run parallel LLM calls and integrate results for higher quality:

from covenance import llm_consensus

result = llm_consensus(
    "Explain quantum entanglement",
    model="gpt-4.1-nano",
    response_type=Evaluation,
    num_candidates=3,  # 3 parallel calls + integration
)

Supported providers

Provider is determined by model name prefix:

Prefix Provider
gpt-*, o1-*, o3-* OpenAI
gemini-* Google Gemini
claude-* Anthropic
mistral-*, codestral-* Mistral
grok-* xAI Grok
org/model (contains /) OpenRouter

Structured output reliability

Providers differ in how they enforce JSON schema compliance:

Provider Method Guarantee
OpenAI Constrained decoding 100% schema-valid JSON
Google Gemini Controlled generation 100% schema-valid JSON
Grok Constrained decoding 100% schema-valid JSON
Anthropic Structured outputs beta 100% schema-valid JSON*
Mistral Best-effort Probabilistic
OpenRouter Varies Depends on underlying model

*Anthropic structured outputs requires SDK >= 0.74.1 (uses anthropic-beta: structured-outputs-2025-11-13). Mistral uses probabilistic generation. Covenance retries automatically (up to 3 times) on JSON parse errors for Mistral.

API keys

Set environment variables for the providers you use:

  • OPENAI_API_KEY
  • GOOGLE_API_KEY (or GEMINI_API_KEY)
  • ANTHROPIC_API_KEY
  • MISTRAL_API_KEY
  • OPENROUTER_API_KEY
  • XAI_API_KEY (for Grok)

A .env file in the working directory is loaded automatically.

Isolated clients

Use Covenance instances for separate API keys and call records per subsystem:

from covenance import Covenance
from pydantic import BaseModel

# Each client tracks its own usage
question_client = Covenance(label="questions")
review_client = Covenance(label="review")

answer = question_client.ask_llm("Who is David Blaine?", model="gpt-4.1-nano")

class Evaluation(BaseModel):
    reasoning: str
    is_correct: bool

eval = review_client.llm_consensus(
    f"Is this accurate? '''{answer}'''",
    model="gemini-2.5-flash-lite",
    response_type=Evaluation,
)

question_client.print_usage()  # Shows only the question call
review_client.print_usage()    # Shows only the review call

How it works: dual backend

Covenance uses two backends for structured output and picks the better one per provider:

  • Native SDK — calls the provider's API directly (e.g., OpenAI Responses API with responses.parse)
  • pydantic-ai — uses pydantic-ai as a unified layer

The default routing:

Provider Backend Why
OpenAI Native Responses API with constrained decoding handles enums, recursive types, and large schemas more reliably
Grok Native OpenAI-compatible API, same benefits
Gemini pydantic-ai Native SDK hits RecursionError on self-referencing types (e.g., tree nodes)
Anthropic pydantic-ai No native client implemented
Mistral pydantic-ai Similar pass rates; pydantic-ai handles recursive types better
OpenRouter pydantic-ai No native client implemented

These defaults are based on a stress test suite that runs 14 test categories across providers with both backends. The results for the cheapest model per provider:

OpenAI  (gpt-4.1-nano):          native 14/14, pydantic-ai 10/14
Gemini  (gemini-2.5-flash-lite): native 11/14, pydantic-ai 13/14
Mistral (mistral-small-latest):  native  9/14, pydantic-ai  8/14

Where native beats pydantic-ai on OpenAI: enum adherence (strict values vs. hallucinated ones), recursive types (deeper trees), real-world schemas (fewer empty fields), and extreme schema limits (100+ fields with Literal types).

Where pydantic-ai beats native on Gemini: recursive/self-referencing types (native Google SDK crashes with RecursionError).

Overriding the backend

Each Covenance instance has a backends object with a field per provider. You can inspect and override them:

from covenance import Covenance

client = Covenance()
print(client.backends)
# Backends(native=[openai, grok], pydantic=[gemini, anthropic, mistral, openrouter])

# Override a specific provider
client.backends.anthropic = "native"

# Force all providers to one backend (useful for benchmarking)
client.backends.set_all("native")

Only "native" and "pydantic" are accepted — anything else raises ValueError.

Every call records which backend was used:

for record in client.get_records():
    print(f"{record.model}: {record.backend}")  # "native" or "pydantic"

The backend also shows in print_call_timeline() as (N) or (P):

print_call_timeline()
# LLM Call Timeline (2.1s total, 2 calls)
#                            |0s                                       2.1s|
#   gpt-4.1-nano(N)    0.8s  |█████████████████                            |
#   g2.5-flash-l(P)    1.1s  |                  ██████████████████████████  |

To see routing decisions in real time, enable debug logging:

import logging
logging.basicConfig(level=logging.DEBUG)
# DEBUG:covenance:ask_llm: model=gpt-4.1-nano provider=openai backend=native

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

covenance-0.0.9.tar.gz (33.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

covenance-0.0.9-py3-none-any.whl (45.5 kB view details)

Uploaded Python 3

File details

Details for the file covenance-0.0.9.tar.gz.

File metadata

  • Download URL: covenance-0.0.9.tar.gz
  • Upload date:
  • Size: 33.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.7

File hashes

Hashes for covenance-0.0.9.tar.gz
Algorithm Hash digest
SHA256 df7b05a9115134c3c47540cc6d4be574111ea52b3c463f134e8821aa8b0f049f
MD5 fcfb5798387355775fc73951fc8fd607
BLAKE2b-256 7bf776f6e05d0902b5d2d72f4ddbc834fa80fad4c39b626f865eb1631b8429d1

See more details on using hashes here.

File details

Details for the file covenance-0.0.9-py3-none-any.whl.

File metadata

  • Download URL: covenance-0.0.9-py3-none-any.whl
  • Upload date:
  • Size: 45.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.7

File hashes

Hashes for covenance-0.0.9-py3-none-any.whl
Algorithm Hash digest
SHA256 dc9fc6a6346773e837e59f9d4177ce3e00aa16435875604b7ef0bf1e5c1c2054
MD5 66843cdcc55e54b57ca009dd07bd2228
BLAKE2b-256 2de8fac7584c3e941dc45103d988f173e0932bd2578f275ee2e4df543a5eb566

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page