OmniGate — a litellm-style multi-provider LLM SDK: call OpenAI, Anthropic, Gemini & Azure in-process with routing, retry, fallback, circuit breaking, cost tracking and an opt-in cache — or point it at a hosted OmniGate gateway.

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

ninja3

These details have not been verified by PyPI

Project description

omnigate

A small, fully-typed, litellm-style multi-provider LLM SDK — sync and async, streaming-aware, with typed errors. Depends only on httpx and pydantic.

pip install omnigate

Two ways to use it:

In-process — call OpenAI / Anthropic / Gemini / Azure directly, no server to run. You get routing, retry + backoff, fallbacks, circuit breaking, per-call cost tracking, an opt-in response cache, callbacks and a local spend cap.
Hosted gateway client — point Client / AsyncClient at a running OmniGate server for centralised auth, budgets, rate limiting and metrics.

In-process quick start

Set a provider key the usual way (OPENAI_API_KEY, ANTHROPIC_API_KEY, GEMINI_API_KEY / GOOGLE_API_KEY, or AZURE_OPENAI_API_KEY + AZURE_OPENAI_ENDPOINT) — or pass api_key= explicitly.

import omnigate

r = omnigate.completion(model="gpt-4o-mini", messages="Say hi in French")
print(r.content, r.usage.total_tokens, r.cost_usd, r.model, r.provider)

messages is flexible: pass a bare string (treated as one user message), a single dict/Message, or a list of dicts/Messages. The model name routes to the provider by prefix (gpt-*/o1/o3/o4 → OpenAI, claude-* → Anthropic, gemini-* → Gemini, azure/<deployment> → Azure OpenAI).

Async

import asyncio, omnigate

async def main():
    r = await omnigate.acompletion(
        model="claude-3-5-haiku-latest",
        messages=[{"role": "user", "content": "hi"}],
    )
    print(r.content)

asyncio.run(main())

Streaming

completion(stream=True) returns an iterator of StreamChunk; the async twin returns an async iterator. Content chunks carry text; the final chunk carries usage.

for chunk in omnigate.completion(model="gpt-4o-mini", messages="haiku", stream=True):
    print(chunk.text, end="", flush=True)

# async
async for chunk in await omnigate.acompletion(model="gpt-4o-mini",
                                               messages="haiku", stream=True):
    print(chunk.text, end="")

Fallbacks

Try models in order until one succeeds. Each may resolve to a different provider; transient failures (429/5xx/timeout) trip the breaker, client errors (4xx) just move on. response.fallback_used tells you if a fallback answered.

r = omnigate.completion(
    model="gpt-4o-mini",
    messages="hi",
    fallbacks=["claude-3-5-haiku-latest", "gemini-1.5-flash"],
)

Cost tracking

Every non-streamed response carries cost_usd computed from a built-in per-model price table (omnigate.pricing). Cached hits are billed as 0.0.

Response cache (opt-in)

A deterministic, in-memory TTL cache for repeated temperature=0 calls. Enable per call with cache=True, or globally via configure(cache_enabled=True).

r1 = omnigate.completion(model="gpt-4o-mini", messages="2+2?", temperature=0, cache=True)
r2 = omnigate.completion(model="gpt-4o-mini", messages="2+2?", temperature=0, cache=True)
assert r2.cached and r2.cost_usd == 0.0   # served from cache, no second API call

Callbacks

omnigate.register_callback(
    on_success=lambda e: print(e.provider, e.model, e.cost_usd, e.latency_ms),
    on_failure=lambda e: print("failed:", e.exception),
)

Local spend cap

Set a process-wide USD ceiling; once reached, further calls raise BudgetExceededError.

omnigate.configure(max_spend_usd=5.00)

Configuration & keys

configure(...) sets process-global defaults and/or keys; per-call kwargs (timeout=, num_retries=, cache=, api_key=, api_base=, api_version=) override them. Everything also reads from the environment:

Setting	Env var	Default
Request timeout (s)	`OMNIGATE_TIMEOUT_SECONDS`	60
Retry attempts	`OMNIGATE_RETRY_MAX_ATTEMPTS`	3
Retry base delay (s)	`OMNIGATE_RETRY_BASE_DELAY_SECONDS`	0.25
Retry max delay (s)	`OMNIGATE_RETRY_MAX_DELAY_SECONDS`	8.0
Retry jitter (s)	`OMNIGATE_RETRY_JITTER_SECONDS`	0.25
Circuit breaker on	`OMNIGATE_CIRCUIT_BREAKER_ENABLED`	true
Breaker fail threshold	`OMNIGATE_CIRCUIT_BREAKER_FAIL_THRESHOLD`	5
Breaker cooldown (s)	`OMNIGATE_CIRCUIT_BREAKER_COOLDOWN_SECONDS`	30
Cache on	`OMNIGATE_CACHE_ENABLED`	false
Cache TTL (s)	`OMNIGATE_CACHE_TTL_SECONDS`	300
Local spend cap (USD)	`OMNIGATE_MAX_SPEND_USD`	(off)

import omnigate

omnigate.configure(
    openai_api_key="sk-...",
    anthropic_api_key="...",
    azure_endpoint="https://my.openai.azure.com",
    cache_enabled=True,
    num_retries=3,   # note: in configure this is EngineConfig.retry_max_attempts
)

# Azure: deployment is taken from the model id
omnigate.completion(model="azure/my-gpt4o-deployment", messages="hi",
                    api_key="...", api_base="https://my.openai.azure.com")

Errors (in-process)

All errors derive from GatewayError.

Exception	When
`AuthError`	provider returned 401/403 (your provider key is bad)
`RateLimitError`	429 — has `.retry_after` (honored by retry)
`BudgetExceededError`	local spend cap reached
`ProviderError`	5xx / network / timeout (retried, then surfaced)
`APIError`	config errors (unknown model, missing key) and other 4xx

Hosted gateway client

If you run an OmniGate server, point the client at it for centralised auth, budgets, rate limiting and metrics. The client talks the gateway's HTTP surface; it does not call providers itself.

from omnigate import Client

# Public client (no key) just for signup:
with Client(base_url="https://gw.example.com") as anon:
    acct = anon.signup(email="dev@acme.com", org_name="Acme", project_name="prod")

client = Client(api_key=acct.api_key, base_url="https://gw.example.com", user_id="u-42")
client.set_provider_key(provider="openai", api_key="sk-...")  # stored encrypted by the gateway

resp = client.chat(model="gpt-4o-mini", messages=[{"role": "user", "content": "Hi"}])
print(resp.content, resp.usage.total_tokens, resp.cost_usd)
client.close()

AsyncClient mirrors Client exactly (identical constructor and method names), but every method is async def and chat_stream returns an async iterator. Use async with / await client.aclose().

import asyncio
from omnigate import AsyncClient, BudgetExceededError, RateLimitError

async def main():
    async with AsyncClient(api_key="llmg_...", base_url="https://gw.example.com") as c:
        try:
            async for piece in c.chat_stream(model="claude-3-5-sonnet-latest", messages="hi"):
                print(piece, end="")
        except RateLimitError as e:
            print("slow down; retry after", e.retry_after)
        except BudgetExceededError as e:
            print("budget hit:", e.detail)

asyncio.run(main())

Pointing the OpenAI SDK at the gateway

The gateway exposes an OpenAI-compatible POST /v1/chat/completions, so you can reuse the official OpenAI SDK and just change the base URL + key:

from openai import OpenAI

oai = OpenAI(
    base_url="https://gw.example.com/v1",
    api_key="llmg_...",
    default_headers={"x-api-key": "llmg_...", "x-user-id": "u-42"},
)
oai.chat.completions.create(model="gpt-4o-mini", messages=[{"role": "user", "content": "Hi"}])

Models, metrics & key management (hosted)

for m in client.models():            # GET /v1/models
    print(m.id, m.owned_by, m.provider)

mx = client.metrics(range="7d")      # GET /v1/metrics (1h | 24h | 7d | 30d)
print(mx.totals.requests, mx.totals.cost_usd, mx.totals.p95_latency_ms)

key = client.create_api_key(name="ci")   # POST /v1/keys/api -> ApiKeyCreated (plaintext shown once)
client.me(); client.health()

Gateway-client errors map the same exception hierarchy; a provider-surfaced 401 is classified as ProviderError (not AuthError) so you can tell "my gateway key is bad" from "my OpenAI key is bad".

License

MIT

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

ninja3

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

0.2.0

Jun 4, 2026

0.1.0

Jun 4, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

omnigate-0.2.0.tar.gz (42.3 kB view details)

Uploaded Jun 4, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

omnigate-0.2.0-py3-none-any.whl (47.7 kB view details)

Uploaded Jun 4, 2026 Python 3

File details

Details for the file omnigate-0.2.0.tar.gz.

File metadata

Download URL: omnigate-0.2.0.tar.gz
Upload date: Jun 4, 2026
Size: 42.3 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for omnigate-0.2.0.tar.gz
Algorithm	Hash digest
SHA256	`4693639aa98619cbcd941a6595f7e670cdda588ae814e158b6b503902ca1c43c`
MD5	`87bde789730c805f76c9c5e264e2bf82`
BLAKE2b-256	`ca01a040fcd95abcf7fa3a3acc03d7996d56c2dbe505204a0296e51bdcb42606`

See more details on using hashes here.

Provenance

The following attestation bundles were made for omnigate-0.2.0.tar.gz:

Publisher: publish.yml on sreekarp/omnigate

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: omnigate-0.2.0.tar.gz
- Subject digest: 4693639aa98619cbcd941a6595f7e670cdda588ae814e158b6b503902ca1c43c
- Sigstore transparency entry: 1723677947
- Sigstore integration time: Jun 4, 2026
Source repository:
- Permalink: sreekarp/omnigate@6dc0a6e1f1d1e93f05cf2ddf95e34c4f53609404
- Branch / Tag: refs/heads/main
- Owner: https://github.com/sreekarp
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@6dc0a6e1f1d1e93f05cf2ddf95e34c4f53609404
- Trigger Event: workflow_dispatch

File details

Details for the file omnigate-0.2.0-py3-none-any.whl.

File metadata

Download URL: omnigate-0.2.0-py3-none-any.whl
Upload date: Jun 4, 2026
Size: 47.7 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for omnigate-0.2.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`af257ba92ba9c447697bd9a370a57a1d84bfe2dc84c856196359a1d4991744ee`
MD5	`9b174ebb1b507836824917a57ea8a630`
BLAKE2b-256	`0476776702e55797860ec4995133754d460c3656ae3725a995e375203ac13a60`

See more details on using hashes here.

Provenance

The following attestation bundles were made for omnigate-0.2.0-py3-none-any.whl:

Publisher: publish.yml on sreekarp/omnigate

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: omnigate-0.2.0-py3-none-any.whl
- Subject digest: af257ba92ba9c447697bd9a370a57a1d84bfe2dc84c856196359a1d4991744ee
- Sigstore transparency entry: 1723678047
- Sigstore integration time: Jun 4, 2026
Source repository:
- Permalink: sreekarp/omnigate@6dc0a6e1f1d1e93f05cf2ddf95e34c4f53609404
- Branch / Tag: refs/heads/main
- Owner: https://github.com/sreekarp
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@6dc0a6e1f1d1e93f05cf2ddf95e34c4f53609404
- Trigger Event: workflow_dispatch

omnigate 0.2.0

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

omnigate

In-process quick start

Async

Streaming

Fallbacks

Cost tracking

Response cache (opt-in)

Callbacks

Local spend cap

Configuration & keys

Errors (in-process)

Hosted gateway client

Pointing the OpenAI SDK at the gateway

Models, metrics & key management (hosted)

License

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance