Skip to main content

Multi-credential LLM client with automatic failover and provider abstraction

Project description

nodus-llm

Multi-credential LLM client with automatic failover and provider abstraction.

Rotates through ordered credential profiles when an LLM provider fails — handling rate limits, auth errors, billing limits, and context overflow — with exponential per-credential backoff (5m→10m→20m→40m→1h). No required external dependencies; provider SDKs are optional extras.

Status: v0.1.0 — prepared, not yet published.


Install

pip install nodus-llm

# With OpenAI support:
pip install "nodus-llm[openai]"

# With Anthropic support:
pip install "nodus-llm[anthropic]"

# Both:
pip install "nodus-llm[all]"

What it provides

Component Purpose
CredentialProfile One API key + provider + model with cooldown state
CredentialStore Ordered profile list with availability tracking
FailoverClient Rotates profiles on failure with exponential backoff
FailoverError Raised when all profiles are exhausted
FailoverReason RATE_LIMIT | AUTH | BILLING | CONTEXT_OVERFLOW | EXHAUSTED
context_window_for(model) Token limit for a known model
would_overflow(messages, model) Context window check before sending
CONTEXT_WINDOWS Dict of model name → token limit

Quick start

from nodus_llm import CredentialProfile, CredentialStore, FailoverClient
from nodus_llm.providers.openai import OpenAIProvider

profiles = [
    CredentialProfile(api_key="sk-primary", provider="openai", model="gpt-4o"),
    CredentialProfile(api_key="sk-backup",  provider="openai", model="gpt-4o-mini"),
]
store = CredentialStore(profiles=profiles)

client = FailoverClient(store, provider_fn=OpenAIProvider)
reply = client.chat([{"role": "user", "content": "Hello!"}])

CredentialProfile and CredentialStore

from nodus_llm import CredentialProfile, CredentialStore

profile = CredentialProfile(
    api_key="sk-abc123",
    provider="openai",     # "openai" | "anthropic" | custom string
    model="gpt-4o",
    max_tokens=4096,       # optional per-profile override
)

store = CredentialStore(profiles=[profile])

available = store.available_profiles()   # profiles not in cooldown
store.mark_failed(profile)               # starts exponential cooldown
store.mark_success(profile)              # clears cooldown

Cooldown schedule: 5m → 10m → 20m → 40m → 1h (exponential, capped at 1h).


FailoverClient

from nodus_llm import FailoverClient, FailoverError, FailoverReason

client = FailoverClient(
    store,
    provider_fn=OpenAIProvider,   # factory: (profile) → LLMClient
)

try:
    reply = client.chat(
        messages=[{"role": "user", "content": "Hello!"}],
        max_tokens=256,
    )
except FailoverError as exc:
    print(exc.reason)   # FailoverReason.EXHAUSTED

FailoverClient tries profiles in order. On failure it:

  1. Classifies the error → FailoverReason
  2. Marks the profile failed (starts cooldown)
  3. Tries the next available profile
  4. Raises FailoverError if all profiles are exhausted

Providers

OpenAI

from nodus_llm.providers.openai import OpenAIProvider

client = FailoverClient(store, provider_fn=OpenAIProvider)

Requires pip install "nodus-llm[openai]".

Anthropic

from nodus_llm.providers.anthropic import AnthropicProvider

client = FailoverClient(store, provider_fn=AnthropicProvider)

Requires pip install "nodus-llm[anthropic]".

OpenAI-compatible endpoints

from nodus_llm.providers.compat import OpenAICompatProvider

profile = CredentialProfile(
    api_key="local-key",
    provider="local",
    model="llama-3.1-8b",
    base_url="http://localhost:11434/v1",
)
client = FailoverClient(store, provider_fn=OpenAICompatProvider)

Context window utilities

from nodus_llm import context_window_for, would_overflow, CONTEXT_WINDOWS

limit = context_window_for("gpt-4o")   # 128000 | None (unknown model)

messages = [{"role": "user", "content": "..." * 10000}]
if would_overflow(messages, "gpt-4o-mini"):
    # truncate or summarise before sending
    ...

# All known limits
print(CONTEXT_WINDOWS)  # {"gpt-4o": 128000, "claude-3-5-sonnet-20241022": 200000, ...}

Custom provider

Any callable (CredentialProfile) -> LLMClient works as provider_fn. LLMClient is a structural protocol — any object with chat(messages, **kwargs) -> str satisfies it.

class MyProvider:
    def __init__(self, profile): self.profile = profile
    def chat(self, messages, **kwargs): return "custom response"

client = FailoverClient(store, provider_fn=MyProvider)

Design

  • No required dependencies. Core credential management and failover are pure stdlib. Provider SDKs are opt-in extras.
  • nodus-circuit-breaker is type-check only. The LLMClient protocol is referenced via TYPE_CHECKING — no runtime import.
  • Exponential backoff per credential. Each profile has independent cooldown state; a failed profile is skipped until its cooldown expires.

Development

pip install -e ".[dev]"
pytest tests/ -q

License

MIT — see LICENSE.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

nodus_llm-0.1.0.tar.gz (13.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

nodus_llm-0.1.0-py3-none-any.whl (12.7 kB view details)

Uploaded Python 3

File details

Details for the file nodus_llm-0.1.0.tar.gz.

File metadata

  • Download URL: nodus_llm-0.1.0.tar.gz
  • Upload date:
  • Size: 13.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.9

File hashes

Hashes for nodus_llm-0.1.0.tar.gz
Algorithm Hash digest
SHA256 39fe51899bcf17a7ae593115323c7f0e33a777114eb9de67d21cdb98c4d503e6
MD5 71350436f63f8ac12bbc282868ae5143
BLAKE2b-256 c221600cea26078e3f8b88e2be5bdbef5f30615bd0cac4efdec24aae6d861524

See more details on using hashes here.

File details

Details for the file nodus_llm-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: nodus_llm-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 12.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.9

File hashes

Hashes for nodus_llm-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 e281b49b087893e2d12b2fe96b64fa1b8ec2fd54a7351083d1b106c9cf525c83
MD5 c45373aa42ceab0419b04630fed7c858
BLAKE2b-256 230594be79676618399635b9954009d27243c29a25cc621f4874e1189f874643

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page