Skip to main content

Production hardening pack for Hermes Agent — per-topic model routing, fallback chains, rate limiting, and cost tracking via gateway hooks

Project description

hermes-kit

Production hardening pack for Hermes Agent.

Self-hosted Hermes gateways are powerful but built for single-user setups. Multi-user deployments hit walls: no per-topic model routing, API failures surface as hard errors, one heavy user can burn your API budget with no alert.

hermes-kit fills these gaps with production-grade hooks.

⚠️ How it works: hermes-kit monkey-patches Hermes Agent's internal model resolver at runtime. This is intentionally fragile — Hermes Agent updates may break your setup. We're working on an upstream PR to replace the patch with native hook return values. Until then, test after every Hermes upgrade.

Prerequisites

  • Python ≥ 3.11
  • Hermes Agent v0.16.0 (pinned)
  • A configured gateway (Telegram, Discord, etc.)

Install

pip install hermes-agent-kit

🔵 Naming — same project, two names:

Context Name
PyPI / pip install hermes-agent-kit
GitHub repo srmdn/hermes-agent-kit
CLI command hermes-kit

Why the split? PyPI name matches the repo (hermes-agent-kit). The short CLI alias (hermes-kit) keeps commands terse — hermes-kit install router instead of hermes-agent-kit install router. Same project, same code, two names.

Quickstart

# Install all hooks in one command
hermes-kit install router fallback rate-limiter cost-tracker model-switch

# Verify
hermes-kit doctor

# Start gateway with bridge auto-patched
hermes-kit gateway run --accept-hooks

# If new users get "I don't recognize you":
GATEWAY_ALLOW_ALL_USERS=true hermes-kit gateway run --accept-hooks

Hooks land in ~/.hermes/hooks/<name>/. Hermes discovers them on restart.

Modules

router — Per-Topic Model Routing

Route Telegram topics to different AI models. Finance chat uses Qwen, coding chat uses DeepSeek, everything else falls back to GPT-4o-mini.

Via CLI:

hermes-kit router set-default --model opencode-go/gpt-4o-mini
hermes-kit router add 42 --model opencode-go/deepseek-v4-pro
hermes-kit router show

Via YAML (~/.hermes/hooks/router/topic_router.yaml):

default:
  model: "opencode-go/gpt-4o-mini"

topics:
  "42":
    model: "opencode-go/deepseek-v4-pro"

Multi-provider — route specific topics to native providers:

hermes-kit router add 42 --model gpt-4o --provider openai
hermes-kit router add 7 --model claude-sonnet-4-20250514 --provider anthropic

Hermes resolves API keys from ~/.hermes/.env (OPENAI_API_KEY, ANTHROPIC_API_KEY, etc.). See providers guide for all supported providers and model IDs.

fallback — Automatic Fallback Chain

Define a chain of models to try when the primary fails.

Via YAML (~/.hermes/hooks/fallback/fallback_chain.yaml):

chains:
  global:
    - "opencode-go/deepseek-v4-pro"     # primary
    - "opencode-go/claude-sonnet-4"      # fallback
    - "opencode-go/gpt-4o-mini"          # last resort

After a failure, call hermes_kit.bridge.retry_with_fallback(session_key) to advance to the next model.

rate-limiter — Per-User Rate Limiting

Prevent a single user or chat from draining your API budget.

Via YAML (~/.hermes/hooks/rate-limiter/rate_limits.yaml):

limits:
  global:
    max_messages_per_window: 100
    window_seconds: 3600
  per_user:
    "123456789":
      max_messages_per_window: 50

Rate limiter enforces limits at the bridge level — exceeding users get RateLimitExceeded on every request until their window resets. Set window_seconds to control reset timing.

cost-tracker — Real-Time Cost Tracking

Track token costs per session and alert when thresholds are exceeded.

Via YAML (~/.hermes/hooks/cost-tracker/cost_tracker.yaml):

alert_threshold_usd: 1.0

Set to 0 to disable alerts but continue tracking.

Docs

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

hermes_agent_kit-0.3.1.tar.gz (23.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

hermes_agent_kit-0.3.1-py3-none-any.whl (16.7 kB view details)

Uploaded Python 3

File details

Details for the file hermes_agent_kit-0.3.1.tar.gz.

File metadata

  • Download URL: hermes_agent_kit-0.3.1.tar.gz
  • Upload date:
  • Size: 23.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.11.18 {"installer":{"name":"uv","version":"0.11.18","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for hermes_agent_kit-0.3.1.tar.gz
Algorithm Hash digest
SHA256 adf6957d732c9f3c6d77163282359c6e53112df3b67ca870001ad19e9a0f42e4
MD5 4971b37368269fd4bf665ec20bf0904e
BLAKE2b-256 d133f192dc4b8ac6e196288220faf82f07b811915d7f75df7fc9a10b38f00285

See more details on using hashes here.

File details

Details for the file hermes_agent_kit-0.3.1-py3-none-any.whl.

File metadata

  • Download URL: hermes_agent_kit-0.3.1-py3-none-any.whl
  • Upload date:
  • Size: 16.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.11.18 {"installer":{"name":"uv","version":"0.11.18","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for hermes_agent_kit-0.3.1-py3-none-any.whl
Algorithm Hash digest
SHA256 2fbf0ff20eb3dd010fc383d4fc6c02f3eee506622e669b8648e503bd1aa99241
MD5 6f30162e16811ef1f8937d218c6f88a8
BLAKE2b-256 f1b4107dadd464318b397a19c0ba0db1013617d429761c163afb54a0a323785b

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page