Skip to main content

HY Memory provider plugin for Hermes Agent (native, 100% passive injection)

Project description

HY Memory Provider for Hermes

A native Memory Provider plugin for Hermes Agentfirst-class, 100% passive injection.

How it works

Hermes calls prefetch(query) before every LLM turn. This provider:

  1. searches HY Memory with the query (chat path, three-channel recall),
  2. formats the hits by layer (§ [profile] / § [intent] / §),
  3. lets Hermes inject that block into the system prompt.

No user action required — memory works automatically.

At the end of each turn Hermes calls sync_turn(user, assistant). To save tokens and avoid duplicate extraction, the provider buffers turns and writes once every HY_MEMORY_WRITE_TURN_WINDOW turns (default 5) as a single batched, asynchronous write — so the main loop is never blocked. A partial tail (fewer than N turns, e.g. a session that ends after 3 turns) is always flushed on session end / pre-compress / shutdown, so no turns are lost.

Install

Two names, don't confuse them:

  • hermes-hy-memory — the PyPI package you pip install.
  • hy-memory — the provider name Hermes registers it under (set in config.yaml).

The SDK package hy-memory is on public PyPI — a plain pip install works, no extra index or credentials needed.

The chroma backend needs sqlite3 >= 3.35 (default MEMORY_VECTOR_STORE=chroma). On systems with an older sqlite (some CentOS/Linux report unsupported version of sqlite3): pip install pysqlite3-binary and swap it in at process entry (import sys; sys.modules["sqlite3"] = __import__("pysqlite3")), or switch to MEMORY_VECTOR_STORE=qdrant / faiss.

1. Install the package

pip install hermes-hy-memory        # pulls the hy-memory SDK as a dependency

This registers the hy-memory memory provider with Hermes via entry points.

2. Configure + activate (recommended: the wizard)

Run the standalone wizard — it configures everything and activates the provider for you, so this is the only command you need:

hermes-hy-memory init

The 4-step wizard (LLM → embedding → vector store → userId):

  1. writes the SDK configuration to ~/.hermes/.env, and
  2. sets memory.provider: hy-memory in ~/.hermes/config.yaml automatically.

Requires questionary (pip install "hermes-hy-memory[init]"). The processing mode defaults to pro (the wizard doesn't ask) — set HY_MEMORY_MODE in ~/.hermes/.env if you want ultra or lite.

Use hermes-hy-memory (standalone), not hermes hy-memory, for the first run. The Hermes main CLI only exposes hermes hy-memory ... after the provider is active in config.yaml — so before activation hermes hy-memory init fails with invalid choice: 'hy-memory'. The standalone hermes-hy-memory binary (installed by pip) always works. Once activated, both forms are equivalent.

If you prefer to do it by hand, set memory.provider yourself:

# ~/.hermes/config.yaml
memory:
  provider: hy-memory

and set the environment variables manually (see Configuration).

3. Verify

hermes-hy-memory doctor
# once the provider is active, this also works:
hermes hy-memory doctor

Configuration

Environment variables

Variable Required Default Description
HY_MEMORY_USER_ID First-level isolation key (your memory namespace)
HY_MEMORY_AGENT_ID hermes Second-level isolation key
HY_MEMORY_MODE pro Processing mode: lite / pro / ultra
HY_MEMORY_WRITE_TURN_WINDOW 5 Write throttle: buffer turns and persist once every N turns (one batched extraction → saves tokens, avoids per-turn dup writes). 1 = write every turn. Tail (< N turns) is flushed on session end / pre-compress / shutdown.
HY_MEMORY_PREFETCH_MAX_CHARS 2000 Max chars of injected prefetch text
HY_MEMORY_SYNC_WORKERS 2 sync_turn background thread-pool size
HY_MEMORY_SHUTDOWN_GRACE_SEC 10 Max seconds to wait for in-flight writes on shutdown
OPENAI_API_KEY (or matching LLM/embedder key) ✅* Required for pro/ultra; lite only needs the embedder

*The HY Memory SDK's own LLM/Embedder configuration (see the hy-memory docs).

Environment variables take precedence and can be persisted in ~/.hermes/.env (written automatically by the init wizard).

Processing modes (HY_MEMORY_MODE)

Mode Write pipeline Speed Recall quality
lite pure embedding (no LLM) fastest vector similarity only
pro (default) LLM fact/identity extraction + reconcile/evolution medium profile + fact layers
ultra pro + System2 async cognition (Schema/Intention) fullest + cross-domain induction + proactive intent

⚠️ lite mode is not suitable for Hermes passive injection / recall. lite only embeds and skips LLM extraction, so memories stay at the L1_RAW layer; the SDK's list / search filter out L1_RAW, meaning lite-written memories cannot be recalled by prefetch (writes succeed but search is always empty). Use pro (default) or ultra for Hermes; lite only fits write-only / no-semantic-recall scenarios.

CLI

After pip install, the hermes-hy-memory subcommands let you run diagnostics and manual operations (outside the Hermes main process):

# health check: env present / client constructs / list works (read-only)
hermes-hy-memory doctor

# manual write
hermes-hy-memory add "I like K-Pop but prefer Jazz"

# manual search
hermes-hy-memory search "music taste" --limit 5

# list recent 20
hermes-hy-memory list

# cross-user test (override env)
hermes-hy-memory search "x" --user-id other_user --agent-id test

When loaded by the Hermes main CLI:

hermes hy-memory doctor

Hooks

Hook When Behavior
prefetch(query) before each LLM call search memory → inject into system prompt
sync_turn(user, ast) end of each turn buffer the turn; flush a batched async write every HY_MEMORY_WRITE_TURN_WINDOW turns (default 5)
on_session_end(msgs) session end flush the partial (< window) tail buffer + wait for in-flight
on_pre_compress(msgs) before context compression same as on_session_end, preserves about-to-be-trimmed content
on_memory_write(action, target, content) Hermes built-in memory commands add syncs to HY Memory; delete is skipped (target IDs are not interchangeable)

Tools (LLM-invoked, optional)

Tool Purpose
memory_search(query, limit) search memories
memory_add(content) write a memory
memory_delete(memory_id) delete a memory
memory_list(limit) list memories for the current user/agent

Even with tools disabled, the passive prefetch injection guarantees relevant memories reach every LLM call.

Troubleshooting

provider not initialized or all hooks silently no-op

Run the health check:

hermes-hy-memory doctor

Common causes:

  1. HY_MEMORY_USER_ID not set
  2. embed/LLM key not set (OPENAI_API_KEY, etc.) → HyMemoryClient(mode="pro") fails to construct
  3. SDK not installed: pip install hy-memory

prefetch injects nothing

  • There really is no relevant memory — confirm with hermes-hy-memory list
  • Query too short (< 3 chars) or in the skip list (ok / thanks, etc.) — by design
  • Using lite mode? lite memories stay at L1_RAW and are never recalled — use pro/ultra

sync_turn doesn't seem to write

  • Default daemon threads are killed when the main process exits. Use Hermes daemon mode in production
  • Raise HY_MEMORY_SHUTDOWN_GRACE_SEC to let shutdown wait a few more seconds
  • Check the log for [hermes] sync_turn failed: ...

cross-loop errors (multi-client deployments)

If your Hermes deployment runs multiple HyMemoryClient instances (e.g. a multi-tenant server), use SharedRuntime:

from hy_memory import HyMemoryClient, SharedRuntime
runtime = await SharedRuntime.create(base_config)
client = HyMemoryClient(cfg, runtime=runtime)

A single-process Hermes deployment uses a solo-mode client by default and does not need this.

Comparison with Mem0-style integrations

Mem0's Hermes integration relies on a TypeScript SDK; this plugin uses the Python SDK. HY Memory offers three processing depths — lite/pro/ultra (lite skips the LLM, pro does standard extraction, ultra adds System2 cognition) — whereas Mem0 is single-tier LLM extraction.

Development

cd plugins/native/hermes
python -m pytest tests/ -v

Tests mock HyMemoryClient and need no external dependencies (no OPENAI_API_KEY, no running Qdrant).

Versions

Plugin SDK dependency Notes
0.1.8 hy-memory>=1.2.17 Turn-window write throttle (HY_MEMORY_WRITE_TURN_WINDOW, default 5) — batch writes every N turns, flush partial tail on session end; wizard no longer asks for mode (defaults to pro)
0.1.7 hy-memory>=1.2.17 Standalone CLI (doctor/add/search/list) now loads ~/.hermes/.env, so wizard-written settings are visible outside the Hermes daemon
0.1.6 hy-memory>=1.2.17 init auto-activates memory.provider in config.yaml; docs steer to standalone hermes-hy-memory init (avoids pre-activation invalid choice error)
0.1.5 hy-memory>=1.2.17 Single-level standalone CLI; corrected install docs (real pip + config.yaml flow)
0.1.4 hy-memory>=1.2.17 English README; init wizard (questionary)
0.1.3 hy-memory>=1.2.17 init interactive setup wizard
0.1.2 hy-memory>=1.2.17 Published to public PyPI; channel-dict flatten fix

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

hermes_hy_memory-0.1.8.tar.gz (34.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

hermes_hy_memory-0.1.8-py3-none-any.whl (24.7 kB view details)

Uploaded Python 3

File details

Details for the file hermes_hy_memory-0.1.8.tar.gz.

File metadata

  • Download URL: hermes_hy_memory-0.1.8.tar.gz
  • Upload date:
  • Size: 34.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.11

File hashes

Hashes for hermes_hy_memory-0.1.8.tar.gz
Algorithm Hash digest
SHA256 574a278a15e632ef2fd2c56ff49c56a4f4219e48e892f24f5ef7fbc85dbc9647
MD5 d6376520eae01b65893752d549472ab3
BLAKE2b-256 5694839e6a2a8da4ca6f167f815122ec3990189696b135625121f1c6fef5ea59

See more details on using hashes here.

File details

Details for the file hermes_hy_memory-0.1.8-py3-none-any.whl.

File metadata

File hashes

Hashes for hermes_hy_memory-0.1.8-py3-none-any.whl
Algorithm Hash digest
SHA256 ec7d094eaa21acad686db9cc8378ac5966e8117a9d052e84590e6a462c50caeb
MD5 eda9eb71360dd83d88b65fbfcbb3f66e
BLAKE2b-256 b4bc125ca05b5b6ed799c6d31c699aebdbed180e6e3f996ea480fd4415244edb

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page