HY Memory provider plugin for Hermes Agent (native, 100% passive injection)

These details have not been verified by PyPI

Project description

HY Memory Provider for Hermes

A native Memory Provider plugin for Hermes Agent — first-class, 100% passive injection.

How it works

Hermes calls prefetch(query) before every LLM turn. This provider:

searches HY Memory with the query (chat path, three-channel recall),
formats the hits by layer (§ [profile] / § [intent] / §),
lets Hermes inject that block into the system prompt.

No user action required — memory works automatically.

At the end of each turn Hermes calls sync_turn(user, assistant). To save tokens and avoid duplicate extraction, the provider buffers turns and writes once every HY_MEMORY_WRITE_TURN_WINDOW turns (default 5) as a single batched, asynchronous write — so the main loop is never blocked. A partial tail (fewer than N turns, e.g. a session that ends after 3 turns) is always flushed on session end / pre-compress / shutdown, so no turns are lost.

Install

Two names, don't confuse them:

hermes-hy-memory — the PyPI package you pip install.

hy-memory — the provider name Hermes registers it under (set in config.yaml).

The SDK package hy-memory is on public PyPI — a plain pip install works, no extra index or credentials needed.

The chroma backend needs sqlite3 >= 3.35 (default MEMORY_VECTOR_STORE=chroma). On systems with an older sqlite (some CentOS/Linux report unsupported version of sqlite3): pip install pysqlite3-binary and swap it in at process entry (import sys; sys.modules["sqlite3"] = __import__("pysqlite3")), or switch to MEMORY_VECTOR_STORE=qdrant / faiss.

1. Install the package

pip install hermes-hy-memory        # pulls the hy-memory SDK as a dependency

This registers the hy-memory memory provider with Hermes via entry points.

2. Configure + activate (recommended: the wizard)

Run the standalone wizard — it configures everything and activates the provider for you, so this is the only command you need:

hermes-hy-memory init

The 4-step wizard (LLM → embedding → vector store → userId):

writes the SDK configuration to ~/.hermes/.env, and
sets memory.provider: hy-memory in ~/.hermes/config.yaml automatically.

Requires questionary (pip install "hermes-hy-memory[init]"). The processing mode defaults to pro (the wizard doesn't ask) — set HY_MEMORY_MODE in ~/.hermes/.env if you want ultra or lite.

Use hermes-hy-memory (standalone), not hermes hy-memory, for the first run. The Hermes main CLI only exposes hermes hy-memory ... after the provider is active in config.yaml — so before activation hermes hy-memory init fails with invalid choice: 'hy-memory'. The standalone hermes-hy-memory binary (installed by pip) always works. Once activated, both forms are equivalent.

If you prefer to do it by hand, set memory.provider yourself:

# ~/.hermes/config.yaml
memory:
  provider: hy-memory

and set the environment variables manually (see Configuration).

3. Verify

hermes-hy-memory doctor
# once the provider is active, this also works:
hermes hy-memory doctor

Configuration

Environment variables

Variable	Required	Default	Description
`HY_MEMORY_USER_ID`	✅	—	First-level isolation key (your memory namespace)
`HY_MEMORY_AGENT_ID`	❌	`hermes`	Second-level isolation key
`HY_MEMORY_MODE`	❌	`pro`	Processing mode: `lite` / `pro` / `ultra`
`HY_MEMORY_WRITE_TURN_WINDOW`	❌	`5`	Write throttle: buffer turns and persist once every N turns (one batched extraction → saves tokens, avoids per-turn dup writes). `1` = write every turn. Tail (< N turns) is flushed on session end / pre-compress / shutdown.
`HY_MEMORY_PREFETCH_MAX_CHARS`	❌	`2000`	Max chars of injected prefetch text
`HY_MEMORY_SYNC_WORKERS`	❌	`2`	sync_turn background thread-pool size
`HY_MEMORY_SHUTDOWN_GRACE_SEC`	❌	`10`	Max seconds to wait for in-flight writes on shutdown
`OPENAI_API_KEY` (or matching LLM/embedder key)	✅*	—	Required for `pro`/`ultra`; `lite` only needs the embedder

*The HY Memory SDK's own LLM/Embedder configuration (see the hy-memory docs).

Environment variables take precedence and can be persisted in ~/.hermes/.env (written automatically by the init wizard).

Processing modes (`HY_MEMORY_MODE`)

Mode	Write pipeline	Speed	Recall quality
`lite`	pure embedding (no LLM)	fastest	vector similarity only
`pro` (default)	LLM fact/identity extraction + reconcile/evolution	medium	profile + fact layers
`ultra`	pro + System2 async cognition (Schema/Intention)	fullest	+ cross-domain induction + proactive intent

⚠️ lite mode is not suitable for Hermes passive injection / recall. lite only embeds and skips LLM extraction, so memories stay at the L1_RAW layer; the SDK's list / search filter out L1_RAW, meaning lite-written memories cannot be recalled by prefetch (writes succeed but search is always empty). Use pro (default) or ultra for Hermes; lite only fits write-only / no-semantic-recall scenarios.

CLI

After pip install, the hermes-hy-memory subcommands let you run diagnostics and manual operations (outside the Hermes main process):

# health check: env present / client constructs / list works (read-only)
hermes-hy-memory doctor

# manual write
hermes-hy-memory add "I like K-Pop but prefer Jazz"

# manual search
hermes-hy-memory search "music taste" --limit 5

# list recent 20
hermes-hy-memory list

# cross-user test (override env)
hermes-hy-memory search "x" --user-id other_user --agent-id test

When loaded by the Hermes main CLI:

hermes hy-memory doctor

Hooks

Hook	When	Behavior
`prefetch(query)`	before each LLM call	search memory → inject into system prompt
`sync_turn(user, ast)`	end of each turn	buffer the turn; flush a batched async write every `HY_MEMORY_WRITE_TURN_WINDOW` turns (default 5)
`on_session_end(msgs)`	session end	flush the partial (< window) tail buffer + wait for in-flight
`on_pre_compress(msgs)`	before context compression	same as `on_session_end`, preserves about-to-be-trimmed content
`on_memory_write(action, target, content)`	Hermes built-in memory commands	`add` syncs to HY Memory; `delete` is skipped (target IDs are not interchangeable)

Tools (LLM-invoked, optional)

Tool	Purpose
`memory_search(query, limit)`	search memories
`memory_add(content)`	write a memory
`memory_delete(memory_id)`	delete a memory
`memory_list(limit)`	list memories for the current user/agent

Even with tools disabled, the passive prefetch injection guarantees relevant memories reach every LLM call.

Troubleshooting

`provider not initialized` or all hooks silently no-op

Run the health check:

hermes-hy-memory doctor

Common causes:

HY_MEMORY_USER_ID not set
embed/LLM key not set (OPENAI_API_KEY, etc.) → HyMemoryClient(mode="pro") fails to construct
SDK not installed: pip install hy-memory

prefetch injects nothing

There really is no relevant memory — confirm with hermes-hy-memory list
Query too short (< 3 chars) or in the skip list (ok / thanks, etc.) — by design
Using lite mode? lite memories stay at L1_RAW and are never recalled — use pro/ultra

sync_turn doesn't seem to write

Default daemon threads are killed when the main process exits. Use Hermes daemon mode in production
Raise HY_MEMORY_SHUTDOWN_GRACE_SEC to let shutdown wait a few more seconds
Check the log for [hermes] sync_turn failed: ...

`cross-loop` errors (multi-client deployments)

If your Hermes deployment runs multiple HyMemoryClient instances (e.g. a multi-tenant server), use SharedRuntime:

from hy_memory import HyMemoryClient, SharedRuntime
runtime = await SharedRuntime.create(base_config)
client = HyMemoryClient(cfg, runtime=runtime)

A single-process Hermes deployment uses a solo-mode client by default and does not need this.

Comparison with Mem0-style integrations

Mem0's Hermes integration relies on a TypeScript SDK; this plugin uses the Python SDK. HY Memory offers three processing depths — lite/pro/ultra (lite skips the LLM, pro does standard extraction, ultra adds System2 cognition) — whereas Mem0 is single-tier LLM extraction.

Development

cd plugins/native/hermes
python -m pytest tests/ -v

Tests mock HyMemoryClient and need no external dependencies (no OPENAI_API_KEY, no running Qdrant).

Versions

Plugin	SDK dependency	Notes
0.1.8	`hy-memory>=1.2.17`	Turn-window write throttle (`HY_MEMORY_WRITE_TURN_WINDOW`, default 5) — batch writes every N turns, flush partial tail on session end; wizard no longer asks for mode (defaults to `pro`)
0.1.7	`hy-memory>=1.2.17`	Standalone CLI (`doctor`/`add`/`search`/`list`) now loads `~/.hermes/.env`, so wizard-written settings are visible outside the Hermes daemon
0.1.6	`hy-memory>=1.2.17`	`init` auto-activates `memory.provider` in `config.yaml`; docs steer to standalone `hermes-hy-memory init` (avoids pre-activation `invalid choice` error)
0.1.5	`hy-memory>=1.2.17`	Single-level standalone CLI; corrected install docs (real pip + `config.yaml` flow)
0.1.4	`hy-memory>=1.2.17`	English README; `init` wizard (questionary)
0.1.3	`hy-memory>=1.2.17`	`init` interactive setup wizard
0.1.2	`hy-memory>=1.2.17`	Published to public PyPI; channel-dict flatten fix

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

0.2.7

Jun 9, 2026

0.2.6

Jun 8, 2026

0.2.5

Jun 8, 2026

0.2.4

Jun 8, 2026

0.2.3

Jun 8, 2026

0.2.2

Jun 8, 2026

0.2.1

Jun 8, 2026

0.2.0

Jun 8, 2026

0.1.10

Jun 8, 2026

0.1.9

Jun 8, 2026

This version

0.1.8

Jun 8, 2026

0.1.7

Jun 8, 2026

0.1.6

Jun 8, 2026

0.1.5

Jun 8, 2026

0.1.4

Jun 7, 2026

0.1.3

Jun 7, 2026

0.1.2

Jun 7, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

hermes_hy_memory-0.1.8.tar.gz (34.9 kB view details)

Uploaded Jun 8, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

hermes_hy_memory-0.1.8-py3-none-any.whl (24.7 kB view details)

Uploaded Jun 8, 2026 Python 3

File details

Details for the file hermes_hy_memory-0.1.8.tar.gz.

File metadata

Download URL: hermes_hy_memory-0.1.8.tar.gz
Upload date: Jun 8, 2026
Size: 34.9 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.11

File hashes

Hashes for hermes_hy_memory-0.1.8.tar.gz
Algorithm	Hash digest
SHA256	`574a278a15e632ef2fd2c56ff49c56a4f4219e48e892f24f5ef7fbc85dbc9647`
MD5	`d6376520eae01b65893752d549472ab3`
BLAKE2b-256	`5694839e6a2a8da4ca6f167f815122ec3990189696b135625121f1c6fef5ea59`

See more details on using hashes here.

File details

Details for the file hermes_hy_memory-0.1.8-py3-none-any.whl.

File metadata

Download URL: hermes_hy_memory-0.1.8-py3-none-any.whl
Upload date: Jun 8, 2026
Size: 24.7 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.11

File hashes

Hashes for hermes_hy_memory-0.1.8-py3-none-any.whl
Algorithm	Hash digest
SHA256	`ec7d094eaa21acad686db9cc8378ac5966e8117a9d052e84590e6a462c50caeb`
MD5	`eda9eb71360dd83d88b65fbfcbb3f66e`
BLAKE2b-256	`b4bc125ca05b5b6ed799c6d31c699aebdbed180e6e3f996ea480fd4415244edb`

See more details on using hashes here.

hermes-hy-memory 0.1.8

Navigation

Verified details

Maintainers

Unverified details

Meta

Classifiers

Project description

HY Memory Provider for Hermes

How it works

Install

1. Install the package

2. Configure + activate (recommended: the wizard)

3. Verify

Configuration

Environment variables

Processing modes (HY_MEMORY_MODE)

CLI

Hooks

Tools (LLM-invoked, optional)

Troubleshooting

provider not initialized or all hooks silently no-op

prefetch injects nothing

sync_turn doesn't seem to write

cross-loop errors (multi-client deployments)

Comparison with Mem0-style integrations

Development

Versions

Project details

Verified details

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

Processing modes (`HY_MEMORY_MODE`)

`provider not initialized` or all hooks silently no-op

`cross-loop` errors (multi-client deployments)