Postgres + pgvector memory provider plugin for hermes-agent. Multi-agent storage layer with per-minion themes, async writer, no LLM in the memory hot path.
Project description
hermes-memory-pgvector
Postgres + pgvector memory provider for hermes-agent. A shared memory substrate for a fleet of cooperating hermes-agent minions — built on Postgres and a single embedding endpoint you probably already run, with no LLM in the memory hot path.
each minion → X-Hermes-Session-Key: <theme>
→ hermes-agent gateway
→ pgvector plugin
├── memory_entries (mirrors built-in MEMORY.md / USER.md per theme)
└── conversations (every substantive turn, semantically searchable)
Why it exists
Existing memory providers each solve a piece of the problem; the gap for fleet deployments is wide:
- Built-in
memorytool persists to per-hostMEMORY.md/USER.md. Two minions on the same host stomp on each other; minions on different hosts have no shared substrate. - Honcho offers cross-session user modelling but requires a full external service, an LLM in the memory hot path for its deriver + dialectic loops, and its own ontology layered on top of the built-in tool. In high-concurrency fleet use it produces retry storms, embedding-endpoint queue backups, and gateway↔Honcho circular dependencies.
- Holographic is a fine in-process fact store but uses SQLite — a poor fit for many minions writing concurrently from many hosts.
- Other providers (Mem0, Hindsight, OpenViking, ByteRover, RetainDB, Supermemory) all either require a paid cloud, require LLM mediation for memory ops, or both.
What was missing: a storage layer that gives the built-in memory model durable, multi-tenant, semantically-searchable backing, with no LLM in the hot path, scoped cleanly per-minion so a marketing agent's notes don't pollute a trading agent's recall. That's what this plugin provides.
Design philosophy
- Storage layer, not a memory model. The agent keeps using
memory(action='add', target='memory'|'user', …). We mirror those writes viaon_memory_write. No new ontology for the agent to learn. - No LLM in the memory hot path. Embeddings are vector math, not LLM calls. There is no deriver, no dialectic, no dream cycle — the failure modes that hurt Honcho cannot occur here by construction.
- Per-agent themes by default, cross-theme recall on explicit demand. Every row carries
agent_identity(resolved fromX-Hermes-Session-Keyheader, profile name, workspace, or'default'). Recall is scoped to the current theme unless the agent asks forscope='all'. - Fail-soft everywhere. Embed endpoint down → degrade to text-only writes. Async writer queue full → drop with a one-time warning. DB down → log + skip. No exception escapes into the agent loop.
- Admin/runtime separation. DDL (
CREATE EXTENSION vector,CREATE TABLE,CREATE INDEX) runs once with superuser. The runtime user has DML only on the migrated schema.ensure_schema()at runtime is verify-only with a clearSchemaNotAppliederror if the operator forgot the migration.
Features (v0.3.0)
| Hook / surface | Behavior |
|---|---|
initialize() |
Verifies schema, opens psycopg_pool.ConnectionPool, bulk-imports existing MEMORY.md + USER.md content. |
on_memory_write(action, target, content, meta) |
Mirrors built-in memory writes into memory_entries (add / replace / remove). |
sync_turn(user, assistant, session_id) |
Captures every substantive (>= 40 chars + not boilerplate) chat turn into conversations. |
prefetch(query) |
Top-K semantically similar memory_entries in current theme, injected ambient. |
recall_memory(query, scope, target, limit) tool |
Explicit cross-theme search of durable memory entries. |
recall_conversation(query, scope, limit) tool |
Explicit search over past chat turns. scope ∈ {current, session, all, <theme>}. |
Internals:
psycopg_pool.ConnectionPool(min=1, max=4, lazy + thread-safe) shared across the agent thread and the async-writer drain thread.AsyncWriter— bounded queue + daemon drain thread. Memory write hooks return in microseconds. Worker embeds + writes in the background. Crash-resilient (auto-restart on next enqueue).- Single migration (
pgvector/migrations/001_schema.sql) —memory_entries+conversations+ HNSW indexes. Same tuning operators typically use elsewhere. - Boilerplate filter for turn capture — length floor + acknowledgement regex (
"ok","thanks","continue", …) so the recall table stays high-signal.
Multi-agent / per-minion themes
Each systemd-run minion sets one header on its OpenAI client; everything else flows automatically:
client = AsyncOpenAI(
base_url="http://127.0.0.1:8642/v1",
api_key=API_KEY,
default_headers={"X-Hermes-Session-Key": "marketing"}, # ← theme
)
The gateway plumbs X-Hermes-Session-Key through as gateway_session_key=… in MemoryProvider.initialize kwargs. The plugin reads it with priority over the profile default, so agent_identity='default' from unprofiled API traffic does not collapse every minion into one shared scope.
Convention: lowercase, dash-separated, stable. Examples that work well:
marketing,sales,morning-report,incidentintraday-<agent_name>for fan-out workers (e.g.intraday-trading,intraday-sre,intraday-marketing)
Install
Option 1: clone + run the installer script (recommended)
git clone https://github.com/andreab67/hermes-memory-pgvector.git
cd hermes-memory-pgvector
./scripts/install.sh
That:
pip installspsycopg[binary],psycopg-pool,PyYAML(with the upper-bound pins).- Copies
pgvector/into$HERMES_HOME/plugins/pgvector/(defaults to~/.hermes/plugins/pgvector/). - Prints the admin migration + activation commands you run next.
Option 2: manual
# Python deps
pip install 'psycopg[binary]>=3.3.4,<4' 'psycopg-pool>=3.3.1,<4' 'PyYAML>=6.0,<7'
# Plugin module
mkdir -p ~/.hermes/plugins
cp -r pgvector ~/.hermes/plugins/pgvector
Then (admin once)
# Apply the schema migration (CREATE EXTENSION needs superuser)
sudo -u postgres psql -d <your-memory-db> \
-f ~/.hermes/plugins/pgvector/migrations/001_schema.sql
# Hand ownership of the new tables to the hermes runtime role
sudo -u postgres psql -d <your-memory-db> -c "
ALTER TABLE memory_entries OWNER TO hermes;
ALTER SEQUENCE memory_entries_id_seq OWNER TO hermes;
ALTER TABLE conversations OWNER TO hermes;
ALTER SEQUENCE conversations_id_seq OWNER TO hermes;
"
# Activate
hermes config set memory.provider pgvector
sudo systemctl restart hermes.service # or however you run hermes
hermes memory status # expect: Provider: pgvector; Status: available
Configuration
Lives in $HERMES_HOME/config.yaml under plugins.pgvector — every value optional, sensible defaults shown:
plugins:
pgvector:
dsn: "dbname=hermes_memory user=hermes host=/var/run/postgresql"
embed_url: "http://your-embed-endpoint:11434"
embed_model: "nomic-embed-text"
prefetch_limit: 5
min_similarity: 0.30
embed_on_write: true
scope_default: "current"
write_queue_maxsize: 256
bulk_sync_on_init: true
sync_turns: true
turn_min_chars: 40
The embed endpoint can be any OpenAI-compatible /v1/embeddings or Ollama-native /api/embed URL that returns 768-dim vectors (the schema is hard-coded to vector(768) to match nomic-embed-text). Use a different model only if it produces 768-dim output, or edit the migration before applying it.
Schema
CREATE TABLE memory_entries (
id BIGSERIAL PRIMARY KEY,
agent_identity TEXT NOT NULL DEFAULT 'default',
target TEXT NOT NULL CHECK (target IN ('memory', 'user')),
content TEXT NOT NULL,
embedding vector(768),
created_at TIMESTAMPTZ NOT NULL DEFAULT now(),
updated_at TIMESTAMPTZ NOT NULL DEFAULT now(),
metadata JSONB NOT NULL DEFAULT '{}'::jsonb,
UNIQUE (agent_identity, target, content)
);
CREATE TABLE conversations (
id BIGSERIAL PRIMARY KEY,
session_id TEXT NOT NULL,
agent_identity TEXT NOT NULL DEFAULT 'default',
role TEXT NOT NULL CHECK (role IN ('user','assistant','system','tool')),
content TEXT NOT NULL,
ts TIMESTAMPTZ NOT NULL DEFAULT now(),
embedding vector(768),
metadata JSONB NOT NULL DEFAULT '{}'::jsonb
);
Indexes: HNSW on each embedding column (m=16, ef_construction=64) plus per-agent + per-session btree timelines. Full DDL in pgvector/migrations/001_schema.sql.
Tests
pip install -e ".[test]"
# Skip mode (no DB, no embed endpoint): everything skips gracefully
pytest tests/
# Live mode (against a throwaway Postgres + your embed endpoint)
export PG_TEST_DSN='dbname=hermes_test user=postgres host=/var/run/postgresql'
export PG_TEST_EMBED_URL='http://your-embed-endpoint:11434'
pytest tests/
DB tests skip when PG_TEST_DSN is unset; live embed tests skip when PG_TEST_EMBED_URL is unset.
Roadmap
See ROADMAP.md for the full milestone table. Highlights:
- M1 (v0.1, v0.1.1) ✅ Shared storage with per-agent themes, async writer, connection pool, bulk import from
MEMORY.md/USER.md - M2 (v0.2) ✅ Conversation transcript table with
sync_turncapture +recall_conversationtool - M3 (v0.3) ✅ Identity propagation for stateless API minions via
X-Hermes-Session-Key - M4 (v0.4) ⏳
on_delegation()+on_session_end()capture for agent-of-agents observability - M5 (v0.5–v0.6) ⏳ TTL/decay, partial HNSW indexes per-theme, metrics, bulk-import CLI
- M6 (v1.0) ⏳ Stable config schema, full docs, CI coverage
The roadmap exists so the multi-agent positioning isn't a one-off claim — each milestone has to pass the test "does this make N cooperating agents more capable?" before it lands. The What's not on the roadmap section in ROADMAP.md lists what was deliberately rejected (LLM-mediated dialectic, fact-store ontologies, background derivers, in-plugin RBAC) so the boundaries are explicit.
Rollback
hermes config set memory.provider none
sudo systemctl restart hermes.service
# Optional — drop the tables (data loss, irreversible)
sudo -u postgres psql -d <your-memory-db> -c "
DROP TABLE IF EXISTS conversations;
DROP TABLE IF EXISTS memory_entries;
"
# Optional — remove the plugin files
rm -rf ~/.hermes/plugins/pgvector
Why a standalone plugin (not an upstream PR)?
Per the hermes-agent CONTRIBUTING.md:
We are no longer accepting new memory providers into this repo. The set of built-in providers under
plugins/memory/is closed. If you want to add a new memory backend, publish it as a standalone plugin repo that users install into~/.hermes/plugins/(or via a pip entry point).
The discovery system (plugins/memory/__init__.py in hermes-agent) scans $HERMES_HOME/plugins/<name>/ for any directory whose __init__.py calls register_memory_provider. This plugin's pgvector/__init__.py does exactly that — no upstream change required.
Contributing
Bug reports + PRs welcome. Open an issue describing the failure mode + your environment (hermes-agent version, Postgres version, embed endpoint), or a PR with a focused change + test.
License
BSD 3-Clause © 2026 Andrea Borghi.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file hermes_memory_pgvector-0.3.0.tar.gz.
File metadata
- Download URL: hermes_memory_pgvector-0.3.0.tar.gz
- Upload date:
- Size: 27.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
9187e0c92754053dcc53f6941e014a309d426788fc0befbd1756fce1430206c2
|
|
| MD5 |
e6343d1d58d53ffc21773eccd9778e82
|
|
| BLAKE2b-256 |
3aebfc62c1c60446c6ccd78c52b42849c4eb82092a467e7a4ee5db1cdc367b68
|
File details
Details for the file hermes_memory_pgvector-0.3.0-py3-none-any.whl.
File metadata
- Download URL: hermes_memory_pgvector-0.3.0-py3-none-any.whl
- Upload date:
- Size: 27.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4e828b34359054360af025e1afa98cf67518b53629102b22f50ce9f3dc9e7dab
|
|
| MD5 |
c87a689d2f69864e31300afd8f32650f
|
|
| BLAKE2b-256 |
b9bfae5963647ce73850cec67c0864717f8f37022aaf3a004f1f079f41120e93
|