Skip to main content

Zeno memory protocols, context-bound handles, and SQLite defaults.

Project description

zeno-memory

Memory protocols, per-turn view handles, and SQLite-backed defaults for the Zeno framework.

Install

uv add 'zeno-framework[memory]'

What's in here

  • Protocols: SessionStore, UserMemoryStore, KnowledgeStore, ConversationStore, WorkingMemoryStore, ObservationStore, VectorStore.
  • Per-turn handles: MemoryView, UserMemoryView, KnowledgeView, ConversationHandle, SessionHandle, WorkingMemoryView, ObservationView.
  • SQLite defaults: SqliteSessionStore, SqliteUserMemoryStore, SqliteConversationStore, SqliteWorkingMemoryStore, SqliteObservationStore.
  • Composer: ThreeLayer (session + user memory + knowledge + conversation + working memory + observation log).
  • Adapter: VectorBackedUserMemoryStore — wraps any KnowledgeStore to make user memory semantically searchable.
  • L4 actors: Observer, Reflector, ObservationalMemory (orchestrator wired via ZenoApp(observational_memory=...)).

Wiring

from pathlib import Path

from zeno.app import ZenoApp
from zeno.memory import ThreeLayer
from zeno.memory.sqlite.conversation_store import SqliteConversationStore
from zeno.memory.sqlite.session_store import SqliteSessionStore
from zeno.memory.sqlite.observation_store import SqliteObservationStore
from zeno.memory.sqlite.user_memory_store import SqliteUserMemoryStore
from zeno.memory.sqlite.working_memory_store import SqliteWorkingMemoryStore

data = Path.home() / ".zeno"
data.mkdir(parents=True, exist_ok=True)

# `knowledge=` is a `KnowledgeStore` from a vector backend — see
# `zeno-chroma` or `zeno-qdrant` for concrete adapters. The example below
# uses `SqliteUserMemoryStore` for user memory; swap in
# `VectorBackedUserMemoryStore(knowledge_store=...)` to make user memory
# semantically searchable.
memory = ThreeLayer(
    session=SqliteSessionStore(data / "sessions.db"),
    user_memory=SqliteUserMemoryStore(data / "user_memory.db"),
    knowledge=...,  # ChromaKnowledgeStore | QdrantKnowledgeStore | …
    conversation=SqliteConversationStore(data / "conversations.db"),
    working_memory=SqliteWorkingMemoryStore(data / "working_memory.db"),
    observation_log=SqliteObservationStore(data / "observations.db"),
)

app = ZenoApp(agent=..., memory=memory, channels=..., provider=...)

ZenoApp calls memory.view_for(user_id=, channel=, thread_key=, agent_id=) once per turn and binds the resulting MemoryView into Ctx. Tools use ctx.memory.user, ctx.memory.knowledge, ctx.memory.session, ctx.memory.conversation, and ctx.memory.working_memory without ever seeing the underlying store.

Choosing a UserMemoryStore

Store Use when
SqliteUserMemoryStore You want exact-match recall on stored facts. Cheap, no embedding model required.
VectorBackedUserMemoryStore(knowledge_store=…) You want semantic recall (e.g. "what did the user say about their job?"). Reuses the same vector backend you wired for knowledge.

Both implement the same UserMemoryStore protocol — swap freely without changing tool code.

Memory tiers

Every user-memory row carries a tiershort / long / permanent — that governs its lifecycle:

Tier Default lifetime Promoted when
short Archived after 7 idle days Hit ≥ 2 times → long
long Archived after 30 idle days Hit ≥ 5 times → permanent
permanent Never archived (terminal)

Tier choice is explicit at write time:

# tentative observation — easily decayed
await ctx.memory.user.add("alice mentioned a coffee shop", tier="short")

# default — promoted to permanent on repeated recall
await ctx.memory.user.add("alice's dog is named Otto")

# pinned fact — never decayed
await ctx.memory.user.add("alice's birthday is march 14", tier="permanent")

The remember built-in tool accepts the same tier argument so the model can choose the right bucket itself.

Observational memory (L4)

Observational memory replaces the legacy MemoryExtractor + MemoryConsolidator + MemoryMaintenance trio with two LLM-driven actors that share the observation_log feed:

  • Observer runs after a turn whose unprocessed conversation window crosses a token threshold (default 8192). It produces a small batch of dated, priority-tagged observations from the new rounds.
  • Reflector wakes when the active observation set crosses its own token threshold (default 16384). It proposes structured edits (Merge / Replace / Delete) that compress the feed into denser observations without losing high-priority detail.

ObservationalMemory(observer=…, reflector=…, conversations=…, observations=…) owns the lifecycle for both. Wire it through ZenoApp(observational_memory=…) and the framework auto-registers a post-turn hook plus the rendered ## Observations block.

from zeno.app import ZenoApp
from zeno.memory import ObservationalMemory, Observer, Reflector

om = ObservationalMemory(
    observer=Observer(observations=observations, llm=llm_proposer),
    reflector=Reflector(observations=observations, llm=llm_reflector),
    conversations=conversations,
    observations=observations,
)

app = ZenoApp(
    agent=agent,
    memory=memory,
    channels=[...],
    provider=...,
    observational_memory=om,
)

The orchestrator's priority_markers="emoji" (default) renders rows as 🔴 / 🟡 / 🟢; switch to "tokens" for [high] / [med] / [low] if you'd rather not ship emoji glyphs in the prompt.

When observational_memory= is wired, auto_inject flips its enabled_user default to False — the L4 block carries the durable user-facing context the model needs each turn, and re-injecting top-k user-memory hits on top of it tends to repeat itself. Pass an explicit auto_inject=AutoInjectConfig(enabled_user=True, ...) to opt back in.

Deprecated: MemoryExtractor, MemoryConsolidator, MemoryMaintenance

The pre-L4 trio still imports cleanly but emits a DeprecationWarning from __init__ and will be removed in the next minor release. See docs/MIGRATION.md for concrete before/after snippets.

Recall vs auto-inject — when does the model see a fact?

Two complementary surfaces expose user memory to the LLM:

Surface Triggered by Best for
Auto-inject (zeno-core middleware) Every turn — runs before the handler builds the prompt. Searches ctx.memory.user (and optionally ctx.memory.knowledge) with the inbound text and prepends the top-k hits to system. Background facts the user expects the model to "just know" — name, location, ongoing projects. The model never has to ask.
recall tool (opt-in) The model decides — usually when auto-inject didn't surface what it needs (different phrasing, deeper search, explicit knowledge lookup). On-demand lookups the model knows it needs. The tool call is visible in the trace, so it's auditable.

Both go through the same ctx.memory view, so they see the same rows and the same tier filter. Auto-inject is governed by AutoInjectConfig (k, distance_threshold, per-store toggles); recall accepts an explicit k and a memory="user" | "knowledge" selector.

Rule of thumb: turn auto-inject on for low-friction defaults; expose recall (and remember) as tools so the model can extend the same state when auto-inject misses.

Working memory (L2)

Working memory is a per-(user_id, agent.name) typed key-value scratchpad pinned to the system prompt. It complements similarity-based recall: durable identity facts (name, role, communication style, current focus) live in a stable card the model always sees, instead of depending on whether the embedding happens to retrieve them this turn.

Declare a Pydantic schema (every field str | None) and attach it to the agent — the framework auto-wires the update_working_memory tool and renders a ## Working memory block at the top of every system prompt:

from pydantic import BaseModel
from zeno.agent import Agent


class UserCard(BaseModel):
    name: str | None = None
    role: str | None = None
    communication_style: str | None = None
    current_focus: str | None = None


agent = Agent(
    name="root",
    instructions="You are a helpful personal assistant.",
    working_memory_schema=UserCard,
)

The model writes via the auto-wired tool:

update_working_memory(name="Niels", communication_style="concise, no emoji")

…and on the next turn sees:

## Working memory
- name: Niels (updated 2026-04-29)
- role: (unknown)
- communication_style: concise, no emoji (updated 2026-04-29)
- current_focus: (unknown)

Empty string clears a field; explicit None means "don't touch".

Namespacing rule. Working memory is keyed by (user_id, agent.name). Renaming an agent creates a new namespace — the old card data does not migrate. Treat agent names as stable identifiers.

Storage default. SqliteWorkingMemoryStore(path) — own SQLite file, last-write-wins, owner-only file perms. Migration 007_working_memory provisions the table on first connect.

When to call update_working_memory vs remember. Use update_working_memory for the structured fields the developer declared on the schema (durable identity facts the agent should always see). Use remember for free-form facts that don't fit the card.

ConversationStore (provider portability)

Non-Claude providers (e.g. OpenAIProvider) write each turn's assistant, tool, and user messages through ConversationStore so the next turn has prior context. ClaudeSDKProvider does not use it — the SDK owns its own session history.

Vector backends

Concrete KnowledgeStore adapters live in sibling packages:

See also: zeno-core for Ctx, @tool, and the MemoryBinderProtocol.

Part of the Zeno framework.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

zeno_memory-1.1.0.tar.gz (86.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

zeno_memory-1.1.0-py3-none-any.whl (69.9 kB view details)

Uploaded Python 3

File details

Details for the file zeno_memory-1.1.0.tar.gz.

File metadata

  • Download URL: zeno_memory-1.1.0.tar.gz
  • Upload date:
  • Size: 86.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for zeno_memory-1.1.0.tar.gz
Algorithm Hash digest
SHA256 060c48e7c2c3b7f6594ed4586e88919878b4e8f95a19f10f7749a4ddd3556e49
MD5 3096b9f2ebfc86d00f09db5832af9974
BLAKE2b-256 3486bea1e25f679f52822fc03eac59fc615b4126ef50d409fafb5d7b2c35a0f4

See more details on using hashes here.

Provenance

The following attestation bundles were made for zeno_memory-1.1.0.tar.gz:

Publisher: publish.yml on nkootstra/zeno

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file zeno_memory-1.1.0-py3-none-any.whl.

File metadata

  • Download URL: zeno_memory-1.1.0-py3-none-any.whl
  • Upload date:
  • Size: 69.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for zeno_memory-1.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 7913e9a56f12081568f1bc4c8f33b9222d7f0308471640097961a2549ddc4d77
MD5 ec6dde00d8436a99607e75a85c56f04a
BLAKE2b-256 95304d388e1ce2b9ec679849a12640a2097cdef1609e07b994264a5ff0f10535

See more details on using hashes here.

Provenance

The following attestation bundles were made for zeno_memory-1.1.0-py3-none-any.whl:

Publisher: publish.yml on nkootstra/zeno

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page