Agent memory that scores 79.8% on LongMemEval (+13pp). Graph-based, local-first, zero server.

These details have not been verified by PyPI

Project links

Project description

Dory

+13pp on LongMemEval. The best Python-native, local-first agent memory library.

pip install dory-memory

from dory import DoryMemory

mem = DoryMemory()
mem.observe("User prefers local-first AI")
mem.observe("User switched from llama.cpp to MLX — 25% faster")

print(mem.query("what does the user prefer for inference?"))
# → MLX (updated preference, supersedes llama.cpp)

Dory gives your agent persistent, structured memory across sessions — with spreading activation retrieval, principled forgetting, and an episodic layer that scored 79.8% on LongMemEval (beats Mem0 68.4% and Zep 71.2%). Zero server. Single SQLite file. Works offline.

The problem

Every time you start a new session, your agent starts from zero. Even systems that claim to "remember" you are doing keyword search through a flat list of notes. That's not memory — that's ctrl+F.

The deeper problem: naive memory injection makes things worse. Dumping everything into context creates noise that degrades model performance. Research (Chroma, 2025) shows all major frontier models degrade starting at 500–750 tokens of context.

What Dory does differently

Four memory types, all in one place

Type	What it stores	Status
Episodic	Past events, sessions, experiences	✓
Semantic	Facts, preferences, entities, relationships	✓
Procedural	Skills, workflows, repeatable processes	✓
Working	In-context window (managed by your LLM)	—

Spreading activation retrieval — not vector similarity search. Relevant memories pull in connected memories through the graph. "AllergyFind" activates "Giovanni's" activates "FastAPI" activates "menu endpoint" because those things co-occurred. That's how human memory works.

Cacheable prefix output — instead of regenerating your full memory context every turn (which blows prompt caching), Dory splits output into a stable prefix (same until memory actually changes) and a dynamic suffix (query-specific). Result: cache hits every turn. 4–10x cheaper to run agents with memory than without.

Principled forgetting — three decay zones: active, archived, expired. Scores based on recency + frequency + relevance. Nothing is ever deleted — archived memories are queryable for historical context. No other production memory library ships this.

Bi-temporal conflict resolution — when a fact changes, the old version is archived with a SUPERSEDES edge and a timestamp. You can query "what was true in January" and get the right answer.

Zero-server stack — everything runs in a single SQLite file. sqlite-vec for vectors, FTS5 for keyword search, adjacency tables for the graph. No Postgres, no Neo4j, no Redis. Works offline.

Quick start

from dory import DoryMemory

# No dependencies required — works out of the box
mem = DoryMemory()

# Add memories manually
mem.observe("Alice is migrating payments from Stripe to a custom processor", node_type="EVENT")
mem.observe("Alice prefers async Python over synchronous frameworks", node_type="PREFERENCE")
mem.observe("The migration deadline is end of Q2", node_type="EVENT")

# Query — returns context to inject into your LLM prompt
context = mem.query("payment migration deadline")
print(context)

# End of session: consolidate, decay, promote core memories
mem.flush()

# See your graph in the browser
mem.visualize()

Or from the command line after any session:

dory visualize          # opens graph in browser
dory show               # print stats + core memories
dory query "topic"      # spreading activation from the terminal

With auto-extraction (add a model and Dory extracts memories from conversation turns automatically):

mem = DoryMemory(extract_model="qwen3:14b")                 # local via Ollama
mem = DoryMemory(                                           # Claude
    extract_model="claude-haiku-4-5-20251001",
    extract_backend="anthropic",
    extract_api_key="sk-ant-...",
)
mem = DoryMemory(                                           # GPT / Grok / any compat
    extract_model="gpt-4o-mini",
    extract_backend="openai",
    extract_api_key="sk-...",
)

# Log turns — extraction happens automatically every N turns
mem.add_turn("user", "I'm working on AllergyFind today, need to add a menu endpoint")
mem.add_turn("assistant", "What authentication approach are you using?")

# Build API-ready messages with prompt caching
result = mem.build_context("menu endpoint authentication")
messages = result.as_anthropic_messages(user_query)   # Anthropic SDK w/ cache_control
messages = result.as_openai_messages(user_query)      # OpenAI / compat

MCP server (Claude Code / Claude Desktop)

pip install 'dory-memory[mcp]'

# Register globally across all Claude Code projects
claude mcp add --scope user dory -- dory-mcp

# Or with a specific DB path
claude mcp add --scope user dory -- dory-mcp --db /path/to/engram.db

Five tools are exposed: dory_query, dory_observe, dory_consolidate, dory_visualize, dory_stats.

Interactive demo

Live graph visualization →

Dory memory graph demo

Force-directed knowledge graph with spreading activation query mode, edge type coloring, archived/superseded nodes, and session summary chain. Click any of the pre-set queries to see retrieval in action.

Framework adapters

LangChain — drop-in BaseMemory replacement:

from dory.adapters.langchain import DoryMemoryAdapter
from langchain.chains import ConversationChain
from langchain_anthropic import ChatAnthropic

memory = DoryMemoryAdapter(
    extract_model="claude-haiku-4-5-20251001",
    extract_backend="anthropic",
    extract_api_key="sk-ant-...",
)
chain = ConversationChain(llm=ChatAnthropic(model="claude-sonnet-4-6"), memory=memory)

LangGraph — graph nodes with the (state) -> state signature:

from dory.adapters.langgraph import DoryMemoryNode, MemoryState
from langgraph.graph import StateGraph, START, END

mem = DoryMemoryNode(extract_model="claude-haiku-4-5-20251001", extract_backend="anthropic")

builder = StateGraph(MemoryState)
builder.add_node("load_memory", mem.load_context)   # or mem.aload_context for async
builder.add_node("record_turn", mem.record_turn)
builder.add_edge(START, "load_memory")
builder.add_edge("load_memory", "record_turn")
builder.add_edge("record_turn", END)
graph = builder.compile()

Multi-agent — shared memory pool with thread-safe writes and agent attribution:

from dory.adapters.multi_agent import SharedMemoryPool

pool = SharedMemoryPool(db_path="shared.db")
pool.observe("User prefers dark mode", agent_id="agent-1")
pool.add_turn("user", "Let's ship it", agent_id="agent-2", session_id="s1")
results = pool.query("UI preferences")
agent_nodes = pool.get_agent_nodes("agent-1")

Async API

All DoryMemory methods have async counterparts — safe to await from FastAPI, LangGraph, and any async framework:

context = await mem.aquery("current topic")
result  = await mem.abuild_context("current topic")
await mem.aadd_turn("user", "message")
node_id = await mem.aobserve("User prefers JWT", node_type="PREFERENCE")
stats   = await mem.aflush()

Export / import

from dory.export.jsonld import JSONLDExporter

exporter = JSONLDExporter(graph)
exporter.export("memory.jsonld.json")           # write to file
data = exporter.export()                         # or get dict

JSONLDExporter.import_into(graph, "memory.jsonld.json")   # round-trip import

Advanced: direct pipeline access

from dory import Graph, Observer, Prefixer

graph = Graph("myapp.db")
obs = Observer(graph, backend="ollama", model="qwen3:14b")
p = Prefixer(graph)
# ... same as DoryMemory but with full control

How it works

Knowledge graph

Every piece of information is a node. Nodes have types: ENTITY, CONCEPT, EVENT, PREFERENCE, BELIEF, PROCEDURE, SESSION (episodic narrative), SESSION_SUMMARY (structured episodic with salient_counts). Edges between them are typed and weighted: USES, WORKS_ON, PREFERS, SUPERSEDES, CO_OCCURS, SUPPORTS_FACT, TEMPORALLY_AFTER, etc.

Salience is computed, not assigned:

salience = α × connectivity + β × activation_frequency + γ × recency

High-salience nodes become core memories — they anchor the stable context prefix.

Observer

Every N conversation turns (configurable), the Observer calls a local LLM to extract structured memories from the raw conversation. Extractions have confidence scores — anything below the threshold is logged but not written to the graph, guarding against false memory.

Backends: Ollama (default), Anthropic (Claude), or any OpenAI-compatible endpoint (llama.cpp, Clanker, vLLM, GPT, Grok, etc.).

Prefixer

Builds context in two parts:

[stable prefix]         ← core memories + key relationships
                          same bytes across turns → prompt cache hits

[dynamic suffix]        ← spreading activation for this specific query
                          + recent episodic observations
                          changes per query but small

Decayer

Runs periodically to score every node:

score = recency_weight  × exp(-λ × days_since_activation)
      + frequency_weight × log(1 + activation_count)
      + relevance_weight × salience

Nodes below the active floor → archived. Below the archive floor → expired. Core memories are shielded with a configurable multiplier.

Reflector

Finds near-duplicate nodes (Jaccard similarity ≥ 0.82, empirically tuned), merges them keeping the higher-salience one. Detects supersession — same subject, newer fact, Jaccard in [0.45, 0.82) — archives the old node, and adds a SUPERSEDES provenance edge. Old observations are compressed into summaries. Dedup thresholds are practical defaults chosen conservatively; sensitivity analysis is planned.

Architecture

dory/
├── graph.py          ← nodes, edges, salience computation
├── schema.py         ← NodeType, EdgeType, zone constants
├── activation.py     ← spreading activation engine
├── consolidation.py  ← edge decay, strengthen, prune, promote/demote core
├── session.py        ← session-level helpers: query, observe, write_turn, end_session
├── memory.py         ← DoryMemory — the high-level drop-in API (sync + async)
├── visualize.py      ← D3.js interactive graph visualization
├── mcp_server.py     ← MCP tools (dory_query, dory_observe, dory_consolidate, …)
├── store.py          ← SQLite backend (nodes, edges, FTS5, observations)
│
├── pipeline/
│   ├── observer.py   ← LLM extraction of memories from conversation turns
│   ├── summarizer.py ← episodic layer: SESSION nodes from conversation turns
│   ├── prefixer.py   ← stable prefix + dynamic suffix builder
│   ├── decayer.py    ← node decay scoring + zone management
│   └── reflector.py  ← dedup, supersession, observation compression
│
├── adapters/
│   ├── langchain.py   ← DoryMemoryAdapter — LangChain BaseMemory drop-in
│   ├── langgraph.py   ← DoryMemoryNode — LangGraph StateGraph nodes
│   └── multi_agent.py ← SharedMemoryPool — thread-safe multi-agent memory
│
└── export/
    └── jsonld.py      ← JSONLDExporter — portable JSON-LD round-trip

Local LLM setup

Dory defaults to Ollama for LLM-based extraction (Observer) and embedding (vector search).

# Pull the default models
ollama pull qwen3:14b          # extraction
ollama pull nomic-embed-text   # embeddings (768-dim, offline after pull)

OpenAI-compatible endpoint (Clanker, llama.cpp server, vLLM):

obs = Observer(
    graph,
    backend="openai",
    base_url="http://localhost:8000",
    model="qwen3",
)

Vector search activates automatically once nomic-embed-text is available. Falls back to FTS5 BM25 + substring search if no embedding model is running.

Decay zones

Zone	Behavior	How to query
`active`	Retrieved in all normal queries	`graph.all_nodes()` (default)
`archived`	Invisible to normal queries	`graph.all_nodes(zone="archived")`
`expired`	Completely invisible	`graph.all_nodes(zone=None)`

User-meaningful memory is never deleted by forgetting — archived and expired nodes retain full provenance and can be restored if reactivated. The one exception: exact structural duplicates detected by the Reflector are hard-merged (the lower-salience copy is removed, all its edges are rewired to the winner).

What's different from other memory libraries

	mem0	Zep	Letta	Mastra	Dory
Principled forgetting	✗	✗	✗	✗	✓
Spreading activation retrieval	✗	✗	✗	✗	✓
Cacheable prefix output	✗	✗	✗	✓ (TS only)	✓
Bi-temporal conflict resolution	✗	✓	✗	✗	✓
Zero-server local stack	partial	✗	partial	✗	✓
Drop-in Python library	✓	partial	✗	✗	✓
Apache 2.0	✓	✓	✓	✓	✓

Roadmap

Shipped (v0.1)

MCP server — expose Dory memory as MCP tools for Claude Code / Claude Desktop
LangChain adapter — dory.adapters.langchain.DoryMemoryAdapter implements BaseMemory
LangGraph adapter — dory.adapters.langgraph.DoryMemoryNode for StateGraph integration
Procedural memory — PROCEDURE node type for skills, workflows, and repeatable processes
Multi-agent shared memory — dory.adapters.multi_agent.SharedMemoryPool with thread-safe writes and agent attribution
Portable import/export format — dory.export.jsonld.JSONLDExporter for JSON-LD round-trips

Shipped (v0.2)

Episodic layer — SESSION_SUMMARY nodes with structured salient_counts metadata
Retrieval fusion — three-mode routing (graph / episodic / hybrid) via deterministic regex, no extra LLM calls
Staged retrieval — spreading activation → SUPPORTS_FACT traversal → SESSION_SUMMARY injection
Behavioral preference synthesis — Reflector detects repeated behavioral patterns across sessions and synthesizes PREFERENCE nodes without LLM calls

Shipped (v0.3)

Full 500-question LongMemEval run — 79.8% Sonnet/Sonnet (+13.0pp over v0.1)
Temporal arithmetic prompt — step-by-step date math before answering
Count cross-validation — salient_counts verified against EVENT nodes, low-confidence flagged
Behavioral preference synthesis — Reflector synthesizes PREFERENCE nodes from repeated patterns

In progress (v0.4)

Preference inference — targeted improvement on single-session-preference (currently 46.7%)
Graph topology demo — demo_topology.py showing provenance / evolution queries flat systems can't answer
S-split benchmark — longer sessions (~115K tokens), better test of spreading activation value
Production hardening — concurrent write safety, adversarial memory injection defense

Research basis

Dory draws from:

MemGPT: Towards LLMs as Operating Systems — two-tier memory architecture
Zep: A Temporal Knowledge Graph Architecture — bi-temporal provenance
MAGMA: Multi-Graph based Agentic Memory — multi-graph retrieval
Mastra Observational Memory — cacheable prefix architecture (Python port)

LongMemEval (ICLR 2025) — the benchmark we care about. Published scores: Mem0 68.4%, Zep 71.2%, Mastra 94.87%¹.

Version	Extract	Answer	Questions	Score	Notes
v0.1	Haiku	Haiku	500 (full)	54.4%	Baseline
v0.1	Sonnet	Sonnet	500 (full)	66.8%
v0.3	Haiku	Haiku	40 (spot check)	67.5%	Episodic hybrid, spot check
v0.3	Sonnet	Sonnet	500 (full)	79.8%	Episodic hybrid, full run

Category breakdown (v0.3 Sonnet, 500q):

Category	v0.1 Sonnet	v0.3 Sonnet	Δ
temporal-reasoning	46.6%	75.9%	+29.3pp
knowledge-update	75.6%	84.6%	+9.0pp
multi-session	70.7%	80.5%	+9.8pp
single-session-assistant	82.1%	87.5%	+5.4pp
single-session-user	85.7%	88.6%	+2.9pp
single-session-preference	43.3%	46.7%	+3.3pp
Overall	66.8%	79.8%	+13.0pp

¹ Mastra uses GPT-4o-mini (TypeScript). Dory uses Claude on Python. Architecturally different stacks — not directly comparable. See ablation study for component attribution.

Disclaimer: LongMemEval oracle split uses pre-filtered context (~15K tokens per question). Production performance with live, noisy, unfiltered conversations will differ.

Collins & Loftus (1975) — spreading activation in semantic memory
Hebb (1949) — neurons that fire together wire together
Hopfield (1982) — Neural networks and physical systems with emergent collective computational abilities — statistical mechanics of associative memory; energy landscape formulation underlying spreading activation (Nobel Prize in Physics, 2024)

License

Apache 2.0 — see LICENSE.

Named after Dory from Finding Nemo, because your AI agent right now is Dory. This fixes it.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.9.3

Apr 16, 2026

0.9.2

Apr 16, 2026

0.9.1

Apr 16, 2026

0.9.0

Apr 15, 2026

0.8.1

Apr 8, 2026

0.8.0

Apr 8, 2026

0.7.0

Apr 5, 2026

0.6.1

Mar 30, 2026

0.6.0

Mar 30, 2026

0.5.0

Mar 28, 2026

0.4.0

Mar 26, 2026

0.3.8

Mar 22, 2026

0.3.7

Mar 22, 2026

0.3.6

Mar 22, 2026

0.3.5

Mar 22, 2026

0.3.4

Mar 22, 2026

0.3.3

Mar 22, 2026

0.3.2

Mar 20, 2026

This version

0.3.1

Mar 20, 2026

0.3.0

Mar 20, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dory_memory-0.3.1.tar.gz (88.8 kB view details)

Uploaded Mar 20, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

dory_memory-0.3.1-py3-none-any.whl (77.5 kB view details)

Uploaded Mar 20, 2026 Python 3

File details

Details for the file dory_memory-0.3.1.tar.gz.

File metadata

Download URL: dory_memory-0.3.1.tar.gz
Upload date: Mar 20, 2026
Size: 88.8 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for dory_memory-0.3.1.tar.gz
Algorithm	Hash digest
SHA256	`5570f903ec9d4eee65ae770fc0676c5c440988377b8ccd00cf1229b262c5ff41`
MD5	`d0288c7ed107075fa1fc2a2043ad743f`
BLAKE2b-256	`70ac9ff1829cb710139b0fa5a94170058f12966fd353998b9ac0688b3bb6ad47`

See more details on using hashes here.

File details

Details for the file dory_memory-0.3.1-py3-none-any.whl.

File metadata

Download URL: dory_memory-0.3.1-py3-none-any.whl
Upload date: Mar 20, 2026
Size: 77.5 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for dory_memory-0.3.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`5846fdcc6fdec1fb96be110ab07b2e6122d0643ea80c98fd6077f03c000c82bd`
MD5	`a41a7a4deb24677663370799a88e3d75`
BLAKE2b-256	`6627a52e9d4611fff987ca6ab709ab44da59c97cee2a3148609224bc73da7385`

See more details on using hashes here.

dory-memory 0.3.1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Dory

The problem

What Dory does differently

Quick start

MCP server (Claude Code / Claude Desktop)

Interactive demo

Framework adapters

Async API

Export / import

Advanced: direct pipeline access

How it works

Knowledge graph

Observer

Prefixer

Decayer

Reflector

Architecture

Local LLM setup

Decay zones

What's different from other memory libraries

Roadmap

Research basis

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes