deep-memory

Open-source persistent memory layer for AI agents: Chinese-first, cross-session, forgetting-aware, conflict-aware.

These details have not been verified by PyPI

Project description

deep-memory

It remembers the thing you approved. Everything else has to earn the right.

A local memory notebook for agents that forget too much, remember too much, and rarely ask permission.

deep-memory is the memory layer I wanted while wiring real agents together: boring SQLite on this machine, explicit durable facts and reusable procedures, scoped recall, and a delete button that actually means delete. No hidden cloud state. No transcript hoarding. No mysterious "memory" you have to trust because the UI says so.

English · 简体中文

Python License Status

75/75 Chinese retrieval evals · 20/20 bilingual memory tasks · local SQLite · explicit writes · scoped recall

Quickstart · Agent install guide · Talk to your agent · Connect your agent · Benchmarks & evals · Safety & privacy

You know the feeling. The agent confidently forgets the rule you gave it yesterday. Then, somehow, it remembers a half-wrong preference from three experiments ago and drags it into a new repo.

deep-memory is for people who like agents, but do not want agent memory to become folklore.

Choose the path that fits what you need

If you are...	Start here	What you get
An agent operator who wants the fastest install	Quickstart	A machine-local database, one test memory, and a successful retrieval
An AI agent installing this for a user	Talk to your agent	A direct task prompt plus a verification checklist
Connecting Claude Code, Hermes, Codex, or OpenCode	Connect your agent	MCP or wrapper setup against one shared local DB
Evaluating whether the claims are real	Evidence, not magic	Checked-in evals, benchmark baselines, and reproduction commands
Checking the safety boundary before rollout	Safety boundary	Explicit write rules, scope boundaries, and destructive controls
Inspecting how the system is built	Architecture	Mechanism, storage model, retrieval path, and extension surface

Before / after

Without a shared memory layer, every agent has the same bad habit: it either forgets the thing you explained yesterday, or it remembers something somewhere you cannot inspect. You repeat preferences, repo conventions, safety rules, and all the tiny "please don't do that again" corrections. Then next week you still have no idea what stuck.

With deep-memory, a useful convention becomes a scoped, reviewable record:

project:deep-memory  procedural  "Run uv run pytest -q before review"
user:ben             semantic    "User prefers concise answers with English technical terms"

Claude Code, Codex, OpenCode, and Hermes can pull the same bounded context before work. You can inspect it in the CLI or local WebUI, edit it, export it, soft-delete it, or hard-delete it when a memory should stop steering future behavior.

Why this exists

I do not want agents to become more "personalized" by quietly accumulating invisible state. That is just a new place for bugs to hide.

I want memory to behave like infrastructure: local by default, inspectable when something feels off, scoped narrowly enough to avoid leaking context between projects, and testable enough that "supports Chinese" means an eval passed, not a badge on a README.

Most agent memory fails in one of two ways: it forgets everything useful between sessions, or it remembers too much in a place the user cannot inspect. Both are bad substrates for serious work.

deep-memory is built around a narrower mechanism:

Cross-agent continuity. One shared memory layer for Claude Code, Codex, OpenCode, and Hermes, so useful conventions do not have to be re-taught from scratch.
Inspectable by default. Read, edit, export, soft-delete, hard-delete, and audit records through the CLI, Python SDK, or local WebUI.
Machine-local governance. One local SQLite store can be shared across agents, while scope and scope_id keep records bounded to global, user, tenant, workspace, or project namespaces.
Chinese retrieval as a first-class path. FTS5 plus local Chinese/English token fallback, optional jieba, and checked-in Chinese retrieval fixtures make the claim measurable instead of decorative.
Memory to skill candidate export. Procedural memory can be exported as reviewable skill candidates, so successful workflows become auditable artifacts instead of silently turning into behavior rules.
Regression-tested retrieval. Checked-in evals cover Chinese retrieval, bilingual memory/no-memory tasks, and the core CLI/SDK behavior.

Persistent agent memory is powerful precisely because it changes future behavior. deep-memory keeps the mechanism narrow: store durable facts and reusable procedures, keep them local and inspectable, retrieve only relevant context, and make deletion and policy boundaries explicit.

How it works

Before an agent relies on memory, it walks a short governance path:

1. Is this durable?        → no: do not store it
2. Is it safe to retain?   → secrets / raw private transcripts / temporary status are denied
3. What scope is narrowest?→ project, workspace, user, tenant, or global
4. Can a human inspect it? → CLI, Python SDK, local WebUI, export, audit
5. Is it still useful?     → edit, deprecate, resolve conflict, or delete
6. Is it procedural?       → export as a reviewable skill candidate, not an auto-installed rule

That control surface matters because a useful memory layer needs more than storage:

Inspectability. Humans should be able to see what an agent will carry forward.
Deletion. Wrong, stale, private, or unsafe records must be removable, not merely hidden by ranking.
Scoping. Machine-level memory should stay relevant through fixed scopes plus explicit namespace IDs.
Regression tests. Retrieval quality and safety boundaries should be checked with fixtures, not asserted in copy.

More than an MCP server

MCP is a connection protocol, not the product boundary. deep-memory-mcp is one adapter into deep-memory; it is not the whole system.

The product shape is deliberately broader and still machine-local, scoped, and inspectable:

Memory substrate: a local SQLite store with durable semantic, episodic, and procedural records, fixed scope, and explicit scope_id namespaces.
CLI and Python SDK: direct ways to initialize, write, search, audit, export, delete, and embed memory behavior without requiring MCP.
MCP server: a protocol bridge for MCP-native agents that should point at the same local database, not a separate memory product.
WebUI: a local inspection and editing surface for records that should not remain hidden inside agent context.
Adapters and wrappers: agent-specific integration paths for tools that do not yet speak MCP cleanly.
Skill layer: procedural memories can become reviewable skill candidates after evidence and review, so repeated workflows can be promoted into installable skills without silently changing future behavior.

The last layer is intentionally review-first. deep-memory can export candidate skill markdown from verified procedural memory, but it should not auto-install skills into an agent profile. Automatic installation would turn memory into hidden behavioral drift: the agent would start acting differently because a stored procedure crossed an invisible boundary. The safe loop is evidence → candidate → reviewer gate → explicit installation through the normal skill-management path.

Screenshots

deep-memory WebUI memory inspector screenshot

Architecture

deep-memory architecture

Quickstart

If you want the fastest path, ask your coding agent:

Install deep-memory for this machine, set it up as a shared memory layer across my agent tools, and verify that you can write and retrieve a durable memory.

Agent checklist:

Read docs/AGENT_INSTALL_GUIDE.md.
Inspect the machine-readable install contract at docs/agent-install.json.
Install dependencies and initialize a machine-local memory store.
Connect your agent through MCP or a wrapper.
Verify that one durable memory can be written and retrieved.
Report which scopes you configured (for example: global, workspace, or project).

Prerequisites

Python 3.10–3.12 (requires-python = >=3.10,<3.13 in pyproject.toml)
uv for environment sync and command execution
Package/release maintainers: see docs/PACKAGING.md for the wheel, uv tool, GitHub Release, and PyPI readiness checklist.
Optional extras depending on what you want:
- dev: pytest + ruff for local verification
- mcp: the deep-memory-mcp server for MCP-native agent integrations
- retrieval: optional jieba tokenizer for higher-fidelity Chinese segmentation

Install uv if you do not already have it:

curl -LsSf https://astral.sh/uv/install.sh | sh

Zero-config first run

If you just want to prove the mechanism works before wiring any agent, this path gives you one local DB, one durable memory, and one retrieval roundtrip:

uv sync --extra dev --extra mcp
uv run deep-memory verify-install ~/.deep-memory/deep-memory.db --json
uv run deep-memory add ~/.deep-memory/deep-memory.db \
  "User wants agents to use deep-memory as shared persistent memory" \
  --kind semantic \
  --scope project \
  --scope-id deep-memory \
  --importance 0.8
uv run deep-memory search ~/.deep-memory/deep-memory.db "shared persistent memory" \
  --scope project \
  --scope-id deep-memory

Expected CLI shape:

$ uv run deep-memory search ~/.deep-memory/deep-memory.db "shared persistent memory" --scope project --scope-id deep-memory
score=...  scope=project  scope_id=deep-memory  kind=semantic  content=User wants agents to use deep-memory as shared persistent memory

This is the core loop: install one machine-local memory store, let agents share it, and keep records bounded with a fixed scope plus a custom scope_id.

If you are an AI agent installing this for a user, follow docs/AGENT_INSTALL_GUIDE.md — it covers MCP setup, wrappers, safe memory-write rules, and how to ask the user before starring the repo.

Talk to your agent

If you prefer a direct handoff, copy one of these prompts.

Fastest installation prompt

Install deep-memory for this machine, connect it to my agent tools through MCP or a wrapper, and verify that you can write and retrieve one durable memory. Tell me which scope layout you chose and why.

Shared-agent rollout prompt

Set up deep-memory as a shared machine-local memory layer for Claude Code, Codex, OpenCode, and Hermes. Use the same SQLite database for every tool, keep memory writes explicit, and show me the exact retrieval test you ran.

Safety-first evaluation prompt

Evaluate whether deep-memory fits my workflow. Check the safety boundary, scoping model, deletion path, and benchmark evidence before you install anything, then recommend a rollout plan.

Project-scope memory prompt

Connect deep-memory to this repo and keep retrieval bounded to project scope. Before work, search for this repository's conventions; after verified success, write back only durable project-specific facts or procedures and show me the exact records you added.

Procedural-memory-to-skill prompt

Use deep-memory to capture one successful workflow from this task as procedural memory, then export it as a reviewable skill candidate instead of auto-installing it. Show me the exported artifact and explain why it should stay review-first.

Evidence, not magic

These checks are intentionally modest. They are internal evals and regressions, not a claim that memory is solved.

Evaluation	Current checked-in result	Reproduce
Chinese retrieval v1	55/55 with the default local backend; 55/55 with optional `jieba`; earlier plain SQLite FTS baseline was 24/55	`uv run python evals/chinese_retrieval_eval.py --data evals/data/zh_memory_retrieval.jsonl`
Chinese retrieval v2	20/20 harder multi-memory cases with distractors; local top-1 accuracy 1.0 and MRR 1.0 in this checked-in baseline	`uv run python evals/chinese_retrieval_eval.py --data evals/data/zh_memory_retrieval_v2.jsonl --json`
Memory benchmark v0	20 bilingual tasks; no-memory baseline 0/20; `deep-memory` should pass at least 16/20 in tests and usually 20/20 with the default retrieval limit	`uv run python benchmarks/memory_benchmark.py`
Test suite	Core behavior, policy, import/export, CLI paths, and regressions are covered by pytest and CI	`uv run pytest -q`

Details: docs/CHINESE_RETRIEVAL_EVAL.md, docs/MEMORY_BENCHMARK.md.

Connect your agent

Use MCP when your agent supports it. Use a wrapper when it does not. Either way, point every tool at the same machine-local database, then rely on scope to keep records relevant:

~/.deep-memory/deep-memory.db

For custom MCP clients, generate a reviewable machine-readable launch payload:

deep-memory mcp-config --agent generic --db ~/.deep-memory/deep-memory.db --json

For agents that prefer a full install contract, read docs/agent-install.json. It declares the install mode, default DB, verify command, MCP command/args, safe-write policy, scope policy, and success report schema.

Agent	Integration path	Config file / touchpoint	Difficulty
Claude Code	MCP	`CLAUDE.md` + Claude MCP config	Easy
Hermes	MCP	`~/.hermes/config.yaml`	Easy
Codex / OpenCode / OpenClaw-style tools	Wrapper first, MCP later	task wrapper / launch script	Medium

Claude Code setup

deep-memory mcp-config --agent claude --db ~/.deep-memory/deep-memory.db

This prints the reviewable command to run, for example:

claude mcp add deep-memory -- deep-memory-mcp --db ~/.deep-memory/deep-memory.db

Add this to CLAUDE.md so the policy is explicit:

Before large tasks, search deep-memory for relevant project conventions.
After verified success, add only durable facts or reusable procedures.
Never store secrets, raw credentials, or temporary issue status.

Hermes setup

deep-memory mcp-config --agent hermes --db ~/.deep-memory/deep-memory.db

This prints a reviewable config.yaml snippet, for example:

mcp_servers:
  deep_memory:
    command: "deep-memory-mcp"
    args: ["--db", "~/.deep-memory/deep-memory.db"]
    timeout: 30

Hermes should then expose tools such as mcp_deep_memory_add, mcp_deep_memory_search, and mcp_deep_memory_stats.

Hermes can also import explicit facts JSONL:

cat > /tmp/hermes-session.jsonl <<'JSONL'
{"session_id":"s_demo","facts":[{"content":"User prefers concise answers with English technical terms","kind":"semantic","importance":0.9}]}
{"session_id":"s_demo","facts":[{"content":"Successful workflows should become reviewable skill candidates","kind":"procedural","confidence":0.8}]}
JSONL

uv run deep-memory hermes-import ~/.deep-memory/deep-memory.db /tmp/hermes-session.jsonl

Codex, OpenCode, and OpenClaw-style wrapper setup

Until MCP is wired in, use a wrapper. Search before the task, write only verified facts after:

MEMORY_DB=~/.deep-memory/deep-memory.db
uv run deep-memory search "$MEMORY_DB" "this task's relevant conventions" \
  --scope project \
  --scope-id deep-memory
# pass the result into the agent as a short "relevant memory" block
# ...run the agent...
uv run deep-memory add "$MEMORY_DB" \
  "Workflow: for this repo, run uv run pytest -q and uv run ruff check . before review" \
  --kind procedural \
  --scope project \
  --scope-id deep-memory \
  --importance 0.8 \
  --source codex:manual

Full adapter references

For the full adapter surface — integration points, read/write paths, permissions, risks — see docs/ADAPTERS.md and the per-agent commands in docs/AGENT_QUICKSTART_MATRIX.md.

Memory scopes

deep-memory is machine-local by default, but records can still be bounded explicitly:

Scope	Primary use	Typical content	Cross-project?
`global`	Long-lived facts that should follow the whole machine	durable user preferences, stable conventions, machine-level policy	Yes
`user`	Per-user partitioning on shared hosts	one person's preferences, role, language, recurring workflow habits	Sometimes
`workspace`	Shared context across related repos or folders	adjacent project notes, shared build/test conventions, multi-repo context	Sometimes
`project`	Repo-specific memory	repository conventions, local architecture facts, review checklists	No
`tenant`	Team / environment isolation	org lane separation, staging vs production boundaries, multi-tenant execution state	Depends on tenant design

The database is shared; scope is the fixed governance layer (global, user, tenant, workspace, or project) and scope_id is the custom namespace inside that layer, such as deep-memory, repo-a, or ben. Start with the narrowest scope that preserves the behavior you want, then widen only when the memory should truly travel across projects or agents.

Inspect memory

uv run deep-memory webui ~/.deep-memory/deep-memory.db --host 127.0.0.1 --port 8765
# open http://127.0.0.1:8765

deep-memory webui ... is the supported launch path. deep-memory-webui is not the current console script or launch contract.

The WebUI can list, search, edit, and soft-delete records. It binds to 127.0.0.1 by default, now serves /favicon.svg and /favicon.ico, and uses the same project icon in the browser tab. If port 8765 is already occupied, choose another free port with --port, for example --port 8876.

Export and audit:

uv run deep-memory export ~/.deep-memory/deep-memory.db                      # active records only
uv run deep-memory export ~/.deep-memory/deep-memory.db --include-deprecated # audit / backup
uv run deep-memory hard-delete ~/.deep-memory/deep-memory.db <memory-id>     # physically remove one record

Python API

from pathlib import Path
from deep_memory import DeepMemory

mem = DeepMemory(Path("~/.deep-memory/deep-memory.db").expanduser())
mem.add(
    "User prefers concise answers with English technical terms",
    kind="semantic",
    importance=0.9,
    scope="user",
    scope_id="ben",
)
mem.add(
    "Project convention: use uv for tests",
    kind="procedural",
    importance=0.8,
    scope="project",
    scope_id="deep-memory",
)

for result in mem.search(
    "how should this repo be tested?",
    scope="project",
    scope_id="deep-memory",
    limit=3,
):
    print(result.score, result.record.kind, result.record.content)

What works today

Area	Status	Notes
Local persistence	Implemented	Machine-local SQLite DB controlled by the user, with fixed global/user/tenant/workspace/project scopes and custom scope IDs.
Search	Implemented	FTS5 plus local Chinese/English token fallback.
Optional Chinese tokenizer	Implemented	`jieba` backend via `uv sync --extra retrieval`.
Metadata	Implemented	`kind`, `importance`, `confidence`, `source`, timestamps, conflict states, scope, and decay.
Conflict handling	Implemented	Candidate, resolved, superseded, deprecated.
Python SDK + CLI	Implemented	`add`, `search`, `stats`, `conflicts`, `resolve-conflict`, `export`, `hard-delete`, `hermes-import`, `webui`.
MCP server	Implemented	Stdio tools for `add`, `search`, `stats`, and conflict helpers.
Hermes import	Implemented	Explicit session facts JSONL to `deep-memory` records.
Local WebUI MVP	Implemented	Inspect, search, edit, soft-delete, and favicon-backed browser identity for memory records.
Memory to skill candidate	Implemented	Exports procedural memories as reviewable skill markdown; no auto-install.
Codex wrapper MVP	Implemented	`deep-memory codex-run` injects bounded context and imports only explicit `--facts-out` JSONL after success.
Native adapters for every agent	Spec / prototype	Use MCP or wrapper first. See `docs/ADAPTERS.md`.
Vector retrieval / hosted sync	Roadmap	Later, if evals and privacy boundaries justify it.

Architecture

The core system is small on purpose:

agents or developers produce explicit facts, procedures, and durable conventions;
SDK, CLI, MCP, or wrapper paths validate and write records;
machine-local SQLite + FTS5 stores searchable memory with metadata and scope;
future agents retrieve a bounded context block before work;
humans inspect, edit, export, delete, evaluate, or promote procedural records into skill candidates.

SQLite is boring on purpose. It is easy to install, inspect, test, back up, and replace later. A single machine-local store keeps agents interoperable; scopes keep retrieval bounded. Vector retrieval stays on the roadmap with schema placeholders and an opt-in migration path; see docs/VECTOR_ROADMAP.md.

Safety boundary

Persistent memory changes future behavior. Keep the default narrow:

store explicit durable facts, not raw transcripts;
use machine-local SQLite by default;
keep scope as the fixed layer and scope_id as the custom namespace so global memories are intentional and project/workspace memories stay bounded;
retrieve a small relevant context block;
retrieval telemetry is local-only and can be disabled with DEEP_MEMORY_TELEMETRY=off — see docs/SAFETY_AND_PRIVACY.md;
never store secrets, private keys, auth cookies, raw credentials, raw private transcripts, or temporary task status;
write procedural memories only after tests, review, or user confirmation;
auto-backup destructive operations with a configurable 7-day TTL;
export skill candidates for review instead of auto-installing them.

Read docs/MEMORY_POLICY.md for the allow / deny / requires-confirmation write policy, and docs/SAFETY_AND_PRIVACY.md before adding automatic writes or shared team memory.

Contributing

This is a controlled preview lane, not a broad launch claim. Contributions should make the memory layer more inspectable, reproducible, scoped, or easier to run.

New here? Start with a good first issue, comment to claim one, run its suggested commands, and open a small PR with the evidence.

Good starting paths:

good first issue: small fixtures, docs fixes, CLI output polish, and reproducible failure cases;
adapter: smoke transcripts and wrapper/MCP compatibility notes for Claude Code, Codex, OpenCode, OpenClaw-style tools, and Hermes;
eval: Chinese retrieval, privacy-boundary, memory/no-memory, and Memory × Skill regression cases;
governance: write policy, consent, export/delete, and conflict-lifecycle checks;
docs: quickstarts, troubleshooting, glossary, and contribution paths.

Concrete contribution paths

Adding a new agent adapter. Update the agent-facing command matrix in docs/AGENT_QUICKSTART_MATRIX.md, document the integration surface and trust boundary in docs/ADAPTERS.md, add the implementation or wrapper entrypoint under src/deep_memory/, and cover the path with at least one CLI or integration-oriented test under tests/.
Adding a new eval fixture. Add the fixture data under evals/data/, wire the new case into the relevant eval or benchmark runner under evals/ or benchmarks/, document what it is measuring in docs/CHINESE_RETRIEVAL_EVAL.md or docs/MEMORY_BENCHMARK.md, and add a regression assertion in tests/ if the behavior should stay stable in CI.

More contributing references

Start with CONTRIBUTING.md, docs/COMMUNITY.md, and docs/NEXT_PHASE_BACKLOG.md.

License

deep-memory gives your agents a local memory layer you can inspect and govern.

If this project is useful in your workflow, please consider starring the repo and opening issues or discussions with real deployment feedback.

Contact and feedback:

GitHub Issues: https://github.com/benbenlijie/deep-memory/issues
GitHub Discussions: https://github.com/benbenlijie/deep-memory/discussions

MIT

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

0.1.0

Jul 2, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

deep_memory-0.1.0.tar.gz (60.1 kB view details)

Uploaded Jul 2, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

deep_memory-0.1.0-py3-none-any.whl (67.1 kB view details)

Uploaded Jul 2, 2026 Python 3

File details

Details for the file deep_memory-0.1.0.tar.gz.

File metadata

Download URL: deep_memory-0.1.0.tar.gz
Upload date: Jul 2, 2026
Size: 60.1 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.13

File hashes

Hashes for deep_memory-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`6a83683a8a773b9ea8f94220da09842749eaf2d45ae3f7bd413745167d08f078`
MD5	`f7ddcfc3b4329e6e2aea90875d09972a`
BLAKE2b-256	`ff9fd9adf735065c9c399ac3359cc66f637f10592145b9e437d225ea0101e7bf`

See more details on using hashes here.

File details

Details for the file deep_memory-0.1.0-py3-none-any.whl.

File metadata

Download URL: deep_memory-0.1.0-py3-none-any.whl
Upload date: Jul 2, 2026
Size: 67.1 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.13

File hashes

Hashes for deep_memory-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`906d95611ea1370d5fa4af3c483e56e4378786eb86aba73011b569a2b6a0945b`
MD5	`ea5b68cf26b282f2d9daf02040959c0c`
BLAKE2b-256	`f7b1e40beeafb56e1a450010cb1c7442a2752cc5969b75df1c54d04d75ad1aa9`

See more details on using hashes here.

deep-memory 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Meta

Classifiers

Project description

deep-memory

Choose the path that fits what you need

Before / after

Why this exists

How it works

More than an MCP server

Screenshots

Architecture

Quickstart

Prerequisites

Zero-config first run

Talk to your agent

Fastest installation prompt

Shared-agent rollout prompt

Safety-first evaluation prompt

Project-scope memory prompt

Procedural-memory-to-skill prompt

Evidence, not magic

Connect your agent

Memory scopes

Inspect memory

Python API

What works today

Architecture

Safety boundary

Contributing

Concrete contribution paths

License

Project details

Verified details

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes