Memory layer for AI agents — biologically-inspired forgetting, multi-agent trust, and plug-and-play integrations
Project description
Engram
The Personal Memory Kernel for AI Agents
A user-owned memory store that any agent can plug into to become instantly personalized.
Agents read via scoped retrieval. Writes land in staging until you approve.
Quick Start · Why Engram · Architecture · Integrations · API & SDK · Changelog
100% free, forever. No Pro tier, no usage limits, no license keys. Bring your own API key (Gemini, OpenAI, or Ollama). Everything runs locally by default.
Why Engram
Every AI agent you use starts with amnesia. Your coding assistant forgets your preferences between sessions. Your planning agent has no idea what your research agent discovered yesterday. You end up re-explaining context that should already be known.
Engram fixes this. It's a Personal Memory Kernel (PMK) — a single memory store that sits between you and all your agents. Any agent can plug in via MCP or REST to become instantly personalized, without you having to repeat yourself.
But unlike "store everything forever" approaches, Engram treats agents as untrusted writers. Writes land in staging. You control what sticks. And memories that stop being useful fade away naturally — just like biological memory.
| Capability | Other Memory Layers | Engram |
|---|---|---|
| Bio-inspired forgetting | No | Ebbinghaus decay curve |
| Untrusted agent writes | Store directly | Staging + verification + conflict stash |
| Episodic narrative memory | No | CAST scenes (time/place/topic) |
| Multi-modal encoding | Rare | 5 retrieval paths (EchoMem) |
| Cross-agent memory sharing | Per-agent silos | Scoped retrieval with masking |
| Knowledge graph | Sometimes | Entity extraction + linking |
| Reference-aware decay | No | If other agents use it, don't delete it |
| Hybrid search | Vector only | Semantic + keyword + episodic |
| Storage efficiency | Store everything | ~45% less |
| MCP + REST | One or the other | Both, plug-and-play |
| Local-first | Cloud-required | 127.0.0.1:8100 by default |
Quick Start
pip install -e ".[all]" # 1. Install
export GEMINI_API_KEY="your-key" # 2. Set one API key (or OPENAI_API_KEY, or OLLAMA_HOST)
engram install # 3. Auto-configure Claude Code, Cursor, Codex
Restart your agent. Done — it now has persistent memory across sessions.
Or with Docker:
docker compose up -d # API at http://localhost:8100
Architecture
Engram is a Personal Memory Kernel — not just a vector store with an API. It has opinions about how memory should work:
- Agents are untrusted writers. Every write is a proposal that lands in staging. Trusted agents can auto-merge; untrusted ones wait for approval.
- Memory has a lifecycle. New memories start in short-term (SML), get promoted to long-term (LML) through repeated access, and fade away through Ebbinghaus decay if unused.
- Scoping is mandatory. Every memory is scoped by user. Agents see only what they're allowed to — everything else gets the "all but mask" treatment (structure visible, details redacted).
┌─────────────────────────────────────────────────────────────────┐
│ Agent Orchestrator │
│ (Claude Code / Cursor / Codex / Custom) │
└─────────────────────────┬───────────────────────────────────────┘
│
┌───────────┴───────────┐
│ │
▼ ▼
┌──────────┐ ┌──────────┐
│ MCP │ │ REST │
│ Server │ │ API │
└────┬─────┘ └────┬─────┘
└───────────┬──────────┘
▼
┌────────────────────────────────────┐
│ Policy Gateway │
│ Scopes · Masking · Quotas · │
│ Capability Tokens · Trust Score │
└────────────────┬───────────────────┘
│
┌──────────┴──────────┐
▼ ▼
┌──────────────────┐ ┌──────────────────┐
│ Retrieval Engine │ │ Ingestion Pipeline│
│ ┌─────────────┐ │ │ │
│ │Semantic │ │ │ Text → Views │
│ │(hybrid+graph│ │ │ Views → Scenes │
│ │+categories) │ │ │ Scenes → LML │
│ ├─────────────┤ │ │ │
│ │Episodic │ │ └────────┬─────────┘
│ │(CAST scenes)│ │ │
│ └─────────────┘ │ ▼
│ │ ┌──────────────────┐
│ Intersection │ │Write Verification│
│ Promotion: │ │ │
│ match in both → │ │ Invariant checks │
│ boost score │ │ Conflict → stash │
└──────────────────┘ │ Trust scoring │
└────────┬─────────┘
│
┌───────────────────┼───────────────────┐
▼ ▼ ▼
┌──────────────────┐ ┌──────────────┐ ┌──────────────────┐
│ Staging (SML) │ │ Long-Term │ │ Indexes │
│ Proposals+Diffs │ │ Store (LML) │ │ Vector + Graph │
│ Conflict Stash │ │ Canonical │ │ + Episodic │
└──────────────────┘ └──────────────┘ └──────────────────┘
│ │ │
└───────────────────┼───────────────────┘
▼
┌──────────────────┐
│ FadeMem GC │
│ Ref-aware decay │
│ If other agents │
│ use it → keep │
└──────────────────┘
The Memory Stack
Engram combines four bio-inspired memory systems, each handling a different aspect of how humans actually remember:
FadeMem — Decay & Consolidation
Memories fade based on time and access patterns, following the Ebbinghaus forgetting curve. Frequently accessed memories get promoted from short-term (SML) to long-term (LML). Unused memories weaken and eventually get forgotten. Reference-aware: if other agents still reference a memory, it won't be garbage collected — even if the original agent stopped using it.
New Memory → Short-term (SML)
│
│ Accessed frequently?
▼
┌──────────┐
No ← │ Decay │ → Yes
└──────────┘
│ │
▼ ▼
Forgotten Promoted to Long-term (LML)
EchoMem — Multi-Modal Encoding
Each memory is encoded through multiple retrieval paths — keywords, paraphrases, implications, and question forms. This creates 5x the retrieval surface area compared to single-embedding approaches. Important memories get deeper processing (1.6x strength multiplier).
Input: "User prefers TypeScript over JavaScript"
↓
raw: "User prefers TypeScript over JavaScript"
paraphrase: "TypeScript is the user's preferred language"
keywords: ["typescript", "javascript", "preference"]
implications: ["values type safety", "modern tooling"]
question: "What language does the user prefer?"
CategoryMem — Dynamic Organization
Categories aren't predefined — they emerge from content and evolve over time. As new memories arrive, the category tree grows, splits, and merges. Categories themselves decay when unused, keeping the taxonomy clean.
CAST Scenes — Episodic Narrative Memory
Inspired by the Contextual Associative Scene Theory of memory, Engram clusters interactions into scenes defined by three dimensions: time, place, and topic. Each scene has characters, a synopsis, and links to the semantic memories extracted from it.
Scene: "Engram v2 architecture session"
Time: 2026-02-09 12:00–12:25
Place: repo:Engram (digital)
Characters: [self, collaborator]
Synopsis: "Designed staged writes and scoped retrieval..."
Views: [view_1, view_2, view_3]
Memories: [mem_1, mem_2] ← semantic facts extracted
Key Flows
Read: Query → Context Packet
Agent calls search_memory or POST /v1/search
→ Policy Gateway enforces scope, quotas, masking
→ Dual retrieval: semantic index + episodic index (parallel)
→ Intersection promotion: results matching in both get boosted
→ Returns Context Packet (token-budgeted, with scene citations)
The dual retrieval approach reduces "similar but wrong time/place" errors. If a memory appears in both semantic search and the relevant episodic scene, it gets a confidence boost.
Write: Agent Proposal → Staging
Agent calls propose_write or POST /v1/memories
→ Lands in Staging SML as a Proposal Commit
→ Provenance recorded (agent, time, scope, trust score)
→ Verification runs:
• Invariant contradiction check → stash if conflict
• Duplication detection
• PII risk detection → require manual approval if high
→ High-trust agents: auto-merge
→ Others: wait for user approval or daily digest
"All But Mask" Policy
When an agent queries data outside its scope, it sees structure but not details:
{
"type": "private_event",
"time": "2026-02-10T17:00:00Z",
"importance": "high",
"details": "[REDACTED]"
}
Agents can still operate (scheduling, planning) without seeing secrets.
Integrations
Engram is plug-and-play. Run engram install and it auto-configures everything:
Claude Code (MCP + Plugin)
engram install # Writes MCP config to ~/.claude.json
MCP tools give Claude reactive memory — it stores and retrieves when you ask.
The optional Claude Code plugin makes memory proactive — relevant context is injected automatically before Claude sees your message:
# Inside Claude Code:
/plugin install engram-memory --path ~/.engram/claude-plugin
What the plugin adds:
| Component | What it does |
|---|---|
| UserPromptSubmit hook | Before each reply, queries Engram and injects matching memories into context. Stdlib-only, no extra deps. Under 2s latency. |
/engram:remember <text> |
Save a fact or preference on the spot |
/engram:search <query> |
Search memories by topic |
/engram:forget <id> |
Delete a memory (confirms before removing) |
/engram:status |
Show memory-store stats at a glance |
| Skill | Standing instructions telling Claude when to save, search, and surface memories |
Without plugin — Claude reacts to explicit requests:
You: Remember that I prefer TypeScript
Claude: [calls remember tool] Done.
With plugin — memory is proactive and invisible:
--- Session A ---
You: /engram:remember I prefer TypeScript for all new projects
--- Session B (new conversation, no history) ---
You: What stack should I use for the new API?
[Hook injects "TypeScript preference" before Claude sees the message]
Claude: Based on your preferences, I'd recommend TypeScript...
Cursor
engram install writes MCP config to ~/.cursor/mcp.json. Restart Cursor to load.
OpenAI Codex
engram install writes MCP config to ~/.codex/config.toml. Restart Codex to load.
OpenClaw
engram install deploys the Engram skill to OpenClaw's skills directory.
Any Agent Runtime
Any tool-calling agent can connect via REST:
engram-api # Starts on http://127.0.0.1:8100
MCP Tools
Once configured, your agent has access to these tools:
| Tool | Description |
|---|---|
add_memory |
Store a new memory (lands in staging by default) |
search_memory |
Semantic + keyword + episodic search |
get_all_memories |
List all stored memories for a user |
get_memory |
Get a specific memory by ID |
update_memory |
Update memory content |
delete_memory |
Remove a memory |
get_memory_stats |
Storage statistics and health |
apply_memory_decay |
Run the forgetting algorithm |
engram_context |
Session-start digest — load top memories from prior sessions |
remember |
Quick-save a fact (no LLM extraction, stores directly) |
propose_write |
Create a staged write proposal (default safe path) |
list_pending_commits |
Inspect staged write queue |
resolve_conflict |
Resolve invariant conflicts (accept proposed or keep existing) |
search_scenes / get_scene |
Episodic CAST scene retrieval with masking policy |
API & SDK
REST API
engram-api # http://127.0.0.1:8100
# Interactive docs at /docs
# 1. Create a capability session token
curl -X POST http://localhost:8100/v1/sessions \
-H "Content-Type: application/json" \
-d '{
"user_id": "u123",
"agent_id": "planner",
"allowed_confidentiality_scopes": ["work", "personal"],
"capabilities": ["search", "propose_write", "read_scene"],
"ttl_minutes": 1440
}'
# 2. Propose a write (default: staging)
curl -X POST http://localhost:8100/v1/memories \
-H "Authorization: Bearer <TOKEN>" \
-H "Content-Type: application/json" \
-d '{"content": "User prefers dark mode", "user_id": "u123", "mode": "staging"}'
# 3. Search (returns context packet with scene citations)
curl -X POST http://localhost:8100/v1/search \
-H "Authorization: Bearer <TOKEN>" \
-H "Content-Type: application/json" \
-d '{"query": "UI preferences", "user_id": "u123"}'
# 4. Review staged commits
curl "http://localhost:8100/v1/staging/commits?user_id=u123&status=PENDING"
curl -X POST http://localhost:8100/v1/staging/commits/<id>/approve
# 5. Episodic scene search
curl -X POST http://localhost:8100/v1/scenes/search \
-H "Content-Type: application/json" \
-d '{"query": "architecture discussion", "user_id": "u123"}'
# 6. Namespace & trust management
curl -X POST http://localhost:8100/v1/namespaces \
-d '{"user_id": "u123", "namespace": "workbench"}'
curl "http://localhost:8100/v1/trust?user_id=u123&agent_id=planner"
# 7. Sleep-cycle maintenance
curl -X POST http://localhost:8100/v1/sleep/run \
-d '{"user_id": "u123", "apply_decay": true, "cleanup_stale_refs": true}'
Python SDK
from engram import Engram
memory = Engram()
# Add a memory
memory.add("User prefers Python over JavaScript", user_id="u123")
# Search with dual retrieval
results = memory.search("programming preferences", user_id="u123")
# Cross-agent knowledge sharing
memory.add(
"The API rate limit is 100 req/min",
user_id="team_alpha",
agent_id="researcher",
categories=["technical", "api"]
)
# Another agent finds it
results = memory.search("rate limits", user_id="team_alpha")
Full Memory interface:
from engram import Memory
memory = Memory()
# Lifecycle
memory.add(content, user_id, agent_id=None, categories=None, metadata=None)
memory.get(memory_id)
memory.update(memory_id, content)
memory.delete(memory_id)
# Search
memory.search(query, user_id, agent_id=None, limit=10, categories=None)
memory.get_all(user_id, agent_id=None, layer=None, limit=100)
# Memory management
memory.promote(memory_id) # SML → LML
memory.demote(memory_id) # LML → SML
memory.fuse(memory_ids) # Combine related memories
memory.decay(user_id=None) # Apply forgetting
memory.history(memory_id) # Access history
# Knowledge graph
memory.get_related_memories(memory_id) # Graph traversal
memory.get_memory_entities(memory_id) # Extracted entities
memory.get_entity_memories(entity_name) # All memories with entity
memory.get_memory_graph(memory_id) # Visualization data
# Categories
memory.get_category_tree()
memory.search_by_category(category_id)
memory.stats(user_id=None, agent_id=None)
Async support:
from engram.memory.async_memory import AsyncMemory
async with AsyncMemory() as memory:
await memory.add("User prefers Python", user_id="u1")
results = await memory.search("programming", user_id="u1")
CLI
engram install # Auto-configure all integrations
engram status # Version, config paths, DB stats
engram serve # Start REST API server
engram add "User prefers Python" # Add a memory
engram search "preferences" # Search
engram list --layer lml # List long-term memories
engram stats # Memory statistics
engram decay # Apply forgetting
engram categories # List categories
engram export -o memories.json # Export
engram import memories.json # Import (Engram or Mem0 format)
Configuration
# LLM & Embeddings (choose one)
export GEMINI_API_KEY="your-key" # Gemini (default)
export OPENAI_API_KEY="your-key" # OpenAI
export OLLAMA_HOST="http://localhost:11434" # Ollama (local, no key)
# v2 features (all enabled by default)
export ENGRAM_V2_POLICY_GATEWAY="true" # Token + scope enforcement
export ENGRAM_V2_STAGING_WRITES="true" # Writes land in staging
export ENGRAM_V2_DUAL_RETRIEVAL="true" # Semantic + episodic search
export ENGRAM_V2_REF_AWARE_DECAY="true" # Preserve referenced memories
export ENGRAM_V2_TRUST_AUTOMERGE="true" # Auto-approve for trusted agents
export ENGRAM_V2_AUTO_MERGE_TRUST_THRESHOLD="0.85" # Trust threshold for auto-merge
Python config:
from engram.configs.base import MemoryConfig, FadeMemConfig, EchoMemConfig, CategoryMemConfig
config = MemoryConfig(
fadem=FadeMemConfig(
enable_forgetting=True,
sml_decay_rate=0.15,
lml_decay_rate=0.02,
promotion_access_threshold=3,
forgetting_threshold=0.1,
),
echo=EchoMemConfig(
enable_echo=True,
auto_depth=True,
deep_multiplier=1.6,
),
category=CategoryMemConfig(
enable_categories=True,
auto_categorize=True,
enable_category_decay=True,
max_category_depth=3,
),
)
Multi-Agent Memory
Engram is designed for agent orchestrators. Every memory is scoped by user_id and optionally agent_id:
# Research agent stores knowledge
memory.add("OAuth 2.0 with JWT tokens",
user_id="project_123", agent_id="researcher")
# Implementation agent searches shared knowledge
results = memory.search("authentication", user_id="project_123")
# → Finds researcher's discovery
# Review agent adds findings
memory.add("Security review passed",
user_id="project_123", agent_id="reviewer")
Agent trust scoring determines write permissions:
- High-trust agents (>0.85): proposals auto-merge
- Medium-trust: queued for daily digest review
- Low-trust: require explicit approval
Research
Engram is based on:
FadeMem: Biologically-Inspired Forgetting for Efficient Agent Memory arXiv:2601.18642
| Metric | Result |
|---|---|
| Storage Reduction | ~45% |
| Multi-hop Reasoning | +12% accuracy |
| Retrieval Precision | +8% on LTI-Bench |
Biological inspirations: Ebbinghaus Forgetting Curve → exponential decay, Spaced Repetition → access boosts strength, Sleep Consolidation → SML → LML promotion, Production Effect → echo encoding, Elaborative Encoding → deeper processing = stronger memory.
Docker
# Quick start
docker compose up -d
# Or build manually
docker build -t engram .
docker run -p 8100:8100 -v engram-data:/data \
-e GEMINI_API_KEY="your-key" engram
Manual Integration Setup
Claude Code / Claude Desktop
Add to ~/.claude.json (CLI) or claude_desktop_config.json (Desktop):
{
"mcpServers": {
"engram-memory": {
"command": "python",
"args": ["-m", "engram.mcp_server"],
"env": {
"GEMINI_API_KEY": "your-api-key"
}
}
}
}
Cursor
Add to ~/.cursor/mcp.json:
{
"mcpServers": {
"engram-memory": {
"command": "python",
"args": ["-m", "engram.mcp_server"],
"env": {
"GEMINI_API_KEY": "your-api-key"
}
}
}
}
OpenAI Codex
Add to ~/.codex/config.toml:
[mcp_servers.engram-memory]
command = "python"
args = ["-m", "engram.mcp_server"]
[mcp_servers.engram-memory.env]
GEMINI_API_KEY = "your-api-key"
Troubleshooting
Claude Code doesn't see the memory tools
- Restart Claude Code after running
engram install - Check that
~/.claude.jsonhas anmcpServers.engram-memorysection - Verify your API key:
echo $GEMINI_API_KEY
The hook isn't injecting memories
- Check that
engram-apiis running:curl http://127.0.0.1:8100/health - Verify the plugin is activated: run
/pluginin Claude Code - Check script permissions:
ls -l ~/.engram/claude-plugin/engram-memory/hooks/prompt_context.py
API won't start (port in use)
- Check:
lsof -i :8100 - Kill the process:
kill <PID> - Or use a different port:
ENGRAM_API_PORT=8200 engram-api
Contributing
git clone https://github.com/Ashish-dwi99/Engram.git
cd Engram
pip install -e ".[dev]"
pytest tests/ -v
License
MIT License — see LICENSE for details.
Your agents forget everything between sessions. Engram fixes that.
GitHub ·
Issues ·
Changelog
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file engram_memory-0.4.0.tar.gz.
File metadata
- Download URL: engram_memory-0.4.0.tar.gz
- Upload date:
- Size: 162.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
751898893eb0eca4e1a6267c5d5e9f235057aa3459a63a3bcb2200b6ff1eb8e6
|
|
| MD5 |
6631e89f051d894b3129de1bb19071cd
|
|
| BLAKE2b-256 |
72fc71153d35ecfce7f74f458e583c131e1f491d84f3467be92b5162af4de01c
|
File details
Details for the file engram_memory-0.4.0-py3-none-any.whl.
File metadata
- Download URL: engram_memory-0.4.0-py3-none-any.whl
- Upload date:
- Size: 167.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
37f353db21211eb38b563cf8a6bc192ac02a08152f2638d6df5ff8c26f9cb626
|
|
| MD5 |
f2608fe356ca4cf89a5c1c77b2b1094c
|
|
| BLAKE2b-256 |
9e5dfdf827d10fb0c12a1b0285538f61cd32720dadda45e2fec8960ce8c997d9
|