Persistent memory engine with hybrid retrieval, tiered storage, and semantic dedup for AI agents
Project description
trw-memory
AI agent memory engine — persistent memory for AI agents with hybrid search (BM25 + vectors), Q-learning scoring, Ebbinghaus decay curves, tiered storage, and knowledge graph. The standalone memory backend powering TRW Framework.
Part of TRW Framework
trw-memory is the standalone memory engine for TRW (The Real Work) — a methodology layer for AI-assisted development that turns stateless agents into self-improving systems through knowledge compounding. It works alongside trw-mcp, the MCP server that provides 24 tools built on this engine.
- trw-memory (this repo): Standalone AI agent memory engine with hybrid retrieval, scoring, and lifecycle
- trw-mcp: MCP server with 24 tools, 24 skills, 18 agents — uses trw-memory as its backend
What It Does
TRW-Memory is a standalone persistent memory engine for AI agents that gives coding agents searchable, long-lived knowledge storage. It stores learnings (patterns, gotchas, architecture decisions) in SQLite with optional YAML backup, and retrieves them using hybrid search that combines keyword matching (BM25) with dense vector similarity.
Designed as the storage backend for trw-mcp and TRW Framework, but usable independently by any AI agent framework that needs persistent memory with recall.
Features
- MemoryClient SDK -- High-level async Python client with store/recall/forget/search
- Hybrid Search (BM25 + vector) -- BM25 keyword matching + dense vector similarity via sqlite-vec, combined with Reciprocal Rank Fusion (RRF). Learn more
- Tiered Storage -- Hot/warm/cold tiers with automatic promotion/demotion based on access patterns and impact scores. Architecture details
- Semantic Deduplication -- Detects and merges near-duplicate learnings using cosine similarity (0.85 threshold)
- Knowledge Graph for AI -- Tag co-occurrence and similarity edges, BFS traversal, importance boost/decay, cross-validation propagation. Docs
- LLM Consolidation -- Episodic-to-semantic consolidation via complete-linkage clustering and LLM summarization
- Q-learning Memory Scoring -- Q-learning with EMA updates, Ebbinghaus forgetting curve applied at query time, Bayesian MACLA calibration
- Remote Sync -- Publish/fetch learnings across installations with vector clock conflict resolution and SSE live updates
- Security -- AES-256-GCM field encryption, PII detection/redaction, memory poisoning detection (z-score anomaly), RBAC, audit trail
- Agent Integration --
register_tools()for any agent framework,@auto_recalldecorator - Framework Integrations -- LangChain memory, LlamaIndex reader/writer, CrewAI component, OpenAI-compatible adapter
- CLI -- Full command-line interface for store, recall, search, forget, consolidate, export/import
- REST API -- FastAPI server with CRUD, search, namespace management, and background jobs
- MCP Tools -- 6 tools for store, recall, search, consolidate, forget, and status
- Dual Storage Backends -- SQLite with FTS5 (primary) + YAML (backup) with one-time migration
Quick Start
# Install from PyPI
pip install trw-memory
# Or install from source
git clone https://github.com/wallter/trw-memory.git
cd trw-memory
python -m venv .venv && source .venv/bin/activate
pip install -e ".[dev]"
# With all optional features (embeddings, vectors, BM25, LLM, API)
pip install -e ".[all]"
MemoryClient (recommended)
from trw_memory.client import MemoryClient
async with MemoryClient(namespace="project:my-app") as client:
# Store a learning
await client.store(
"Pydantic v2 requires use_enum_values=True for YAML round-trip",
tags=["pydantic", "gotcha"],
importance=0.8,
)
# Recall by keyword query (hybrid BM25 + vector search)
results = await client.recall("pydantic serialization", limit=10)
# Search with filters
high_impact = await client.search(min_importance=0.7, tags=["gotcha"])
# Forget an entry
await client.forget(results[0]["memory_id"])
Agent Framework Integration
from trw_memory.client import MemoryClient
client = MemoryClient(namespace="project:my-app")
# Register tools with any agent that has register_tool() or tool() API
client.register_tools(agent)
# Or use the auto_recall decorator
@client.auto_recall(query_from="prompt")
async def handle_prompt(prompt: str, recalled_memories: list = []) -> str:
# recalled_memories is automatically injected with relevant context
return f"Found {len(recalled_memories)} relevant memories"
CLI
# Store a learning
trw-memory store "Always use connection pooling for PostgreSQL" --tags db,performance --importance 0.8
# Recall by query
trw-memory recall "database optimization" --limit 5
# Search with filters
trw-memory search --tags security --min-importance 0.7
# Consolidate related entries
trw-memory consolidate --namespace project:my-app --dry-run
# Export/import for backup or migration
trw-memory export --format json > memories.json
trw-memory import memories.json --namespace project:new-app
# Status overview
trw-memory status
Low-Level Backend Access
from trw_memory.storage.sqlite_backend import SQLiteBackend
from trw_memory.models.memory import MemoryEntry
backend = SQLiteBackend(db_path=".trw/memory.db")
entry = MemoryEntry(id="M-abc12345", content="...", namespace="default", ...)
backend.store(entry)
results = backend.search("query", top_k=10, namespace="default")
Architecture
src/trw_memory/
client.py # MemoryClient SDK (recommended entry point)
cli.py # CLI entry point (trw-memory command)
config.py # MemoryConfig (pydantic-settings, env var override)
graph.py # Knowledge graph (similarity/tag/consolidation edges, BFS)
server.py # FastMCP MCP server entry point
decorators.py # @auto_recall decorator
exceptions.py # Custom exception hierarchy
namespace.py # Namespace validation
storage/
sqlite_backend.py # Primary: SQLite with FTS5, WAL mode, sqlite-vec vectors
yaml_backend.py # Legacy: per-entry YAML files (backup/migration)
interface.py # Abstract StorageBackend protocol
persistence.py # Atomic read/write helpers
retrieval/
bm25.py # BM25Okapi sparse retrieval (rank-bm25)
dense.py # Cosine similarity dense vector search
fusion.py # Reciprocal Rank Fusion (RRF, k=60)
pipeline.py # hybrid_search() orchestrator (BM25 + dense + RRF)
lifecycle/
scoring.py # Q-learning, Ebbinghaus decay, Bayesian calibration
tiers.py # Hot/warm/cold tier management (LRU, sweep, archive)
dedup.py # Semantic dedup (cosine threshold, merge/skip logic)
consolidation.py # LLM clustering + summarization (episodic-to-semantic)
sync/
remote.py # Publish/fetch to platform backend
conflict.py # Vector clock comparison + three-way merge
retry_queue.py # Persistent JSONL retry queue
subscriber.py # SSE subscriber for live updates
security/
encryption.py # AES-256-GCM with HKDF per-namespace keys
pii.py # PII detection (email, phone, SSN, API keys, entropy)
poisoning.py # Anomaly detection (z-score frequency/size/pattern)
audit.py # Append-only security event audit trail
rbac.py # Role-based access control for namespaces
keys.py # Master key derivation and rotation
api/
app.py # FastAPI app factory
auth.py # API key middleware
router_memories.py # CRUD + search endpoints
router_namespaces.py # Namespace management
router_jobs.py # Background job management
router_health.py # Health check
tools/ # 6 MCP tool implementations
store.py # memory_store (validate + dedup + write)
recall.py # memory_recall (hybrid search + graph traversal)
search.py # memory_search (filter-based listing)
forget.py # memory_forget (namespace-scoped deletion)
consolidate.py # memory_consolidate (trigger consolidation)
status.py # memory_status (backend stats + tier info)
integrations/
langchain.py # LangChain memory class
llamaindex.py # LlamaIndex reader/writer
crewai.py # CrewAI memory component
factory.py # Auto-detect framework and create adapter
vscode.py # VS Code extension adapter
adapters/
openai_compat.py # OpenAI Memory API compatible endpoint
models/
config.py # MemoryConfig (pydantic-settings)
memory.py # MemoryEntry, MemoryStatus
events.py # Audit event models
namespaces/
manager.py # Namespace lifecycle (create, expire, list)
validation.py # Namespace format validation
path_mapping.py # Namespace-to-path resolution
migration/
from_trw.py # YAML-to-SQLite migration from trw-mcp format
78 source files, ~11,650 lines of code
API Reference
Key Modules and Functions
| Name | Module | Description |
|---|---|---|
MemoryClient |
client |
High-level async SDK (store/recall/forget/search) |
SQLiteBackend |
storage.sqlite_backend |
Primary storage with FTS5, WAL, and sqlite-vec vectors |
YAMLBackend |
storage.yaml_backend |
File-based storage (backup/migration) |
hybrid_search() |
retrieval.pipeline |
BM25 + dense vector search with RRF fusion |
bm25_search() |
retrieval.bm25 |
BM25Okapi sparse keyword retrieval |
dense_search() |
retrieval.dense |
Cosine similarity vector search |
rrf_fuse() |
retrieval.fusion |
Reciprocal Rank Fusion combiner |
KnowledgeGraph functions |
graph |
Tag/similarity edges, BFS traversal, decay |
TierSweepResult |
lifecycle.tiers |
Hot/warm/cold sweep, promote, demote, purge |
DedupResult |
lifecycle.dedup |
Duplicate detection (skip/merge/store decisions) |
compute_utility_score() |
lifecycle.scoring |
Q-learning + Ebbinghaus + Bayesian scoring |
MemoryConfig |
models.config |
Configuration via env vars or dict |
MemoryEntry |
models.memory |
Core data model for stored memories |
Storage Backends
SQLite (recommended) -- Fast, transactional, supports FTS5 full-text search, knowledge graph edges, and optional sqlite-vec vector similarity:
from trw_memory.storage.sqlite_backend import SQLiteBackend
backend = SQLiteBackend(db_path=".trw/memory.db")
# Supports: store, get, update, delete, search, count, list_entries,
# list_namespaces, upsert_vector, search_vectors
YAML -- Human-readable, git-friendly, used as backup during migration:
from trw_memory.storage.yaml_backend import YAMLBackend
backend = YAMLBackend(entries_dir=".trw/learnings")
Hybrid Search: BM25 + Vector
The hybrid search pipeline combines sparse keyword retrieval with dense semantic search — ensuring strong results for both exact-match queries and conceptually similar queries. Read the full architecture docs.
Query --> BM25 (keyword, rank-bm25) --+
+--> RRF Fusion (k=60) --> Ranked Results
Query --> Dense (cosine, sqlite-vec) --+
The pipeline gracefully degrades: if BM25 is unavailable, only dense search runs (and vice versa). If neither is available, falls back to the storage backend's built-in FTS5 keyword search.
Scoring System
Learning utility is computed from multiple signals. Full scoring documentation:
- Q-learning: Exponential moving average updated from outcome events (success/failure/mixed)
- Ebbinghaus forgetting curve: Time-based Ebbinghaus decay applied at query time (not mutated in storage) — entries naturally fade unless reinforced by recall
- Access recency boost: Recently accessed entries score higher
- Impact score: Author-assigned importance (0.0-1.0)
- Bayesian calibration: MACLA calibration for impact score accuracy
Tiered Storage
Automatic hot/warm/cold tiering keeps frequently-used memories fast and archives stale ones. Architecture overview:
| Tier | Criteria | Storage | Latency |
|---|---|---|---|
| Hot | Created/accessed in last 7 days | In-memory LRU cache | <1ms |
| Warm | 8-90 days, impact >= 0.3 | SQLite + FTS5 index | <50ms |
| Cold | 90+ days OR impact < 0.3 | YAML archive (partitioned by year/month) | <200ms |
Entries are automatically promoted/demoted during periodic sweeps. Cold-tier entries remain queryable.
Security
| Feature | Implementation |
|---|---|
| Field encryption | AES-256-GCM with HKDF-SHA256 per-namespace key derivation |
| PII detection | Regex patterns (email, phone, SSN, credit card, API keys) + Shannon entropy analysis |
| Poisoning defense | Z-score anomaly detection on frequency, size, and content patterns |
| Access control | Role-based (admin/editor/viewer) per namespace |
| Audit trail | Append-only security event log |
| Key management | Master key derivation, per-namespace keys, rotation support |
REST API
When installed with [api] extra:
trw-memory-api # Starts FastAPI server
| Endpoint | Method | Purpose |
|---|---|---|
/memories |
POST | Store a new memory entry |
/memories/{id} |
GET | Retrieve a specific entry |
/memories/{id} |
PATCH | Update an entry |
/memories/{id} |
DELETE | Delete an entry |
/memories/search |
POST | Search with filters |
/namespaces |
GET | List namespaces |
/namespaces/{ns} |
DELETE | Delete namespace and entries |
/jobs/consolidate |
POST | Trigger consolidation |
/jobs/sweep |
POST | Trigger tier sweep |
/health |
GET | Health check |
MCP Tools
When installed with [mcp] extra:
trw-memory-server # Starts MCP server (stdio transport)
| Tool | Purpose |
|---|---|
memory_store |
Store entry with dedup check and embedding |
memory_recall |
Hybrid retrieval with optional graph traversal |
memory_search |
Filter-based listing (tags, importance, date range) |
memory_forget |
Namespace-scoped deletion |
memory_consolidate |
Trigger episodic-to-semantic consolidation |
memory_status |
Backend stats, entry counts, tier distribution |
Integration with trw-mcp
trw-mcp is the MCP server layer of TRW Framework — it exposes 24 tools, 24 skills, and 18 agents to Claude Code and other AI coding tools. trw-memory serves as its memory backend:
trw_learndelegates toSQLiteBackend.store()viamemory_adapter.py(YAML dual-write as backup)trw_recalldelegates toSQLiteBackend.search()/list_entries()as the sole query path- Scoring functions (
compute_utility_score,update_q_value,apply_time_decay,bayesian_calibrate) are canonical in trw-memory and re-exported by trw-mcp - One-time YAML-to-SQLite migration runs automatically on first access
- Optional vector search via
LocalEmbeddingProvider+rrf_fusewhensentence-transformersis installed
Read more about the full TRW Framework architecture.
Development
# Install dev dependencies
pip install -e ".[dev]"
# Run full test suite (1,314 tests, >=80% coverage required)
.venv/bin/python -m pytest tests/ -v --cov=trw_memory --cov-report=term-missing
# Type checking (strict mode, 78 files)
.venv/bin/python -m mypy --strict src/trw_memory/
# Targeted testing
.venv/bin/python -m pytest tests/test_client.py -v
.venv/bin/python -m pytest tests/test_retrieval.py -v
.venv/bin/python -m pytest tests/test_storage_sqlite.py -v
Current metrics: 1,314 tests, 91% coverage, mypy strict clean.
Optional Dependencies
| Extra | Packages | Purpose |
|---|---|---|
[mcp] |
fastmcp | MCP server tools |
[embeddings] |
sentence-transformers | Dense vector embeddings (all-MiniLM-L6-v2, 384-dim) |
[vectors] |
sqlite-vec | Vector similarity search in SQLite |
[bm25] |
rank-bm25 | BM25 keyword search |
[llm] |
anthropic | LLM-augmented consolidation |
[api] |
fastapi, uvicorn | REST API server |
[langchain] |
langchain-core | LangChain memory integration |
[llamaindex] |
llama-index-core | LlamaIndex reader/writer |
[crewai] |
crewai | CrewAI memory component |
[all] |
mcp + embeddings + vectors + bm25 + llm + api | Full feature set |
[dev] |
pytest, mypy, ruff, coverage, etc. | Testing and linting |
Entry Points
| Command | Purpose |
|---|---|
trw-memory |
CLI for store/recall/search/forget/consolidate/export/import |
trw-memory-server |
MCP server (stdio transport) |
trw-memory-api |
FastAPI REST server |
License
Business Source License 1.1 -- source-available, free for non-competing use. Converts to Apache 2.0 on 2030-03-21.
Built by Tyler Wall · TRW Framework · Documentation · License
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file trw_memory-0.4.0.tar.gz.
File metadata
- Download URL: trw_memory-0.4.0.tar.gz
- Upload date:
- Size: 241.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c0f1ef48fb6b3dbf46c61c1967e1f46888f716ef7c100c3effa533640bd4355c
|
|
| MD5 |
4fb61b6ea031b2d94ac14f5c11b231d2
|
|
| BLAKE2b-256 |
91caa422d9da5be10b91d7c150fad2d7a494ab51ba5d06e947b015614a37ce9d
|
Provenance
The following attestation bundles were made for trw_memory-0.4.0.tar.gz:
Publisher:
release.yml on wallter/trw-memory
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
trw_memory-0.4.0.tar.gz -
Subject digest:
c0f1ef48fb6b3dbf46c61c1967e1f46888f716ef7c100c3effa533640bd4355c - Sigstore transparency entry: 1184745872
- Sigstore integration time:
-
Permalink:
wallter/trw-memory@de6beef8b0fbac58f441680bd677122506966dca -
Branch / Tag:
refs/tags/v0.4.0 - Owner: https://github.com/wallter
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@de6beef8b0fbac58f441680bd677122506966dca -
Trigger Event:
push
-
Statement type:
File details
Details for the file trw_memory-0.4.0-py3-none-any.whl.
File metadata
- Download URL: trw_memory-0.4.0-py3-none-any.whl
- Upload date:
- Size: 145.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b43394d9e777835338305ff00d7962102cf8cdd5f933bd1eafb238decbdb9bb8
|
|
| MD5 |
75005f3a1bb8644b30672c0f2740c474
|
|
| BLAKE2b-256 |
5b0f185eb2225dbd0cf31008691b70e4c7bb5d12730d8c5555bd1db01761f2a7
|
Provenance
The following attestation bundles were made for trw_memory-0.4.0-py3-none-any.whl:
Publisher:
release.yml on wallter/trw-memory
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
trw_memory-0.4.0-py3-none-any.whl -
Subject digest:
b43394d9e777835338305ff00d7962102cf8cdd5f933bd1eafb238decbdb9bb8 - Sigstore transparency entry: 1184745893
- Sigstore integration time:
-
Permalink:
wallter/trw-memory@de6beef8b0fbac58f441680bd677122506966dca -
Branch / Tag:
refs/tags/v0.4.0 - Owner: https://github.com/wallter
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@de6beef8b0fbac58f441680bd677122506966dca -
Trigger Event:
push
-
Statement type: