Lightweight RAG memory system for AI agents — Progressive Disclosure, Auto-Capture, 3-Layer Archive

These details have not been verified by PyPI

Project links

Project description

openclaw-mem

Local-first AI memory — No API keys. No cloud. No vendor lock-in.

Lightweight RAG memory system for AI agents — Progressive Disclosure, Auto-Capture, 3-Layer Archive, built-in injection defense.

Features

Progressive Disclosure — 2-step search: summaries first (--index), then full content on demand (--detail). Saves tokens for LLM agents.
Auto-Capture — Rule-based extraction of decisions, learnings, errors, and insights from session transcripts. No LLM required.
3-Layer Archive — Hot (active files), Warm (indexed in RAG), Cold (archived but still searchable).
Observation Logging — Structured [tag] observations with instant indexing.
Pluggable Embeddings — Local sentence-transformers (default, no API key), OpenAI, or Ollama backends.
Multilingual — Swap in multilingual models for Korean + English support.
Zero Config — Works out of the box with pip install and no API keys. Everything is overridable via environment variables.

Architecture

┌─────────────────────────────────────────────────────┐
│                   openclaw-mem                       │
│                                                     │
│  ┌──────────┐  ┌──────────┐  ┌──────────────────┐  │
│  │  search   │  │  index   │  │  auto-capture    │  │
│  │(2-step)   │  │(incr.)   │  │(rule-based)      │  │
│  └─────┬─────┘  └─────┬────┘  └────────┬─────────┘  │
│        │              │                │             │
│        ▼              ▼                ▼             │
│  ┌─────────────────────────────────────────────┐    │
│  │              LanceDB + Embeddings            │    │
│  │   local (default) │ openai │ ollama          │    │
│  └─────────────────────────────────────────────┘    │
│                                                     │
│  Memory Layers:                                     │
│  ┌────────┐  ┌────────┐  ┌────────┐                │
│  │  HOT   │  │  WARM  │  │  COLD  │                │
│  │(active)│→ │(indexed)│→ │(archive│                │
│  │        │  │  in RAG │  │ in RAG)│                │
│  └────────┘  └────────┘  └────────┘                │
└─────────────────────────────────────────────────────┘

Installation

# Default: local embeddings (no API key needed)
pip install openclaw-mem

# With OpenAI backend support
pip install openclaw-mem[openai]

# With Ollama backend support
pip install openclaw-mem[ollama]

# Everything
pip install openclaw-mem[all]

From source

git clone https://github.com/kjaylee/openclaw-mem.git
cd openclaw-mem
pip install -e ".[dev]"

Quick Start

1. Index your markdown files

# Set your workspace root (where memory/ directory lives)
export OPENCLAW_MEM_ROOT=/path/to/workspace

# Index all configured files
openclaw-mem index --all

# Or index only changed files (incremental)
openclaw-mem index --changed

# Or index a specific file
openclaw-mem index path/to/notes.md

2. Search with Progressive Disclosure

# Step 1: Get summaries (cheap on tokens)
openclaw-mem search "deployment process" --index

# Output:
# 1. [0.8432] memory/2025-01-15.md
#    id: 2025-01-15.md:3:a1b2c3d4
#    Deployed the new API to production...

# Step 2: Get full content for interesting chunks
openclaw-mem search --detail "2025-01-15.md:3:a1b2c3d4"

3. Record observations

openclaw-mem observe "Redis cache reduced latency by 40%" --tag learning
openclaw-mem observe "Switched to Rust for WASM builds" --tag decision
openclaw-mem observe "OOM on 2GB instances with batch size > 100" --tag error

4. Auto-capture from sessions

# Scan recent session transcripts for observations
openclaw-mem auto-capture --since 6h

# Dry run — see what would be captured
openclaw-mem auto-capture --dry-run

5. Archive old files

# See what would be archived (dry run)
openclaw-mem archive

# Actually archive files older than 30 days
openclaw-mem archive --execute

# Re-index archive for search
openclaw-mem archive --reindex

Configuration

All settings can be overridden via environment variables:

Variable	Default	Description
`OPENCLAW_MEM_ROOT`	Package parent dir	Workspace root directory
`OPENCLAW_MEM_DB_PATH`	`$ROOT/lance_db`	LanceDB database path
`OPENCLAW_MEM_TABLE`	`openclaw_memory`	LanceDB table name
`OPENCLAW_MEM_BACKEND`	`local`	Embedding backend: `local`, `openai`, `ollama`
`OPENCLAW_MEM_MODEL`	`intfloat/multilingual-e5-small`	Model name (per backend)
`OPENAI_API_KEY`	(empty)	Required only for `openai` backend
`OPENAI_BASE_URL`	(empty)	Custom OpenAI-compatible endpoint
`OLLAMA_BASE_URL`	`http://localhost:11434`	Ollama server URL
`OPENCLAW_MEM_CHUNK_SIZE`	`500`	Max chunk size (characters)
`OPENCLAW_MEM_CHUNK_OVERLAP`	`50`	Chunk overlap (characters)
`OPENCLAW_MEM_ARCHIVE_DIR`	`$ROOT/memory/archive`	Archive directory
`OPENCLAW_MEM_ARCHIVE_DAYS`	`30`	Days before archiving
`OPENCLAW_MEM_OBSERVATIONS_FILE`	`$ROOT/memory/observations.md`	Observations file
`OPENCLAW_MEM_SESSION_DIR`	`~/.openclaw/agents/main/sessions`	Session transcripts dir

Embedding Backends

# Default: local sentence-transformers (no API key, ~470MB model download)
export OPENCLAW_MEM_BACKEND=local
export OPENCLAW_MEM_MODEL=intfloat/multilingual-e5-small  # default, Korean+English

# English-only lightweight alternative
export OPENCLAW_MEM_MODEL=all-MiniLM-L6-v2

# OpenAI API
export OPENCLAW_MEM_BACKEND=openai
export OPENCLAW_MEM_MODEL=text-embedding-3-small
export OPENAI_API_KEY=sk-...

# Ollama (local server)
export OPENCLAW_MEM_BACKEND=ollama
export OPENCLAW_MEM_MODEL=nomic-embed-text

Python API

from openclaw_mem.search import search, search_index, get_detail
from openclaw_mem.index import index_single, index_observation
from openclaw_mem.observe import append_observation
from openclaw_mem.auto_capture import extract_observations_from_text
from openclaw_mem.embedder import get_embedder, Embedder

# Search
results = search("deployment", top_k=5)
summaries = search_index("deployment", top_k=10)
detail = get_detail("chunk:0:abc123")

# Index
index_single("path/to/file.md")
index_observation("Important finding", tag="learning")

# Observe
append_observation("Cache works great", tag="learning")

# Extract patterns from text
obs = extract_observations_from_text("결정: Redis를 사용한다")
# [{"tag": "decision", "text": "Redis를 사용한다"}]

# Direct embedding access
embedder = get_embedder()  # uses configured backend
vectors = embedder.embed(["text 1", "text 2"])

# Custom backend
embedder = Embedder(backend="openai", model="text-embedding-3-small")

Observation Tags

Tag	Description	Example patterns
`decision`	Decisions made	`결정:`, `Decision:`, `→ 채택`
`learning`	Things learned	`배움:`, `Learned:`, `발견:`, `✅`
`error`	Errors encountered	`에러:`, `Error:`, `FAIL`, `실패`
`insight`	TODOs and insights	`TODO:`, `할일:`, `다음에`

Benchmark — Korean Search Accuracy

Tested with intfloat/multilingual-e5-small on Korean+English mixed project data.

Metric	Result	Target
Accuracy	10/10 (100%)	≥ 80%
Avg Response	0.38s	≤ 1.0s
Similarity Scores	0.83–0.88	-

All 10 queries — pure Korean, pure English, and mixed — returned the correct document sections. See docs/benchmark.md for full results.

Why Local-First?

Cloud-dependent memory	openclaw-mem
API outage → entire memory offline	100% offline — works without internet
Data sent to third-party servers	Data stays on your disk — zero telemetry
API key management & rotation	No API keys needed — `pip install` and go
Vendor lock-in	MIT license — use anywhere, modify freely

Real-world pain: when OpenAI's embedding API went down, cloud-dependent agents lost all memory access. With openclaw-mem, a 470MB local model (intfloat/multilingual-e5-small) gives you Korean + English search that never goes offline.

Your data. Your disk. Your rules.

Security

openclaw-mem includes a built-in memory injection sanitizer that protects against prompt injection attacks stored in memory.

How it works

Observe — All incoming observations are scanned before storage. Detected injection patterns are filtered out and replaced with [FILTERED].
Index — During indexing, each chunk is scanned and warnings are logged for any detected patterns (non-blocking, since these are existing files).

Detected patterns

Direct command injection (ignore previous instructions, you are now, system prompt:, etc.)
Data exfiltration (send api key, curl https://..., fetch(...))
Encoding bypasses (base64.encode, eval(), exec())
Role manipulation (act as, pretend, jailbreak, DAN mode)

Custom patterns

from openclaw_mem.sanitizer import MemorySanitizer

sanitizer = MemorySanitizer(extra_patterns=[
    r"my_custom_attack_pattern",
    r"another_pattern_to_block",
])
is_safe, matches = sanitizer.check(user_input)

Development

# Install with dev dependencies
pip install -e ".[dev]"

# Run tests
pytest

# Run tests with coverage
pytest --cov=openclaw_mem

License

MIT — see LICENSE for details.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.2.0

Feb 12, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

openclaw_mem-0.2.0.tar.gz (44.1 kB view details)

Uploaded Feb 12, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

openclaw_mem-0.2.0-py3-none-any.whl (33.8 kB view details)

Uploaded Feb 12, 2026 Python 3

File details

Details for the file openclaw_mem-0.2.0.tar.gz.

File metadata

Download URL: openclaw_mem-0.2.0.tar.gz
Upload date: Feb 12, 2026
Size: 44.1 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.2

File hashes

Hashes for openclaw_mem-0.2.0.tar.gz
Algorithm	Hash digest
SHA256	`94dc0060753339beb47cdd10bae119220865a67342cff68e6bcd3e56b1ecd3fc`
MD5	`c7b7dc1b4af9863a0e1da53ac212e7f9`
BLAKE2b-256	`670b75b991dfbbc2acf856868c63a5b2d8e8378b1d4287150accd335e364e97a`

See more details on using hashes here.

File details

Details for the file openclaw_mem-0.2.0-py3-none-any.whl.

File metadata

Download URL: openclaw_mem-0.2.0-py3-none-any.whl
Upload date: Feb 12, 2026
Size: 33.8 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.2

File hashes

Hashes for openclaw_mem-0.2.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`2cac83c27bbf46bd257602d013ddd7081a30f257d9d34272413a00a15104c06e`
MD5	`b1c556d06d16e68bc2632dedb386461f`
BLAKE2b-256	`05349d5142b5f8b80503c9814d15caa4517c9c6fcf0ae5af4f1dd9c896445aa6`

See more details on using hashes here.

openclaw-mem 0.2.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

openclaw-mem

Features

Architecture

Installation

From source

Quick Start

1. Index your markdown files

2. Search with Progressive Disclosure

3. Record observations

4. Auto-capture from sessions

5. Archive old files

Configuration

Embedding Backends

Python API

Observation Tags

Benchmark — Korean Search Accuracy

Why Local-First?

Security

How it works

Detected patterns

Custom patterns

Development

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes