Long-term memory system for AI agents — vector search, decay & reflection
Project description
AI-Houkai — Agent Memory System
A long-term memory system for AI agents backed by ChromaDB and exposed over MCP. Agents can remember, recall, and forget information across sessions — with automatic decay of stale memories and periodic reflection that condenses experience into knowledge.
Features
| Feature | Description |
|---|---|
| Vector search | Cosine-space HNSW via ChromaDB + sentence-transformers |
| Memory types | episodic · semantic · procedural · feedback |
| Rich metadata | importance, tags, source, access tracking |
| Decay | Exponential forgetting — prune old, unimportant memories |
| Reflection | Cluster episodic memories → condense into semantic summaries |
| MCP server | Five tools for any MCP client (Claude Code, Claude Desktop) |
| Multi-provider | Claude · OpenAI · Ollama (local) agent examples |
Layout
AI-Houkai/
├── ai_houkai/
│ ├── __init__.py # convenience re-exports
│ ├── memory_system/
│ │ ├── __init__.py
│ │ ├── store.py # MemoryStore + Memory dataclass
│ │ ├── decay.py # DecayEngine — exponential forgetting
│ │ └── reflection.py # ReflectionEngine — episodic → semantic
│ └── mcp_server/
│ ├── __init__.py
│ └── server.py # FastMCP server (5 tools)
├── examples/
│ ├── 01_standalone.py # pure-Python walkthrough, no LLM
│ ├── 02_ollama_local_network.py # Ollama on LAN, fully offline
│ ├── 03_claude_desktop.py # MCP auto-install for Claude Desktop
│ ├── 04_openai.py # OpenAI GPT-4o / gpt-4o-mini
│ ├── 05_decay_reflection.py # decay + reflection demo
│ ├── 06_claude_code.py # Claude Code MCP integration
│ ├── claude_agent.py # Claude Sonnet REPL (Anthropic SDK)
│ └── pip_package_example.py # post-install usage walkthrough
├── tests/
│ ├── conftest.py # isolated MemoryStore fixture (tmp_path)
│ ├── test_memory.py # MemoryStore unit tests
│ ├── test_decay.py # DecayEngine unit tests
│ ├── test_reflection.py # ReflectionEngine unit tests
│ └── test_dispatch.py # cross-provider _dispatch_tool tests
├── pyproject.toml
└── requirements.txt
Install
Modern Linux distros protect the system Python (PEP 668). Pick whichever
approach fits your workflow — none requires --break-system-packages.
Virtual environment (recommended for development)
python3 -m venv .venv
source .venv/bin/activate # Windows: .venv\Scripts\activate
pip install ai-houkai
pipx (recommended for the MCP server / CLI tool)
pipx installs CLI tools into isolated venvs and puts the script on your
PATH automatically — no activation step needed.
sudo apt install pipx # or: pip install --user pipx
pipx ensurepath # adds ~/.local/bin to PATH (one-time)
pipx install ai-houkai
ai-houkai-mcp # available everywhere
uv (fastest, modern)
curl -Lsf https://astral.sh/uv/install.sh | sh # one-time install
uv venv && uv pip install ai-houkai # project venv
# or run a script without a persistent install:
uv run --with ai-houkai python examples/pip_package_example.py
Extras
pip install "ai-houkai[claude]" # + Anthropic SDK
pip install "ai-houkai[openai]" # + OpenAI SDK (also covers Ollama)
pip install "ai-houkai[all]" # all providers
pip install "ai-houkai[dev]" # + pytest
The embedding model (
all-MiniLM-L6-v2) downloads automatically on first use (~90 MB). Everything runs fully local — no API key required for the memory layer itself.
Quick-start
from ai_houkai.memory_system import MemoryStore
store = MemoryStore() # persists to ./.chroma
store.remember("Python's GIL blocks CPU parallelism",
type="semantic", importance=0.85, tags=["python"])
for mem, score in store.recall("parallel execution", k=3):
print(f"{score:.3f} {mem.text}")
Run the tests
pytest tests/ -v # 79 tests
Examples
01 · Standalone (no LLM)
Full memory lifecycle — seed → recall with filters → access tracking → forget.
python examples/01_standalone.py
02 · Ollama (local network)
Conversational REPL using a local model over Ollama's OpenAI-compatible endpoint. No API key, no internet.
ollama pull llama3.1
OLLAMA_MODEL=llama3.1 python examples/02_ollama_local_network.py
| Env var | Default |
|---|---|
OLLAMA_BASE_URL |
http://localhost:11434/v1 |
OLLAMA_MODEL |
llama3.1 |
AI_HOUKAI_PATH |
./.chroma |
03 · Claude Desktop (MCP)
Auto-installs the MCP server into Claude Desktop's config.
python examples/03_claude_desktop.py # preview config
python examples/03_claude_desktop.py --install # write config
python examples/03_claude_desktop.py --demo # simulated session
04 · OpenAI
GPT-4o / gpt-4o-mini with function calling.
export OPENAI_API_KEY=sk-...
python examples/04_openai.py
OPENAI_MODEL=gpt-4o AI_HOUKAI_PATH=~/.ai_houkai python examples/04_openai.py
| Env var | Default |
|---|---|
OPENAI_MODEL |
gpt-4o-mini |
AI_HOUKAI_PATH |
temp dir |
05 · Decay + Reflection
Shows both cognitive maintenance features with backdated timestamps.
python examples/05_decay_reflection.py
06 · Claude Code (MCP)
Gives the Claude Code CLI a persistent memory so it remembers project conventions, past debug sessions, and your preferences across every coding session.
# Option A — one-liner (recommended)
claude mcp add ai-houkai -- ai-houkai-mcp
# Option B — auto-patch ~/.claude/settings.json
python examples/06_claude_code.py --install
# Option C — preview the config block
python examples/06_claude_code.py
# Smoke-test
python examples/06_claude_code.py --verify
# Simulated coding session
python examples/06_claude_code.py --demo
# Print a CLAUDE.md snippet that teaches Claude how to use memory
python examples/06_claude_code.py --claudemd
The installed MCP block in ~/.claude/settings.json:
{
"mcpServers": {
"ai-houkai": {
"command": "ai-houkai-mcp",
"env": {
"AI_HOUKAI_PATH": "~/.ai_houkai",
"AI_HOUKAI_COLLECTION": "claude_code"
}
}
}
}
Recommended CLAUDE.md addition
Add the following to your project's CLAUDE.md so Claude Code knows when and
how to use memory tools (run python examples/06_claude_code.py --claudemd
to generate it):
## Memory (AI-Houkai MCP)
- **remember** — store conventions, decisions, preferences
- **recall** — search before starting any task
- **forget** — remove outdated facts
| Situation | Action |
|---|---|
| User states a convention | `remember` with `type="procedural"` |
| User corrects you | `remember` correction, `forget` old fact |
| Starting a new task | `recall` relevant context first |
Claude agent (Anthropic SDK REPL)
export ANTHROPIC_API_KEY=sk-ant-...
AI_HOUKAI_PATH=/tmp/my_memory python examples/claude_agent.py
REPL commands: memories to list recent memories · quit to exit.
MCP server
Exposes the memory store to any MCP client.
ai-houkai-mcp
# or: python -m ai_houkai.mcp_server.server
Exposed tools: remember · recall · forget · list_recent · stats.
Environment variables:
| Variable | Default |
|---|---|
AI_HOUKAI_PATH |
./.chroma |
AI_HOUKAI_COLLECTION |
ai_houkai |
Claude Code (global):
claude mcp add ai-houkai -- ai-houkai-mcp
Claude Code (manual ~/.claude/settings.json):
{
"mcpServers": {
"ai-houkai": {
"command": "ai-houkai-mcp",
"env": { "AI_HOUKAI_PATH": "/your/memory/path" }
}
}
}
Claude Desktop — use examples/03_claude_desktop.py --install.
Decay
Memories fade over time based on age and importance.
score = importance × exp(−λ × days_since_last_access)
Default λ = 0.1 → half-life ≈ 7 days for a 0.5-importance memory.
procedural memories are protected and never pruned.
from ai_houkai.memory_system import MemoryStore, DecayEngine
store = MemoryStore()
engine = DecayEngine(store, decay_rate=0.1, min_score=0.05)
engine.prune(dry_run=True) # preview
engine.prune() # delete stale memories
Reflection
Clusters of semantically similar episodic memories are condensed into a
single semantic summary (the Generative Agents pattern).
from ai_houkai.memory_system import MemoryStore, ReflectionEngine
store = MemoryStore()
engine = ReflectionEngine(store, similarity_threshold=0.75)
engine.clusters() # inspect clusters
engine.reflect(dry_run=True) # preview
engine.reflect(consolidate=True) # create summaries + delete sources
Plug in any summarizer — including an LLM:
def llm_summarizer(memories):
prompt = "\n".join(m.text for m in memories)
return openai_client.chat.completions.create(
model="gpt-4o-mini",
messages=[{"role": "user", "content": f"Summarise: {prompt}"}],
).choices[0].message.content
engine = ReflectionEngine(store, summarizer=llm_summarizer)
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file ai_houkai-0.1.1.tar.gz.
File metadata
- Download URL: ai_houkai-0.1.1.tar.gz
- Upload date:
- Size: 21.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a911e4cfa73ff80c2651ed8570b5582c361226a93c59b6cbe3baad82d8092d4b
|
|
| MD5 |
0da1d6813989ca7c8a0fd3a3e574e7a2
|
|
| BLAKE2b-256 |
4124ada42a57fa2f05fadb8546f823ce7b2f220fc3208f3e1f74e740a9f8208a
|
File details
Details for the file ai_houkai-0.1.1-py3-none-any.whl.
File metadata
- Download URL: ai_houkai-0.1.1-py3-none-any.whl
- Upload date:
- Size: 14.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
fe00c8acdf719c1afa9098cdb39a67e6a50054628f7bd3211c4b74867d2c52ee
|
|
| MD5 |
5a5982b09673550f2f1b1052ecc0e2a9
|
|
| BLAKE2b-256 |
e4d532e4531c5fcca326cdf888207ae63d558a1668ef00a170819cef5294d044
|