A plastic memory well for coding agents.
Project description
Mimir
A plastic memory well for coding agents.
Unlike vector notebooks, Mimir reshapes its own embedding space as you work. One matrix. Zero cloud. Always local.
Why Mimir?
Most agent memory tools are filing cabinets: they store text, then search it. They don't learn. They don't adapt. They just retrieve.
Mimir is different.
It maintains a small, fixed-size prototype matrix that is updated with every interaction via Hebbian-style local learning. The same embedding model keeps its base weights, but Mimir overlays a fast, plastic layer that bends toward your domain, your project, and your habits.
The result: a memory system that feels less like a database and more like a second brain that gets sharper the more you use it.
What makes Mimir unique
| Other memory tools | Mimir | |
|---|---|---|
| Core model | Store and retrieve discrete facts | Learn a plastic prototype matrix |
| Learning | Heuristic scoring, TTL decay, or no learning at all | Hebbian / Oja local updates + Markov prediction |
| State size | Often grows with history (SQLite/vector DB) | Fixed [k × dim] matrix, typically < 100 MB |
| Offline | Needs cloud API or hosted vector DB | Runs locally with llama-server, sentence-transformers, or fake backend |
| Inference | Calls DB/index on every recall | Pure matrix operation, no locks, predictable latency |
| Hook noise | Stores every short reply, including "ok" and "continue" | Language-aware small-talk filtering + importance gating |
| Security | Stores secrets verbatim | Automatic API key / token / password redaction |
| Project context | Manual copy-paste of instruction files | Auto-discovers AGENTS.md, CLAUDE.md, .cursorrules |
| Quality | Duplicate memories accumulate | Duplicate blocking + contradiction hints at store time |
Secret redaction patterns can be customized or disabled entirely via
MimirConfig(redaction_patterns=...) (None = defaults, [] = disabled).
Quick Start
pip install mimir-core
Use it from Python
from mimir import Mimir, MimirConfig
config = MimirConfig(
base_model="all-MiniLM-L6-v2",
num_prototypes=64,
top_k=4,
)
mimir = Mimir(config)
# Encode
emb = mimir.encode("hello world")
print(emb.shape) # (1, 384)
# Learn
report = mimir.learn("hello world", importance=1.0)
print(report)
# Save / load
mimir.save("checkpoint.pt")
mimir.load("checkpoint.pt")
Plug it into your coding agent
Mimir exposes an MCP server and Agent CLI hooks for Kimi Code, Claude Code,
Codex, and OpenCode. Each workspace is isolated under ~/.mimir/workspaces/.
Start the MCP server
# Local embedding backend (recommended)
llama-server \
--model Qwen3-Embedding-8B-Q4_K_M.gguf \
--embeddings \
--port 11435
# Or use sentence-transformer if you don't have llama-server
mimir mcp --backend sentence-transformer
Configure Kimi Code / Claude Code / Codex (.mcp.json)
{
"mcpServers": {
"mimir": {
"command": "mimir",
"args": ["mcp", "--backend", "sentence-transformer"]
}
}
}
Configure OpenCode (.opencode/opencode.jsonc)
{
"mcp": {
"mimir": {
"type": "local",
"command": ["mimir", "mcp", "--backend", "sentence-transformer"]
}
}
}
One-shot setup with mimir setup
Instead of editing config files by hand, you can install hooks automatically:
mimir setup kimi-code
mimir setup claude-code
mimir setup codex
mimir setup opencode
Use --base-dir to write the configuration somewhere other than the agent's
default config directory (useful for custom dotfiles layouts or CI):
mimir setup kimi-code --base-dir ~/my-configs/kimi-code
This writes the correct hook definitions for each agent CLI and is safe to run multiple times.
For AI Agents
If you are an AI agent installing Mimir on behalf of a user, use this checklist.
1. Install Mimir
pip install mimir-core
Or install from the repo root:
pip install -e .
Verify:
mimir --help
python -m mimir.hooks.mimir_turn --help
2. Start an embedding backend
Choose one backend. The default is llama-server at http://127.0.0.1:11435.
- sentence-transformer — easiest, no extra server, first load ~20s:
mimir mcp --backend sentence-transformer
- llama-server — higher quality, requires a running server:
llama-server \ --model Qwen3-Embedding-8B-Q4_K_M.gguf \ --embeddings \ --port 11435
- ollama — if the user already runs Ollama:
mimir mcp --backend ollama --base-url http://127.0.0.1:11434 --model ...
3. Add the MCP server to the user's agent config
- OpenCode (
~/.config/opencode/opencode.json):{ "mcp": { "mimir": { "type": "local", "command": ["mimir", "mcp", "--backend", "sentence-transformer"] } } }
- Claude Code / Kimi Code / Codex (
.mcp.json):{ "mcpServers": { "mimir": { "command": "mimir", "args": ["mcp", "--backend", "sentence-transformer"] } } }
4. Install automatic agent hooks (optional)
mimir setup kimi-code
mimir setup claude-code
mimir setup codex
mimir setup opencode
This adds UserPromptSubmit/Stop hooks so Mimir recalls context on each turn
and stores the user/assistant exchange automatically.
5. Tell the user to restart their agent CLI
MCP servers and hooks are loaded on startup. After restarting, the agent can use:
store(text)/recall(query)via MCP.- Automatic recall/store via hooks if installed.
Manual hook configuration
# ~/.kimi-code/config.toml
[[hooks]]
event = "UserPromptSubmit"
command = "python3 -m mimir.hooks.mimir_turn"
timeout = 10
[[hooks]]
event = "Stop"
command = "python3 -m mimir.hooks.mimir_turn"
timeout = 10
Both the MCP store() and the hook Stop path apply the same redaction,
filtering, and duplicate-blocking pipeline, so secrets are never persisted
regardless of how a memory is captured.
See docs/mcp-user-guide.md for the full hook guide.
How It Works
Text
│
▼
[Base embedding model] ───────┐
│ │
▼ │
Base embedding │
│ │
▼ │
[Prototype matrix lookup] │
│ │
▼ │
Sparse prototype activation │
│ │
▼ │
Residual modulation ◄─────────┘
│
▼
Mimir embedding
- Slow weights: the frozen base embedding model gives stable semantic priors.
- Fast weights: a fixed-capacity prototype matrix encodes your local domain.
- Learning: each input activates the nearest prototypes and nudges them toward the new observation.
- Prediction: a first-order Markov transition matrix predicts the next
prototype and emits a
surprise_score. - Forgetting: prototype strength decays exponentially; weak prototypes are overwritten when capacity is full.
This design is inspired by Prototype Theory in cognitive psychology and Predictive Coding in neuroscience: memory is not a pile of events, but a compressed set of typical examples that continuously updates itself.
MCP Tools
| Tool | Purpose |
|---|---|
store(text, importance=1.0) |
Store and learn from text. Secrets are redacted before storage; the response text field is the redacted form. With async store enabled, returns "pending" and processes in the background. |
recall(query, top_k=5, min_score=0.0) |
Hybrid vector + BM25 recall, reranked by lifecycle metadata |
consolidate() |
Consolidate the working memory buffer |
forget() |
Clear the current session's working memory |
checkpoint(name) |
Save a named checkpoint |
restore(name) |
Restore to a named checkpoint |
status() |
Show session stats. When async store is enabled, includes async_store with enabled and pending_count. |
store() may also include reason, similar_memory, or contradictions in its
response when a memory is rejected or appears to contradict an existing memory.
Quality Gate
store() runs a lightweight quality gate before learning:
- Duplicate blocking: near-duplicate memories (cosine similarity ≥ 0.95) are rejected instead of accumulating.
- Contradiction hints: simple negation/polarity checks flag pairs like "I use Python" vs "I don't use Python". The memory is still stored, but the result includes the hint so the agent can ask before acting on stale context.
Both checks can be disabled via MimirConfig:
MimirConfig(quality_gate_enabled=False)
Or tune the thresholds:
MimirConfig(
quality_gate_enabled=True,
quality_gate_duplicate_threshold=0.95,
quality_gate_contradiction_threshold=0.85,
)
Async store
For MCP / agent integrations where the embedding backend is slow, you can defer
store() to a background worker so the tool returns immediately:
MimirConfig(
async_store_enabled=True,
async_store_queue_size=1000,
async_store_flush_timeout=5.0,
)
When async storage is enabled, store() returns one of:
{"stored": "pending", "text": ..., "memory_count": ..., "pending_count": ...}when the item is enqueued successfully.{"stored": False, "text": ..., "memory_count": ..., "reason": "queue_full"}when the queue is at capacity.
The background worker performs duplicate checks, learning, and persistence. On MCP server shutdown the queue is flushed so pending memories are not lost.
status() includes an async_store dictionary with enabled, pending_count,
and worker_alive so you can monitor the queue health.
Other useful configuration fields
| Field | Default | Purpose |
|---|---|---|
redaction_enabled |
True |
Enable secret redaction |
redaction_patterns |
None |
Custom regex list (None = defaults, [] = disabled) |
project_context_enabled |
True |
Auto-ingest AGENTS.md / CLAUDE.md / .cursorrules |
project_context_importance |
1.5 |
Importance assigned to project context memories |
async_store_enabled |
False |
Defer embedding/learning to background worker |
async_store_queue_size |
1000 |
Max pending items for async store |
async_store_flush_timeout |
5.0 |
Seconds to wait for flush on shutdown |
Programming Interface
If you want to use Mimir inside your own Python code:
from mimir.adapters.agents import InMemoryAgentAdapter, Message
from mimir.core.config import MimirConfig
adapter = InMemoryAgentAdapter(
config=MimirConfig(base_model="all-MiniLM-L6-v2", top_k=4),
)
adapter.observe([
Message(role="user", content="请用 Python 写快排"),
Message(role="assistant", content="..."),
])
adapter.consolidate()
memories = adapter.recall("Python 排序", top_k=3)
print(adapter.memory_count)
adapter.clear_memories()
AgentMemoryInterface also exposes encode(texts) to retrieve embeddings for a
list of texts, which is useful for custom duplicate checks or integrations:
embeddings = adapter.encode(["hello world", "goodbye world"])
See docs/agent-integration.md for the adapter API.
CLI
# Encode
mimir encode --backend sentence-transformer "hello world"
# Learn
mimir learn --backend llama-server "重要上下文"
# Evaluate
python -m mimir.eval --backend llama-server --top-k 4
Status & Roadmap
Mimir is currently v0.3.0.
- MVP: encode / learn / save / load
- Top-k sparse prototype activation
- EventBus + PredictionPolicy + surprise score
- MCP server + Agent CLI hooks
- BM25 + lifecycle hybrid recall
- Multi-language small-talk filtering for automatic hook capture
- Secret redaction for API keys, tokens, and passwords
- Project context discovery (
AGENTS.md,CLAUDE.md,.cursorrules) -
mimir setup <agent>one-shot configuration - Duplicate blocking and contradiction hints
- Async embedding queue
- SQLite-backed memory metadata
- HITL preview before storing high-impact memories
See docs/roadmap.md for the full roadmap.
Development
Mimir uses GitHub Actions to run checks on every push and PR. The CI matrix tests Python 3.10, 3.11, and 3.12, plus the OpenCode TypeScript plugin.
Set up a local development environment:
pip install -e ".[dev,server,api]"
The extras are:
dev— pytest, ruff, mypyserver— sentence-transformers embedding backendapi— OpenAI-compatible client dependencies
Run the same checks as CI:
ruff check mimir
mypy mimir
python -m pytest --tb=short
For the OpenCode plugin you need Node.js 20+:
cd plugins/opencode
npm ci
npm run typecheck
Secret scanning
Some test fixtures and evaluation data contain intentionally fake secrets or
public benchmark dialogue. These files are listed in .gitleaksignore so
Gitleaks does not flag them. Do not add real credentials to those files or
to the ignore list.
Run Gitleaks locally before committing:
gitleaks detect --source . --verbose
If it reports a false positive in test/eval data, add the file to
.gitleaksignore only after confirming it contains no real credentials, and
keep the ignore list in sync with CI.
Embedding Backend Performance
Mimir supports multiple embedding backends. Choose based on your hardware and latency requirements.
Measured on Apple Silicon (M-series), 128-sample batch:
| Backend | Model | Dim | Cold Start | Short Text | Long Text | Batch-128 | Throughput | Memory |
|---|---|---|---|---|---|---|---|---|
| sentence-transformer | all-MiniLM-L6-v2 | 384 | ~20.6s* | 4.9ms | 6.1ms | 23ms | ~4,500/s | ~350MB |
| llama-server | Qwen3-Embedding-8B-Q4_K_M | 4096 | ~43ms | 32ms | 641ms | 10,509ms | ~12/s | ~288MB |
* Cold start includes one-time model download/load. Subsequent runs are fast.
Recommendation
- Use
sentence-transformerfor local development and everyday use. - Use
llama-serverwhen you need higher-quality embeddings and can accept higher latency.
Run the benchmark yourself:
python scripts/benchmark_embedding_backends.py
License
Apache-2.0
Mimir is named after the Norse guardian of the well of wisdom — a source of knowledge that deepens with every visit.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file mimir_core-0.3.1.tar.gz.
File metadata
- Download URL: mimir_core-0.3.1.tar.gz
- Upload date:
- Size: 117.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8e7fc9a7897af347828fae08eaa3e51a011ca6485618784352ed5f3803e51b12
|
|
| MD5 |
4e016c13a65123a06e9c43e16f4d4cc4
|
|
| BLAKE2b-256 |
48fa529d0c740b505d8c4194856b0067a9222d6e731c763a4fc75c896281e6b2
|
File details
Details for the file mimir_core-0.3.1-py3-none-any.whl.
File metadata
- Download URL: mimir_core-0.3.1-py3-none-any.whl
- Upload date:
- Size: 145.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e759d9a41c5aa29be347f95589ff2da9264bc3941fd8106c442361cb3e4069d3
|
|
| MD5 |
5a155b2d3849a9d387839a5b439ed86c
|
|
| BLAKE2b-256 |
5efa90de36118a841096af46da82923b4aa501b07586a1f675215843c7151570
|