Skip to main content

A plastic memory well for coding agents.

Project description

Mimir

A plastic memory well for coding agents.

Python 3.10+ License: Apache-2.0 PyPI GitHub

Unlike vector notebooks, Mimir reshapes its own embedding space as you work. One matrix. Zero cloud. Always local.


Why Mimir?

Most agent memory tools are filing cabinets: they store text, then search it. They don't learn. They don't adapt. They just retrieve.

Mimir is different.

It maintains a small, fixed-size prototype matrix that is updated with every interaction via Hebbian-style local learning. The same embedding model keeps its base weights, but Mimir overlays a fast, plastic layer that bends toward your domain, your project, and your habits.

The result: a memory system that feels less like a database and more like a second brain that gets sharper the more you use it.

What makes Mimir unique

Other memory tools Mimir
Core model Store and retrieve discrete facts Learn a plastic prototype matrix
Learning Heuristic scoring, TTL decay, or no learning at all Hebbian / Oja local updates + Markov prediction
State size Often grows with history (SQLite/vector DB) Fixed [k × dim] matrix, typically < 100 MB
Offline Needs cloud API or hosted vector DB Runs locally with llama-server, sentence-transformers, or fake backend
Inference Calls DB/index on every recall Pure matrix operation, no locks, predictable latency
Hook noise Stores every short reply, including "ok" and "continue" Language-aware small-talk filtering + importance gating
Security Stores secrets verbatim Automatic API key / token / password redaction
Project context Manual copy-paste of instruction files Auto-discovers AGENTS.md, CLAUDE.md, .cursorrules
Quality Duplicate memories accumulate Duplicate blocking + contradiction hints at store time

Secret redaction patterns can be customized or disabled entirely via MimirConfig(redaction_patterns=...) (None = defaults, [] = disabled).


Quick Start

pip install mimir-core

Use it from Python

from mimir import Mimir, MimirConfig

config = MimirConfig(
    base_model="all-MiniLM-L6-v2",
    num_prototypes=64,
    top_k=4,
)
mimir = Mimir(config)

# Encode
emb = mimir.encode("hello world")
print(emb.shape)  # (1, 384)

# Learn
report = mimir.learn("hello world", importance=1.0)
print(report)

# Save / load
mimir.save("checkpoint.pt")
mimir.load("checkpoint.pt")

Plug it into your coding agent

Mimir exposes an MCP server and Agent CLI hooks for Kimi Code, Claude Code, Codex, and OpenCode. Each workspace is isolated under ~/.mimir/workspaces/.

Start the MCP server

# Local embedding backend (recommended)
llama-server \
  --model Qwen3-Embedding-8B-Q4_K_M.gguf \
  --embeddings \
  --port 11435

# Or use sentence-transformer if you don't have llama-server
mimir mcp --backend sentence-transformer

Configure Kimi Code / Claude Code / Codex (.mcp.json)

{
  "mcpServers": {
    "mimir": {
      "command": "mimir",
      "args": ["mcp", "--backend", "sentence-transformer"]
    }
  }
}

Configure OpenCode (.opencode/opencode.jsonc)

{
  "mcp": {
    "mimir": {
      "type": "local",
      "command": ["mimir", "mcp", "--backend", "sentence-transformer"]
    }
  }
}

One-shot setup with mimir setup

Instead of editing config files by hand, you can install hooks automatically:

mimir setup kimi-code
mimir setup claude-code
mimir setup codex
mimir setup opencode

Use --base-dir to write the configuration somewhere other than the agent's default config directory (useful for custom dotfiles layouts or CI):

mimir setup kimi-code --base-dir ~/my-configs/kimi-code

This writes the correct hook definitions for each agent CLI and is safe to run multiple times.

For AI Agents

If you are an AI agent installing Mimir on behalf of a user, use this checklist.

1. Install Mimir

pip install mimir-core

Or install from the repo root:

pip install -e .

Verify:

mimir --help
python -m mimir.hooks.mimir_turn --help

2. Start an embedding backend

Choose one backend. The default is llama-server at http://127.0.0.1:11435.

  • sentence-transformer — easiest, no extra server, first load ~20s:
    mimir mcp --backend sentence-transformer
    
  • llama-server — higher quality, requires a running server:
    llama-server \
      --model Qwen3-Embedding-8B-Q4_K_M.gguf \
      --embeddings \
      --port 11435
    
  • ollama — if the user already runs Ollama:
    mimir mcp --backend ollama --base-url http://127.0.0.1:11434 --model ...
    

3. Add the MCP server to the user's agent config

  • OpenCode (~/.config/opencode/opencode.json):
    {
      "mcp": {
        "mimir": {
          "type": "local",
          "command": ["mimir", "mcp", "--backend", "sentence-transformer"]
        }
      }
    }
    
  • Claude Code / Kimi Code / Codex (.mcp.json):
    {
      "mcpServers": {
        "mimir": {
          "command": "mimir",
          "args": ["mcp", "--backend", "sentence-transformer"]
        }
      }
    }
    

4. Install automatic agent hooks (optional)

mimir setup kimi-code
mimir setup claude-code
mimir setup codex
mimir setup opencode

This adds UserPromptSubmit/Stop hooks so Mimir recalls context on each turn and stores the user/assistant exchange automatically.

5. Tell the user to restart their agent CLI

MCP servers and hooks are loaded on startup. After restarting, the agent can use:

  • store(text) / recall(query) via MCP.
  • Automatic recall/store via hooks if installed.

Manual hook configuration

# ~/.kimi-code/config.toml
[[hooks]]
event = "UserPromptSubmit"
command = "python3 -m mimir.hooks.mimir_turn"
timeout = 10

[[hooks]]
event = "Stop"
command = "python3 -m mimir.hooks.mimir_turn"
timeout = 10

Both the MCP store() and the hook Stop path apply the same redaction, filtering, and duplicate-blocking pipeline, so secrets are never persisted regardless of how a memory is captured.

See docs/mcp-user-guide.md for the full hook guide.


How It Works

Text
  │
  ▼
[Base embedding model] ───────┐
  │                            │
  ▼                            │
Base embedding                │
  │                            │
  ▼                            │
[Prototype matrix lookup]     │
  │                            │
  ▼                            │
Sparse prototype activation   │
  │                            │
  ▼                            │
Residual modulation ◄─────────┘
  │
  ▼
Mimir embedding
  • Slow weights: the frozen base embedding model gives stable semantic priors.
  • Fast weights: a fixed-capacity prototype matrix encodes your local domain.
  • Learning: each input activates the nearest prototypes and nudges them toward the new observation.
  • Prediction: a first-order Markov transition matrix predicts the next prototype and emits a surprise_score.
  • Forgetting: prototype strength decays exponentially; weak prototypes are overwritten when capacity is full.

This design is inspired by Prototype Theory in cognitive psychology and Predictive Coding in neuroscience: memory is not a pile of events, but a compressed set of typical examples that continuously updates itself.


MCP Tools

Tool Purpose
store(text, importance=1.0) Store and learn from text. Secrets are redacted before storage; the response text field is the redacted form. With async store enabled, returns "pending" and processes in the background.
recall(query, top_k=5, min_score=0.0) Hybrid vector + BM25 recall, reranked by lifecycle metadata
consolidate() Consolidate the working memory buffer
forget() Clear the current session's working memory
checkpoint(name) Save a named checkpoint
restore(name) Restore to a named checkpoint
status() Show session stats. When async store is enabled, includes async_store with enabled and pending_count.

store() may also include reason, similar_memory, or contradictions in its response when a memory is rejected or appears to contradict an existing memory.


Quality Gate

store() runs a lightweight quality gate before learning:

  • Duplicate blocking: near-duplicate memories (cosine similarity ≥ 0.95) are rejected instead of accumulating.
  • Contradiction hints: simple negation/polarity checks flag pairs like "I use Python" vs "I don't use Python". The memory is still stored, but the result includes the hint so the agent can ask before acting on stale context.

Both checks can be disabled via MimirConfig:

MimirConfig(quality_gate_enabled=False)

Or tune the thresholds:

MimirConfig(
    quality_gate_enabled=True,
    quality_gate_duplicate_threshold=0.95,
    quality_gate_contradiction_threshold=0.85,
)

Async store

For MCP / agent integrations where the embedding backend is slow, you can defer store() to a background worker so the tool returns immediately:

MimirConfig(
    async_store_enabled=True,
    async_store_queue_size=1000,
    async_store_flush_timeout=5.0,
)

When async storage is enabled, store() returns one of:

  • {"stored": "pending", "text": ..., "memory_count": ..., "pending_count": ...} when the item is enqueued successfully.
  • {"stored": False, "text": ..., "memory_count": ..., "reason": "queue_full"} when the queue is at capacity.

The background worker performs duplicate checks, learning, and persistence. On MCP server shutdown the queue is flushed so pending memories are not lost.

status() includes an async_store dictionary with enabled, pending_count, and worker_alive so you can monitor the queue health.

Other useful configuration fields

Field Default Purpose
redaction_enabled True Enable secret redaction
redaction_patterns None Custom regex list (None = defaults, [] = disabled)
project_context_enabled True Auto-ingest AGENTS.md / CLAUDE.md / .cursorrules
project_context_importance 1.5 Importance assigned to project context memories
async_store_enabled False Defer embedding/learning to background worker
async_store_queue_size 1000 Max pending items for async store
async_store_flush_timeout 5.0 Seconds to wait for flush on shutdown

Programming Interface

If you want to use Mimir inside your own Python code:

from mimir.adapters.agents import InMemoryAgentAdapter, Message
from mimir.core.config import MimirConfig

adapter = InMemoryAgentAdapter(
    config=MimirConfig(base_model="all-MiniLM-L6-v2", top_k=4),
)

adapter.observe([
    Message(role="user", content="请用 Python 写快排"),
    Message(role="assistant", content="..."),
])
adapter.consolidate()
memories = adapter.recall("Python 排序", top_k=3)

print(adapter.memory_count)
adapter.clear_memories()

AgentMemoryInterface also exposes encode(texts) to retrieve embeddings for a list of texts, which is useful for custom duplicate checks or integrations:

embeddings = adapter.encode(["hello world", "goodbye world"])

See docs/agent-integration.md for the adapter API.


CLI

# Encode
mimir encode --backend sentence-transformer "hello world"

# Learn
mimir learn --backend llama-server "重要上下文"

# Evaluate
python -m mimir.eval --backend llama-server --top-k 4

Status & Roadmap

Mimir is currently v0.3.0.

  • MVP: encode / learn / save / load
  • Top-k sparse prototype activation
  • EventBus + PredictionPolicy + surprise score
  • MCP server + Agent CLI hooks
  • BM25 + lifecycle hybrid recall
  • Multi-language small-talk filtering for automatic hook capture
  • Secret redaction for API keys, tokens, and passwords
  • Project context discovery (AGENTS.md, CLAUDE.md, .cursorrules)
  • mimir setup <agent> one-shot configuration
  • Duplicate blocking and contradiction hints
  • Async embedding queue
  • SQLite-backed memory metadata
  • HITL preview before storing high-impact memories

See docs/roadmap.md for the full roadmap.


Development

Mimir uses GitHub Actions to run checks on every push and PR. The CI matrix tests Python 3.10, 3.11, and 3.12, plus the OpenCode TypeScript plugin.

Set up a local development environment:

pip install -e ".[dev,server,api]"

The extras are:

  • dev — pytest, ruff, mypy
  • server — sentence-transformers embedding backend
  • api — OpenAI-compatible client dependencies

Run the same checks as CI:

ruff check mimir
mypy mimir
python -m pytest --tb=short

For the OpenCode plugin you need Node.js 20+:

cd plugins/opencode
npm ci
npm run typecheck

Secret scanning

Some test fixtures and evaluation data contain intentionally fake secrets or public benchmark dialogue. These files are listed in .gitleaksignore so Gitleaks does not flag them. Do not add real credentials to those files or to the ignore list.

Run Gitleaks locally before committing:

gitleaks detect --source . --verbose

If it reports a false positive in test/eval data, add the file to .gitleaksignore only after confirming it contains no real credentials, and keep the ignore list in sync with CI.


Embedding Backend Performance

Mimir supports multiple embedding backends. Choose based on your hardware and latency requirements.

Measured on Apple Silicon (M-series), 128-sample batch:

Backend Model Dim Cold Start Short Text Long Text Batch-128 Throughput Memory
sentence-transformer all-MiniLM-L6-v2 384 ~20.6s* 4.9ms 6.1ms 23ms ~4,500/s ~350MB
llama-server Qwen3-Embedding-8B-Q4_K_M 4096 ~43ms 32ms 641ms 10,509ms ~12/s ~288MB

* Cold start includes one-time model download/load. Subsequent runs are fast.

Recommendation

  • Use sentence-transformer for local development and everyday use.
  • Use llama-server when you need higher-quality embeddings and can accept higher latency.

Run the benchmark yourself:

python scripts/benchmark_embedding_backends.py

License

Apache-2.0


Mimir is named after the Norse guardian of the well of wisdom — a source of knowledge that deepens with every visit.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mimir_core-0.3.1.tar.gz (117.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

mimir_core-0.3.1-py3-none-any.whl (145.1 kB view details)

Uploaded Python 3

File details

Details for the file mimir_core-0.3.1.tar.gz.

File metadata

  • Download URL: mimir_core-0.3.1.tar.gz
  • Upload date:
  • Size: 117.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.13

File hashes

Hashes for mimir_core-0.3.1.tar.gz
Algorithm Hash digest
SHA256 8e7fc9a7897af347828fae08eaa3e51a011ca6485618784352ed5f3803e51b12
MD5 4e016c13a65123a06e9c43e16f4d4cc4
BLAKE2b-256 48fa529d0c740b505d8c4194856b0067a9222d6e731c763a4fc75c896281e6b2

See more details on using hashes here.

File details

Details for the file mimir_core-0.3.1-py3-none-any.whl.

File metadata

  • Download URL: mimir_core-0.3.1-py3-none-any.whl
  • Upload date:
  • Size: 145.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.13

File hashes

Hashes for mimir_core-0.3.1-py3-none-any.whl
Algorithm Hash digest
SHA256 e759d9a41c5aa29be347f95589ff2da9264bc3941fd8106c442361cb3e4069d3
MD5 5a155b2d3849a9d387839a5b439ed86c
BLAKE2b-256 5efa90de36118a841096af46da82923b4aa501b07586a1f675215843c7151570

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page