A plastic memory well for coding agents.

These details have not been verified by PyPI

Project links

Project description

Mimir

A plastic memory well for coding agents.

Unlike vector notebooks, Mimir reshapes its own embedding space as you work. One matrix. Zero cloud. Always local.

Why Mimir?

Most agent memory tools are filing cabinets: they store text, then search it. They don't learn. They don't adapt. They just retrieve.

Mimir is different.

It maintains a small, fixed-size prototype matrix that is updated with every interaction via Hebbian-style local learning. The same embedding model keeps its base weights, but Mimir overlays a fast, plastic layer that bends toward your domain, your project, and your habits.

The result: a memory system that feels less like a database and more like a second brain that gets sharper the more you use it.

What makes Mimir unique

	Other memory tools	Mimir
Core model	Store and retrieve discrete facts	Learn a plastic prototype matrix
Learning	Heuristic scoring, TTL decay, or no learning at all	Hebbian / Oja local updates + Markov prediction
State size	Often grows with history (SQLite/vector DB)	Fixed `[k × dim]` matrix, typically < 100 MB
Offline	Needs cloud API or hosted vector DB	Runs locally with llama-server, sentence-transformers, or fake backend
Inference	Calls DB/index on every recall	Pure matrix operation, no locks, predictable latency
Hook noise	Stores every short reply, including "ok" and "continue"	Language-aware small-talk filtering + importance gating
Security	Stores secrets verbatim	Automatic API key / token / password redaction
Project context	Manual copy-paste of instruction files	Auto-discovers `AGENTS.md`, `CLAUDE.md`, `.cursorrules`
Quality	Duplicate memories accumulate	Duplicate blocking + contradiction hints at store time

Secret redaction patterns can be customized or disabled entirely via MimirConfig(redaction_patterns=...) (None = defaults, [] = disabled).

Quick Start

pip install mimir-core

Use it from Python

from mimir import Mimir, MimirConfig

config = MimirConfig(
    base_model="all-MiniLM-L6-v2",
    num_prototypes=64,
    top_k=4,
)
mimir = Mimir(config)

# Encode
emb = mimir.encode("hello world")
print(emb.shape)  # (1, 384)

# Learn
report = mimir.learn("hello world", importance=1.0)
print(report)

# Save / load
mimir.save("checkpoint.pt")
mimir.load("checkpoint.pt")

Plug it into your coding agent

Mimir exposes an MCP server and Agent CLI hooks for Kimi Code, Claude Code, Codex, and OpenCode. Each workspace is isolated under ~/.mimir/workspaces/.

Start the MCP server

# Local embedding backend (recommended)
llama-server \
  --model Qwen3-Embedding-8B-Q4_K_M.gguf \
  --embeddings \
  --port 11435

# Or use sentence-transformer if you don't have llama-server
mimir mcp --backend sentence-transformer

Configure Kimi Code / Claude Code / Codex (`.mcp.json`)

{
  "mcpServers": {
    "mimir": {
      "command": "mimir",
      "args": ["mcp", "--backend", "sentence-transformer"]
    }
  }
}

Configure OpenCode (`.opencode/opencode.jsonc`)

{
  "mcp": {
    "mimir": {
      "type": "local",
      "command": ["mimir", "mcp", "--backend", "sentence-transformer"]
    }
  }
}

One-shot setup with `mimir setup`

Instead of editing config files by hand, you can install hooks automatically:

mimir setup kimi-code
mimir setup claude-code
mimir setup codex

Use --base-dir to write the configuration somewhere other than the agent's default config directory (useful for custom dotfiles layouts or CI):

mimir setup kimi-code --base-dir ~/my-configs/kimi-code

This writes the correct hook definitions for each agent CLI and is safe to run multiple times.

Manual hook configuration

# ~/.kimi-code/config.toml
[[hooks]]
event = "UserPromptSubmit"
command = "python3 -m mimir.hooks.mimir_turn"
timeout = 10

[[hooks]]
event = "Stop"
command = "python3 -m mimir.hooks.mimir_turn"
timeout = 10

Both the MCP store() and the hook Stop path apply the same redaction, filtering, and duplicate-blocking pipeline, so secrets are never persisted regardless of how a memory is captured.

See docs/mcp-user-guide.md for the full hook guide.

How It Works

Text
  │
  ▼
[Base embedding model] ───────┐
  │                            │
  ▼                            │
Base embedding                │
  │                            │
  ▼                            │
[Prototype matrix lookup]     │
  │                            │
  ▼                            │
Sparse prototype activation   │
  │                            │
  ▼                            │
Residual modulation ◄─────────┘
  │
  ▼
Mimir embedding

Slow weights: the frozen base embedding model gives stable semantic priors.
Fast weights: a fixed-capacity prototype matrix encodes your local domain.
Learning: each input activates the nearest prototypes and nudges them toward the new observation.
Prediction: a first-order Markov transition matrix predicts the next prototype and emits a surprise_score.
Forgetting: prototype strength decays exponentially; weak prototypes are overwritten when capacity is full.

This design is inspired by Prototype Theory in cognitive psychology and Predictive Coding in neuroscience: memory is not a pile of events, but a compressed set of typical examples that continuously updates itself.

MCP Tools

Tool	Purpose
`store(text, importance=1.0)`	Store and learn from text. Secrets are redacted before storage; the response `text` field is the redacted form. With async store enabled, returns `"pending"` and processes in the background.
`recall(query, top_k=5, min_score=0.0)`	Hybrid vector + BM25 recall, reranked by lifecycle metadata
`consolidate()`	Consolidate the working memory buffer
`forget()`	Clear the current session's working memory
`checkpoint(name)`	Save a named checkpoint
`restore(name)`	Restore to a named checkpoint
`status()`	Show session stats. When async store is enabled, includes `async_store` with `enabled` and `pending_count`.

store() may also include reason, similar_memory, or contradictions in its response when a memory is rejected or appears to contradict an existing memory.

Quality Gate

store() runs a lightweight quality gate before learning:

Duplicate blocking: near-duplicate memories (cosine similarity ≥ 0.95) are rejected instead of accumulating.
Contradiction hints: simple negation/polarity checks flag pairs like "I use Python" vs "I don't use Python". The memory is still stored, but the result includes the hint so the agent can ask before acting on stale context.

Both checks can be disabled via MimirConfig:

MimirConfig(quality_gate_enabled=False)

Or tune the thresholds:

MimirConfig(
    quality_gate_enabled=True,
    quality_gate_duplicate_threshold=0.95,
    quality_gate_contradiction_threshold=0.85,
)

Async store

For MCP / agent integrations where the embedding backend is slow, you can defer store() to a background worker so the tool returns immediately:

MimirConfig(
    async_store_enabled=True,
    async_store_queue_size=1000,
    async_store_flush_timeout=5.0,
)

When async storage is enabled, store() returns one of:

{"stored": "pending", "text": ..., "memory_count": ..., "pending_count": ...} when the item is enqueued successfully.
{"stored": False, "text": ..., "memory_count": ..., "reason": "queue_full"} when the queue is at capacity.

The background worker performs duplicate checks, learning, and persistence. On MCP server shutdown the queue is flushed so pending memories are not lost.

status() includes an async_store dictionary with enabled, pending_count, and worker_alive so you can monitor the queue health.

Other useful configuration fields

Field	Default	Purpose
`redaction_enabled`	`True`	Enable secret redaction
`redaction_patterns`	`None`	Custom regex list (`None` = defaults, `[]` = disabled)
`project_context_enabled`	`True`	Auto-ingest `AGENTS.md` / `CLAUDE.md` / `.cursorrules`
`project_context_importance`	`1.5`	Importance assigned to project context memories
`async_store_enabled`	`False`	Defer embedding/learning to background worker
`async_store_queue_size`	`1000`	Max pending items for async store
`async_store_flush_timeout`	`5.0`	Seconds to wait for flush on shutdown

Programming Interface

If you want to use Mimir inside your own Python code:

from mimir.adapters.agents import InMemoryAgentAdapter, Message
from mimir.core.config import MimirConfig

adapter = InMemoryAgentAdapter(
    config=MimirConfig(base_model="all-MiniLM-L6-v2", top_k=4),
)

adapter.observe([
    Message(role="user", content="请用 Python 写快排"),
    Message(role="assistant", content="..."),
])
adapter.consolidate()
memories = adapter.recall("Python 排序", top_k=3)

print(adapter.memory_count)
adapter.clear_memories()

AgentMemoryInterface also exposes encode(texts) to retrieve embeddings for a list of texts, which is useful for custom duplicate checks or integrations:

embeddings = adapter.encode(["hello world", "goodbye world"])

See docs/agent-integration.md for the adapter API.

CLI

# Encode
mimir encode --backend sentence-transformer "hello world"

# Learn
mimir learn --backend llama-server "重要上下文"

# Evaluate
python -m mimir.eval --backend llama-server --top-k 4

Status & Roadmap

Mimir is currently v0.3.0.

MVP: encode / learn / save / load
Top-k sparse prototype activation
EventBus + PredictionPolicy + surprise score
MCP server + Agent CLI hooks
BM25 + lifecycle hybrid recall
Multi-language small-talk filtering for automatic hook capture
Secret redaction for API keys, tokens, and passwords
Project context discovery (AGENTS.md, CLAUDE.md, .cursorrules)
mimir setup <agent> one-shot configuration
Duplicate blocking and contradiction hints
Async embedding queue
SQLite-backed memory metadata
HITL preview before storing high-impact memories

See docs/roadmap.md for the full roadmap.

Embedding Backend Performance

Mimir supports multiple embedding backends. Choose based on your hardware and latency requirements.

Measured on Apple Silicon (M-series), 128-sample batch:

Backend	Model	Dim	Cold Start	Short Text	Long Text	Batch-128	Throughput	Memory
sentence-transformer	all-MiniLM-L6-v2	384	~20.6s*	4.9ms	6.1ms	23ms	~4,500/s	~350MB
llama-server	Qwen3-Embedding-8B-Q4_K_M	4096	~43ms	32ms	641ms	10,509ms	~12/s	~288MB

* Cold start includes one-time model download/load. Subsequent runs are fast.

Recommendation

Use sentence-transformer for local development and everyday use.
Use llama-server when you need higher-quality embeddings and can accept higher latency.

Run the benchmark yourself:

python scripts/benchmark_embedding_backends.py

License

Apache-2.0

Mimir is named after the Norse guardian of the well of wisdom — a source of knowledge that deepens with every visit.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.3.1

Jul 3, 2026

This version

0.3.0

Jul 3, 2026

0.2.0

Jun 30, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mimir_core-0.3.0.tar.gz (100.1 kB view details)

Uploaded Jul 3, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

mimir_core-0.3.0-py3-none-any.whl (124.1 kB view details)

Uploaded Jul 3, 2026 Python 3

File details

Details for the file mimir_core-0.3.0.tar.gz.

File metadata

Download URL: mimir_core-0.3.0.tar.gz
Upload date: Jul 3, 2026
Size: 100.1 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.13

File hashes

Hashes for mimir_core-0.3.0.tar.gz
Algorithm	Hash digest
SHA256	`92a038ec74505b5b91b3cf049c91c018b557ff598d19eff2eccba876f769a940`
MD5	`d9a3362d2d17cacf87202683e1dfd4a7`
BLAKE2b-256	`394d6341835f0f9df926312aa0bc9a9b4d7dd48f0188b9e746962ec932e7eede`

See more details on using hashes here.

File details

Details for the file mimir_core-0.3.0-py3-none-any.whl.

File metadata

Download URL: mimir_core-0.3.0-py3-none-any.whl
Upload date: Jul 3, 2026
Size: 124.1 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.13

File hashes

Hashes for mimir_core-0.3.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`92c26c094cbd2ccfdac0f4c6be4762288e632c306c0ddb5ebd7e2d60450904a1`
MD5	`c0c42012220091a7a3ea1071ddfd5b38`
BLAKE2b-256	`edc830b564f8d26aa790ac8bb22c9718266570a8e7c0019716f02739dfbc46e1`

See more details on using hashes here.

mimir-core 0.3.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Mimir

Why Mimir?

What makes Mimir unique

Quick Start

Use it from Python

Plug it into your coding agent

Start the MCP server

Configure Kimi Code / Claude Code / Codex (.mcp.json)

Configure OpenCode (.opencode/opencode.jsonc)

One-shot setup with mimir setup

Manual hook configuration

How It Works

MCP Tools

Quality Gate

Async store

Other useful configuration fields

Programming Interface

CLI

Status & Roadmap

Embedding Backend Performance

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

Configure Kimi Code / Claude Code / Codex (`.mcp.json`)

Configure OpenCode (`.opencode/opencode.jsonc`)

One-shot setup with `mimir setup`