MCP server that gives LLMs persistent graph-structured memory
Project description
Persistent, structured memory for AI agents — 4× fewer tokens than chunk-based retrieval.
Your LLM remembers facts, decisions, and context across every conversation, backed by a real knowledge graph.
Why waggle-mcp?
waggle-mcp is a local-first memory layer for MCP-compatible AI clients, built on a persistent knowledge graph.
MCP is the Model Context Protocol: the tool interface desktop AI clients like Claude Desktop, Cursor, and Codex use to talk to local servers.
Waggle gives your AI a persistent knowledge graph it can read and write through any MCP-compatible client.
| Stuffed context | Structured retrieval |
|---|---|
| Context stuffed into a huge prompt every session | Compact subgraph retrieved at query time |
| Session-local memory | Persistent multi-session memory |
| Flat notes and chunks | Typed nodes and edges: decisions, reasons, contradictions, updates |
| "What changed?" requires replaying logs | Temporal queries and diffs are first-class |
Waggle's core tradeoff is deliberate: it stores structured knowledge instead of replaying entire transcripts. On Waggle's checked-in 27-scenario multi-session corpus, that yields ~4× fewer tokens per retrieval than naive chunked retrieval. The benchmark section below shows the actual numbers and limits.
Quick start
pip install waggle-mcp
waggle-mcp init
# Restart your MCP client. Done.
init detects your MCP client, writes its config, and creates the local database directory. Default mode is local SQLite with on-device embeddings.
See it in action
Session 1 — April 10
User: Let's use PostgreSQL. MySQL replication has been painful.
Agent: [calls observe_conversation()]
→ stores decision node: "Chose PostgreSQL over MySQL"
→ stores reason node: "MySQL replication painful"
→ links them with a depends_on edge
Session 2 — April 12 (fresh context window, no history)
User: What did we decide about the database?
Agent: [calls query_graph("database decision")]
→ retrieves the decision node + linked reason from April 10
"You decided on PostgreSQL on April 10. The reason recorded was
that MySQL replication had been painful."
Session 3 — April 14
User: Actually, let's reconsider — the team is more familiar with MySQL.
Agent: [calls store_node() + store_edge(new_node → old_node, "contradicts")]
→ both positions are preserved, and the contradiction is explicit
This is the main difference from chunk replay: the agent does not just recover a transcript snippet, it recovers the decision, the reason, and what changed.
Portable context handoff
Hit a rate limit? Switching models mid-project? Handing context to another AI?
export_context_bundle generates a Markdown or JSON context pack that another AI can ingest directly.
Example MCP tool call:
export_context_bundle({
"mode": "query",
"query": "database architecture decisions",
"format": "both",
"retrieval_mode": "fusion"
})
Supported export modes:
prime— compact brief fromprime_contextquery— answer a specific question with supporting graph contextgraph— export the whole tenant graph, chunked for large memory sets
Supported retrieval lanes for query-mode export:
graph— graph-native retrievalreplay— raw transcript/session replayfusion— graph + replay merged with reciprocal-rank fusion
Waggle also supports Obsidian-style round-trip editing:
export_markdown_vaultimport_markdown_vault
That writes one Markdown file per node with YAML frontmatter and wikilinks, then re-imports user edits non-destructively.
The core tool: observe_conversation
Once your client prompt or tool policy nudges the model to call observe_conversation, the memory workflow becomes automatic.
observe_conversation(user_message, assistant_response)
Each call:
- extracts atomic facts from the turn
- deduplicates against existing nodes
- links related concepts with typed edges
- flags contradictions and updates
- stores the raw turn for replay/fusion retrieval
No separate schema authoring is required. The deterministic parser turns conversation turns into typed graph memory directly.
MCP tools
Core workflow:
| Tool | What it does |
|---|---|
observe_conversation |
Ingest a conversation turn into graph memory |
query_graph |
Retrieve memory with graph, replay, or fusion mode |
prime_context |
Build a compact brief for a fresh session |
export_context_bundle |
Hand memory to another AI as Markdown or JSON |
export_markdown_vault |
Export Obsidian-compatible Markdown files |
import_markdown_vault |
Re-import edited Markdown vault files |
timeline |
Build a chronological view of what changed |
list_conflicts / resolve_conflict |
Inspect and resolve contradictions without deleting history |
Additional graph/admin tools are documented in docs/reference.md.
Installation
Local development:
python3 -m venv .venv
source .venv/bin/activate
pip install -e ".[dev]"
waggle-mcp init
Neo4j backend:
pip install -e ".[dev,neo4j]"
WAGGLE_BACKEND=neo4j WAGGLE_TRANSPORT=http waggle-mcp
Docker, manual client config, environment variables, and admin commands are in docs/reference.md.
Benchmarks
Benchmark summary:
| Area | Corpus | Result |
|---|---|---|
| Extraction | 12-case deterministic fixture | 100% |
| Retrieval | 18-query retrieval fixture | 83% Hit@k |
| Comparative efficiency | 27-scenario / 66-query corpus | 88% Hit@k, 73% exact support, 37.7 mean tokens |
| Query stress | 40 adversarial retrieval-only cases | 98% Hit@k, 98% exact support |
| External baseline | LongMemEval s split, 500 questions |
graph_raw: 97.0% R@5 / 76.4% Exact@5, graph_hybrid: 95.8% R@5 / 82.0% Exact@5 |
What these numbers mean:
- Waggle is strongest when the query benefits from structured reasoning chains, temporal context, and contradiction tracking.
- The
~4× fewer tokensclaim comes from the comparative corpus: Waggle averages37.7tokens per retrieval vs150.2for naive chunked-vector RAG. - The retrieval engine itself is strong in isolation (
98%on the query-stress corpus). End-to-end misses still show up more in broader comparative evaluation than in retrieval-only tests. - Deduplication is intentionally conservative: best measured
17/22 = 77%, with zero false merges across the threshold sweep.
Deep dives and saved artifacts:
- Internal benchmark artifacts: tests/artifacts/README.md
- LongMemEval artifacts: benchmarks/longmemeval/README.md
- Evaluation roadmap: docs/evaluation-plan.md
Docs and operations
Detailed reference material lives outside the landing flow:
- Install variants, client config, environment variables, admin commands, and architecture: docs/reference.md
- Kubernetes deployment: deploy/kubernetes/README.md
- Runbooks: docs/runbooks/
- Benchmark artifacts and methodology: tests/artifacts/README.md and benchmarks/longmemeval/README.md
Next Steps
- Expand the extraction corpus beyond the current 12 cases so robustness claims are based on larger paraphrase- and temporality-heavy fixtures.
- Publish a short LongMemEval methodology note, including cold vs warm cache runs and the reranked comparison path.
- Tighten replay/fusion ranking for recall-heavy workloads and improve provenance summaries in exported bundles.
- Polish Neo4j query paths and large-vault import reporting.
License
MIT — see LICENSE.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file waggle_mcp-0.1.4.tar.gz.
File metadata
- Download URL: waggle_mcp-0.1.4.tar.gz
- Upload date:
- Size: 117.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3039bc4c4ee7ffe56babe5ed6c91b4f1baa2a68ed338286a7b6be865e6423a20
|
|
| MD5 |
87751f0ce4b39e0c631e5e9ac7165580
|
|
| BLAKE2b-256 |
b531d9dd3f1e3644152c539d215c2a4f5936cf7d632d9ecba721d697104b4ba8
|
File details
Details for the file waggle_mcp-0.1.4-py3-none-any.whl.
File metadata
- Download URL: waggle_mcp-0.1.4-py3-none-any.whl
- Upload date:
- Size: 106.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
312da259b5e38d7e50cfe29035312c6dcba8e81c1cf5291e5140ef2c6e091e4d
|
|
| MD5 |
436e22304ca1f8df97e629e05de0e4d1
|
|
| BLAKE2b-256 |
2903e31d85168731f6f927d337b9cc009c72bc711d693ae1b29d6261feca61b0
|