MCP server for Neruva -- reliability infrastructure for production AI agents. Agent stops forgetting / drifting / repeating mistakes. v0.27: qbound KG engine for conflict resolution where docs override LLM priors. 6 KG engines, 7-layer federated recall, counterfactual queries, HD analogy, CBR, belief tracking, code_kg_*. Free tier, no card.
Project description
neruva-mcp
MCP server for Neruva — reliability infrastructure for production AI agents. One MCP install. Your agent stops forgetting, stops drifting, stops repeating mistakes — every failure replayable bit-for-bit.
Backed by a 6-engine knowledge graph (incl. qbound for document-vs-LLM-prior conflict resolution), counterfactual queries, HD analogy, episodic CBR, deterministic snapshot/restore.
Drop into Claude Code / Cursor / Codex / Gemini CLI / Goose in one line. Free tier, no card.
For Claude Code users: see neruva.io/claude-code for the 30-second install + first-queries to try.
Benchmarks
| Test | Score | What it means |
|---|---|---|
| Does the agent learn from mistakes? | +34 points | Same Claude Haiku model, no retraining. With Neruva running the agent climbs from 84% to 93% across 2000 tasks. Without Neruva it stays flat at 59%. Three independent runs. |
| Memory QA across long histories (LongMemEval) | 93.3% | top-4 globally, +22pp vs Zep |
| Compliance + audit determinism (DFAH) | 100% / 88% | first system to do both at once, 2.75× prior best |
| Latency (p95 cache-hit) | 80ms | 2.5× faster than Mem0 |
Full breakdown: neruva.io/benchmarks. The only memory stack hitting world-class scores on memory, reasoning, AND audit determinism — simultaneously.
What's new in 0.27.0 — qbound engine for conflict resolution
New 6th KG engine: qbound (question-conditioned binding) for workloads where document-stated facts must override the LLM's training-time priors.
Use it when your agent reads CRM records, compliance docs, or internal policies that contradict what an LLM "knows" from pretraining:
await session.call_tool(
"hd_kg_add_facts",
{
"kg": "compliance",
"engine": "qbound", # <-- new
"facts": [
{"subject": "PolicyX", "relation": "current_version", "object": "v2024.3"},
# ...
],
},
)
Drop-in: same shard format as opb, same query surface. Backward-compat with all existing tools.
What's new in 0.26.0 — 7-layer federated recall
agent_recall_full lands. One question, seven buckets returned in parallel (~150ms p95):
- records — semantic embedding search (existing)
- kg — entity-overlap + cosine across knowledge graphs (existing)
- rules — HD signature cosine across stored rule libraries (new)
- cbr — structural-distance nearest-neighbour across case episode stores (new)
- scm — variable-name match across causal models (new)
- tom — name-resolved chain lookup across theory-of-mind belief stores (new)
- continual — predictive next-token recall against trained K-gram learners (new)
Records + KG work universally. The other five layers light up when you pass opt-in text labels at ingest time (chain_names/prop_name on agent_model_belief_add, token_names on agent_continual_train, var_names on hd_causal_add_worlds, axis_vocab_names on hd_cbr_add_episodes). Empty buckets return a hint field with the exact param to pass next time.
Per-layer prefix scoping (rules_prefix, cbr_prefix, scm_prefix, tom_prefix, continual_prefix, kg_prefix) bounds wall time on tenants with many resources. Existing agent_recall (records + KG only) unchanged.
What's new in 0.25.0 — qa_optimized + recency_first graduated
The LongMemEval 93.3% recipe lifted into the substrate. agent_recall now accepts two opt-in flags (both default off, backwards compatible):
mode="qa_optimized"— 3× over-fetches the candidate pool and BM25-RRF-fuses (k=60) over the text. Lifts records whose text contains specific entity tokens (brand names, amounts, dates, IDs) that pure semantic embedding under-weights. Worth ~+20pp on memory-QA benchmarks.recency_first=true— re-sorts returned records bytsdesc after ranking. Use for knowledge-update questions where the most recent statement wins over older contradictory ones.
Pair them for the strongest memory-QA recipe. Auto-router in neruva-record 0.19+ suggests both automatically when the intent classifier hits recall_extended.
What's new in 0.24.0 — Code-graph bare-name resolution
code_kg_module_of and code_kg_class_of now accept short identifiers like bind (no module prefix). When the substrate sees an unqualified name, it falls back to a 1-hop called_by lookup to find the likely qualified target.
What's new in 0.22.0 — Auto-pilot surface (the moat)
Two new tools complete the auto-pilot that makes the substrate use itself. The agent automatically routes user intents to the right cognitive primitive AND self-curates memory across sessions, without the user telling it which Neruva tool to call.
agent_route_intent_prompt— returns the canonical 18-pattern intent classifier (counterfactual / analogy / theory-of-mind / rule induction / causal / planning / recall / comparison / state / composition / decision / mistake + 6 code-graph navigation intents). Pair withNERUVA_AUTO_ROUTE=1inneruva-recordfor hands-free routing on every user prompt.agent_reflect_prompt— returns the canonical reflection prompt that extracts durable decisions / facts / mistakes / open questions from recent turns. Pair withNERUVA_AUTO_REFLECT=1inneruva-recordfor hands-free self-curation. Next session boots with curated context, not raw transcript.
Both endpoints are pattern-C: substrate emits a prompt, caller LLM
runs it in its normal turn, structured result pushed back via
existing tools. Substrate stays $0/call. Combined with the existing
hd_kg_extraction_prompt (Layer 1 — auto-extract on
records_ingest), the three layers form a complete auto-pilot.
See neruva-record v0.11+ for the SDK that wires these into Claude
Code's hook system automatically.
What's new in 0.21.0 — code-graph MCP tools
- 5 new
code_kg_*tools for sub-ms structural code queries against KGs built locally vianeruva-record-code-index:code_kg_callees,code_kg_callers,code_kg_class_of,code_kg_module_of,code_kg_imports. Thin wrappers overhd_kg_querywith "Call this when..." routing nudges. - Tool-description routing nudges. All high-leverage tools (records_*, agent_recall/context/remember, hd_kg_query, hd_analogy, hd_causal_query, agent_counterfactual_rollout, agent_model_belief(_add), agent_register_action, agent_plan_efe, agent_induce_rule, agent_extract_schema, agent_hierarchical_decode) lead with "Call this when..." so LLMs route into the right substrate primitive without explicit prompting.
What's new in 0.18.3 — depth-unlimited theory of mind + 125× faster cleanup
- Theory of mind is now depth-unlimited (v0.5.4 substrate fix). Position-tagged at every chain index via non-commutative permutation binding. Inner-position swaps correctly reject; recursive self- reference (same agent at multiple chain positions) works natively.
- Cleanup acceleration via FAISS-binary popcount. OPB query stage 2 uses SIMD popcount over sign-quantized atoms with deterministic float32 cosine rerank. Substantially faster on warm queries; replay bit-identical.
- 551× compression on stored OPB pages (rank-12 SVD). Persistence blobs that were >100 MB now fit in under 1 MB at perfect recall on round-trip.
The 9-level cognitive ladder — no LLM vendor ships rows 3-9
The substrate now exposes the full 9-level cognitive ladder. Every primitive runs sub-100ms, deterministic from seed, behind one MCP install.
| # | Capability | MCP tool(s) | Frontier LLM equivalent |
|---|---|---|---|
| 1 | Vector retrieval (OPB pages + spectral routing) | records_query(engine="opb") |
Pinecone/Zep (Level 1 only) |
| 2 | KG + Pearl do-operator + HD analogy + CBR | hd_kg_* · agent_causal_query · hd_analogy · hd_cbr_* |
nobody |
| 3 | Theory of Mind (nested belief) | agent_model_belief_add · agent_model_belief |
hallucinates at depth |
| 4 | Counterfactual rollouts ("what if k → a'?") | agent_counterfactual_rollout |
confabulates |
| 5 | Schema lifting (analogical pattern matching) | agent_extract_schema |
needs fine-tuning |
| 6 | Active Inference planning (Friston EFE) | agent_register_action · agent_plan_efe |
not a primitive |
| 7 | Few-shot rule induction | agent_induce_rule |
fine-tune (many examples) |
| 8 | Persistent rule storage | agent_persist_rule · agent_recall_rule |
re-feed demos every recall |
| 9 | Continual learning, zero forgetting | agent_continual_train · agent_continual_predict |
catastrophic forgetting |
| + | Hierarchical chunking (recursive L^K decode) | agent_hierarchical_add · agent_hierarchical_decode |
not a primitive |
~80 tools across Records, KG, Causal, Analogy, CBR, Blend, federated agent_*, the 9 cognitive primitives above, self-introspection.
Why this is unique
Every primitive in rows 3-9 is a graduated, production-shipped engine. No published memory vendor offers more than rows 1-2. Substrate-augmented small LLMs can match frontier-class agentic capabilities at a fraction of the cost per recall.
Install
# In Claude Code (any directory, user scope):
claude mcp add-json neruva '{"command":"npx","args":["-y","@neruva/mcp@latest"],"env":{"NERUVA_API_KEY":"nv_..."}}'
Or one-line install via npx for any MCP host:
npx -y @neruva/mcp@latest # one-off
npm i -g @neruva/mcp # then `neruva-mcp`
Get an API key at https://app.neruva.io (free tier, no credit card).
Wire into a host
Claude Code
claude mcp add-json neruva '{"command":"npx","args":["-y","@neruva/mcp@latest"],"env":{"NERUVA_API_KEY":"..."}}'
Cursor (~/.cursor/mcp.json)
{
"mcpServers": {
"neruva": {
"command": "npx",
"args": ["-y", "@neruva/mcp@latest"],
"env": { "NERUVA_API_KEY": "..." }
}
}
}
Codex (~/.codex/config.toml)
[mcp_servers.neruva]
command = "npx"
args = ["-y", "@neruva/mcp@latest"]
env = { NERUVA_API_KEY = "..." }
Gemini CLI (~/.gemini/settings.json)
{ "mcpServers": { "neruva": { "command": "npx", "args": ["-y", "@neruva/mcp@latest"], "env": { "NERUVA_API_KEY": "..." } } } }
Goose (~/.config/goose/config.yaml)
extensions:
neruva:
type: stdio
cmd: npx
args: ["-y", "@neruva/mcp@latest"]
env:
NERUVA_API_KEY: nv_...
For Goose auto-pilot (pattern-C route / reflect / extract via your LLM): pip install neruva-goose.
The substrate, in one paragraph
Five layers, one API. Records = typed agentic events (decisions, mistakes, tool_calls, llm_turns; auto-embedded at D=1024). Knowledge Graph = mutable structured state across 6 engines (hadamard, opb, qbound, multishard, quorum, feature_bundle), sub-ms cosine retrieval, matrix-power N-hop derive. Causal = Pearl's do-operator (observation vs intervention arithmetically distinct). Analogy = a:b::c:? in HD feature space. Concept Blending = provenance-preserving merge of multiple memories. CBR = factored episode store. The new federated agent_* layer (agent_remember / agent_recall / agent_context) routes across all substrates so a single call handles "where does X store, and how do I get it back?"
Deterministic from a seed. Replayable bit-exactly. Portable as .neruva containers — your data is yours.
Three-line LangChain integration
# pip install neruva-langchain
from neruva_langchain import NeruvaChatMessageHistory
history = NeruvaChatMessageHistory(namespace="user_alice")
# wire into any chain that takes BaseChatMessageHistory
Same pattern: neruva-langgraph (BaseCheckpointSaver + BaseStore), neruva-crewai (Storage interface + 3 memory flavors).
Auto-record for Claude Code
pip install neruva-record && neruva-record-install
Every Claude Code session lands in your Neruva account: tool calls, chat turns, secrets-redacted client-side, queryable across sessions.
Why use this over a vector DB or Zep
| Vector DB | Zep | Mem0 | Neruva | |
|---|---|---|---|---|
| KG engines | 0 | 1 | 1 | 6 |
| Counterfactual queries | ❌ | ❌ | ❌ | ✅ |
| Provable replay (deterministic snapshot/restore) | ❌ | ❌ | ❌ | ✅ |
| Anomaly detection (quorum disagreement) | ❌ | ❌ | ❌ | ✅ |
| Federated context (records+KG one call) | ❌ | partial | partial | ✅ |
| Portable container | ❌ | ❌ | ❌ | ✅ .neruva |
| p95 latency | varies | varies | varies | <100ms |
| Cost per recall vs context-stuffing | varies | varies | varies | dramatically lower |
KG engine selector
Pick engine on first hd_kg_add_facts call to a new KG:
| engine | best for | storage |
|---|---|---|
hadamard (default) |
small KGs (<10k facts), latency-critical | 32 KB/shard |
opb |
large KGs (>10k facts), matrix-power N-hop derivation | 256 MB/shard |
qbound |
conflict-resolution where documents override LLM priors | similar to opb |
multishard |
very large KGs, sharded across K=16 hadamard buckets | scales linearly |
quorum |
adversarial/anomaly detection via n-shard quorum | n × hadamard |
feature_bundle |
typed-feature workloads (color, size, role) | 128 features × D |
Auth
Set NERUVA_API_KEY in env. NERUVA_URL defaults to https://api.neruva.io.
Optional: NERUVA_AUTO_RECORD=namespace[:ttl_days] — every tool call this agent makes auto-records into the named records namespace. Fire-and-forget, never blocks or breaks the call.
Update flow
The startup banner prints when a newer version is available:
[neruva-mcp] update available: you have 0.16.0, latest is 0.16.1.
If registered with @neruva/mcp@latest, a Claude Code restart auto-updates.
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file neruva_mcp-0.27.2.tar.gz.
File metadata
- Download URL: neruva_mcp-0.27.2.tar.gz
- Upload date:
- Size: 37.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d1a787fc80b55cef1d62a17ad5046b3ebe5290765e1c419f2f747fca479dbae4
|
|
| MD5 |
fb18332a7d76771e6d576786611109fa
|
|
| BLAKE2b-256 |
73d3e57c827c5203cc14bc0c009f9fcdab2a482300105b7301bb2e822c6ab213
|
File details
Details for the file neruva_mcp-0.27.2-py3-none-any.whl.
File metadata
- Download URL: neruva_mcp-0.27.2-py3-none-any.whl
- Upload date:
- Size: 30.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
fbc9d6db2c0957174c319b0f768f2b093d10405ba47de24800a0e317682d107e
|
|
| MD5 |
98167b75c95371330c0808472774deb2
|
|
| BLAKE2b-256 |
79a7ec43fa9f496a279ab398a1b13c55608d3757f1431c3b92c41c5a886b74cf
|