Epistemic grounding substrate for LLMs — knows what it knows, what it doesn't, and why
Project description
Nouse
Epistemic grounding for LLMs — knows what it knows, how confidently, and where knowledge runs out.
An 8B model with Nouse outperforms a 70B model without it.
Quick Start · Benchmark · How It Works · vs Alternatives · Examples · Research · Roadmap
Try NoUse In 60 Seconds
pip install nouse
python - <<'PY'
import nouse
brain = nouse.attach()
result = brain.query("What does this project know about epistemic grounding?")
print(result.context_block())
print("confidence:", round(result.confidence, 2))
PY
If NoUse already knows something relevant, you get back a grounded context block with validated relations, uncertainty, and explicit boundaries instead of a generic answer blob.
If that output feels more useful than plain chat history or chunk retrieval, then the project is doing its job.
Why It Matters
NoUse gives an LLM agent a persistent epistemic memory layer.
- It stores relations, not just retrieved chunks.
- It carries confidence, rationale, and uncertainty with the memory itself.
- It makes the boundary between known, probable, and unknown visible to the model.
That changes agent behavior in the place that actually matters: when a model is close to hallucinating but still sounds fluent.
The Result
Model Score Questions
─────────────────────────────────────────────────────
llama3.1-8b (no memory) 46% 60
llama-3.3-70b (no memory) 47% 60
llama3.1-8b + Nouse memory → 96% 60
An 8B model with Nouse outperforms a 70B model without it.
The effect is not retrieval. It is epistemic grounding — a small, precise knowledge signal redirects the model's existing priors onto the correct frame, with confidence and evidence attached. We call this the Intent Disambiguation Effect.
→ Full benchmark details: eval/RESULTS.md · Run it yourself
What You Get
| Capability | What it does |
|---|---|
| Structured memory | Stores typed relations between concepts instead of plain text chunks |
| Confidence-aware retrieval | Returns what is known, with evidence and uncertainty attached |
| Gap awareness | Surfaces where knowledge ends instead of bluffing through it |
| Continuous learning | Strengthens or weakens graph paths over time via Hebbian plasticity |
| Local-first runtime | Runs as a local graph and daemon, then injects context into any LLM |
What Nouse Is
Nouse (νοῦς, Gk. mind) is a persistent, self-growing epistemic substrate that attaches to any LLM.
It is informed by brain-inspired plasticity, cognitive research, and the practical failure modes of LLM memory.
Your documents, conversations, research
↓
Nouse knowledge graph
(SQLite WAL + NetworkX + Hebbian learning + evidence scoring)
↓
brain.query("your question")
↓
Structured context injected into any LLM prompt:
— what is known (relations + confidence)
— why it is known (evidence chain)
— what is NOT known (gap map from TDA)
It is not a RAG system. RAG retrieves chunks. Nouse extracts relations — typed, weighted, evidence-scored connections between concepts — and injects a compact, structured context block.
It is not just a memory system. Memory stores and retrieves. Nouse maintains an epistemic account: every relation carries a trust tier (hypothesis / indication / validated), a rationale, and a contradiction flag. The system knows the difference between what it has evidence for and what it is guessing.
It learns continuously. Every interaction strengthens or weakens connections (Hebbian plasticity). There is no retraining. No gradient descent. The graph grows — and the gaps become visible.
How Nouse Differs From Alternatives
| System | Main unit | Knows confidence | Knows what's missing | Learns over time | Local-first |
|---|---|---|---|---|---|
| Basic RAG | text chunk | ✗ | ✗ | ✗ | ✓ |
| Vector memory | embedding | ~ | ✗ | ✗ | ✓ |
| Mem0 | memory objects | ~ | ✗ | ~ | ✓ |
| MemGPT / Letta | conversation pages | ✗ | ✗ | ~ | ✗ |
| Claude Memory | key-value | ✗ | ✗ | ✗ | ✗ |
| Nouse | typed relation + evidence | ✓ | ✓ | ✓ | ✓ |
Nouse is not trying to replace the model. It gives the model a brain-like memory substrate it can query before speaking.
Quick start
pip install nouse
import nouse
# Auto-detects the local daemon if it is running.
# Otherwise falls back to direct local graph access.
brain = nouse.attach()
result = brain.query("transformer attention mechanism")
print(result.context_block())
print(result.confidence)
print(result.strong_axioms())
If the daemon is running, attach() connects over HTTP. Otherwise it falls back to direct local graph access. The same code works either way.
Works with any provider — OpenAI, Anthropic, Groq, Cerebras, Ollama:
# You handle the LLM call. Nouse handles the memory.
context = brain.query(user_question).context_block()
response = openai.chat(messages=[
{"role": "system", "content": context},
{"role": "user", "content": user_question},
])
Use With OpenAI, Anthropic, Or Ollama
OpenAI
from openai import OpenAI
import nouse
client = OpenAI()
brain = nouse.attach()
question = "How does residual attention affect token relevance?"
context = brain.query(question).context_block()
response = client.chat.completions.create(
model="gpt-4.1-mini",
messages=[
{"role": "system", "content": context},
{"role": "user", "content": question},
],
)
print(response.choices[0].message.content)
Anthropic
from anthropic import Anthropic
import nouse
client = Anthropic()
brain = nouse.attach()
question = "What does this repo know about topological plasticity?"
context = brain.query(question).context_block()
response = client.messages.create(
model="claude-3-7-sonnet-latest",
max_tokens=800,
system=context,
messages=[
{"role": "user", "content": question},
],
)
print(response.content[0].text)
Ollama
import ollama
import nouse
brain = nouse.attach()
question = "Summarize what is known about epistemic grounding."
context = brain.query(question).context_block()
response = ollama.chat(
model="qwen3.5:latest",
messages=[
{"role": "system", "content": context},
{"role": "user", "content": question},
],
)
print(response["message"]["content"])
The pattern is always the same: brain.query(...) first, provider call second.
Managed NoUse (Coming)
NoUse is local-first today. A managed cloud version is planned:
brain = nouse.attach(api_key="nouse_sk_...")
Hosted memory graphs, shared project memory across agents and teams, and zero local setup. Interested? Get in touch.
What A Grounded Answer Looks Like
When you query NoUse, the model does not just get a blob of context. It gets an epistemic frame:
[Nouse memory]
• transformer attention: mechanism for routing token influence across context
claim: attention modulates token relevance based on learned relational patterns
Validated relations:
transformer —[uses]→ attention [ev=0.92]
attention —[modulates]→ token relevance [ev=0.81]
Uncertain / under review:
attention —[is_equivalent_to]→ memory routing [ev=0.41] ⚑
That is the real product surface: not storage, but a more honest and better-calibrated answer path.
Run the benchmark yourself
git clone https://github.com/base76-research-lab/NoUse
cd NoUse
pip install -e .
# Generate questions from your own graph
python eval/generate_questions.py --n 60
# Run benchmark (requires Cerebras or Groq API key, or use Ollama)
python eval/run_eval.py \
--small cerebras/llama3.1-8b \
--large groq/llama-3.3-70b-versatile \
--n 60 --no-judge
The current benchmark is domain-specific and intentionally small. Its purpose is to test whether a grounded memory signal can redirect the model onto the right frame, not to claim a universal leaderboard win.
How the graph grows
Read a document / have a conversation
↓
nouse daemon (background)
↓
DeepDive: extract concepts + relations
↓
Hebbian update: strengthen confirmed paths
↓
NightRun: consolidate, prune weak edges
↓
Ghost Q (nightly): ask LLM about weak nodes → enrich graph
The daemon runs as a systemd service. It watches your files, chat history, browser bookmarks — anything you configure. You never manually curate the graph.
Good Fits
- Coding agents that need stable project memory across sessions
- Research copilots that must preserve terminology, evidence, and uncertainty
- Domain-specific assistants where bluffing is worse than saying "unknown"
- Local-first AI workflows where you want observability instead of hidden memory state
Architecture
nouse/
├── inject.py # Public API: attach(), NouseBrain, Axiom, QueryResult
├── field/
│ └── surface.py # SQLite WAL + NetworkX graph interface
├── daemon/
│ ├── main.py # Autonomous learning loop
│ ├── nightrun.py # Nightly consolidation (9 phases)
│ ├── node_deepdive.py # 5-step concept extraction
│ └── ghost_q.py # LLM-driven graph enrichment
├── limbic/ # Neuromodulation (relevance, arousal, novelty)
├── memory/ # Episodic + procedural + semantic memory
├── metacognition/ # Self-monitoring and confidence calibration
└── search/
└── escalator.py # 3-level knowledge escalation
The hypothesis (work in progress)
small model + Nouse[domain] > large model without Nouse
We have evidence for this in our benchmark. The next step is to test across more domains, more models, and with an LLM judge instead of keyword scoring.
Contributions welcome — especially domain-specific question banks.
Research
The theoretical foundation for Nouse is described in:
- Wikström, B. (2026). The Larynx Problem: Why Large Language Models Are Not Artificial Intelligence. Zenodo · PhilPapers
The paper argues that LLMs model the output channel of intelligence (language), not intelligence itself — and that epistemic grounding through structured, plastic knowledge graphs is a necessary complement.
Install & Run Daemon
pip install nouse
# Start the learning daemon
nouse daemon start
# Interactive REPL with memory
nouse run
# Check graph stats
nouse status
Requires Python 3.11+. Graph stored in ~/.local/share/nouse/.
Roadmap
| Phase | Status | Description |
|---|---|---|
| Core engine | ✅ | SQLite WAL + NetworkX + Hebbian plasticity + TDA gap detection |
| Multi-provider | ✅ | OpenAI, Anthropic, Ollama, Groq, Cerebras |
| MCP integration | ✅ | Model Context Protocol server for Claude and compatible clients |
| Cross-domain benchmarks | 🔄 | Validating on external datasets beyond internal domain |
| Docker support | 📋 | One-command deployment for teams |
| Managed cloud | 📋 | nouse.attach(api_key="nouse_sk_...") — hosted brain for teams |
| Multi-tenant API | 📋 | Shared project memory, team collaboration, SLAs |
License
MIT — see LICENSE
Contact
Björn Wikström / Base76 Research Lab
- 𝕏 / Twitter: @Q_for_qualia
- LinkedIn: bjornshomelab
- Email: bjorn@base76research.com
- Issues: GitHub Issues
For security vulnerabilities, see SECURITY.md.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file nouse-0.4.0.tar.gz.
File metadata
- Download URL: nouse-0.4.0.tar.gz
- Upload date:
- Size: 386.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
675f5d1d5c5666a5aad32edb7b1f196b7e74735bdd6bef842611ded265fb1086
|
|
| MD5 |
b7a2e3f13a9da4ac333fffef6bd46537
|
|
| BLAKE2b-256 |
4520dafe8f3ec91e7f259bf72fc860ab0285a5a2504449219fe1789eb5faa2b1
|
File details
Details for the file nouse-0.4.0-py3-none-any.whl.
File metadata
- Download URL: nouse-0.4.0-py3-none-any.whl
- Upload date:
- Size: 428.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
87e0dabc40739537c0036d22a524a974bd27641d8da9952615f377e52fe2d879
|
|
| MD5 |
4f953cae51a0fcbf22e7c4c6c92c4849
|
|
| BLAKE2b-256 |
3d2cd443de42014d5a49bca62f68d084afceed7196444f32f03e2e781cc04c30
|