Skip to main content

MCP server for Obsidian — semantic knowledge graph with auto-classification, DAG hierarchy, and cross-domain bridge detection

Project description

NOUZ — Semantic Knowledge Graph for Obsidian

One server. Three approaches. Your notes find their own place in the graph.

MIT License Python 3.10+ MCP PyPI


Why NOUZ?

You write in Obsidian. A lot. Over time your vault turns into a mess — hundreds of notes connected somehow, with no system or logic.

NOUZ fixes this. It reads your notes, analyzes the content, and builds the knowledge graph itself — what belongs where, what "sign" each note has, where the semantic connections between different branches are.

No API keys needed. NOUZ works with your own embedding model.


What NOUZ Does

Builds the Graph

Add parents: to a note — and NOUZ automatically places it in the hierarchy. No manual folder sorting.

Classifies by Content

Using embeddings (local model), NOUZ understands what the note is about and assigns it a "sign" — T (technology), S (science), H (humanities) or any other you define.

Finds Hidden Connections

Notes from different branches can be semantically close. NOUZ finds these bridges and suggests linking them.

Tracks Knowledge Drift

Over time the content of a branch changes. NOUZ shows when the actual composition of notes diverges from the declared topic — this is core_drift, a signal that the branch has grown beyond its domain.


Three Modes for Different Tasks

Mode What It Does Embeddings?
LUCA Pure graph — only links and hierarchy
PRIZMA Full semantics — classification, bridges, drift
SLOI Strict 5-level hierarchy with control

Start with LUCA — just connect and add links. Move to PRIZMA or SLOI when you want semantics.


Quick Start

# Install via PyPI (recommended)
pip install nouz-mcp

# Run
OBSIDIAN_ROOT=/path/to/vault nouz-mcp

Or from source:

git clone https://github.com/KVANTRA-dev/NOUZ-MCP
cd NOUZ-MCP
pip install -r requirements.txt
OBSIDIAN_ROOT=./vault python server.py

Connect to Claude Desktop, Cursor, Opencode, or any other MCP client:

{
  "mcpServers": {
    "nouz": {
      "command": "nouz-mcp",
      "env": {
        "OBSIDIAN_ROOT": "/path/to/vault",
        "MODE": "prizma",
        "EMBED_API_URL": "http://127.0.0.1:1234/v1"
      }
    }
  }
}

How It Works

1. You Write a Note

---
type: module
level: 3
sign: T
parents:
  - Machine Learning
---

Your note here.

2. NOUZ Builds the Graph

  • Reads YAML → builds DAG (directed acyclic graph)
  • Vectorizes text → calculates proximity to your cores
  • Proposes sign, level, parents
  • Finds bridges to other branches

3. Structure Emerges from Content

T (core)
 ├── TH (pattern: AI)
 │   ├── TH (module: ML)
 │   │   ├── TH (quant: neural-networks.md) — T
 │   │   └── TS (quant: transformers.md) — T
 │   └── TH (module: Ethics)
 │       └── TH (quant: ai-safety.md) — T
 └── TS (pattern: Physics)
     └── ...

The more you write — the smarter your knowledge base becomes, and the agent working with NOUZ.


How Classification Works

Cores — Coordinate Axes

In config.yaml you define 2–5 domains as text descriptions. The server converts them into reference vectors — coordinate axes in the multidimensional embedding space.

mode: prizma

etalons:
  - sign: T
    name: Technology
    text: "programming software architecture machine learning neural networks"
  - sign: S
    name: Science
    text: "physics chemistry biology mathematics formal logic theorems"
  - sign: H
    name: Humanities
    text: "philosophy psychology sociology history literature ethics"

On core quality: separation between cores matters more than description accuracy. After writing — run calibrate_cores and check pairwise_cosine. Values above 0.55 between any two cores means they're too similar and the classifier will confuse domains. This is analogous to orthogonal basis vectors: the more orthogonal — the more accurate the projection.

Sign Assignment

For each note, its content vector is compared against all reference vectors:

scores     = {S: cosine(note, core_S), T: cosine(note, core_T), H: cosine(note, core_H)}
spread     = max(scores) - min(scores)

If spread < 0.05 — the note is equidistant from all cores, sign is undefined. This is not an error: the note genuinely belongs to multiple domains or its content is insufficient for classification.

adjusted   = {k: scores[k] - min(scores) for k in scores}
percent    = {k: adjusted[k] / sum(adjusted) * 100 for k in adjusted}
sign       = all domains where percent[k] >= 30%

If two domains score ≥ 30% — the sign is composite: TS, SH. This is not a contradiction, but a spectrum: the note lives on the border between domains.

Sign Inheritance

The sign flows top-down through hierarchical links:

L1 core      ← defined manually, never changes
L2 pattern   ← defined manually + embedding adds second sign (if ≥ 30%)
L3 module    ← inherits sign from parent pattern
L4 quant     ← hybrid: sign from L3 parent + embedding content
L5 artifact  ← inherits sign from parent quant

Priority: sign_manual (YAML) > sign_auto (embedding) > inherited

Manual sign in YAML is never overwritten automatically.

Sign Confidence: auto vs weak_auto

Spread-normalization answers "which core is relatively closer". But it doesn't answer "how close is the note to this core in absolute terms".

For this there's confident_cosine — a threshold on max(cosine):

max_cosine >= confident_cosine → sign_source = "auto"      (confident sign)
max_cosine < confident_cosine  → sign_source = "weak_auto" (relative sign)

What this means in practice:

weak_auto occurs when the note has a relative winner among cores (spread is normal), but in absolute terms all cosines are low. This happens when:

  • The note is short or not written in the language of the cores
  • The cores don't cover the note's topic well
  • The embedding model isn't strong in this domain

Impact on semantic bridges:

This is where it matters. Semantic bridges are only proposed between notes with different signs — if signs match, they're assumed to be "the same area". But if the sign is weak_auto — this assumption is unreliable.

sign_source = "auto"      → sign is considered closed. Bridges to the same core are not proposed.
sign_source = "weak_auto" → sign is open. Bridges to notes with the same core are still proposed.

This gives you a choice: either set the sign manually (close the domain), or let the system keep proposing connections from all directions.

Threshold tuning:

thresholds:
  confident_cosine: 0.6  # for most models (e5, BGE, multilingual)
  # confident_cosine: 0.75  # for nomic-embed (high baseline 0.74–0.83)

The higher the threshold — the stricter the confidence requirement, the more notes get weak_auto. Calibrate for your model.


core_mix — Reality Bottom-Up

The sign flows top-down (intent). core_mix flows bottom-up (reality):

quant (L4) → updates core_mix of parent module (L3)
module (L3) → aggregates into core_mix of parent pattern (L2)

Each module accumulates the averaged domain distribution of all its quants. When the declared sign (intent) diverges from core_mix (reality) — the system reports core_drift.

This is not an error. It's information about how the knowledge base has evolved. A branch with sign=T can gradually accumulate 60% of notes about mathematics — and core_drift will show this.

This bidirectional flow — top-down constraints and bottom-up evidence — mirrors the architecture of hierarchical predictive coding systems: upper levels set expectations, lower levels return discrepancies.


Semantic Bridges

suggest_metadata returns candidates for cross-domain links. All bridges are marked with proposed: true and require explicit confirmation.

Algorithm:

For note A (sign=S) and note B (sign=T):
  if cosine(embed(A), embed(B)) >= 0.55 → propose link

Finds notes that are semantically similar overall but belong to different domains. A note about thermodynamic entropy and a note about data compression can have high embedding similarity — both are about efficient information encoding. The bridge detects that they're talking about the same thing from different angles.

The 0.55 threshold is configurable in config.yaml via semantic_bridge_threshold.


Customize It

mode: prizma

etalons:
  - sign: T
    name: Technology & Engineering
    text: "programming software architecture infrastructure machine learning neural networks
           algorithms frameworks database cloud computing"
  - sign: S
    name: Science & Mathematics
    text: "physics chemistry biology mathematics formal logic theorems cosmology quantum
           mechanics research methodology"
  - sign: H
    name: Humanities & Arts
    text: "philosophy psychology sociology history literature art culture ethics
           cognitive science epistemology linguistics"

thresholds:
  sign_spread: 0.05           # minimum spread for classification
  pattern_second_sign_threshold: 30.0  # second sign threshold (%)
  semantic_bridge_threshold: 0.55      # semantic bridge threshold

Tools

All Modes

Tool Description
read_file Read a note with YAML metadata
write_file Create or update a note
list_files List with filters by level, sign, subfolder
get_children Traverse DAG downwards (all children)
get_parents Traverse DAG upwards (parents)
index_all Reindex the entire vault
format_entity_compact Entity formula: (children)[entity]{parents}

PRIZMA / SLOI

Tool Description
calibrate_cores Vectorize core texts → reference vectors
recalc_signs Reclassify all notes by embeddings
recalc_core_mix Recalculate bottom-up aggregation
suggest_metadata Propose sign, level, semantic bridges
suggest_parents Find parents by semantic similarity
embed Get embedding vector for any text

Configuration

Variable Default Description
OBSIDIAN_ROOT ./obsidian Path to the vault
MODE luca Mode: luca, prizma, or sloi
EMBED_ENABLED true Enable embeddings
EMBED_PROVIDER openai Provider: openai, lmstudio, ollama, gigachat
EMBED_API_URL http://127.0.0.1:1234/v1 Endpoint for embeddings
EMBED_API_KEY `` API key if required

Privacy: What Stays Local

Component Local?
Embeddings (LM Studio / Ollama) ✅ Yes
Your notes (raw files) ✅ Yes
NOUZ server ✅ Yes
AI agent context (what the cloud model sees) ❌ No — goes to the cloud

NOUZ runs locally. But when you connect a cloud AI agent (Claude, ChatGPT, etc.), the content it sees — including your notes — goes to that provider's cloud. This is outside NOUZ control.

Your choice: use local agents or accept the trade-off.


Who Is It For?

Order lovers Want structure without spending time on manual organization
Researchers Gather lots of information and want to see connections
AI enthusiasts Building knowledge graphs for RAG or agent systems
Everyone with >100 notes When folders stop coping

Development

pip install -e .

Links


Structure emerges from content — you just write.

MIT License © 2026 KVANTRA


Cosines are calculated, syntax changes, semantics remains.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

nouz_mcp-2.1.4.tar.gz (32.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

nouz_mcp-2.1.4-py3-none-any.whl (23.4 kB view details)

Uploaded Python 3

File details

Details for the file nouz_mcp-2.1.4.tar.gz.

File metadata

  • Download URL: nouz_mcp-2.1.4.tar.gz
  • Upload date:
  • Size: 32.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.1

File hashes

Hashes for nouz_mcp-2.1.4.tar.gz
Algorithm Hash digest
SHA256 d79081dc5d9c367ede6f0876eff73d7a763f080d60465074029fb355f16ab5a3
MD5 71c9cabce1789bc3af30257a713013ee
BLAKE2b-256 88f8e88b00ddfab046ec6217659ed115fa081e931d4501216289c1a1ccbbe517

See more details on using hashes here.

File details

Details for the file nouz_mcp-2.1.4-py3-none-any.whl.

File metadata

  • Download URL: nouz_mcp-2.1.4-py3-none-any.whl
  • Upload date:
  • Size: 23.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.1

File hashes

Hashes for nouz_mcp-2.1.4-py3-none-any.whl
Algorithm Hash digest
SHA256 ab8e8e7e62b9e4dd57496d4742fb6aa54c7165a47d9d74940169fe9a77b29546
MD5 94531f079834db91a78060bd4f56e930
BLAKE2b-256 aaa68c5f23b0ec82a804f1cb1b2608080f8536b5c908055e6560f011982100dc

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page