Skip to main content

MCP server for Obsidian — semantic knowledge graph with auto-classification, DAG hierarchy, and cross-domain bridge detection

Project description

NOUZ — Semantic MCP Server for Your Knowledge Base

Structure emerges from content.

Works with Obsidian, Logseq, and any directory of Markdown files.

MIT License Python 3.10+ MCP PyPI

🇷🇺 Русская версия


Why NOUZ

NOUZ sits between your note base and your AI agent. It helps turn scattered Markdown files into a graph that can be used through MCP:

  1. Automatic Classification (Semantics)
    You define "Cores" — base domains of your knowledge base, such as Systems Analysis, Data & Science, and Engineering. When you add a new note, NOUZ reads its text, compares vectors, and proposes a domain sign or a combination of domains.

  2. Connection Discovery Between Notes The server builds a directed graph (DAG) and proposes links that can be reviewed before they are written:

    • Semantic bridges: two notes from different domains point to the same idea.
    • Explicit tag links can be stored manually in parents_meta; suggest_metadata also proposes read-only tag_bridges from shared canonical YAML tags.
  3. Base Evolution Tracking (Drift)
    NOUZ aggregates data bottom-up. If a module started in one domain while new notes gradually pull it into another, the server shows the divergence (core_drift).

Depending on your needs, NOUZ works in three modes: from a simple graph (LUCA) to a strict 5-level hierarchy (SLOI).


How It Works

  1. You describe domains in config.yaml — what each domain covers and which textual signals identify it.
  2. The server turns descriptions into vector etalons (locally, via LM Studio or Ollama).
  3. Each new note is projected onto these axes. Sign is determined by content, or by you.
  4. L4 gets a domain profile from text classification, while L3/L2 aggregate core_mix from child nodes. If a module's sign diverges from core_mix, the server reports core_drift.

Semantic bridges find connections between notes from different domains when texts are close in meaning. Tags remain explicit user metadata.


Quick Start

pip install nouz-mcp
OBSIDIAN_ROOT=/path/to/vault nouz-mcp

Without config.yaml, the server starts in LUCA mode — graph without semantics, works immediately.

To enable semantic mode, create a local config from the template:

cp config.template.yaml config.yaml

On Windows PowerShell:

Copy-Item config.template.yaml config.yaml

Or from source:

git clone https://github.com/Semiotronika/NOUZ-MCP
cd NOUZ-MCP
pip install -r requirements.txt
cp config.template.yaml config.yaml
OBSIDIAN_ROOT=./vault python server.py

Connect to Claude Desktop, Cursor, OpenCode, or any MCP client:

{
  "mcpServers": {
    "nouz": {
      "command": "nouz-mcp",
      "env": {
        "OBSIDIAN_ROOT": "/path/to/vault",
        "NOUZ_CONFIG": "/absolute/path/to/config.yaml",
        "EMBED_API_URL": "http://127.0.0.1:1234/v1"
      }
    }
  }
}

MCP Tools

Tool Purpose
suggest_metadata Sign, level, bridges, drift warnings
write_file Write a note with YAML frontmatter
update_metadata Update YAML only, preserving the note body
read_file Read a note + metadata
calibrate_cores Update core reference vectors
recalc_signs Recalculate signs for all notes
recalc_core_mix Recalculate bottom-up aggregation
index_all Re-index the entire base; with with_embeddings=true, also refresh file/chunk embeddings
embed Get a vector for text
chunk_text Split Markdown text into deterministic retrieval chunks
chunk_file Split one note body into deterministic retrieval chunks
search_chunks Search stored chunk embeddings; defaults to mean-centered scoring for unscoped large searches
list_files List with filters by level, sign
get_children Traverse down the graph
get_parents Traverse up the graph
suggest_parents Find parents for an orphan
add_entity Create an entity in one step (auto sign/parents, explicit tags only)
process_orphans Auto-fill files without markup

Set NOUZ_READ_ONLY=true to hide and block mutating tools (write_file, update_metadata, index_all, recalculation, orphan processing, and entity creation). Read-only tools such as read_file, suggest_metadata, embed, chunk_text, chunk_file, and search_chunks remain available. With NOUZ_READ_ONLY=true, read-only tools do not refresh the SQLite cache by default, and startup skips DB init/index/calibration; set NOUZ_CACHE_WRITE=true if you want cache writes in read-only mode.

chunk_text and chunk_file return chunker_version, stable id, actual chunk text coordinates (start_char/end_char), body coordinates without overlap (body_start_char/body_end_char), and hash fields. index_all with with_embeddings=true stores these chunks in the SQLite chunk_embeddings table, and search_chunks ranks them by semantic score. In score_mode=auto, large unscoped candidate sets use mean-centered cosine to reduce the anisotropic common background of the embedding space. Each match returns the active score plus diagnostic score_raw and score_centered values. Scoped search within path keeps raw ranking by default; use score_mode=raw for legacy cosine behavior or score_mode=centered to force mean-centered scoring.

parents_meta.link_type supports manual hierarchy, semantic, temporary, tag, analogy, and error links. NOUZ does not auto-generate analogy links. tag_bridges in suggest_metadata are suggestions from explicit YAML tags and are not written back to files.

YAML tags are explicit metadata: NOUZ normalizes them to canonical slug form (agent-context, optionally area/topic) and rejects obvious non-tags such as hex colors, URLs, numeric-only tokens, empty values, and none/null. suggest_metadata returns tag_quality so an agent can see which tags are accepted for future tag_bridges and which raw values were discarded. For tag automation, suggest_metadata also returns read-only tag_candidates: candidates from the already accepted YAML tag vocabulary in the index and explicit inline #tag markers in the note body. Candidates are not written to YAML automatically; once accepted through update_metadata, normal tag_bridges work from those tags. Before writing, possible links are returned separately as candidate_tag_bridges. For each candidate, NOUZ temporarily chunks the current text and returns evidence with chunk_id, heading, coordinates, and a short snippet. This does not require a prebuilt chunk_embeddings table.


Configuration

Minimal config.yaml:

mode: prizma

etalons:
  - sign: S
    name: Systems Analysis
    text: >
      Methodology for analysing complex objects: feedback loops,
      emergent properties, self-regulation, bifurcation points.
      Cybernetics, synergetics, dissipative structures, catastrophe
      theory, autopoiesis — tools for understanding how the whole
      exceeds the sum of its parts. Not data and not code — a way
      of thinking about how parts form a whole and why systems
      behave non-linearly.
  - sign: D
    name: Data & Science
    text: >
      Physics and cosmology: from subatomic particles to the large-scale
      structure of the Universe. Lagrangians, curvature tensors, scattering
      cross-sections, quarks, bosons, fermions, plasma, vacuum fluctuations,
      cosmic microwave background, cosmological constant, decoherence.
      Pure science about the nature of matter, energy and spacetime.
  - sign: E
    name: Engineering
    text: >
      Software engineering, machine learning and infrastructure: writing
      and debugging code, deployment, containerisation, neural networks,
      inference, tokenisation, data serialisation, microservices, CI/CD,
      automated testing, refactoring, Git, Docker, Kubernetes, APIs.
      The practical discipline of building computational systems from
      architecture to production.

thresholds:
  sign_spread: 0.05
  confident_spread: 60.0
  pattern_second_sign_threshold: 30.0
  semantic_bridge_threshold: 0.55
  parent_link_threshold: 0.55

artifact_signs:
  - sign: n
    name: Note
    text: Short note, observation, fragment.
  - sign: c
    name: Concept
    text: Definition, concept, entity description.
  - sign: r
    name: Reference
    text: External source, documentation, link, citation.
  - sign: l
    name: Log
    text: Session log, chronology, dialogue record.
  - sign: u
    name: Update
    text: Update, release note, changelog entry.
  - sign: h
    name: Hypothesis
    text: Hypothesis, assumption, speculative idea.
  - sign: s
    name: Specification
    text: Technical specification, instruction, requirements.

After setup, run calibrate_cores — the server creates reference vectors. Check pairwise cosines: mean-centered between different domains should be noticeably lower than raw. If all pairs are roughly equal — strengthen the differences in texts. You can also run the standalone etalon check from the installed package: nouz-calc-etalons --config config.yaml.

etalons are semantic domains compared through embeddings. artifact_signs describe the material type of L5 artifacts: note, concept, reference, log, update, hypothesis, or specification. This is a heuristic label, not a separate embedding etalon. In the public convention, domains use uppercase signs (S/D/E) while material types use lowercase signs (n/c/r/l/u/h/s); you can replace them in config as long as signs stay short and do not conflict with domain signs. If needed, add keywords to any material type: the server will use your detection words instead of the built-in RU/EN fallback.

Real Calculation Example

Here are actual results for the S/D/E etalons using the text-embedding-granite-embedding-278m-multilingual model:

=== Pairwise Cosine (raw) ===
S↔D: 0.5894    S↔E: 0.5862    D↔E: 0.6022

=== Pairwise Cosine (mean-centered) ===
S↔D: -0.5059   S↔E: -0.5117   D↔E: -0.4822

Negative mean-centered values are a good result here: after subtracting the mean vector, domains are well-separated. Self-classification: S→99.4%, D→97.5%, E→96.9%.

Variable Default Description
OBSIDIAN_ROOT ./obsidian Path to vault
NOUZ_CONFIG (empty) Absolute path to config.yaml; if omitted, the server looks in the current working directory
NOUZ_DATABASE_NAME obsidian_kb.db SQLite cache filename inside OBSIDIAN_ROOT; useful for isolated public checks, e.g. obsidian_kb.public.db
NOUZ_DATABASE_PATH (empty) Full SQLite cache path; takes precedence over NOUZ_DATABASE_NAME
EMBED_PROVIDER openai openai, lmstudio, ollama
EMBED_API_URL http://127.0.0.1:1234/v1 Embedding endpoint
EMBED_API_KEY (empty) API key, if needed
EMBED_MODEL (empty) Model name

Privacy

Component Local?
Embeddings (LM Studio / Ollama) ✅ Yes
Your notes ✅ Yes
NOUZ server ✅ Yes
AI agent context (Claude, ChatGPT) ❌ Goes to cloud

Everything critical stays on your machine.


Development

git clone https://github.com/Semiotronika/NOUZ-MCP
cd NOUZ-MCP
pip install -e .
python -m compileall -q nouz_mcp pytest_smoke.py scripts
python -m pytest -q
python test_server.py

Links

MIT License © 2026 Semiotronika

Cosines are computed. Syntax changes. Semantics remains.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

nouz_mcp-3.2.2.tar.gz (98.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

nouz_mcp-3.2.2-py3-none-any.whl (56.0 kB view details)

Uploaded Python 3

File details

Details for the file nouz_mcp-3.2.2.tar.gz.

File metadata

  • Download URL: nouz_mcp-3.2.2.tar.gz
  • Upload date:
  • Size: 98.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.1

File hashes

Hashes for nouz_mcp-3.2.2.tar.gz
Algorithm Hash digest
SHA256 f7051a0b5c2ffdb2df2ca70eab51d0e7e763b0036c02489af0e21bff55a70b53
MD5 915ff787d41cacbea7e701e8c671d9bb
BLAKE2b-256 07ffeb5725b0dd194d199791ab511b29e1d2bd6f5f21e8e8dd9fefdd4dcdaaf1

See more details on using hashes here.

File details

Details for the file nouz_mcp-3.2.2-py3-none-any.whl.

File metadata

  • Download URL: nouz_mcp-3.2.2-py3-none-any.whl
  • Upload date:
  • Size: 56.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.1

File hashes

Hashes for nouz_mcp-3.2.2-py3-none-any.whl
Algorithm Hash digest
SHA256 9552f8ba87bc1dec52ceec15dcbb7bf6bbf2e0c817e78ac7302c44c14086d8f2
MD5 f44c2e05d04a2e05cf767b6c30b5f32d
BLAKE2b-256 c923deb08111a3c4f6fdf8eb0dca1efaa20addfdf72604a0ddd8cb87f4e51ac3

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page