Nowledge Graph MCP Server - Local-first personal memory management

Project description

Nowledge Graph MCP Server

A local-first personal memory management system that serves as a universal bridge between GenAI tools using MCP (Model Context Protocol). This Python server provides graph-based memory storage, semantic search, and knowledge management capabilities.

Features

Graph Database: KuzuDB-powered graph storage for memories, threads, entities, and relationships
Vector Search: Semantic search using sentence transformers and FAISS
MCP Protocol: Full Model Context Protocol support for GenAI tool integration
Memory Management: Intelligent memory compaction and entity linking
Thread Support: Conversation import and management with provenance tracking
Flexible Labeling: User-driven organization with label nodes

Architecture

┌─────────────────────────────────────┐
│       Python MCP Server            │
│  ┌─────────────┐  ┌─────────────┐   │
│  │ FastMCP     │  │   KuzuDB    │   │
│  │ Tools       │  │  + Vector   │   │
│  └─────────────┘  └─────────────┘   │
└─────────────────────────────────────┘
               │ HTTP/JSON-RPC
               ▼
┌─────────────────────────────────────┐
│     Nowledge Graph Desktop App     │
│  ┌─────────────┐  ┌─────────────┐   │
│  │   React     │  │    Rust     │   │
│  │   + G6      │  │   Backend   │   │
│  └─────────────┘  └─────────────┘   │
└─────────────────────────────────────┘

Quick Start

Installation

# Using uv (recommended)
cd nowledge-graph-py
uv sync

# Or using pip
pip install -e .

Development

# Start the MCP server in development mode
uv run mcp dev src/nowledge_graph_server/server.py

# Run with additional dependencies
uv run mcp dev src/nowledge_graph_server/server.py --with pandas --with numpy

# Install for Claude Desktop
uv run mcp install src/nowledge_graph_server/server.py

Production

# Start the server directly
uv run nowledge-server

# Or using the module
python -m nowledge_graph_server

MCP Tools

The server exposes the following MCP tools:

Memory Management

memory.search(query, limit) - Semantic search of Memory nodes
memory.add(content, source_thread_id, metadata) - Add single Memory
memory.get(memory_id) - Retrieve Memory by ID

Thread Management

thread.add(messages, thread_id, metadata) - Import conversation thread
thread.search(query, limit) - Semantic search of Thread summaries
thread.fetch(thread_id) - Retrieve full thread by ID

Entity & Relationship Management

entity.search(query, limit) - Find entities by name/description
entity.relate(entity1_id, entity2_id, relationship_type) - Create entity relationships

Search & Discovery

message.search(query, limit) - Full-text search of Message content
graph.explore(start_node_id, depth) - Explore graph relationships

Graph Schema

Node Types

Thread: Conversation containers with summaries
Message: Individual messages within threads
Memory: Distilled knowledge from conversations
Entity: Concepts, people, places, etc.
Label: Flexible organization tags
Community: Entity clustering for discovery

Relationship Types

(Thread)-[:CONTAINS {order_index}]->(Message)
(Thread)-[:COMPACTS_TO]->(Memory)
(Memory)-[:EXTRACTED_FROM]->(Message)
(Memory)-[:MENTIONS {confidence}]->(Entity)
(Entity)-[:RELATES_TO {type, strength}]->(Entity)
(Thread/Memory)-[:HAS_LABEL]->(Label)
(Entity)-[:BELONGS_TO {strength}]->(Community)

Configuration

Create a .env file or set environment variables:

# Database
KUZU_DB_PATH=./data/nowledge.db
VECTOR_INDEX_PATH=./data/vectors

# Embeddings
EMBEDDING_MODEL=all-MiniLM-L6-v2
EMBEDDING_DIMENSION=384

# Server
MCP_SERVER_HOST=localhost
MCP_SERVER_PORT=3001
LOG_LEVEL=INFO

# Optional: OpenAI API for enhanced processing
OPENAI_API_KEY=your_key_here

Development

Project Structure

src/nowledge_graph_server/
├── __init__.py              # Package initialization
├── server.py               # FastMCP server setup
├── cli.py                  # Command-line interface
├── config.py               # Configuration management
├── models/                 # Pydantic data models
│   ├── __init__.py
│   ├── graph.py           # Graph node/edge models
│   ├── memory.py          # Memory-related models
│   └── thread.py          # Thread/message models
├── database/               # Database layer
│   ├── __init__.py
│   ├── kuzu_client.py     # KuzuDB integration
│   ├── schema.py          # Graph schema setup
│   └── migrations.py      # Schema migrations
├── tools/                  # MCP tool implementations
│   ├── __init__.py
│   ├── memory.py          # Memory management tools
│   ├── thread.py          # Thread management tools
│   ├── entity.py          # Entity tools
│   └── search.py          # Search tools
├── services/               # Business logic
│   ├── __init__.py
│   ├── embedding.py       # Vector embedding service
│   ├── entity_extraction.py # NLP entity extraction
│   └── memory_compaction.py # Memory creation logic
└── utils/                  # Utilities
    ├── __init__.py
    ├── logging.py         # Structured logging
    └── validation.py      # Data validation

Testing

# Run all tests
uv run pytest

# Run with coverage
uv run pytest --cov=nowledge_graph_server

# Run specific test file
uv run pytest tests/test_memory_tools.py

Code Quality

# Format code
uv run black src tests
uv run isort src tests

# Lint
uv run ruff check src tests

# Type checking
uv run mypy src

Integration with Tauri App

The Python MCP server communicates with the Tauri desktop application via HTTP/JSON-RPC. The Tauri app connects to the server on startup and uses the MCP tools for all memory and graph operations.

Offline Deployment & Model Caching

For production deployment in air-gapped environments or packaged applications, you'll need to pre-cache models locally.

Quick Setup for Offline Mode

Pre-cache models (run this before packaging):

# Cache default models
python -m nowledge_graph_server.utils.model_cache

# Cache specific models
python -m nowledge_graph_server.utils.model_cache \
  --embedding-model multi-qa-MiniLM-L6-cos-v1 \
  --llm-model mlx-community/Qwen3-4B-Instruct-2507-DDWQ

# Check cache info
python -m nowledge_graph_server.utils.model_cache --info

Configure for offline mode:

# Environment variables
export NOWLEDGE_EMBEDDING_OFFLINE_MODE=true
export NOWLEDGE_LLM_OFFLINE_MODE=true
export NOWLEDGE_EMBEDDING_CACHE_DIR=~/.cache/huggingface
export NOWLEDGE_LLM_CACHE_DIR=~/.cache/mlx

# Or in config.py
embedding_offline_mode = True
llm_offline_mode = True

Model Caching Strategy

The system uses a offline-first approach:

Embedding Models: Uses local sentence-transformers cache
LLM Models: Uses local MLX model cache
Fallback: Hash-based embeddings if models unavailable

Cache Locations

Embedding Models: ~/.cache/huggingface/hub/
MLX Models: ~/.cache/mlx/models/ or ~/.cache/huggingface/hub/
Custom Cache: Set via NOWLEDGE_EMBEDDING_CACHE_DIR and NOWLEDGE_LLM_CACHE_DIR

Packaging for Binary Distribution

Pre-cache all models:

# Create cache directory in your package
mkdir -p ./package/cache
python -m nowledge_graph_server.utils.model_cache --cache-dir ./package/cache

Set environment variables in your package:

export NOWLEDGE_EMBEDDING_OFFLINE_MODE=true
export NOWLEDGE_LLM_OFFLINE_MODE=true
export NOWLEDGE_EMBEDDING_CACHE_DIR=./cache/huggingface
export NOWLEDGE_LLM_CACHE_DIR=./cache/mlx

Bundle cache directory with your application.

Troubleshooting Offline Issues

Error: "Failed to resolve 'huggingface.co'"

Solution: Enable offline mode and pre-cache models

Error: "Model not found in cache"

Solution: Run the model caching script first

Error: "MLX model download failed"

Solution: Set HF_HUB_OFFLINE=1 and TRANSFORMERS_OFFLINE=1

Performance Considerations

Cache Size: ~2-4GB for typical models
Startup Time: Faster with cached models
Memory Usage: Same as online mode
Accuracy: Identical to online mode (uses same models)

Self-Contained App Deployment

Cache Directory Configuration

The system now supports proper cache directory configuration for self-contained/bundled applications:

Key Features:

Automatic Cache Directory Detection: Detects if running in a bundled app (PyInstaller, cx_Freeze, etc.)
App-Relative Cache Paths: Uses cache directories relative to the executable, not user home directory
Offline-First Configuration: Defaults to offline mode to prevent internet access in packaged apps
HuggingFace Cache Support: Properly handles MLX model caching via HuggingFace Hub

Cache Directory Structure:

your-app/
├── cache/
│   ├── embeddings/           # Sentence transformer models
│   ├── llm/                  # Local LLM cache (if needed)
│   └── huggingface/          # MLX models from HuggingFace
├── data/
│   └── nowledge_graph.db     # Database file
└── your-app-executable

Configuration for Bundled Apps

For bundled applications, the system automatically:

Detects Bundle Environment: Checks sys.frozen or sys._MEIPASS (PyInstaller)
Sets App-Relative Cache: Uses ./cache/ directory relative to executable
Configures HuggingFace Environment: Sets HF_HOME and HUGGINGFACE_HUB_CACHE
Enables Offline Mode: Prevents model downloads in production

Pre-Caching Models

Before packaging your application:

# 1. Cache embedding models
python -m nowledge_graph_server.utils.model_cache cache-embedding

# 2. Cache LLM models  
python -m nowledge_graph_server.utils.model_cache cache-llm

# 3. Verify cached models
python -m nowledge_graph_server.utils.model_cache info

Environment Variables

For manual configuration or development:

# Core cache directories
export HUGGINGFACE_CACHE_DIR=./cache/huggingface
export EMBEDDING_CACHE_DIR=./cache/embeddings
export LLM_CACHE_DIR=./cache/llm

# HuggingFace offline mode (for MLX models)
export HF_HOME=./cache/huggingface
export HUGGINGFACE_HUB_CACHE=./cache/huggingface
export HF_HUB_OFFLINE=1
export TRANSFORMERS_OFFLINE=1

# Application offline mode
export EMBEDDING_OFFLINE_MODE=true
export LLM_OFFLINE_MODE=true

Testing Offline Mode

# Test with offline configuration
cp config.offline.env config.env
python -m nowledge_graph_server.cli serve

# Test in air-gapped environment
# (disconnect internet and verify functionality)

Binary Packaging Checklist

Pre-cache all models using the model cache utility
Include cache directory in your package bundle
Set cache paths relative to executable
Test offline functionality before packaging
Verify no internet access required in production environment

Troubleshooting

Issue: MLX models downloading instead of using cache Solution: Check that HUGGINGFACE_CACHE_DIR is set and models are cached in HuggingFace format

Issue: "Model not found in cache" errors Solution: Run model caching utility and verify cache directory structure

Issue: Internet access required in bundled app Solution: Enable offline mode and ensure all models are pre-cached

Installation

Contributing

Fork the repository
Create a feature branch
Make your changes with tests
Run the quality checks
Submit a pull request

License

This project is part of the Nowledge Graph ecosystem and follows the same licensing terms.

Project details

Release history Release notifications | RSS feed

This version

0.6.6

Mar 4, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

nowledge_graph_server-0.6.6.tar.gz (1.8 MB view details)

Uploaded Mar 4, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

nowledge_graph_server-0.6.6-py3-none-any.whl (1.6 MB view details)

Uploaded Mar 4, 2026 Python 3

File details

Details for the file nowledge_graph_server-0.6.6.tar.gz.

File metadata

Download URL: nowledge_graph_server-0.6.6.tar.gz
Upload date: Mar 4, 2026
Size: 1.8 MB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.12

File hashes

Hashes for nowledge_graph_server-0.6.6.tar.gz
Algorithm	Hash digest
SHA256	`2a5d72f1b1bda0df32011f54cab283073e9f6dac7f4273e39f10b2d6cc6480a7`
MD5	`f2c0da8ef3ac17615d941b064fd36faa`
BLAKE2b-256	`01ea686fe1c8e2ba3654114a0e987cc45c7190d34d3ad9ce1ba98fde7f05f9e3`

See more details on using hashes here.

File details

Details for the file nowledge_graph_server-0.6.6-py3-none-any.whl.

File metadata

Download URL: nowledge_graph_server-0.6.6-py3-none-any.whl
Upload date: Mar 4, 2026
Size: 1.6 MB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.12

File hashes

Hashes for nowledge_graph_server-0.6.6-py3-none-any.whl
Algorithm	Hash digest
SHA256	`bf7ff73671fcdc7ef9abca8232899e0401ccfd7a09fc69d5262cb16136d64559`
MD5	`9aff43192fe67868413f3740f313eb67`
BLAKE2b-256	`4e30c78e40539c48241ebaf5c609f4c273f3c6587fdcac56c439b44b664ae8ad`

See more details on using hashes here.

nowledge-graph-server 0.6.6

Navigation

Verified details

Maintainers

Unverified details

Meta

Project description

Nowledge Graph MCP Server

Features

Architecture

Quick Start

Installation

Development

Production

MCP Tools

Memory Management

Thread Management

Entity & Relationship Management

Search & Discovery

Graph Schema

Node Types

Relationship Types

Configuration

Development

Project Structure

Testing

Code Quality

Integration with Tauri App

Offline Deployment & Model Caching

Quick Setup for Offline Mode

Model Caching Strategy

Cache Locations

Packaging for Binary Distribution

Troubleshooting Offline Issues

Performance Considerations

Self-Contained App Deployment

Cache Directory Configuration

Configuration for Bundled Apps

Pre-Caching Models

Environment Variables

Testing Offline Mode

Binary Packaging Checklist

Troubleshooting

Installation

Contributing

License

Project details

Verified details

Maintainers

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes