Skip to main content

Nowledge Graph MCP Server - Local-first personal memory management

Project description

Nowledge Graph MCP Server

A local-first personal memory management system that serves as a universal bridge between GenAI tools using MCP (Model Context Protocol). This Python server provides graph-based memory storage, semantic search, and knowledge management capabilities.

Features

  • Graph Database: KuzuDB-powered graph storage for memories, threads, entities, and relationships
  • Vector Search: Semantic search using sentence transformers and FAISS
  • MCP Protocol: Full Model Context Protocol support for GenAI tool integration
  • Memory Management: Intelligent memory compaction and entity linking
  • Thread Support: Conversation import and management with provenance tracking
  • Flexible Labeling: User-driven organization with label nodes

Architecture

┌─────────────────────────────────────┐
│       Python MCP Server            │
│  ┌─────────────┐  ┌─────────────┐   │
│  │ FastMCP     │  │   KuzuDB    │   │
│  │ Tools       │  │  + Vector   │   │
│  └─────────────┘  └─────────────┘   │
└─────────────────────────────────────┘
               │ HTTP/JSON-RPC
               ▼
┌─────────────────────────────────────┐
│     Nowledge Graph Desktop App     │
│  ┌─────────────┐  ┌─────────────┐   │
│  │   React     │  │    Rust     │   │
│  │   + G6      │  │   Backend   │   │
│  └─────────────┘  └─────────────┘   │
└─────────────────────────────────────┘

Quick Start

Installation

# Using uv (recommended)
cd nowledge-graph-py
uv sync

# Or using pip
pip install -e .

Development

# Start the MCP server in development mode
uv run mcp dev src/nowledge_graph_server/server.py

# Run with additional dependencies
uv run mcp dev src/nowledge_graph_server/server.py --with pandas --with numpy

# Install for Claude Desktop
uv run mcp install src/nowledge_graph_server/server.py

Production

# Start the server directly
uv run nowledge-server

# Or using the module
python -m nowledge_graph_server

MCP Tools

The server exposes the following MCP tools:

Memory Management

  • memory.search(query, limit) - Semantic search of Memory nodes
  • memory.add(content, source_thread_id, metadata) - Add single Memory
  • memory.get(memory_id) - Retrieve Memory by ID

Thread Management

  • thread.add(messages, thread_id, metadata) - Import conversation thread
  • thread.search(query, limit) - Semantic search of Thread summaries
  • thread.fetch(thread_id) - Retrieve full thread by ID

Entity & Relationship Management

  • entity.search(query, limit) - Find entities by name/description
  • entity.relate(entity1_id, entity2_id, relationship_type) - Create entity relationships

Search & Discovery

  • message.search(query, limit) - Full-text search of Message content
  • graph.explore(start_node_id, depth) - Explore graph relationships

Graph Schema

Node Types

  • Thread: Conversation containers with summaries
  • Message: Individual messages within threads
  • Memory: Distilled knowledge from conversations
  • Entity: Concepts, people, places, etc.
  • Label: Flexible organization tags
  • Community: Entity clustering for discovery

Relationship Types

  • (Thread)-[:CONTAINS {order_index}]->(Message)
  • (Thread)-[:COMPACTS_TO]->(Memory)
  • (Memory)-[:EXTRACTED_FROM]->(Message)
  • (Memory)-[:MENTIONS {confidence}]->(Entity)
  • (Entity)-[:RELATES_TO {type, strength}]->(Entity)
  • (Thread/Memory)-[:HAS_LABEL]->(Label)
  • (Entity)-[:BELONGS_TO {strength}]->(Community)

Configuration

Create a .env file or set environment variables:

# Database
KUZU_DB_PATH=./data/nowledge.db
VECTOR_INDEX_PATH=./data/vectors

# Embeddings
EMBEDDING_MODEL=all-MiniLM-L6-v2
EMBEDDING_DIMENSION=384

# Server
MCP_SERVER_HOST=localhost
MCP_SERVER_PORT=3001
LOG_LEVEL=INFO

# Optional: OpenAI API for enhanced processing
OPENAI_API_KEY=your_key_here

Development

Project Structure

src/nowledge_graph_server/
├── __init__.py              # Package initialization
├── server.py               # FastMCP server setup
├── cli.py                  # Command-line interface
├── config.py               # Configuration management
├── models/                 # Pydantic data models
│   ├── __init__.py
│   ├── graph.py           # Graph node/edge models
│   ├── memory.py          # Memory-related models
│   └── thread.py          # Thread/message models
├── database/               # Database layer
│   ├── __init__.py
│   ├── kuzu_client.py     # KuzuDB integration
│   ├── schema.py          # Graph schema setup
│   └── migrations.py      # Schema migrations
├── tools/                  # MCP tool implementations
│   ├── __init__.py
│   ├── memory.py          # Memory management tools
│   ├── thread.py          # Thread management tools
│   ├── entity.py          # Entity tools
│   └── search.py          # Search tools
├── services/               # Business logic
│   ├── __init__.py
│   ├── embedding.py       # Vector embedding service
│   ├── entity_extraction.py # NLP entity extraction
│   └── memory_compaction.py # Memory creation logic
└── utils/                  # Utilities
    ├── __init__.py
    ├── logging.py         # Structured logging
    └── validation.py      # Data validation

Testing

# Run all tests
uv run pytest

# Run with coverage
uv run pytest --cov=nowledge_graph_server

# Run specific test file
uv run pytest tests/test_memory_tools.py

Code Quality

# Format code
uv run black src tests
uv run isort src tests

# Lint
uv run ruff check src tests

# Type checking
uv run mypy src

Integration with Tauri App

The Python MCP server communicates with the Tauri desktop application via HTTP/JSON-RPC. The Tauri app connects to the server on startup and uses the MCP tools for all memory and graph operations.

Offline Deployment & Model Caching

For production deployment in air-gapped environments or packaged applications, you'll need to pre-cache models locally.

Quick Setup for Offline Mode

  1. Pre-cache models (run this before packaging):
# Cache default models
python -m nowledge_graph_server.utils.model_cache

# Cache specific models
python -m nowledge_graph_server.utils.model_cache \
  --embedding-model multi-qa-MiniLM-L6-cos-v1 \
  --llm-model mlx-community/Qwen3-4B-Instruct-2507-DDWQ

# Check cache info
python -m nowledge_graph_server.utils.model_cache --info
  1. Configure for offline mode:
# Environment variables
export NOWLEDGE_EMBEDDING_OFFLINE_MODE=true
export NOWLEDGE_LLM_OFFLINE_MODE=true
export NOWLEDGE_EMBEDDING_CACHE_DIR=~/.cache/huggingface
export NOWLEDGE_LLM_CACHE_DIR=~/.cache/mlx

# Or in config.py
embedding_offline_mode = True
llm_offline_mode = True

Model Caching Strategy

The system uses a offline-first approach:

  1. Embedding Models: Uses local sentence-transformers cache
  2. LLM Models: Uses local MLX model cache
  3. Fallback: Hash-based embeddings if models unavailable

Cache Locations

  • Embedding Models: ~/.cache/huggingface/hub/
  • MLX Models: ~/.cache/mlx/models/ or ~/.cache/huggingface/hub/
  • Custom Cache: Set via NOWLEDGE_EMBEDDING_CACHE_DIR and NOWLEDGE_LLM_CACHE_DIR

Packaging for Binary Distribution

  1. Pre-cache all models:
# Create cache directory in your package
mkdir -p ./package/cache
python -m nowledge_graph_server.utils.model_cache --cache-dir ./package/cache
  1. Set environment variables in your package:
export NOWLEDGE_EMBEDDING_OFFLINE_MODE=true
export NOWLEDGE_LLM_OFFLINE_MODE=true
export NOWLEDGE_EMBEDDING_CACHE_DIR=./cache/huggingface
export NOWLEDGE_LLM_CACHE_DIR=./cache/mlx
  1. Bundle cache directory with your application.

Troubleshooting Offline Issues

Error: "Failed to resolve 'huggingface.co'"

  • Solution: Enable offline mode and pre-cache models

Error: "Model not found in cache"

  • Solution: Run the model caching script first

Error: "MLX model download failed"

  • Solution: Set HF_HUB_OFFLINE=1 and TRANSFORMERS_OFFLINE=1

Performance Considerations

  • Cache Size: ~2-4GB for typical models
  • Startup Time: Faster with cached models
  • Memory Usage: Same as online mode
  • Accuracy: Identical to online mode (uses same models)

Self-Contained App Deployment

Cache Directory Configuration

The system now supports proper cache directory configuration for self-contained/bundled applications:

Key Features:

  • Automatic Cache Directory Detection: Detects if running in a bundled app (PyInstaller, cx_Freeze, etc.)
  • App-Relative Cache Paths: Uses cache directories relative to the executable, not user home directory
  • Offline-First Configuration: Defaults to offline mode to prevent internet access in packaged apps
  • HuggingFace Cache Support: Properly handles MLX model caching via HuggingFace Hub

Cache Directory Structure:

your-app/
├── cache/
│   ├── embeddings/           # Sentence transformer models
│   ├── llm/                  # Local LLM cache (if needed)
│   └── huggingface/          # MLX models from HuggingFace
├── data/
│   └── nowledge_graph.db     # Database file
└── your-app-executable

Configuration for Bundled Apps

For bundled applications, the system automatically:

  1. Detects Bundle Environment: Checks sys.frozen or sys._MEIPASS (PyInstaller)
  2. Sets App-Relative Cache: Uses ./cache/ directory relative to executable
  3. Configures HuggingFace Environment: Sets HF_HOME and HUGGINGFACE_HUB_CACHE
  4. Enables Offline Mode: Prevents model downloads in production

Pre-Caching Models

Before packaging your application:

# 1. Cache embedding models
python -m nowledge_graph_server.utils.model_cache cache-embedding

# 2. Cache LLM models  
python -m nowledge_graph_server.utils.model_cache cache-llm

# 3. Verify cached models
python -m nowledge_graph_server.utils.model_cache info

Environment Variables

For manual configuration or development:

# Core cache directories
export HUGGINGFACE_CACHE_DIR=./cache/huggingface
export EMBEDDING_CACHE_DIR=./cache/embeddings
export LLM_CACHE_DIR=./cache/llm

# HuggingFace offline mode (for MLX models)
export HF_HOME=./cache/huggingface
export HUGGINGFACE_HUB_CACHE=./cache/huggingface
export HF_HUB_OFFLINE=1
export TRANSFORMERS_OFFLINE=1

# Application offline mode
export EMBEDDING_OFFLINE_MODE=true
export LLM_OFFLINE_MODE=true

Testing Offline Mode

# Test with offline configuration
cp config.offline.env config.env
python -m nowledge_graph_server.cli serve

# Test in air-gapped environment
# (disconnect internet and verify functionality)

Binary Packaging Checklist

  1. Pre-cache all models using the model cache utility
  2. Include cache directory in your package bundle
  3. Set cache paths relative to executable
  4. Test offline functionality before packaging
  5. Verify no internet access required in production environment

Troubleshooting

Issue: MLX models downloading instead of using cache Solution: Check that HUGGINGFACE_CACHE_DIR is set and models are cached in HuggingFace format

Issue: "Model not found in cache" errors Solution: Run model caching utility and verify cache directory structure

Issue: Internet access required in bundled app Solution: Enable offline mode and ensure all models are pre-cached

Installation

Contributing

  1. Fork the repository
  2. Create a feature branch
  3. Make your changes with tests
  4. Run the quality checks
  5. Submit a pull request

License

This project is part of the Nowledge Graph ecosystem and follows the same licensing terms.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

nowledge_graph_server-0.6.6.tar.gz (1.8 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

nowledge_graph_server-0.6.6-py3-none-any.whl (1.6 MB view details)

Uploaded Python 3

File details

Details for the file nowledge_graph_server-0.6.6.tar.gz.

File metadata

  • Download URL: nowledge_graph_server-0.6.6.tar.gz
  • Upload date:
  • Size: 1.8 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.12

File hashes

Hashes for nowledge_graph_server-0.6.6.tar.gz
Algorithm Hash digest
SHA256 2a5d72f1b1bda0df32011f54cab283073e9f6dac7f4273e39f10b2d6cc6480a7
MD5 f2c0da8ef3ac17615d941b064fd36faa
BLAKE2b-256 01ea686fe1c8e2ba3654114a0e987cc45c7190d34d3ad9ce1ba98fde7f05f9e3

See more details on using hashes here.

File details

Details for the file nowledge_graph_server-0.6.6-py3-none-any.whl.

File metadata

File hashes

Hashes for nowledge_graph_server-0.6.6-py3-none-any.whl
Algorithm Hash digest
SHA256 bf7ff73671fcdc7ef9abca8232899e0401ccfd7a09fc69d5262cb16136d64559
MD5 9aff43192fe67868413f3740f313eb67
BLAKE2b-256 4e30c78e40539c48241ebaf5c609f4c273f3c6587fdcac56c439b44b664ae8ad

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page