Nowledge Graph MCP Server - Local-first personal memory management
Project description
Nowledge Graph MCP Server
A local-first personal memory management system that serves as a universal bridge between GenAI tools using MCP (Model Context Protocol). This Python server provides graph-based memory storage, semantic search, and knowledge management capabilities.
Features
- Graph Database: KuzuDB-powered graph storage for memories, threads, entities, and relationships
- Vector Search: Semantic search using sentence transformers and FAISS
- MCP Protocol: Full Model Context Protocol support for GenAI tool integration
- Memory Management: Intelligent memory compaction and entity linking
- Thread Support: Conversation import and management with provenance tracking
- Flexible Labeling: User-driven organization with label nodes
Architecture
┌─────────────────────────────────────┐
│ Python MCP Server │
│ ┌─────────────┐ ┌─────────────┐ │
│ │ FastMCP │ │ KuzuDB │ │
│ │ Tools │ │ + Vector │ │
│ └─────────────┘ └─────────────┘ │
└─────────────────────────────────────┘
│ HTTP/JSON-RPC
▼
┌─────────────────────────────────────┐
│ Nowledge Graph Desktop App │
│ ┌─────────────┐ ┌─────────────┐ │
│ │ React │ │ Rust │ │
│ │ + G6 │ │ Backend │ │
│ └─────────────┘ └─────────────┘ │
└─────────────────────────────────────┘
Quick Start
Installation
# Using uv (recommended)
cd nowledge-graph-py
uv sync
# Or using pip
pip install -e .
Development
# Start the MCP server in development mode
uv run mcp dev src/nowledge_graph_server/server.py
# Run with additional dependencies
uv run mcp dev src/nowledge_graph_server/server.py --with pandas --with numpy
# Install for Claude Desktop
uv run mcp install src/nowledge_graph_server/server.py
Production
# Start the server directly
uv run nowledge-server
# Or using the module
python -m nowledge_graph_server
MCP Tools
The server exposes the following MCP tools:
Memory Management
memory.search(query, limit)- Semantic search of Memory nodesmemory.add(content, source_thread_id, metadata)- Add single Memorymemory.get(memory_id)- Retrieve Memory by ID
Thread Management
thread.add(messages, thread_id, metadata)- Import conversation threadthread.search(query, limit)- Semantic search of Thread summariesthread.fetch(thread_id)- Retrieve full thread by ID
Entity & Relationship Management
entity.search(query, limit)- Find entities by name/descriptionentity.relate(entity1_id, entity2_id, relationship_type)- Create entity relationships
Search & Discovery
message.search(query, limit)- Full-text search of Message contentgraph.explore(start_node_id, depth)- Explore graph relationships
Graph Schema
Node Types
- Thread: Conversation containers with summaries
- Message: Individual messages within threads
- Memory: Distilled knowledge from conversations
- Entity: Concepts, people, places, etc.
- Label: Flexible organization tags
- Community: Entity clustering for discovery
Relationship Types
(Thread)-[:CONTAINS {order_index}]->(Message)(Thread)-[:COMPACTS_TO]->(Memory)(Memory)-[:EXTRACTED_FROM]->(Message)(Memory)-[:MENTIONS {confidence}]->(Entity)(Entity)-[:RELATES_TO {type, strength}]->(Entity)(Thread/Memory)-[:HAS_LABEL]->(Label)(Entity)-[:BELONGS_TO {strength}]->(Community)
Configuration
Create a .env file or set environment variables:
# Database
KUZU_DB_PATH=./data/nowledge.db
VECTOR_INDEX_PATH=./data/vectors
# Embeddings
EMBEDDING_MODEL=all-MiniLM-L6-v2
EMBEDDING_DIMENSION=384
# Server
MCP_SERVER_HOST=localhost
MCP_SERVER_PORT=3001
LOG_LEVEL=INFO
# Optional: OpenAI API for enhanced processing
OPENAI_API_KEY=your_key_here
Development
Project Structure
src/nowledge_graph_server/
├── __init__.py # Package initialization
├── server.py # FastMCP server setup
├── cli.py # Command-line interface
├── config.py # Configuration management
├── models/ # Pydantic data models
│ ├── __init__.py
│ ├── graph.py # Graph node/edge models
│ ├── memory.py # Memory-related models
│ └── thread.py # Thread/message models
├── database/ # Database layer
│ ├── __init__.py
│ ├── kuzu_client.py # KuzuDB integration
│ ├── schema.py # Graph schema setup
│ └── migrations.py # Schema migrations
├── tools/ # MCP tool implementations
│ ├── __init__.py
│ ├── memory.py # Memory management tools
│ ├── thread.py # Thread management tools
│ ├── entity.py # Entity tools
│ └── search.py # Search tools
├── services/ # Business logic
│ ├── __init__.py
│ ├── embedding.py # Vector embedding service
│ ├── entity_extraction.py # NLP entity extraction
│ └── memory_compaction.py # Memory creation logic
└── utils/ # Utilities
├── __init__.py
├── logging.py # Structured logging
└── validation.py # Data validation
Testing
# Run all tests
uv run pytest
# Run with coverage
uv run pytest --cov=nowledge_graph_server
# Run specific test file
uv run pytest tests/test_memory_tools.py
Code Quality
# Format code
uv run black src tests
uv run isort src tests
# Lint
uv run ruff check src tests
# Type checking
uv run mypy src
Integration with Tauri App
The Python MCP server communicates with the Tauri desktop application via HTTP/JSON-RPC. The Tauri app connects to the server on startup and uses the MCP tools for all memory and graph operations.
Offline Deployment & Model Caching
For production deployment in air-gapped environments or packaged applications, you'll need to pre-cache models locally.
Quick Setup for Offline Mode
- Pre-cache models (run this before packaging):
# Cache default models
python -m nowledge_graph_server.utils.model_cache
# Cache specific models
python -m nowledge_graph_server.utils.model_cache \
--embedding-model multi-qa-MiniLM-L6-cos-v1 \
--llm-model mlx-community/Qwen3-4B-Instruct-2507-DDWQ
# Check cache info
python -m nowledge_graph_server.utils.model_cache --info
- Configure for offline mode:
# Environment variables
export NOWLEDGE_EMBEDDING_OFFLINE_MODE=true
export NOWLEDGE_LLM_OFFLINE_MODE=true
export NOWLEDGE_EMBEDDING_CACHE_DIR=~/.cache/huggingface
export NOWLEDGE_LLM_CACHE_DIR=~/.cache/mlx
# Or in config.py
embedding_offline_mode = True
llm_offline_mode = True
Model Caching Strategy
The system uses a offline-first approach:
- Embedding Models: Uses local
sentence-transformerscache - LLM Models: Uses local MLX model cache
- Fallback: Hash-based embeddings if models unavailable
Cache Locations
- Embedding Models:
~/.cache/huggingface/hub/ - MLX Models:
~/.cache/mlx/models/or~/.cache/huggingface/hub/ - Custom Cache: Set via
NOWLEDGE_EMBEDDING_CACHE_DIRandNOWLEDGE_LLM_CACHE_DIR
Packaging for Binary Distribution
- Pre-cache all models:
# Create cache directory in your package
mkdir -p ./package/cache
python -m nowledge_graph_server.utils.model_cache --cache-dir ./package/cache
- Set environment variables in your package:
export NOWLEDGE_EMBEDDING_OFFLINE_MODE=true
export NOWLEDGE_LLM_OFFLINE_MODE=true
export NOWLEDGE_EMBEDDING_CACHE_DIR=./cache/huggingface
export NOWLEDGE_LLM_CACHE_DIR=./cache/mlx
- Bundle cache directory with your application.
Troubleshooting Offline Issues
Error: "Failed to resolve 'huggingface.co'"
- Solution: Enable offline mode and pre-cache models
Error: "Model not found in cache"
- Solution: Run the model caching script first
Error: "MLX model download failed"
- Solution: Set
HF_HUB_OFFLINE=1andTRANSFORMERS_OFFLINE=1
Performance Considerations
- Cache Size: ~2-4GB for typical models
- Startup Time: Faster with cached models
- Memory Usage: Same as online mode
- Accuracy: Identical to online mode (uses same models)
Self-Contained App Deployment
Cache Directory Configuration
The system now supports proper cache directory configuration for self-contained/bundled applications:
Key Features:
- Automatic Cache Directory Detection: Detects if running in a bundled app (PyInstaller, cx_Freeze, etc.)
- App-Relative Cache Paths: Uses cache directories relative to the executable, not user home directory
- Offline-First Configuration: Defaults to offline mode to prevent internet access in packaged apps
- HuggingFace Cache Support: Properly handles MLX model caching via HuggingFace Hub
Cache Directory Structure:
your-app/
├── cache/
│ ├── embeddings/ # Sentence transformer models
│ ├── llm/ # Local LLM cache (if needed)
│ └── huggingface/ # MLX models from HuggingFace
├── data/
│ └── nowledge_graph.db # Database file
└── your-app-executable
Configuration for Bundled Apps
For bundled applications, the system automatically:
- Detects Bundle Environment: Checks
sys.frozenorsys._MEIPASS(PyInstaller) - Sets App-Relative Cache: Uses
./cache/directory relative to executable - Configures HuggingFace Environment: Sets
HF_HOMEandHUGGINGFACE_HUB_CACHE - Enables Offline Mode: Prevents model downloads in production
Pre-Caching Models
Before packaging your application:
# 1. Cache embedding models
python -m nowledge_graph_server.utils.model_cache cache-embedding
# 2. Cache LLM models
python -m nowledge_graph_server.utils.model_cache cache-llm
# 3. Verify cached models
python -m nowledge_graph_server.utils.model_cache info
Environment Variables
For manual configuration or development:
# Core cache directories
export HUGGINGFACE_CACHE_DIR=./cache/huggingface
export EMBEDDING_CACHE_DIR=./cache/embeddings
export LLM_CACHE_DIR=./cache/llm
# HuggingFace offline mode (for MLX models)
export HF_HOME=./cache/huggingface
export HUGGINGFACE_HUB_CACHE=./cache/huggingface
export HF_HUB_OFFLINE=1
export TRANSFORMERS_OFFLINE=1
# Application offline mode
export EMBEDDING_OFFLINE_MODE=true
export LLM_OFFLINE_MODE=true
Testing Offline Mode
# Test with offline configuration
cp config.offline.env config.env
python -m nowledge_graph_server.cli serve
# Test in air-gapped environment
# (disconnect internet and verify functionality)
Binary Packaging Checklist
- Pre-cache all models using the model cache utility
- Include cache directory in your package bundle
- Set cache paths relative to executable
- Test offline functionality before packaging
- Verify no internet access required in production environment
Troubleshooting
Issue: MLX models downloading instead of using cache
Solution: Check that HUGGINGFACE_CACHE_DIR is set and models are cached in HuggingFace format
Issue: "Model not found in cache" errors Solution: Run model caching utility and verify cache directory structure
Issue: Internet access required in bundled app Solution: Enable offline mode and ensure all models are pre-cached
Installation
Contributing
- Fork the repository
- Create a feature branch
- Make your changes with tests
- Run the quality checks
- Submit a pull request
License
This project is part of the Nowledge Graph ecosystem and follows the same licensing terms.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file nowledge_graph_server-0.6.6.tar.gz.
File metadata
- Download URL: nowledge_graph_server-0.6.6.tar.gz
- Upload date:
- Size: 1.8 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2a5d72f1b1bda0df32011f54cab283073e9f6dac7f4273e39f10b2d6cc6480a7
|
|
| MD5 |
f2c0da8ef3ac17615d941b064fd36faa
|
|
| BLAKE2b-256 |
01ea686fe1c8e2ba3654114a0e987cc45c7190d34d3ad9ce1ba98fde7f05f9e3
|
File details
Details for the file nowledge_graph_server-0.6.6-py3-none-any.whl.
File metadata
- Download URL: nowledge_graph_server-0.6.6-py3-none-any.whl
- Upload date:
- Size: 1.6 MB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
bf7ff73671fcdc7ef9abca8232899e0401ccfd7a09fc69d5262cb16136d64559
|
|
| MD5 |
9aff43192fe67868413f3740f313eb67
|
|
| BLAKE2b-256 |
4e30c78e40539c48241ebaf5c609f4c273f3c6587fdcac56c439b44b664ae8ad
|