MCP server for databases, spreadsheets, structured files, and directed graphs.
Project description
LocalData MCP Server
A comprehensive MCP server for databases, spreadsheets, structured data files, and directed graphs with security features, performance optimization, and extensive format support.
What's New in v1.5.x
Graph Storage for Directed Graphs
DOT, GML, GraphML, and Mermaid files are stored as directed multigraphs backed by SQLite. Nodes and edges carry typed metadata properties, and the full graph is available for algorithmic analysis via NetworkX.
14 Graph Tools
| Tool | Purpose |
|---|---|
set_node |
Create or update a graph node |
delete_node |
Remove a node and cascade edges/properties |
get_node |
Inspect a node and its properties |
add_edge |
Create an edge (auto-creates endpoints) |
remove_edge |
Remove an edge by source/target/label |
get_edges |
List edges for a node |
get_neighbors |
List adjacent nodes by direction |
get_value |
Read a property from a node |
set_value |
Set a typed property on a node |
delete_key |
Remove a property |
list_keys |
List properties on a node |
find_path |
Shortest or all-paths between two nodes |
get_graph_stats |
Node/edge counts, density, DAG check, degree stats |
export_graph |
Export as DOT, GML, GraphML, or Mermaid |
Graph Validation Warnings
Import and edit operations automatically check for 13 categories of issues:
- Structural: self-loops, duplicate edges, orphan nodes, missing edge labels, cycles, disconnected components
- Semantic: contradictory edges (A→B and B→A with same label), conflicting parallel labels on the same pair
- Property: duplicate node ID casing, missing common properties, near-duplicate labels (fuzzy matching)
- DAG-specific: redundant transitive edges, diamond ambiguity (unintended polyhierarchy)
Warnings are returned in the "warnings" key of tool responses. Expensive checks (cycle detection, transitive reduction) are skipped automatically for graphs exceeding 10,000 nodes.
Metadata-Preserving Export
Graph exports (GraphML, GML, DOT) now include all node and edge metadata properties, not just labels and weights.
Table of Contents
- Features
- Quick Start
- Available Tools
- Working with Graphs
- Supported Data Sources
- Security Features
- Performance & Scalability
- Examples
- Testing & Quality
- Troubleshooting
- Roadmap
- Contributing
Features
Multi-Database Support
- SQL Databases: PostgreSQL, MySQL, SQLite, DuckDB
- Modern Databases: MongoDB, Redis, Elasticsearch, InfluxDB, Neo4j, CouchDB
- Spreadsheets: Excel (.xlsx/.xls), LibreOffice Calc (.ods), Apple Numbers (.numbers)
- Structured Files: CSV, TSV, JSON, YAML, TOML, XML, INI
- Graph Files: DOT (Graphviz), GML, GraphML, Mermaid flowcharts
- Analytical Formats: Parquet, Feather, Arrow, HDF5
Advanced Security
- Path Security: Restricts file access to current working directory only
- SQL Injection Prevention: Parameterized queries and safe table identifiers
- Connection Limits: Maximum 10 concurrent database connections
- Input Validation: Comprehensive validation and sanitization
Large Dataset Handling
- Query Buffering: Automatic buffering for results with 100+ rows
- Large File Support: 100MB+ files automatically use temporary SQLite storage
- Chunk Retrieval: Paginated access to large result sets
- Auto-Cleanup: 10-minute expiry with file modification detection
Developer Experience
- Clean Tool Surface: Core database tools, 9 tree tools, and 14 graph tools
- Error Handling: Detailed, actionable error messages
- Thread Safety: Concurrent operation support
- Backward Compatible: All existing APIs preserved
Quick Start
Installation
# Using pip
pip install localdata-mcp
# Using uv (recommended)
uv tool install localdata-mcp
# Development installation
git clone https://github.com/ChrisGVE/localdata-mcp.git
cd localdata-mcp
pip install -e .
Configuration
Add to your MCP client configuration:
{
"mcpServers": {
"localdata": {
"command": "localdata-mcp",
"env": {}
}
}
}
Docker Usage: See Docker Usage Guide for container deployment and configuration.
Usage Examples
Connect to Databases
# PostgreSQL
connect_database("analytics", "postgresql", "postgresql://user:pass@localhost/db")
# SQLite
connect_database("local", "sqlite", "./data.sqlite")
# CSV Files
connect_database("csvdata", "csv", "./data.csv")
# JSON Files
connect_database("config", "json", "./config.json")
# Excel Spreadsheets (all sheets)
connect_database("sales", "xlsx", "./sales_data.xlsx")
# Excel with specific sheet
connect_database("q1data", "xlsx", "./quarterly.xlsx?sheet=Q1_Sales")
# LibreOffice Calc
connect_database("budget", "ods", "./budget_2024.ods")
# Tab-separated values
connect_database("exports", "tsv", "./export_data.tsv")
# XML structured data
connect_database("config_xml", "xml", "./config.xml")
# INI configuration files
connect_database("settings", "ini", "./app.ini")
# Analytical formats
connect_database("analytics", "parquet", "./data.parquet")
connect_database("features", "feather", "./features.feather")
connect_database("vectors", "arrow", "./vectors.arrow")
Work with Structured Files (TOML, JSON, YAML)
# Connect — returns tree summary (root nodes, depths, property counts)
connect_database("cfg", "toml", "./config.toml")
# Navigate the tree
get_node("cfg") # Root summary
get_node("cfg", "server") # Node properties + children info
get_children("cfg", "server") # List child nodes
# Read and write properties
get_value("cfg", "server", "port") # → 8080
set_value("cfg", "server", "port", "9090")
set_value("cfg", "monitoring.alerts", "enabled", "true") # auto-creates node
# Export back to any format
export_structured("cfg", "toml") # Modified TOML output
export_structured("cfg", "json") # Convert to JSON
Query Data
# Execute queries with automatic result formatting
execute_query("analytics", "SELECT * FROM users LIMIT 50")
# Large result sets use buffering automatically
execute_query_json("analytics", "SELECT * FROM large_table")
Handle Large Results
# Get chunked results for large datasets
get_query_chunk("analytics_1640995200_a1b2", 101, "100")
# Check buffer status
get_buffered_query_info("analytics_1640995200_a1b2")
# Manual cleanup
clear_query_buffer("analytics_1640995200_a1b2")
Available Tools
Core Database Tools
| Tool | Description | Use Case |
|---|---|---|
connect_database |
Connect to databases/files | Initial setup |
disconnect_database |
Close connections | Cleanup |
list_databases |
Show active connections | Status check |
execute_query |
Run SQL with automatic chunking | All query needs |
next_chunk |
Get next chunk of large result sets | Large data |
describe_database |
Show database schema | Exploration |
describe_table |
Show table structure | Analysis |
find_table |
Locate tables by name | Navigation |
Tree Tools (TOML, JSON, YAML)
These tools are available when connected to a structured data file.
| Tool | Description | Use Case |
|---|---|---|
get_node |
View node properties and children info | Navigation |
get_children |
List child nodes with pagination | Navigation |
set_node |
Create a node (auto-creates ancestors) | Structure edit |
delete_node |
Remove node and all descendants | Structure edit |
list_keys |
List key-value pairs at a node | Inspection |
get_value |
Read a single property value | Inspection |
set_value |
Create or update a property | Data edit |
delete_key |
Remove a property from a node | Data edit |
export_structured |
Export tree as TOML, JSON, or YAML | Export |
Graph Tools (DOT, GML, GraphML, Mermaid)
These tools are available when connected to a graph file.
| Tool | Description | Use Case |
|---|---|---|
set_node |
Create or update a graph node | Structure edit |
delete_node |
Remove node, cascade edges and properties | Structure edit |
get_node |
Inspect a node's label and properties | Navigation |
add_edge |
Create a directed edge between nodes | Structure edit |
remove_edge |
Remove an edge by source/target/label | Structure edit |
get_edges |
List edges for a node | Navigation |
get_neighbors |
List adjacent nodes (in/out/both) | Navigation |
get_value |
Read a typed property from a node | Inspection |
set_value |
Set a typed property on a node | Data edit |
delete_key |
Remove a property from a node | Data edit |
list_keys |
List properties on a node with pagination | Inspection |
find_path |
Find shortest or all paths between nodes | Analysis |
get_graph_stats |
Node/edge counts, density, DAG test | Analysis |
export_graph |
Export as DOT, GML, GraphML, or Mermaid | Export |
Working with Graphs
# Connect to a graph file
connect_database("taxonomy", "graphml", "./knowledge_graph.graphml")
connect_database("flow", "dot", "./pipeline.dot")
connect_database("diagram", "mermaid", "./architecture.mmd")
# Navigate the graph
get_node("taxonomy", "machine_learning")
get_neighbors("taxonomy", "machine_learning", direction="out")
get_edges("taxonomy", node_id="machine_learning")
# Edit nodes and edges
set_node("taxonomy", "deep_learning", label="Deep Learning")
add_edge("taxonomy", "machine_learning", "deep_learning", label="broader")
set_value("taxonomy", "deep_learning", "field", "Computer Science")
# Algorithmic analysis
find_path("taxonomy", "statistics", "neural_networks")
get_graph_stats("taxonomy")
# Export to any supported format (includes metadata)
export_graph("taxonomy", "graphml")
export_graph("taxonomy", "mermaid")
Mutation and import operations return validation warnings when issues are detected:
# Adding a self-loop returns a warning
add_edge("taxonomy", "A", "A", label="related")
# → {"source": "A", "target": "A", ..., "warnings": [{"code": "self_loop", ...}]}
# Importing a file with issues reports them all
connect_database("g", "dot", "./messy_graph.dot")
# → {"node_count": 42, ..., "warnings": [{"code": "orphan_nodes", ...}, ...]}
Supported Data Sources
Detailed Connection Guide: See Database Connections Guide for setup instructions, connection strings, and security practices.
SQL Databases
- PostgreSQL: Full support with connection pooling
- MySQL: Complete MySQL/MariaDB compatibility
- SQLite: Local file and in-memory databases
- DuckDB: High-performance analytical SQL database
Modern Databases
- MongoDB: Document store with collection queries and aggregation
- Redis: High-performance key-value store
- Elasticsearch: Full-text search and analytics engine
- InfluxDB: Time-series database for metrics and IoT data
- Neo4j: Graph database for relationship queries
- CouchDB: Document-oriented database with HTTP API
Structured Files
Spreadsheet Formats
- Excel (.xlsx, .xls): Full multi-sheet support with automatic table creation
- LibreOffice Calc (.ods): Complete ODS support with sheet handling
- Apple Numbers (.numbers): Native support for Numbers documents
- Multi-sheet handling: Each sheet becomes a separate queryable table
Text-Based Formats
- CSV: Large file automatic SQLite conversion
- TSV: Tab-separated values with same features as CSV
- JSON: Tree storage with node navigation and CRUD (v1.4.0)
- YAML: Tree storage with multi-document support (v1.4.0)
- TOML: Tree storage with array-of-tables support (v1.4.0)
- XML: Structured XML document parsing
- INI: Configuration file format support
Graph Formats (v1.5.0)
- DOT (Graphviz): Directed and undirected graph descriptions
- GML: Graph Modelling Language with nested attributes
- GraphML: XML-based format with typed node/edge properties
- Mermaid: Flowchart syntax with subgraphs, edge labels, and node shapes
Analytical Formats
- Parquet: High-performance columnar data format
- Feather: Fast binary format for data interchange
- Arrow: In-memory columnar format support
- HDF5: Hierarchical data format for scientific computing
Security Features
Path Security
# ✅ Allowed - current directory and subdirectories
"./data/users.csv"
"data/config.json"
"subdir/file.yaml"
# ❌ Blocked - parent directory access
"../etc/passwd"
"../../sensitive.db"
"/etc/hosts"
SQL Injection Prevention
# ✅ Safe - parameterized queries
describe_table("mydb", "users") # Validates table name
# ❌ Blocked - malicious input
describe_table("mydb", "users; DROP TABLE users; --")
Resource Limits
- Connection Limit: Maximum 10 concurrent connections
- File Size Threshold: 100MB triggers temporary storage
- Query Buffering: Automatic for 100+ row results
- Auto-Cleanup: Buffers expire after 10 minutes
Performance & Scalability
Large File Handling
- Files over 100MB automatically use temporary SQLite storage
- Memory-efficient streaming for large datasets
- Automatic cleanup of temporary files
Query Optimization
- Results with 100+ rows automatically use buffering system
- Chunk-based retrieval for large datasets
- File modification detection for cache invalidation
Concurrency
- Thread-safe connection management
- Concurrent query execution support
- Resource pooling and limits
Testing & Quality
Comprehensive Test Coverage
- 68% test coverage with 500+ test cases
- Import error handling and graceful degradation
- Security vulnerability testing
- Performance benchmarking with large datasets
- Modern database connection testing
Security Validated
- Path traversal prevention
- SQL injection protection
- Resource exhaustion testing
- Malicious input handling
Performance Tested
- Large file processing
- Concurrent connection handling
- Memory usage optimization
- Query response times
API Compatibility
All existing MCP tool signatures remain 100% backward compatible. New functionality is additive only:
- All original tools work unchanged
- Enhanced responses with additional metadata
- New buffering tools for large datasets
- Improved error messages and validation
Examples
Production Examples: See Advanced Examples for production-ready usage patterns and complex scenarios.
Basic Database Operations
# Connect to SQLite
connect_database("sales", "sqlite", "./sales.db")
# Explore structure
describe_database("sales")
describe_table("sales", "orders")
# Query data
execute_query("sales", "SELECT product, SUM(amount) FROM orders GROUP BY product")
Large Dataset Processing
# Connect to large CSV
connect_database("bigdata", "csv", "./million_records.csv")
# Query returns buffer info for large results
result = execute_query_json("bigdata", "SELECT * FROM data WHERE category = 'A'")
# Access results in chunks
chunk = get_query_chunk("bigdata_1640995200_a1b2", 1, "1000")
Multi-Database Analysis
# Connect multiple sources
connect_database("postgres", "postgresql", "postgresql://localhost/prod")
connect_database("config", "yaml", "./config.yaml")
connect_database("logs", "json", "./logs.json")
# Query across sources (in application logic)
user_data = execute_query("postgres", "SELECT * FROM users")
config = read_text_file("./config.yaml", "yaml")
Multi-Sheet Spreadsheet Handling
LocalData MCP Server provides comprehensive support for multi-sheet spreadsheets (Excel and LibreOffice Calc):
Automatic Multi-Sheet Processing
# Connect to Excel file - all sheets become separate tables
connect_database("workbook", "xlsx", "./financial_data.xlsx")
# Query specific sheet (table names are sanitized sheet names)
execute_query("workbook", "SELECT * FROM Q1_Sales")
execute_query("workbook", "SELECT * FROM Q2_Budget")
execute_query("workbook", "SELECT * FROM Annual_Summary")
Single Sheet Selection
# Connect to specific sheet only using ?sheet=SheetName syntax
connect_database("q1only", "xlsx", "./financial_data.xlsx?sheet=Q1 Sales")
# The data is available as the default table
execute_query("q1only", "SELECT * FROM data")
Sheet Name Sanitization
Sheet names are automatically sanitized for SQL compatibility:
| Original Sheet Name | SQL Table Name |
|---|---|
| "Q1 Sales" | Q1_Sales |
| "2024-Budget" | _2024_Budget |
| "Summary & Notes" | Summary__Notes |
Discovering Available Sheets
# Connect to multi-sheet workbook
connect_database("workbook", "xlsx", "./data.xlsx")
# List all available tables (sheets)
describe_database("workbook")
# Get sample data from specific sheet
get_table_sample("workbook", "Sheet1")
Troubleshooting
For comprehensive troubleshooting guidance, see Troubleshooting Guide. For common questions, check the FAQ.
Roadmap
Completed (v1.5.x)
- Graph Storage: DOT, GML, GraphML, and Mermaid files stored as directed multigraphs
- 14 Graph Tools: Node/edge CRUD, path finding, graph statistics, multi-format export
- Graph Validation: 13 automated checks for structural, semantic, property, and DAG issues
- Metadata-Preserving Export: Graph exports include all node/edge properties
- Mermaid Round-Trip: Parse and export Mermaid flowcharts with subgraphs
Completed (v1.4.0)
- Tree Storage: TOML, JSON, YAML stored as navigable trees with full CRUD
- 9 Tree Tools: get_node, get_children, set_node, delete_node, list_keys, get_value, set_value, delete_key, export_structured
- Format Conversion: Export between TOML, JSON, and YAML
- Connection Summaries: Schema/tree overview returned on connect
- FastMCP v3: Updated tool registration for latest fastmcp
Completed (v1.3.x)
- Memory-Safe Streaming: Chunked query execution with configurable limits
- Pre-Query Analysis: Resource estimation before execution
- Structured Logging: JSON logging with Prometheus metrics
Completed (v1.1.0)
- Spreadsheet Formats: Excel (.xlsx/.xls), LibreOffice Calc (.ods) with full multi-sheet support
- Enhanced File Formats: XML, INI, TSV support
- Analytical Formats: Parquet, Feather, Arrow support
Contributing
Contributions welcome! Please read our Contributing Guidelines for details.
Development Setup
git clone https://github.com/ChrisGVE/localdata-mcp.git
cd localdata-mcp
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
pip install -e ".[dev]"
pytest
License
MIT License - see the LICENSE file for details.
Links
- GitHub: localdata-mcp
- PyPI: localdata-mcp
- MCP Protocol: Model Context Protocol
- FastMCP: FastMCP Framework
Tags
mcp model-context-protocol database postgresql mysql sqlite mongodb spreadsheet excel xlsx ods csv tsv json yaml toml xml ini parquet feather arrow graph graphml dot gml mermaid ai machine-learning data-integration python security performance
Made with care for the MCP Community
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file localdata_mcp-1.6.0.tar.gz.
File metadata
- Download URL: localdata_mcp-1.6.0.tar.gz
- Upload date:
- Size: 207.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c016293979847401dd7b9aa581d7c3873e24abb5977f6c3d770aa5098f822157
|
|
| MD5 |
e3675edc3b220b57dbce27784d707124
|
|
| BLAKE2b-256 |
5eaed6b2f26f5e295f101bb5ea133ce23e84fe778743032dbe50b7ae057f1b78
|
Provenance
The following attestation bundles were made for localdata_mcp-1.6.0.tar.gz:
Publisher:
publish-to-pypi.yml on ChrisGVE/localdata-mcp
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
localdata_mcp-1.6.0.tar.gz -
Subject digest:
c016293979847401dd7b9aa581d7c3873e24abb5977f6c3d770aa5098f822157 - Sigstore transparency entry: 1191286790
- Sigstore integration time:
-
Permalink:
ChrisGVE/localdata-mcp@0dea271e95cbf3acc70604e9851482ca615bc599 -
Branch / Tag:
refs/tags/v1.6.0 - Owner: https://github.com/ChrisGVE
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish-to-pypi.yml@0dea271e95cbf3acc70604e9851482ca615bc599 -
Trigger Event:
push
-
Statement type:
File details
Details for the file localdata_mcp-1.6.0-py3-none-any.whl.
File metadata
- Download URL: localdata_mcp-1.6.0-py3-none-any.whl
- Upload date:
- Size: 225.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0f14452a704859cee70f925fb0a8ab0b08cb460a9813e481faa17b677a58a1e5
|
|
| MD5 |
79591d9e857b0030fb9a16a0f4d89c83
|
|
| BLAKE2b-256 |
4bafe75486e30fa11d7a32c18ea1427c2dd1f7b5c23a8badc5ad73a68dc5a236
|
Provenance
The following attestation bundles were made for localdata_mcp-1.6.0-py3-none-any.whl:
Publisher:
publish-to-pypi.yml on ChrisGVE/localdata-mcp
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
localdata_mcp-1.6.0-py3-none-any.whl -
Subject digest:
0f14452a704859cee70f925fb0a8ab0b08cb460a9813e481faa17b677a58a1e5 - Sigstore transparency entry: 1191286802
- Sigstore integration time:
-
Permalink:
ChrisGVE/localdata-mcp@0dea271e95cbf3acc70604e9851482ca615bc599 -
Branch / Tag:
refs/tags/v1.6.0 - Owner: https://github.com/ChrisGVE
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish-to-pypi.yml@0dea271e95cbf3acc70604e9851482ca615bc599 -
Trigger Event:
push
-
Statement type: