Skip to main content

MCP server for databases, spreadsheets, structured files, and directed graphs.

Project description

LocalData MCP Server

LocalData MCP Server

License: MIT GitHub Release CI PyPI version Python 3.10+ Documentation FastMCP Verified on MseeP PyPI downloads GitHub stars

A comprehensive MCP server for databases, spreadsheets, structured data files, and directed graphs with security features, performance optimization, and extensive format support.

MseeP.ai Security Assessment Badge

What's New in v1.5.x

Graph Storage for Directed Graphs

DOT, GML, GraphML, and Mermaid files are stored as directed multigraphs backed by SQLite. Nodes and edges carry typed metadata properties, and the full graph is available for algorithmic analysis via NetworkX.

14 Graph Tools

Tool Purpose
set_node Create or update a graph node
delete_node Remove a node and cascade edges/properties
get_node Inspect a node and its properties
add_edge Create an edge (auto-creates endpoints)
remove_edge Remove an edge by source/target/label
get_edges List edges for a node
get_neighbors List adjacent nodes by direction
get_value Read a property from a node
set_value Set a typed property on a node
delete_key Remove a property
list_keys List properties on a node
find_path Shortest or all-paths between two nodes
get_graph_stats Node/edge counts, density, DAG check, degree stats
export_graph Export as DOT, GML, GraphML, or Mermaid

Graph Validation Warnings

Import and edit operations automatically check for 13 categories of issues:

  • Structural: self-loops, duplicate edges, orphan nodes, missing edge labels, cycles, disconnected components
  • Semantic: contradictory edges (A→B and B→A with same label), conflicting parallel labels on the same pair
  • Property: duplicate node ID casing, missing common properties, near-duplicate labels (fuzzy matching)
  • DAG-specific: redundant transitive edges, diamond ambiguity (unintended polyhierarchy)

Warnings are returned in the "warnings" key of tool responses. Expensive checks (cycle detection, transitive reduction) are skipped automatically for graphs exceeding 10,000 nodes.

Metadata-Preserving Export

Graph exports (GraphML, GML, DOT) now include all node and edge metadata properties, not just labels and weights.

Table of Contents

Features

Multi-Database Support

  • SQL Databases: PostgreSQL, MySQL, SQLite, DuckDB
  • Modern Databases: MongoDB, Redis, Elasticsearch, InfluxDB, Neo4j, CouchDB
  • Spreadsheets: Excel (.xlsx/.xls), LibreOffice Calc (.ods), Apple Numbers (.numbers)
  • Structured Files: CSV, TSV, JSON, YAML, TOML, XML, INI
  • Graph Files: DOT (Graphviz), GML, GraphML, Mermaid flowcharts
  • Analytical Formats: Parquet, Feather, Arrow, HDF5

Advanced Security

  • Path Security: Restricts file access to current working directory only
  • SQL Injection Prevention: Parameterized queries and safe table identifiers
  • Connection Limits: Maximum 10 concurrent database connections
  • Input Validation: Comprehensive validation and sanitization

Large Dataset Handling

  • Query Buffering: Automatic buffering for results with 100+ rows
  • Large File Support: 100MB+ files automatically use temporary SQLite storage
  • Chunk Retrieval: Paginated access to large result sets
  • Auto-Cleanup: 10-minute expiry with file modification detection

Developer Experience

  • Clean Tool Surface: Core database tools, 9 tree tools, and 14 graph tools
  • Error Handling: Detailed, actionable error messages
  • Thread Safety: Concurrent operation support
  • Backward Compatible: All existing APIs preserved

Quick Start

Installation

# Using pip
pip install localdata-mcp

# Using uv (recommended)
uv tool install localdata-mcp

# Development installation
git clone https://github.com/ChrisGVE/localdata-mcp.git
cd localdata-mcp
pip install -e .

Configuration

Add to your MCP client configuration:

{
  "mcpServers": {
    "localdata": {
      "command": "localdata-mcp",
      "env": {}
    }
  }
}

Docker Usage: See Docker Usage Guide for container deployment and configuration.

Usage Examples

Connect to Databases

# PostgreSQL
connect_database("analytics", "postgresql", "postgresql://user:pass@localhost/db")

# SQLite
connect_database("local", "sqlite", "./data.sqlite")

# CSV Files
connect_database("csvdata", "csv", "./data.csv")

# JSON Files
connect_database("config", "json", "./config.json")

# Excel Spreadsheets (all sheets)
connect_database("sales", "xlsx", "./sales_data.xlsx")

# Excel with specific sheet
connect_database("q1data", "xlsx", "./quarterly.xlsx?sheet=Q1_Sales")

# LibreOffice Calc
connect_database("budget", "ods", "./budget_2024.ods")

# Tab-separated values
connect_database("exports", "tsv", "./export_data.tsv")

# XML structured data
connect_database("config_xml", "xml", "./config.xml")

# INI configuration files
connect_database("settings", "ini", "./app.ini")

# Analytical formats
connect_database("analytics", "parquet", "./data.parquet")
connect_database("features", "feather", "./features.feather")
connect_database("vectors", "arrow", "./vectors.arrow")

Work with Structured Files (TOML, JSON, YAML)

# Connect — returns tree summary (root nodes, depths, property counts)
connect_database("cfg", "toml", "./config.toml")

# Navigate the tree
get_node("cfg")                          # Root summary
get_node("cfg", "server")               # Node properties + children info
get_children("cfg", "server")           # List child nodes

# Read and write properties
get_value("cfg", "server", "port")       # → 8080
set_value("cfg", "server", "port", "9090")
set_value("cfg", "monitoring.alerts", "enabled", "true")  # auto-creates node

# Export back to any format
export_structured("cfg", "toml")         # Modified TOML output
export_structured("cfg", "json")         # Convert to JSON

Query Data

# Execute queries with automatic result formatting
execute_query("analytics", "SELECT * FROM users LIMIT 50")

# Large result sets use buffering automatically
execute_query_json("analytics", "SELECT * FROM large_table")

Handle Large Results

# Get chunked results for large datasets
get_query_chunk("analytics_1640995200_a1b2", 101, "100")

# Check buffer status
get_buffered_query_info("analytics_1640995200_a1b2")

# Manual cleanup
clear_query_buffer("analytics_1640995200_a1b2")

Available Tools

Core Database Tools

Tool Description Use Case
connect_database Connect to databases/files Initial setup
disconnect_database Close connections Cleanup
list_databases Show active connections Status check
execute_query Run SQL with automatic chunking All query needs
next_chunk Get next chunk of large result sets Large data
describe_database Show database schema Exploration
describe_table Show table structure Analysis
find_table Locate tables by name Navigation

Tree Tools (TOML, JSON, YAML)

These tools are available when connected to a structured data file.

Tool Description Use Case
get_node View node properties and children info Navigation
get_children List child nodes with pagination Navigation
set_node Create a node (auto-creates ancestors) Structure edit
delete_node Remove node and all descendants Structure edit
list_keys List key-value pairs at a node Inspection
get_value Read a single property value Inspection
set_value Create or update a property Data edit
delete_key Remove a property from a node Data edit
export_structured Export tree as TOML, JSON, or YAML Export

Graph Tools (DOT, GML, GraphML, Mermaid)

These tools are available when connected to a graph file.

Tool Description Use Case
set_node Create or update a graph node Structure edit
delete_node Remove node, cascade edges and properties Structure edit
get_node Inspect a node's label and properties Navigation
add_edge Create a directed edge between nodes Structure edit
remove_edge Remove an edge by source/target/label Structure edit
get_edges List edges for a node Navigation
get_neighbors List adjacent nodes (in/out/both) Navigation
get_value Read a typed property from a node Inspection
set_value Set a typed property on a node Data edit
delete_key Remove a property from a node Data edit
list_keys List properties on a node with pagination Inspection
find_path Find shortest or all paths between nodes Analysis
get_graph_stats Node/edge counts, density, DAG test Analysis
export_graph Export as DOT, GML, GraphML, or Mermaid Export

Working with Graphs

# Connect to a graph file
connect_database("taxonomy", "graphml", "./knowledge_graph.graphml")
connect_database("flow", "dot", "./pipeline.dot")
connect_database("diagram", "mermaid", "./architecture.mmd")

# Navigate the graph
get_node("taxonomy", "machine_learning")
get_neighbors("taxonomy", "machine_learning", direction="out")
get_edges("taxonomy", node_id="machine_learning")

# Edit nodes and edges
set_node("taxonomy", "deep_learning", label="Deep Learning")
add_edge("taxonomy", "machine_learning", "deep_learning", label="broader")
set_value("taxonomy", "deep_learning", "field", "Computer Science")

# Algorithmic analysis
find_path("taxonomy", "statistics", "neural_networks")
get_graph_stats("taxonomy")

# Export to any supported format (includes metadata)
export_graph("taxonomy", "graphml")
export_graph("taxonomy", "mermaid")

Mutation and import operations return validation warnings when issues are detected:

# Adding a self-loop returns a warning
add_edge("taxonomy", "A", "A", label="related")
# → {"source": "A", "target": "A", ..., "warnings": [{"code": "self_loop", ...}]}

# Importing a file with issues reports them all
connect_database("g", "dot", "./messy_graph.dot")
# → {"node_count": 42, ..., "warnings": [{"code": "orphan_nodes", ...}, ...]}

Supported Data Sources

Detailed Connection Guide: See Database Connections Guide for setup instructions, connection strings, and security practices.

SQL Databases

  • PostgreSQL: Full support with connection pooling
  • MySQL: Complete MySQL/MariaDB compatibility
  • SQLite: Local file and in-memory databases
  • DuckDB: High-performance analytical SQL database

Modern Databases

  • MongoDB: Document store with collection queries and aggregation
  • Redis: High-performance key-value store
  • Elasticsearch: Full-text search and analytics engine
  • InfluxDB: Time-series database for metrics and IoT data
  • Neo4j: Graph database for relationship queries
  • CouchDB: Document-oriented database with HTTP API

Structured Files

Spreadsheet Formats

  • Excel (.xlsx, .xls): Full multi-sheet support with automatic table creation
  • LibreOffice Calc (.ods): Complete ODS support with sheet handling
  • Apple Numbers (.numbers): Native support for Numbers documents
  • Multi-sheet handling: Each sheet becomes a separate queryable table

Text-Based Formats

  • CSV: Large file automatic SQLite conversion
  • TSV: Tab-separated values with same features as CSV
  • JSON: Tree storage with node navigation and CRUD (v1.4.0)
  • YAML: Tree storage with multi-document support (v1.4.0)
  • TOML: Tree storage with array-of-tables support (v1.4.0)
  • XML: Structured XML document parsing
  • INI: Configuration file format support

Graph Formats (v1.5.0)

  • DOT (Graphviz): Directed and undirected graph descriptions
  • GML: Graph Modelling Language with nested attributes
  • GraphML: XML-based format with typed node/edge properties
  • Mermaid: Flowchart syntax with subgraphs, edge labels, and node shapes

Analytical Formats

  • Parquet: High-performance columnar data format
  • Feather: Fast binary format for data interchange
  • Arrow: In-memory columnar format support
  • HDF5: Hierarchical data format for scientific computing

Security Features

Path Security

# ✅ Allowed - current directory and subdirectories
"./data/users.csv"
"data/config.json"
"subdir/file.yaml"

# ❌ Blocked - parent directory access
"../etc/passwd"
"../../sensitive.db"
"/etc/hosts"

SQL Injection Prevention

# ✅ Safe - parameterized queries
describe_table("mydb", "users")  # Validates table name

# ❌ Blocked - malicious input
describe_table("mydb", "users; DROP TABLE users; --")

Resource Limits

  • Connection Limit: Maximum 10 concurrent connections
  • File Size Threshold: 100MB triggers temporary storage
  • Query Buffering: Automatic for 100+ row results
  • Auto-Cleanup: Buffers expire after 10 minutes

Performance & Scalability

Large File Handling

  • Files over 100MB automatically use temporary SQLite storage
  • Memory-efficient streaming for large datasets
  • Automatic cleanup of temporary files

Query Optimization

  • Results with 100+ rows automatically use buffering system
  • Chunk-based retrieval for large datasets
  • File modification detection for cache invalidation

Concurrency

  • Thread-safe connection management
  • Concurrent query execution support
  • Resource pooling and limits

Testing & Quality

Comprehensive Test Coverage

  • 68% test coverage with 500+ test cases
  • Import error handling and graceful degradation
  • Security vulnerability testing
  • Performance benchmarking with large datasets
  • Modern database connection testing

Security Validated

  • Path traversal prevention
  • SQL injection protection
  • Resource exhaustion testing
  • Malicious input handling

Performance Tested

  • Large file processing
  • Concurrent connection handling
  • Memory usage optimization
  • Query response times

API Compatibility

All existing MCP tool signatures remain 100% backward compatible. New functionality is additive only:

  • All original tools work unchanged
  • Enhanced responses with additional metadata
  • New buffering tools for large datasets
  • Improved error messages and validation

Examples

Production Examples: See Advanced Examples for production-ready usage patterns and complex scenarios.

Basic Database Operations

# Connect to SQLite
connect_database("sales", "sqlite", "./sales.db")

# Explore structure
describe_database("sales")
describe_table("sales", "orders")

# Query data
execute_query("sales", "SELECT product, SUM(amount) FROM orders GROUP BY product")

Large Dataset Processing

# Connect to large CSV
connect_database("bigdata", "csv", "./million_records.csv")

# Query returns buffer info for large results
result = execute_query_json("bigdata", "SELECT * FROM data WHERE category = 'A'")

# Access results in chunks
chunk = get_query_chunk("bigdata_1640995200_a1b2", 1, "1000")

Multi-Database Analysis

# Connect multiple sources
connect_database("postgres", "postgresql", "postgresql://localhost/prod")
connect_database("config", "yaml", "./config.yaml")
connect_database("logs", "json", "./logs.json")

# Query across sources (in application logic)
user_data = execute_query("postgres", "SELECT * FROM users")
config = read_text_file("./config.yaml", "yaml")

Multi-Sheet Spreadsheet Handling

LocalData MCP Server provides comprehensive support for multi-sheet spreadsheets (Excel and LibreOffice Calc):

Automatic Multi-Sheet Processing

# Connect to Excel file - all sheets become separate tables
connect_database("workbook", "xlsx", "./financial_data.xlsx")

# Query specific sheet (table names are sanitized sheet names)
execute_query("workbook", "SELECT * FROM Q1_Sales")
execute_query("workbook", "SELECT * FROM Q2_Budget")
execute_query("workbook", "SELECT * FROM Annual_Summary")

Single Sheet Selection

# Connect to specific sheet only using ?sheet=SheetName syntax
connect_database("q1only", "xlsx", "./financial_data.xlsx?sheet=Q1 Sales")

# The data is available as the default table
execute_query("q1only", "SELECT * FROM data")

Sheet Name Sanitization

Sheet names are automatically sanitized for SQL compatibility:

Original Sheet Name SQL Table Name
"Q1 Sales" Q1_Sales
"2024-Budget" _2024_Budget
"Summary & Notes" Summary__Notes

Discovering Available Sheets

# Connect to multi-sheet workbook
connect_database("workbook", "xlsx", "./data.xlsx")

# List all available tables (sheets)
describe_database("workbook")

# Get sample data from specific sheet
get_table_sample("workbook", "Sheet1")

Troubleshooting

For comprehensive troubleshooting guidance, see Troubleshooting Guide. For common questions, check the FAQ.

Roadmap

Completed (v1.5.x)

  • Graph Storage: DOT, GML, GraphML, and Mermaid files stored as directed multigraphs
  • 14 Graph Tools: Node/edge CRUD, path finding, graph statistics, multi-format export
  • Graph Validation: 13 automated checks for structural, semantic, property, and DAG issues
  • Metadata-Preserving Export: Graph exports include all node/edge properties
  • Mermaid Round-Trip: Parse and export Mermaid flowcharts with subgraphs

Completed (v1.4.0)

  • Tree Storage: TOML, JSON, YAML stored as navigable trees with full CRUD
  • 9 Tree Tools: get_node, get_children, set_node, delete_node, list_keys, get_value, set_value, delete_key, export_structured
  • Format Conversion: Export between TOML, JSON, and YAML
  • Connection Summaries: Schema/tree overview returned on connect
  • FastMCP v3: Updated tool registration for latest fastmcp

Completed (v1.3.x)

  • Memory-Safe Streaming: Chunked query execution with configurable limits
  • Pre-Query Analysis: Resource estimation before execution
  • Structured Logging: JSON logging with Prometheus metrics

Completed (v1.1.0)

  • Spreadsheet Formats: Excel (.xlsx/.xls), LibreOffice Calc (.ods) with full multi-sheet support
  • Enhanced File Formats: XML, INI, TSV support
  • Analytical Formats: Parquet, Feather, Arrow support

Contributing

Contributions welcome! Please read our Contributing Guidelines for details.

Development Setup

git clone https://github.com/ChrisGVE/localdata-mcp.git
cd localdata-mcp
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate
pip install -e ".[dev]"
pytest

License

MIT License - see the LICENSE file for details.

Links

Tags

mcp model-context-protocol database postgresql mysql sqlite mongodb spreadsheet excel xlsx ods csv tsv json yaml toml xml ini parquet feather arrow graph graphml dot gml mermaid ai machine-learning data-integration python security performance


Made with care for the MCP Community

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

localdata_mcp-1.6.0.tar.gz (207.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

localdata_mcp-1.6.0-py3-none-any.whl (225.1 kB view details)

Uploaded Python 3

File details

Details for the file localdata_mcp-1.6.0.tar.gz.

File metadata

  • Download URL: localdata_mcp-1.6.0.tar.gz
  • Upload date:
  • Size: 207.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for localdata_mcp-1.6.0.tar.gz
Algorithm Hash digest
SHA256 c016293979847401dd7b9aa581d7c3873e24abb5977f6c3d770aa5098f822157
MD5 e3675edc3b220b57dbce27784d707124
BLAKE2b-256 5eaed6b2f26f5e295f101bb5ea133ce23e84fe778743032dbe50b7ae057f1b78

See more details on using hashes here.

Provenance

The following attestation bundles were made for localdata_mcp-1.6.0.tar.gz:

Publisher: publish-to-pypi.yml on ChrisGVE/localdata-mcp

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file localdata_mcp-1.6.0-py3-none-any.whl.

File metadata

  • Download URL: localdata_mcp-1.6.0-py3-none-any.whl
  • Upload date:
  • Size: 225.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for localdata_mcp-1.6.0-py3-none-any.whl
Algorithm Hash digest
SHA256 0f14452a704859cee70f925fb0a8ab0b08cb460a9813e481faa17b677a58a1e5
MD5 79591d9e857b0030fb9a16a0f4d89c83
BLAKE2b-256 4bafe75486e30fa11d7a32c18ea1427c2dd1f7b5c23a8badc5ad73a68dc5a236

See more details on using hashes here.

Provenance

The following attestation bundles were made for localdata_mcp-1.6.0-py3-none-any.whl:

Publisher: publish-to-pypi.yml on ChrisGVE/localdata-mcp

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page