Skip to main content

A dynamic MCP server for local databases and text files.

Project description

LocalData MCP Server

License: MIT Python 3.8+ FastMCP

A powerful, secure MCP server for local databases and structured text files with advanced security features and large dataset handling.

✨ Features

🗄️ Multi-Database Support

  • SQL Databases: PostgreSQL, MySQL, SQLite
  • Document Databases: MongoDB
  • Structured Files: CSV, JSON, YAML, TOML

🔒 Advanced Security

  • Path Security: Restricts file access to current working directory only
  • SQL Injection Prevention: Parameterized queries and safe table identifiers
  • Connection Limits: Maximum 10 concurrent database connections
  • Input Validation: Comprehensive validation and sanitization

📊 Large Dataset Handling

  • Query Buffering: Automatic buffering for results with 100+ rows
  • Large File Support: 100MB+ files automatically use temporary SQLite storage
  • Chunk Retrieval: Paginated access to large result sets
  • Auto-Cleanup: 10-minute expiry with file modification detection

🛠️ Developer Experience

  • Comprehensive Tools: 12 database operation tools
  • Error Handling: Detailed, actionable error messages
  • Thread Safety: Concurrent operation support
  • Backward Compatible: All existing APIs preserved

🚀 Quick Start

Installation

# Using pip
pip install localdata-mcp

# Using uv (recommended)
uv tool install localdata-mcp

# Development installation
git clone https://github.com/ChrisGVE/localdata-mcp.git
cd localdata-mcp
pip install -e .

Configuration

Add to your MCP client configuration:

{
  "mcpServers": {
    "localdata": {
      "command": "localdata-mcp",
      "env": {}
    }
  }
}

Usage Examples

Connect to Databases

# PostgreSQL
connect_database("analytics", "postgresql", "postgresql://user:pass@localhost/db")

# SQLite
connect_database("local", "sqlite", "./data.sqlite")

# CSV Files
connect_database("csvdata", "csv", "./data.csv")

# JSON Files
connect_database("config", "json", "./config.json")

Query Data

# Execute queries with automatic result formatting
execute_query("analytics", "SELECT * FROM users LIMIT 50")

# Large result sets use buffering automatically
execute_query_json("analytics", "SELECT * FROM large_table")

Handle Large Results

# Get chunked results for large datasets
get_query_chunk("analytics_1640995200_a1b2", 101, "100")

# Check buffer status
get_buffered_query_info("analytics_1640995200_a1b2")

# Manual cleanup
clear_query_buffer("analytics_1640995200_a1b2")

🔧 Available Tools

Tool Description Use Case
connect_database Connect to databases/files Initial setup
disconnect_database Close connections Cleanup
list_databases Show active connections Status check
execute_query Run SQL (markdown output) Small results
execute_query_json Run SQL (JSON output) Large results
describe_database Show schema/structure Exploration
describe_table Show table details Analysis
get_table_sample Preview table data Quick look
get_table_sample_json Preview (JSON format) Development
find_table Locate tables by name Navigation
read_text_file Read structured files File access
get_query_chunk Paginated result access Large data
get_buffered_query_info Buffer status info Monitoring
clear_query_buffer Manual buffer cleanup Management

📋 Supported Data Sources

SQL Databases

  • PostgreSQL: Full support with connection pooling
  • MySQL: Complete MySQL/MariaDB compatibility
  • SQLite: Local file and in-memory databases

Document Databases

  • MongoDB: Collection queries and aggregation

Structured Files

  • CSV: Large file automatic SQLite conversion
  • JSON: Nested structure flattening
  • YAML: Configuration file support
  • TOML: Settings and config files

🛡️ Security Features

Path Security

# ✅ Allowed - current directory and subdirectories
"./data/users.csv"
"data/config.json"
"subdir/file.yaml"

# ❌ Blocked - parent directory access
"../etc/passwd"
"../../sensitive.db"
"/etc/hosts"

SQL Injection Prevention

# ✅ Safe - parameterized queries
describe_table("mydb", "users")  # Validates table name

# ❌ Blocked - malicious input
describe_table("mydb", "users; DROP TABLE users; --")

Resource Limits

  • Connection Limit: Maximum 10 concurrent connections
  • File Size Threshold: 100MB triggers temporary storage
  • Query Buffering: Automatic for 100+ row results
  • Auto-Cleanup: Buffers expire after 10 minutes

📊 Performance & Scalability

Large File Handling

  • Files over 100MB automatically use temporary SQLite storage
  • Memory-efficient streaming for large datasets
  • Automatic cleanup of temporary files

Query Optimization

  • Results with 100+ rows automatically use buffering system
  • Chunk-based retrieval for large datasets
  • File modification detection for cache invalidation

Concurrency

  • Thread-safe connection management
  • Concurrent query execution support
  • Resource pooling and limits

🧪 Testing & Quality

✅ 100% Test Coverage

  • 100+ comprehensive test cases
  • Security vulnerability testing
  • Performance benchmarking
  • Edge case validation

🔒 Security Validated

  • Path traversal prevention
  • SQL injection protection
  • Resource exhaustion testing
  • Malicious input handling

⚡ Performance Tested

  • Large file processing
  • Concurrent connection handling
  • Memory usage optimization
  • Query response times

🔄 API Compatibility

All existing MCP tool signatures remain 100% backward compatible. New functionality is additive only:

  • ✅ All original tools work unchanged
  • ✅ Enhanced responses with additional metadata
  • ✅ New buffering tools for large datasets
  • ✅ Improved error messages and validation

📖 Examples

Basic Database Operations

# Connect to SQLite
connect_database("sales", "sqlite", "./sales.db")

# Explore structure
describe_database("sales")
describe_table("sales", "orders")

# Query data
execute_query("sales", "SELECT product, SUM(amount) FROM orders GROUP BY product")

Large Dataset Processing

# Connect to large CSV
connect_database("bigdata", "csv", "./million_records.csv")

# Query returns buffer info for large results
result = execute_query_json("bigdata", "SELECT * FROM data WHERE category = 'A'")

# Access results in chunks
chunk = get_query_chunk("bigdata_1640995200_a1b2", 1, "1000")

Multi-Database Analysis

# Connect multiple sources
connect_database("postgres", "postgresql", "postgresql://localhost/prod")
connect_database("config", "yaml", "./config.yaml")
connect_database("logs", "json", "./logs.json")

# Query across sources (in application logic)
user_data = execute_query("postgres", "SELECT * FROM users")
config = read_text_file("./config.yaml", "yaml")

🚧 Roadmap

  • Enhanced File Formats: Excel, Parquet support
  • Caching Layer: Configurable query result caching
  • Connection Pooling: Advanced connection management
  • Streaming APIs: Real-time data processing
  • Monitoring Tools: Connection and performance metrics

🤝 Contributing

Contributions welcome! Please read our Contributing Guidelines for details.

Development Setup

git clone https://github.com/ChrisGVE/localdata-mcp.git
cd localdata-mcp
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate
pip install -e ".[dev]"
pytest

📄 License

MIT License - see the LICENSE file for details.

🔗 Links

📊 Stats

GitHub stars GitHub forks PyPI downloads

📚 Additional Resources

🤔 Need Help?

🏷️ Tags

mcp model-context-protocol database postgresql mysql sqlite mongodb csv json yaml toml ai machine-learning data-integration python security performance


Made with ❤️ for the MCP Community

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

localdata_mcp-1.0.2.tar.gz (17.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

localdata_mcp-1.0.2-py3-none-any.whl (13.9 kB view details)

Uploaded Python 3

File details

Details for the file localdata_mcp-1.0.2.tar.gz.

File metadata

  • Download URL: localdata_mcp-1.0.2.tar.gz
  • Upload date:
  • Size: 17.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for localdata_mcp-1.0.2.tar.gz
Algorithm Hash digest
SHA256 c1e8c806e7f30b0bd40376e2db02b3eb2c8c6529d881087c978ab148223f4a88
MD5 246254c060b60fb063e1fdf8740eec39
BLAKE2b-256 aed423c46e40620334b7fb47030f3e0f0146cd43006a5badb3819246e040f3b9

See more details on using hashes here.

Provenance

The following attestation bundles were made for localdata_mcp-1.0.2.tar.gz:

Publisher: release.yml on ChrisGVE/localdata-mcp

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file localdata_mcp-1.0.2-py3-none-any.whl.

File metadata

  • Download URL: localdata_mcp-1.0.2-py3-none-any.whl
  • Upload date:
  • Size: 13.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for localdata_mcp-1.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 25695e53062a115f7a8817a706fb2c7a024aa7cfcc83ff3476e1a05269036565
MD5 3641e4cb9d1dba69f7aaeb114b3dcb27
BLAKE2b-256 d48e1a412103ce92e5a3d99a46254b5991f75bb51c5364c7bb92d4323e7e1757

See more details on using hashes here.

Provenance

The following attestation bundles were made for localdata_mcp-1.0.2-py3-none-any.whl:

Publisher: release.yml on ChrisGVE/localdata-mcp

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page