A dynamic MCP server for local databases and text files.
Project description
LocalData MCP Server
A powerful, secure MCP server for local databases and structured text files with advanced security features and large dataset handling.
✨ Features
🗄️ Multi-Database Support
- SQL Databases: PostgreSQL, MySQL, SQLite
- Document Databases: MongoDB
- Structured Files: CSV, JSON, YAML, TOML
🔒 Advanced Security
- Path Security: Restricts file access to current working directory only
- SQL Injection Prevention: Parameterized queries and safe table identifiers
- Connection Limits: Maximum 10 concurrent database connections
- Input Validation: Comprehensive validation and sanitization
📊 Large Dataset Handling
- Query Buffering: Automatic buffering for results with 100+ rows
- Large File Support: 100MB+ files automatically use temporary SQLite storage
- Chunk Retrieval: Paginated access to large result sets
- Auto-Cleanup: 10-minute expiry with file modification detection
🛠️ Developer Experience
- Comprehensive Tools: 12 database operation tools
- Error Handling: Detailed, actionable error messages
- Thread Safety: Concurrent operation support
- Backward Compatible: All existing APIs preserved
🚀 Quick Start
Installation
# Using pip
pip install localdata-mcp
# Using uv (recommended)
uv tool install localdata-mcp
# Development installation
git clone https://github.com/ChrisGVE/localdata-mcp.git
cd localdata-mcp
pip install -e .
Configuration
Add to your MCP client configuration:
{
"mcpServers": {
"localdata": {
"command": "localdata-mcp",
"env": {}
}
}
}
Usage Examples
Connect to Databases
# PostgreSQL
connect_database("analytics", "postgresql", "postgresql://user:pass@localhost/db")
# SQLite
connect_database("local", "sqlite", "./data.sqlite")
# CSV Files
connect_database("csvdata", "csv", "./data.csv")
# JSON Files
connect_database("config", "json", "./config.json")
Query Data
# Execute queries with automatic result formatting
execute_query("analytics", "SELECT * FROM users LIMIT 50")
# Large result sets use buffering automatically
execute_query_json("analytics", "SELECT * FROM large_table")
Handle Large Results
# Get chunked results for large datasets
get_query_chunk("analytics_1640995200_a1b2", 101, "100")
# Check buffer status
get_buffered_query_info("analytics_1640995200_a1b2")
# Manual cleanup
clear_query_buffer("analytics_1640995200_a1b2")
🔧 Available Tools
| Tool | Description | Use Case |
|---|---|---|
connect_database |
Connect to databases/files | Initial setup |
disconnect_database |
Close connections | Cleanup |
list_databases |
Show active connections | Status check |
execute_query |
Run SQL (markdown output) | Small results |
execute_query_json |
Run SQL (JSON output) | Large results |
describe_database |
Show schema/structure | Exploration |
describe_table |
Show table details | Analysis |
get_table_sample |
Preview table data | Quick look |
get_table_sample_json |
Preview (JSON format) | Development |
find_table |
Locate tables by name | Navigation |
read_text_file |
Read structured files | File access |
get_query_chunk |
Paginated result access | Large data |
get_buffered_query_info |
Buffer status info | Monitoring |
clear_query_buffer |
Manual buffer cleanup | Management |
📋 Supported Data Sources
SQL Databases
- PostgreSQL: Full support with connection pooling
- MySQL: Complete MySQL/MariaDB compatibility
- SQLite: Local file and in-memory databases
Document Databases
- MongoDB: Collection queries and aggregation
Structured Files
- CSV: Large file automatic SQLite conversion
- JSON: Nested structure flattening
- YAML: Configuration file support
- TOML: Settings and config files
🛡️ Security Features
Path Security
# ✅ Allowed - current directory and subdirectories
"./data/users.csv"
"data/config.json"
"subdir/file.yaml"
# ❌ Blocked - parent directory access
"../etc/passwd"
"../../sensitive.db"
"/etc/hosts"
SQL Injection Prevention
# ✅ Safe - parameterized queries
describe_table("mydb", "users") # Validates table name
# ❌ Blocked - malicious input
describe_table("mydb", "users; DROP TABLE users; --")
Resource Limits
- Connection Limit: Maximum 10 concurrent connections
- File Size Threshold: 100MB triggers temporary storage
- Query Buffering: Automatic for 100+ row results
- Auto-Cleanup: Buffers expire after 10 minutes
📊 Performance & Scalability
Large File Handling
- Files over 100MB automatically use temporary SQLite storage
- Memory-efficient streaming for large datasets
- Automatic cleanup of temporary files
Query Optimization
- Results with 100+ rows automatically use buffering system
- Chunk-based retrieval for large datasets
- File modification detection for cache invalidation
Concurrency
- Thread-safe connection management
- Concurrent query execution support
- Resource pooling and limits
🧪 Testing & Quality
✅ 100% Test Coverage
- 100+ comprehensive test cases
- Security vulnerability testing
- Performance benchmarking
- Edge case validation
🔒 Security Validated
- Path traversal prevention
- SQL injection protection
- Resource exhaustion testing
- Malicious input handling
⚡ Performance Tested
- Large file processing
- Concurrent connection handling
- Memory usage optimization
- Query response times
🔄 API Compatibility
All existing MCP tool signatures remain 100% backward compatible. New functionality is additive only:
- ✅ All original tools work unchanged
- ✅ Enhanced responses with additional metadata
- ✅ New buffering tools for large datasets
- ✅ Improved error messages and validation
📖 Examples
Basic Database Operations
# Connect to SQLite
connect_database("sales", "sqlite", "./sales.db")
# Explore structure
describe_database("sales")
describe_table("sales", "orders")
# Query data
execute_query("sales", "SELECT product, SUM(amount) FROM orders GROUP BY product")
Large Dataset Processing
# Connect to large CSV
connect_database("bigdata", "csv", "./million_records.csv")
# Query returns buffer info for large results
result = execute_query_json("bigdata", "SELECT * FROM data WHERE category = 'A'")
# Access results in chunks
chunk = get_query_chunk("bigdata_1640995200_a1b2", 1, "1000")
Multi-Database Analysis
# Connect multiple sources
connect_database("postgres", "postgresql", "postgresql://localhost/prod")
connect_database("config", "yaml", "./config.yaml")
connect_database("logs", "json", "./logs.json")
# Query across sources (in application logic)
user_data = execute_query("postgres", "SELECT * FROM users")
config = read_text_file("./config.yaml", "yaml")
🚧 Roadmap
- Enhanced File Formats: Excel, Parquet support
- Caching Layer: Configurable query result caching
- Connection Pooling: Advanced connection management
- Streaming APIs: Real-time data processing
- Monitoring Tools: Connection and performance metrics
🤝 Contributing
Contributions welcome! Please read our Contributing Guidelines for details.
Development Setup
git clone https://github.com/ChrisGVE/localdata-mcp.git
cd localdata-mcp
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
pip install -e ".[dev]"
pytest
📄 License
MIT License - see the LICENSE file for details.
🔗 Links
- GitHub: localdata-mcp
- PyPI: localdata-mcp
- MCP Protocol: Model Context Protocol
- FastMCP: FastMCP Framework
📊 Stats
📚 Additional Resources
- FAQ: Common questions and troubleshooting
- Troubleshooting Guide: Comprehensive problem resolution
- Advanced Examples: Production-ready usage patterns
- Blog Post: Technical deep dive and use cases
🤔 Need Help?
- Issues: GitHub Issues
- Discussions: GitHub Discussions
- Email: Available in GitHub profile
- Community: Join MCP community forums
🏷️ Tags
mcp model-context-protocol database postgresql mysql sqlite mongodb csv json yaml toml ai machine-learning data-integration python security performance
Made with ❤️ for the MCP Community
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file localdata_mcp-1.0.2.tar.gz.
File metadata
- Download URL: localdata_mcp-1.0.2.tar.gz
- Upload date:
- Size: 17.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c1e8c806e7f30b0bd40376e2db02b3eb2c8c6529d881087c978ab148223f4a88
|
|
| MD5 |
246254c060b60fb063e1fdf8740eec39
|
|
| BLAKE2b-256 |
aed423c46e40620334b7fb47030f3e0f0146cd43006a5badb3819246e040f3b9
|
Provenance
The following attestation bundles were made for localdata_mcp-1.0.2.tar.gz:
Publisher:
release.yml on ChrisGVE/localdata-mcp
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
localdata_mcp-1.0.2.tar.gz -
Subject digest:
c1e8c806e7f30b0bd40376e2db02b3eb2c8c6529d881087c978ab148223f4a88 - Sigstore transparency entry: 449015126
- Sigstore integration time:
-
Permalink:
ChrisGVE/localdata-mcp@7322124f352cfe9426c966e0382d4e0447ab0a58 -
Branch / Tag:
refs/tags/v1.0.2 - Owner: https://github.com/ChrisGVE
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@7322124f352cfe9426c966e0382d4e0447ab0a58 -
Trigger Event:
release
-
Statement type:
File details
Details for the file localdata_mcp-1.0.2-py3-none-any.whl.
File metadata
- Download URL: localdata_mcp-1.0.2-py3-none-any.whl
- Upload date:
- Size: 13.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
25695e53062a115f7a8817a706fb2c7a024aa7cfcc83ff3476e1a05269036565
|
|
| MD5 |
3641e4cb9d1dba69f7aaeb114b3dcb27
|
|
| BLAKE2b-256 |
d48e1a412103ce92e5a3d99a46254b5991f75bb51c5364c7bb92d4323e7e1757
|
Provenance
The following attestation bundles were made for localdata_mcp-1.0.2-py3-none-any.whl:
Publisher:
release.yml on ChrisGVE/localdata-mcp
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
localdata_mcp-1.0.2-py3-none-any.whl -
Subject digest:
25695e53062a115f7a8817a706fb2c7a024aa7cfcc83ff3476e1a05269036565 - Sigstore transparency entry: 449015141
- Sigstore integration time:
-
Permalink:
ChrisGVE/localdata-mcp@7322124f352cfe9426c966e0382d4e0447ab0a58 -
Branch / Tag:
refs/tags/v1.0.2 - Owner: https://github.com/ChrisGVE
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@7322124f352cfe9426c966e0382d4e0447ab0a58 -
Trigger Event:
release
-
Statement type: