localdata-mcp

MCP server for databases, spreadsheets, structured files, and directed graphs.

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

chrisgve

These details have not been verified by PyPI

Project links

Documentation

Project description

LocalData MCP Server

PyPI downloads GitHub stars

LocalData MCP gives LLM agents access to local and remote data — databases, files, graphs, and structured documents — along with a full data science toolkit for analysis and modeling. It exposes 52 MCP tools across 13 database types and 20+ file formats, with memory-bounded streaming so agents can work safely on large datasets without exceeding available RAM.

Quick Start

# Install permanently
uv tool install localdata-mcp

# Or run directly without installing
uvx localdata-mcp

First-run note: Data science dependencies (scipy, scikit-learn, statsmodels, geopandas) total around 200 MB and are downloaded on first use. Subsequent starts reuse the cache. If your MCP client times out on the first launch, reconnect — the next start will be immediate.

Add to your MCP client configuration:

{
  "mcpServers": {
    "localdata": {
      "command": "localdata-mcp"
    }
  }
}

For uvx (no permanent install):

{
  "mcpServers": {
    "localdata": {
      "command": "uvx",
      "args": ["localdata-mcp"]
    }
  }
}

Then connect to any supported source and start querying:

connect_database("sales", "postgresql", "postgresql://user:pass@localhost/db")
execute_query("sales", "SELECT product, SUM(amount) FROM orders GROUP BY product")

connect_database("data", "csv", "./records.csv")
analyze_hypothesis_test("data", "SELECT amount, region FROM data", column="amount", group_column="region")

Feature Overview

Core Database (8 tools)

Connect, query, and inspect databases and files. All queries execute within configurable memory limits (default 2 GB) with automatic chunked streaming for large result sets.

Tool	Description
`connect_database`	Open a connection to any supported database or file
`disconnect_database`	Close a connection
`list_databases`	List active connections
`execute_query`	Run SQL with streaming, chunking, and preflight mode
`describe_database`	Show schema and table list
`describe_table`	Column types, indexes, row count
`find_table`	Locate a table across all active connections
`analyze_query_preview`	Estimate query cost before execution

Streaming and Memory (9 tools)

Tool	Description
`next_chunk`	Retrieve the next chunk of a streamed result
`request_data_chunk`	Fetch a specific chunk by row range
`request_multiple_chunks`	Batch-fetch multiple chunks in one call
`manage_memory_bounds`	View and configure memory limits
`get_streaming_status`	Check active streams and buffer usage
`clear_streaming_buffer`	Free memory from a specific buffer
`get_query_metadata`	Rich metadata for a completed query
`cancel_query_operation`	Cancel a running or buffered query
`get_data_quality_report`	Column statistics, null rates, and quality metrics

Tree / Structured Data (10 tools)

Navigate and edit TOML, JSON, and YAML files as navigable trees. Supports full CRUD with auto-creation of ancestor nodes and round-trip export to any supported format.

Tool	Description
`get_node` / `get_children`	Navigate the tree
`set_node` / `delete_node`	Create or remove nodes
`get_value` / `set_value` / `delete_key`	Read and write properties
`list_keys`	List key-value pairs at a node
`move_node`	Relocate a node within the tree
`export_structured`	Export as TOML, JSON, or YAML

Graph (14 tools)

Work with DOT, GML, GraphML, and Mermaid files as directed multigraphs. Supports full CRUD on nodes and edges, shortest-path and all-paths queries, structural statistics, and multi-format export.

Tool	Description
`get_node_graph` / `get_neighbors` / `get_edges`	Navigate the graph
`set_node_graph` / `delete_node_graph`	Create or remove nodes
`add_edge` / `remove_edge`	Manage edges
`get_value_graph` / `set_value_graph` / `delete_key_graph` / `list_keys_graph`	Node properties
`find_path`	Shortest or all paths between two nodes
`get_graph_stats`	Node/edge counts, density, DAG validation
`export_graph`	Export as DOT, GML, GraphML, or Mermaid

Search and Transform (2 tools)

Tool	Description
`search_data`	Regex search across query results
`transform_data`	Apply column transformations to result sets

Schema and Audit (3 tools)

Tool	Description
`export_schema`	Export full schema as JSON
`get_query_log`	Recent query execution history
`get_error_log`	Recent error log

System (2 tools)

Tool	Description
`check_compatibility`	Verify API backward compatibility
`get_metrics`	Server performance and resource metrics

Data Science (12 tools)

Run statistical analysis, modeling, and pattern detection directly on query results from any connected source.

Tool	Domain
`analyze_hypothesis_test`	Statistical Analysis
`analyze_anova`	Statistical Analysis
`analyze_effect_sizes`	Statistical Analysis
`analyze_regression`	Regression and Modeling
`evaluate_model_performance`	Regression and Modeling
`analyze_clusters`	Pattern Recognition
`detect_anomalies`	Pattern Recognition
`reduce_dimensions`	Pattern Recognition
`analyze_time_series`	Time Series
`forecast_time_series`	Time Series
`analyze_rfm`	Business Intelligence
`analyze_ab_test`	Business Intelligence

Supported Data Sources

Databases

Type	Engines
SQL	SQLite, PostgreSQL, MySQL, DuckDB
SQL (enterprise)	Oracle, MS SQL Server (`pip install localdata-mcp[enterprise]`)
Document	MongoDB, CouchDB (`pip install localdata-mcp[modern-databases]`)
Key-value	Redis (`pip install localdata-mcp[modern-databases]`)
Search	Elasticsearch (`pip install localdata-mcp[modern-databases]`)
Time series	InfluxDB (`pip install localdata-mcp[modern-databases]`)
Graph	Neo4j (`pip install localdata-mcp[modern-databases]`)
RDF / SPARQL	Turtle (.ttl), N-Triples (.nt), remote SPARQL endpoints

File Formats

Category	Formats
Tabular	CSV, TSV
Structured	JSON, JSONL, YAML, TOML, XML, INI
Spreadsheet	Excel (.xlsx, .xls), LibreOffice Calc (.ods), Apple Numbers (.numbers)
Analytical	Parquet, Feather, Arrow, HDF5
Graph	DOT (Graphviz), GML, GraphML, Mermaid
RDF	Turtle (.ttl), N-Triples (.nt)

Multi-sheet spreadsheets are fully supported: each sheet becomes a separately queryable table. Connect to a specific sheet with ?sheet=SheetName in the path.

Data Science Domains

Statistical Analysis — t-tests, chi-squared, Mann-Whitney, Kruskal-Wallis, and related hypothesis tests; one-way ANOVA with post-hoc tests; Cohen's d, eta-squared, and other effect size measures.

Regression and Modeling — linear, polynomial, logistic, ridge, lasso, and elastic net regression; model evaluation with R², RMSE, MAE, and classification metrics; automated feature selection.

Pattern Recognition — K-means, DBSCAN, and hierarchical clustering; anomaly detection via isolation forest, LOF, and one-class SVM; dimensionality reduction with PCA, t-SNE, and UMAP.

Time Series — decomposition, stationarity testing, autocorrelation analysis; ARIMA, SARIMA, and ETS forecasting; change point detection; multivariate analysis with VAR, Granger causality, and cointegration tests.

Business Intelligence — A/B test statistical analysis; RFM customer segmentation; cohort analysis, CLV modeling, and funnel analysis.

Geospatial — distance and coordinate calculations, spatial joins, interpolation, and network analysis.

Optimization — linear programming, constrained optimization, assignment problems, and network optimization.

Sampling and Estimation — bootstrap confidence intervals, Bayesian estimation, Monte Carlo simulation, and stratified sampling.

Architecture

Intention-driven interface — tools accept semantic parameters ("find strong correlations") rather than requiring statistical procedure names or threshold values
Progressive disclosure — simple calls return high-level insights with sensible defaults; advanced parameters are available when needed
Streaming-first execution — all operations are designed for chunked processing; tools automatically switch strategies based on data size, keeping memory usage within configured bounds
Composition metadata — every tool result includes metadata that downstream tools can use directly, enabling chained analysis without manual wiring

Configuration

LocalData MCP uses environment variables for optional settings. The defaults work for most cases.

Variable	Default	Description
`LOCALDATA_MEMORY_LIMIT_MB`	`2048`	Maximum memory per query result (MB)
`LOCALDATA_MAX_CONNECTIONS`	`10`	Maximum concurrent database connections
`LOCALDATA_CHUNK_SIZE`	`500`	Default rows per streaming chunk
`LOCALDATA_BUFFER_TTL`	`600`	Streaming buffer expiry in seconds
`LOCALDATA_WORKING_DIR`	process cwd	Root directory for file access (file paths are restricted to this tree)

Set in your MCP server configuration under "env", or in a .env file in the working directory.

Documentation

Database Connection Guide — connection strings, driver setup, and security practices
Docker Usage Guide — container deployment and configuration
Advanced Examples — production-ready usage patterns
Troubleshooting Guide — common issues and solutions
FAQ — frequently asked questions
API Reference — full tool and parameter reference

Development

git clone https://github.com/ChrisGVE/localdata-mcp.git
cd localdata-mcp
uv sync --all-extras
uv run pytest

The test suite includes 1,600+ unit tests, 234+ integration tests, and 62 enterprise-scale tests across 7 database types with 100K rows each.

Contributing

Contributions are welcome. Please read CONTRIBUTING.md before submitting a pull request.

License

MIT — see LICENSE for details.

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

chrisgve

These details have not been verified by PyPI

Project links

Documentation

Release history Release notifications | RSS feed

This version

2.0.0

Apr 6, 2026

1.7.1

Apr 2, 2026

1.7.0

Mar 28, 2026

1.6.0

Mar 28, 2026

1.5.3

Mar 26, 2026

1.5.2

Mar 26, 2026

1.5.1

Mar 26, 2026

1.3.0

Aug 30, 2025

1.2.0

Aug 30, 2025

1.0.3

Aug 29, 2025

1.0.2

Aug 29, 2025

1.0.0

Aug 15, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

localdata_mcp-2.0.0.tar.gz (663.1 kB view details)

Uploaded Apr 6, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

localdata_mcp-2.0.0-py3-none-any.whl (880.8 kB view details)

Uploaded Apr 6, 2026 Python 3

File details

Details for the file localdata_mcp-2.0.0.tar.gz.

File metadata

Download URL: localdata_mcp-2.0.0.tar.gz
Upload date: Apr 6, 2026
Size: 663.1 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for localdata_mcp-2.0.0.tar.gz
Algorithm	Hash digest
SHA256	`c13528ff6915951de3ac55eb775234631ff776b231f2db862a75dd1772dcf2de`
MD5	`012090b443af41348bb358ae6e3e3f43`
BLAKE2b-256	`bef49541e76b742417bc8205263a04d81bfc4cf032d2bfd59d952b8fd31446f9`

See more details on using hashes here.

Provenance

The following attestation bundles were made for localdata_mcp-2.0.0.tar.gz:

Publisher: publish-to-pypi.yml on ChrisGVE/localdata-mcp

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: localdata_mcp-2.0.0.tar.gz
- Subject digest: c13528ff6915951de3ac55eb775234631ff776b231f2db862a75dd1772dcf2de
- Sigstore transparency entry: 1244495714
- Sigstore integration time: Apr 6, 2026
Source repository:
- Permalink: ChrisGVE/localdata-mcp@c349dc45c7db0e62c8af63e6c6d73293e9872ef2
- Branch / Tag: refs/tags/v2.0.0
- Owner: https://github.com/ChrisGVE
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish-to-pypi.yml@c349dc45c7db0e62c8af63e6c6d73293e9872ef2
- Trigger Event: push

File details

Details for the file localdata_mcp-2.0.0-py3-none-any.whl.

File metadata

Download URL: localdata_mcp-2.0.0-py3-none-any.whl
Upload date: Apr 6, 2026
Size: 880.8 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for localdata_mcp-2.0.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`ea6c11b9de4bb98fabb82b2327f9bea100583e794324d2c1f640f7a43b6bc5eb`
MD5	`fa3231ab3d94a5e8cc6f76aef379b63b`
BLAKE2b-256	`d90dac29124465910840f4344ed19b82ee7247d4f764a6bc1663c21c38e05d92`

See more details on using hashes here.

Provenance

The following attestation bundles were made for localdata_mcp-2.0.0-py3-none-any.whl:

Publisher: publish-to-pypi.yml on ChrisGVE/localdata-mcp

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: localdata_mcp-2.0.0-py3-none-any.whl
- Subject digest: ea6c11b9de4bb98fabb82b2327f9bea100583e794324d2c1f640f7a43b6bc5eb
- Sigstore transparency entry: 1244495768
- Sigstore integration time: Apr 6, 2026
Source repository:
- Permalink: ChrisGVE/localdata-mcp@c349dc45c7db0e62c8af63e6c6d73293e9872ef2
- Branch / Tag: refs/tags/v2.0.0
- Owner: https://github.com/ChrisGVE
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish-to-pypi.yml@c349dc45c7db0e62c8af63e6c6d73293e9872ef2
- Trigger Event: push

localdata-mcp 2.0.0

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

LocalData MCP Server

Quick Start

Feature Overview

Core Database (8 tools)

Streaming and Memory (9 tools)

Tree / Structured Data (10 tools)

Graph (14 tools)

Search and Transform (2 tools)

Schema and Audit (3 tools)

System (2 tools)

Data Science (12 tools)

Supported Data Sources

Databases

File Formats

Data Science Domains

Architecture

Configuration

Documentation

Development

Contributing

License

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance