Model Context Protocol server for DataForge data-quality tools.
Project description
dataforge-mcp
dataforge-mcp exposes DataForge's shipped CSV profiling, detection, repair,
verification, and transaction-revert paths as Model Context Protocol tools.
cd dataforge-mcp
python -m pip install -e ".[dev]"
dataforge-mcp serve --allowed-root /path/to/csv/workspace
For local development from this repository:
cd dataforge-mcp
python -m pip install -e ".[dev]"
dataforge-mcp serve --allowed-root ..
The default transport is stdio, which is what local desktop MCP clients expect. For local Streamable HTTP experiments:
dataforge-mcp serve --transport streamable-http --host 127.0.0.1 --port 8000
dry_run is the safe default. To allow file mutation through MCP, start the
server with an explicit allowed root and --enable-apply:
dataforge-mcp serve --allowed-root /path/to/csv/workspace --enable-apply
Tools
dataforge_profile(path: str)- summarize CSV shape plus detected issues.dataforge_detect_errors(path: str)- return detected issues only.dataforge_verify_fix(fix_spec: dict)- run one candidate fix through stale value checks, safety, and verification.dataforge_apply_repairs(path: str, mode: "dry_run" | "apply")- propose verified repairs and optionally write a reversible transaction.dataforge_revert(txn_id: str)- restore a transaction's original bytes.
Client Configuration
Use the same server command for Claude Desktop, Cursor, Windsurf, or any local MCP client that supports stdio servers:
{
"mcpServers": {
"dataforge": {
"command": "dataforge-mcp",
"args": ["serve", "--allowed-root", "/path/to/csv/workspace"]
}
}
}
If your client cannot resolve the console script, replace command with the
absolute path returned by your shell:
which dataforge-mcp
On Windows PowerShell:
Get-Command dataforge-mcp
Before describing a build as agent-ready, run an MCP Inspector smoke check against a fixture directory and confirm the profile, detect, verify, dry-run apply, and disabled-apply paths:
npx @modelcontextprotocol/inspector dataforge-mcp serve --allowed-root /path/to/csv/workspace
Safety Model
apply mode uses DataForge's detector -> repairer -> SafetyFilter ->
SMTVerifier -> transaction-log path. The tool writes the transaction journal and
source snapshot before mutating the CSV, and dataforge_revert restores the
snapshot only when the current file still matches the recorded post-state hash.
The MCP server does not enable live LLM repair fallback by default. It does not
send CSV contents to any external model provider. It also rejects CSV and schema
paths outside the configured allowed roots, and apply mode is disabled unless
the server is started with --enable-apply or DATAFORGE_MCP_ENABLE_APPLY=1.
Release
The package is intended to release independently from the nested
dataforge-mcp/ source directory as the dataforge_07_mcp distribution, but
it is not published yet. After PyPI Trusted Publishing is configured, the
workflow will build on tags matching:
dataforge-mcp-v*
The package depends on dataforge_07 and the official Python mcp SDK; it does
not vendor DataForge or add MCP dependencies to the core package.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file dataforge_07_mcp-0.1.0.tar.gz.
File metadata
- Download URL: dataforge_07_mcp-0.1.0.tar.gz
- Upload date:
- Size: 9.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3dd372aed4a113f954e05b4476606e96b36915c8312d2c22092b94aa38a47042
|
|
| MD5 |
ff46bbcb4bf0b143b0db5c2fdbbba319
|
|
| BLAKE2b-256 |
01da7c4275ce74249acdf1fed0f672a24def874c9250986c124ff1fac8c2a866
|
Provenance
The following attestation bundles were made for dataforge_07_mcp-0.1.0.tar.gz:
Publisher:
publish-dataforge-mcp.yml on Aegis15/dataforge
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
dataforge_07_mcp-0.1.0.tar.gz -
Subject digest:
3dd372aed4a113f954e05b4476606e96b36915c8312d2c22092b94aa38a47042 - Sigstore transparency entry: 1804411609
- Sigstore integration time:
-
Permalink:
Aegis15/dataforge@d498b656734241e343673fafe1b11676b475bf60 -
Branch / Tag:
refs/heads/main - Owner: https://github.com/Aegis15
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish-dataforge-mcp.yml@d498b656734241e343673fafe1b11676b475bf60 -
Trigger Event:
workflow_dispatch
-
Statement type:
File details
Details for the file dataforge_07_mcp-0.1.0-py3-none-any.whl.
File metadata
- Download URL: dataforge_07_mcp-0.1.0-py3-none-any.whl
- Upload date:
- Size: 8.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
84062215ee32c80ff4bfa989ec3d30e50413ddebc11cfb7604fdb7676a973d32
|
|
| MD5 |
7b9b5a62ad67f39456a7f624e6b21630
|
|
| BLAKE2b-256 |
def9d67786feede387a090391abf270dfb4a681e933709fb544a4307aff4f4fc
|
Provenance
The following attestation bundles were made for dataforge_07_mcp-0.1.0-py3-none-any.whl:
Publisher:
publish-dataforge-mcp.yml on Aegis15/dataforge
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
dataforge_07_mcp-0.1.0-py3-none-any.whl -
Subject digest:
84062215ee32c80ff4bfa989ec3d30e50413ddebc11cfb7604fdb7676a973d32 - Sigstore transparency entry: 1804411819
- Sigstore integration time:
-
Permalink:
Aegis15/dataforge@d498b656734241e343673fafe1b11676b475bf60 -
Branch / Tag:
refs/heads/main - Owner: https://github.com/Aegis15
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish-dataforge-mcp.yml@d498b656734241e343673fafe1b11676b475bf60 -
Trigger Event:
workflow_dispatch
-
Statement type: