Skip to main content

Model Context Protocol server for DataForge data-quality tools.

Project description

dataforge-mcp

dataforge-mcp exposes DataForge's shipped CSV profiling, detection, repair, verification, and transaction-revert paths as Model Context Protocol tools.

cd dataforge-mcp
python -m pip install -e ".[dev]"
dataforge-mcp serve --allowed-root /path/to/csv/workspace

For local development from this repository:

cd dataforge-mcp
python -m pip install -e ".[dev]"
dataforge-mcp serve --allowed-root ..

The default transport is stdio, which is what local desktop MCP clients expect. For local Streamable HTTP experiments:

dataforge-mcp serve --transport streamable-http --host 127.0.0.1 --port 8000

dry_run is the safe default. To allow file mutation through MCP, start the server with an explicit allowed root and --enable-apply:

dataforge-mcp serve --allowed-root /path/to/csv/workspace --enable-apply

Tools

  • dataforge_profile(path: str) - summarize CSV shape plus detected issues.
  • dataforge_detect_errors(path: str) - return detected issues only.
  • dataforge_verify_fix(fix_spec: dict) - run one candidate fix through stale value checks, safety, and verification.
  • dataforge_apply_repairs(path: str, mode: "dry_run" | "apply") - propose verified repairs and optionally write a reversible transaction.
  • dataforge_revert(txn_id: str) - restore a transaction's original bytes.

Client Configuration

Use the same server command for Claude Desktop, Cursor, Windsurf, or any local MCP client that supports stdio servers:

{
  "mcpServers": {
    "dataforge": {
      "command": "dataforge-mcp",
      "args": ["serve", "--allowed-root", "/path/to/csv/workspace"]
    }
  }
}

If your client cannot resolve the console script, replace command with the absolute path returned by your shell:

which dataforge-mcp

On Windows PowerShell:

Get-Command dataforge-mcp

Before describing a build as agent-ready, run an MCP Inspector smoke check against a fixture directory and confirm the profile, detect, verify, dry-run apply, and disabled-apply paths:

npx @modelcontextprotocol/inspector dataforge-mcp serve --allowed-root /path/to/csv/workspace

Safety Model

apply mode uses DataForge's detector -> repairer -> SafetyFilter -> SMTVerifier -> transaction-log path. The tool writes the transaction journal and source snapshot before mutating the CSV, and dataforge_revert restores the snapshot only when the current file still matches the recorded post-state hash.

The MCP server does not enable live LLM repair fallback by default. It does not send CSV contents to any external model provider. It also rejects CSV and schema paths outside the configured allowed roots, and apply mode is disabled unless the server is started with --enable-apply or DATAFORGE_MCP_ENABLE_APPLY=1.

Release

The package is intended to release independently from the nested dataforge-mcp/ source directory as the dataforge_07_mcp distribution, but it is not published yet. After PyPI Trusted Publishing is configured, the workflow will build on tags matching:

dataforge-mcp-v*

The package depends on dataforge_07 and the official Python mcp SDK; it does not vendor DataForge or add MCP dependencies to the core package.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dataforge_07_mcp-0.1.0.tar.gz (9.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

dataforge_07_mcp-0.1.0-py3-none-any.whl (8.5 kB view details)

Uploaded Python 3

File details

Details for the file dataforge_07_mcp-0.1.0.tar.gz.

File metadata

  • Download URL: dataforge_07_mcp-0.1.0.tar.gz
  • Upload date:
  • Size: 9.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for dataforge_07_mcp-0.1.0.tar.gz
Algorithm Hash digest
SHA256 3dd372aed4a113f954e05b4476606e96b36915c8312d2c22092b94aa38a47042
MD5 ff46bbcb4bf0b143b0db5c2fdbbba319
BLAKE2b-256 01da7c4275ce74249acdf1fed0f672a24def874c9250986c124ff1fac8c2a866

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataforge_07_mcp-0.1.0.tar.gz:

Publisher: publish-dataforge-mcp.yml on Aegis15/dataforge

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dataforge_07_mcp-0.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for dataforge_07_mcp-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 84062215ee32c80ff4bfa989ec3d30e50413ddebc11cfb7604fdb7676a973d32
MD5 7b9b5a62ad67f39456a7f624e6b21630
BLAKE2b-256 def9d67786feede387a090391abf270dfb4a681e933709fb544a4307aff4f4fc

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataforge_07_mcp-0.1.0-py3-none-any.whl:

Publisher: publish-dataforge-mcp.yml on Aegis15/dataforge

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page