Skip to main content

An AWS Labs Model Context Protocol (MCP) server for syntheticdata

Project description

Synthetic Data MCP Server

A Model Context Protocol (MCP) server for generating, validating, and managing synthetic data.

Overview

This MCP server provides tools for generating synthetic data based on business descriptions, executing pandas code safely, validating data structures, and loading data to storage systems like S3.

Features

  • Business-Driven Generation: Generate synthetic data instructions based on business descriptions
  • Data Generation Instructions: Generate structured data generation instructions from business descriptions
  • Safe Pandas Code Execution: Run pandas code in a restricted environment with automatic DataFrame detection
  • JSON Lines Validation: Validate and convert JSON Lines data to CSV format
  • Data Validation: Validate data structure, referential integrity, and save as CSV files
  • Referential Integrity Checking: Validate relationships between tables
  • Data Quality Assessment: Identify potential issues in data models (3NF validation)
  • Storage Integration: Load data to various storage targets (S3) with support for:
    • Multiple file formats (CSV, JSON, Parquet)
    • Partitioning options
    • Storage class configuration
    • Encryption settings

Prerequisites

  1. Install uv from Astral or the GitHub README
  2. Install Python using uv python install 3.10
  3. Set up AWS credentials with access to AWS services
    • You need an AWS account with appropriate permissions
    • Configure AWS credentials with aws configure or environment variables

Installation

Kiro Cursor VS Code
Add to Kiro Install MCP Server Install on VS Code
{
  "mcpServers": {
    "awslabs.syntheticdata-mcp-server": {
      "command": "uvx",
      "args": ["awslabs.syntheticdata-mcp-server"],
      "env": {
        "FASTMCP_LOG_LEVEL": "ERROR",
        "AWS_PROFILE": "your-aws-profile",
        "AWS_REGION": "us-east-1"
      },
      "autoApprove": [],
      "disabled": false
    }
  }
}

Windows Installation

For Windows users, the MCP server configuration format is slightly different:

{
  "mcpServers": {
    "awslabs.syntheticdata-mcp-server": {
      "disabled": false,
      "timeout": 60,
      "type": "stdio",
      "command": "uv",
      "args": [
        "tool",
        "run",
        "--from",
        "awslabs.syntheticdata-mcp-server@latest",
        "awslabs.syntheticdata-mcp-server.exe"
      ],
      "env": {
        "FASTMCP_LOG_LEVEL": "ERROR",
        "AWS_PROFILE": "your-aws-profile",
        "AWS_REGION": "us-east-1"
      }
    }
  }
}

NOTE: Your credentials will need to be kept refreshed from your host

AWS Authentication

The MCP server uses the AWS profile specified in the AWS_PROFILE environment variable. If not provided, it defaults to the "default" profile in your AWS configuration file.

"env": {
  "AWS_PROFILE": "your-aws-profile"
}

Usage

Getting Data Generation Instructions

response = await server.get_data_gen_instructions(
    business_description="An e-commerce platform with customers, orders, and products"
)

Executing Pandas Code

response = await server.execute_pandas_code(
    code="your_pandas_code_here",
    workspace_dir="/path/to/workspace",
    output_dir="data"
)

Validating and Saving Data

response = await server.validate_and_save_data(
    data={
        "customers": [{"id": 1, "name": "John"}],
        "orders": [{"id": 101, "customer_id": 1}]
    },
    workspace_dir="/path/to/workspace",
    output_dir="data"
)

Loading to Storage

response = await server.load_to_storage(
    data={
        "customers": [{"id": 1, "name": "John"}]
    },
    targets=[{
        "type": "s3",
        "config": {
            "bucket": "my-bucket",
            "prefix": "data/",
            "format": "parquet"
        }
    }]
)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

awslabs_syntheticdata_mcp_server-1.0.10.tar.gz (122.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

File details

Details for the file awslabs_syntheticdata_mcp_server-1.0.10.tar.gz.

File metadata

File hashes

Hashes for awslabs_syntheticdata_mcp_server-1.0.10.tar.gz
Algorithm Hash digest
SHA256 5f175012d690742b0534b35cbe799b493893ccfd273a5ed12072891a6a6d9ec5
MD5 f7c21d50c551f6f41ddbeb740e70c2c6
BLAKE2b-256 34425da7c2e08ad734564fc32485d3c5705ff6c046a2533de2fe6fdb50ce52f9

See more details on using hashes here.

Provenance

The following attestation bundles were made for awslabs_syntheticdata_mcp_server-1.0.10.tar.gz:

Publisher: release.yml on awslabs/mcp

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file awslabs_syntheticdata_mcp_server-1.0.10-py3-none-any.whl.

File metadata

File hashes

Hashes for awslabs_syntheticdata_mcp_server-1.0.10-py3-none-any.whl
Algorithm Hash digest
SHA256 a1d7eac9e09be6d6d0c3e55ebbcb0581705217f229613f5037c6bec63b9c6096
MD5 789e6c956782e45dda37a777e5efddd8
BLAKE2b-256 5db28953da850d909df70be65c7e5bae6d9f514b06a1937af37d932a61259f6e

See more details on using hashes here.

Provenance

The following attestation bundles were made for awslabs_syntheticdata_mcp_server-1.0.10-py3-none-any.whl:

Publisher: release.yml on awslabs/mcp

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page