An AWS Labs Model Context Protocol (MCP) server for syntheticdata
Project description
Synthetic Data MCP Server
A Model Context Protocol (MCP) server for generating, validating, and managing synthetic data.
Overview
This MCP server provides tools for generating synthetic data based on business descriptions, executing pandas code safely, validating data structures, and loading data to storage systems like S3.
Features
- Business-Driven Generation: Generate synthetic data instructions based on business descriptions
- Data Generation Instructions: Generate structured data generation instructions from business descriptions
- Safe Pandas Code Execution: Run pandas code in a restricted environment with automatic DataFrame detection
- JSON Lines Validation: Validate and convert JSON Lines data to CSV format
- Data Validation: Validate data structure, referential integrity, and save as CSV files
- Referential Integrity Checking: Validate relationships between tables
- Data Quality Assessment: Identify potential issues in data models (3NF validation)
- Storage Integration: Load data to various storage targets (S3) with support for:
- Multiple file formats (CSV, JSON, Parquet)
- Partitioning options
- Storage class configuration
- Encryption settings
Prerequisites
- Install
uvfrom Astral or the GitHub README - Install Python using
uv python install 3.10 - Set up AWS credentials with access to AWS services
- You need an AWS account with appropriate permissions
- Configure AWS credentials with
aws configureor environment variables
Installation
| Cursor | VS Code |
|---|---|
{
"mcpServers": {
"awslabs.syntheticdata-mcp-server": {
"command": "uvx",
"args": ["awslabs.syntheticdata-mcp-server"],
"env": {
"FASTMCP_LOG_LEVEL": "ERROR",
"AWS_PROFILE": "your-aws-profile",
"AWS_REGION": "us-east-1"
},
"autoApprove": [],
"disabled": false
}
}
}
Windows Installation
For Windows users, the MCP server configuration format is slightly different:
{
"mcpServers": {
"awslabs.syntheticdata-mcp-server": {
"disabled": false,
"timeout": 60,
"type": "stdio",
"command": "uv",
"args": [
"tool",
"run",
"--from",
"awslabs.syntheticdata-mcp-server@latest",
"awslabs.syntheticdata-mcp-server.exe"
],
"env": {
"FASTMCP_LOG_LEVEL": "ERROR",
"AWS_PROFILE": "your-aws-profile",
"AWS_REGION": "us-east-1"
}
}
}
}
NOTE: Your credentials will need to be kept refreshed from your host
AWS Authentication
The MCP server uses the AWS profile specified in the AWS_PROFILE environment variable. If not provided, it defaults to the "default" profile in your AWS configuration file.
"env": {
"AWS_PROFILE": "your-aws-profile"
}
Usage
Getting Data Generation Instructions
response = await server.get_data_gen_instructions(
business_description="An e-commerce platform with customers, orders, and products"
)
Executing Pandas Code
response = await server.execute_pandas_code(
code="your_pandas_code_here",
workspace_dir="/path/to/workspace",
output_dir="data"
)
Validating and Saving Data
response = await server.validate_and_save_data(
data={
"customers": [{"id": 1, "name": "John"}],
"orders": [{"id": 101, "customer_id": 1}]
},
workspace_dir="/path/to/workspace",
output_dir="data"
)
Loading to Storage
response = await server.load_to_storage(
data={
"customers": [{"id": 1, "name": "John"}]
},
targets=[{
"type": "s3",
"config": {
"bucket": "my-bucket",
"prefix": "data/",
"format": "parquet"
}
}]
)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file awslabs_syntheticdata_mcp_server-1.0.7.tar.gz.
File metadata
- Download URL: awslabs_syntheticdata_mcp_server-1.0.7.tar.gz
- Upload date:
- Size: 121.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c4bbcbf280dc8bcdb7bf06b3e9b5ff0497c4682f299d0745bcc6a872435a6bdb
|
|
| MD5 |
2f2bc2a7bc491500ca8a5e3f219f07c1
|
|
| BLAKE2b-256 |
720084f320555431b89193c92b3b2ff45c81a5a37adac58561d284ff517f3930
|
Provenance
The following attestation bundles were made for awslabs_syntheticdata_mcp_server-1.0.7.tar.gz:
Publisher:
release.yml on awslabs/mcp
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
awslabs_syntheticdata_mcp_server-1.0.7.tar.gz -
Subject digest:
c4bbcbf280dc8bcdb7bf06b3e9b5ff0497c4682f299d0745bcc6a872435a6bdb - Sigstore transparency entry: 760636743
- Sigstore integration time:
-
Permalink:
awslabs/mcp@b9285f738decde5882800b52bdafd0d7917af735 -
Branch / Tag:
refs/tags/2025.12.20251211225414 - Owner: https://github.com/awslabs
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@b9285f738decde5882800b52bdafd0d7917af735 -
Trigger Event:
push
-
Statement type:
File details
Details for the file awslabs_syntheticdata_mcp_server-1.0.7-py3-none-any.whl.
File metadata
- Download URL: awslabs_syntheticdata_mcp_server-1.0.7-py3-none-any.whl
- Upload date:
- Size: 25.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
426a24d2411eb906505e0502537a2040dc6b6706eaec716244c1cab6109e7e69
|
|
| MD5 |
7d7c4184bfff0e0ebbff7f1452f0d338
|
|
| BLAKE2b-256 |
de2154f3ef74ae0231b13e07eed21c292946ff02711969ff391d2d1341617722
|
Provenance
The following attestation bundles were made for awslabs_syntheticdata_mcp_server-1.0.7-py3-none-any.whl:
Publisher:
release.yml on awslabs/mcp
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
awslabs_syntheticdata_mcp_server-1.0.7-py3-none-any.whl -
Subject digest:
426a24d2411eb906505e0502537a2040dc6b6706eaec716244c1cab6109e7e69 - Sigstore transparency entry: 760636746
- Sigstore integration time:
-
Permalink:
awslabs/mcp@b9285f738decde5882800b52bdafd0d7917af735 -
Branch / Tag:
refs/tags/2025.12.20251211225414 - Owner: https://github.com/awslabs
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@b9285f738decde5882800b52bdafd0d7917af735 -
Trigger Event:
push
-
Statement type: