A Python wrapper library for SageMaker SDK v3 with configuration-driven defaults

These details have not been verified by PyPI

Project links

Project description

[!WARNING] This mlp_sdk_v3 example demonstrates how to develop an ML Platform SDK wrapper library, providing a way to simplify infrastructure configuration management and standardize ML workflows across teams. It is intended as a reference guide for customers to help them create their own customized SDK wrappers. Note: This library is provided for illustrative purposes only and should not be used directly in production environments.

mlp_sdk_v3

A Python wrapper library for SageMaker SDK v3 with configuration-driven defaults.

Overview

The mlp_sdk_v3 simplifies SageMaker operations by providing a session-based interface with configuration-driven defaults. Built on top of the SageMaker Python SDK v3, it abstracts infrastructure complexity while maintaining full compatibility with the underlying SDK.

Key Features

Configuration-driven defaults: Define AWS resources (VPCs, security groups, S3 buckets) in YAML configuration files
Simple session interface: Single entry point for all SageMaker operations
Runtime parameter override: Override any default configuration at runtime
Full SageMaker SDK compatibility: Access underlying SageMaker SDK objects for advanced use cases
Comprehensive error handling: Clear error messages with actionable guidance
Encryption support: AES-256-GCM encryption for sensitive configuration values
Audit trail: Track all operations for debugging and compliance

Installation

pip install mlp_sdk_v3

Quick Start

Generate Configuration

First, generate your configuration file:

# Interactive mode (recommended)
python examples/generate_admin_config.py --interactive

# Or use defaults
python examples/generate_admin_config.py --output /home/sagemaker-user/.config/admin-config.yaml

See examples/QUICKSTART.md for a complete quick start guide.

Basic Usage

from mlp_sdk_v3 import MLP_Session

# Initialize session with default configuration
session = MLP_Session()

# Create a feature group
feature_group = session.create_feature_group(
    feature_group_name="customer-features",
    record_identifier_name="customer_id",
    event_time_feature_name="event_time",
    feature_definitions=[
        {"FeatureName": "customer_id", "FeatureType": "String"},
        {"FeatureName": "age", "FeatureType": "Integral"},
        {"FeatureName": "income", "FeatureType": "Fractional"},
        {"FeatureName": "event_time", "FeatureType": "String"}
    ]
)

# Run a processing job
processor = session.run_processing_job(
    job_name="data-preprocessing",
    processing_script="preprocess.py",
    inputs=[{"source": "s3://my-bucket/raw-data/", "destination": "/opt/ml/processing/input"}],
    outputs=[{"source": "/opt/ml/processing/output", "destination": "s3://my-bucket/processed-data/"}]
)

# Run a training job
trainer = session.run_training_job(
    job_name="model-training",
    training_image="763104351884.dkr.ecr.us-west-2.amazonaws.com/pytorch-training:2.0.0-cpu-py310",
    source_code_dir="training-scripts",
    entry_script="train.py",
    inputs={"train": "s3://my-bucket/processed-data/"}
)

# Create a pipeline
from sagemaker.workflow.steps import ProcessingStep, TrainingStep

pipeline = session.create_pipeline(
    pipeline_name="ml-workflow",
    steps=[
        ProcessingStep(name="preprocess", processor=processor),
        TrainingStep(name="train", estimator=trainer)
    ]
)

Configuration

Configuration File Location

By default, mlp_sdk_v3 loads configuration from:

/home/sagemaker-user/.config/admin-config.yaml

You can specify a custom configuration path:

session = MLP_Session(config_path="/path/to/custom-config.yaml")

Configuration File Format

Create a YAML configuration file with the following structure:

defaults:
  # S3 Configuration
  s3:
    default_bucket: "my-sagemaker-bucket"
    input_prefix: "input/"
    output_prefix: "output/"
    model_prefix: "models/"
    
  # Networking Configuration  
  networking:
    vpc_id: "vpc-12345678"
    security_group_ids: ["sg-12345678"]
    subnets: ["subnet-12345678", "subnet-87654321"]
    
  # Compute Configuration
  compute:
    processing_instance_type: "ml.m5.large"
    training_instance_type: "ml.m5.xlarge"
    processing_instance_count: 1
    training_instance_count: 1
    
  # Feature Store Configuration
  feature_store:
    offline_store_s3_uri: "s3://my-sagemaker-bucket/feature-store/"
    enable_online_store: false
    
  # IAM Configuration
  iam:
    execution_role: "arn:aws:iam::123456789012:role/SageMakerExecutionRole"
    
  # KMS Configuration (optional)
  kms:
    key_id: "arn:aws:kms:REGION:ACCOUNT-ID:key/KEY-ID"

Configuration Precedence

Configuration values are applied in the following order (later values override earlier ones):

SageMaker SDK defaults - Built-in defaults from the SageMaker SDK
YAML configuration - Values from your configuration file
Runtime parameters - Values passed directly to method calls

Example:

# This will use the training_instance_type from config (ml.m5.xlarge)
trainer = session.run_training_job(job_name="my-job", ...)

# This will override the config and use ml.p3.2xlarge
trainer = session.run_training_job(
    job_name="my-job",
    instance_type="ml.p3.2xlarge",  # Runtime override
    ...
)

Encryption Setup

mlp_sdk_v3 supports AES-256-GCM encryption for sensitive configuration values.

Generating an Encryption Key

from mlp_sdk_v3.config import ConfigurationManager

# Generate a new encryption key
key = ConfigurationManager.generate_key()
print(f"Encryption key: {key}")
# Save this key securely!

Loading Encryption Keys

From Environment Variable

import os
from mlp_sdk_v3.config import ConfigurationManager

# Set environment variable
os.environ['MLP_SDK_ENCRYPTION_KEY'] = 'your-base64-encoded-key'

# Load key from environment
key = ConfigurationManager.load_key_from_env()
session = MLP_Session(config_path="encrypted-config.yaml")

From File

from mlp_sdk_v3.config import ConfigurationManager

# Load key from file
key = ConfigurationManager.load_key_from_file("/path/to/keyfile")
config_manager = ConfigurationManager(
    config_path="encrypted-config.yaml",
    encryption_key=key
)

From AWS KMS

from mlp_sdk_v3.config import ConfigurationManager

# Load key from KMS
key = ConfigurationManager.load_key_from_kms(
    key_id="arn:aws:kms:REGION:ACCOUNT-ID:key/KEY-ID",
    region="us-west-2"
)
config_manager = ConfigurationManager(
    config_path="encrypted-config.yaml",
    encryption_key=key
)

Encrypting Configuration Files

from mlp_sdk_v3.config import ConfigurationManager

# Generate or load encryption key
key = ConfigurationManager.generate_key()

# Create configuration manager
config_manager = ConfigurationManager(encryption_key=key)

# Encrypt specific fields in configuration file
config_manager.encrypt_config_file(
    input_path="plain-config.yaml",
    output_path="encrypted-config.yaml",
    fields_to_encrypt=[
        "defaults.iam.execution_role",
        "defaults.kms.key_id"
    ]
)

Decrypting Configuration Files

from mlp_sdk_v3.config import ConfigurationManager

# Load encryption key
key = ConfigurationManager.load_key_from_env()

# Create configuration manager
config_manager = ConfigurationManager(encryption_key=key)

# Decrypt specific fields
config_manager.decrypt_config_file(
    input_path="encrypted-config.yaml",
    output_path="decrypted-config.yaml",
    fields_to_decrypt=[
        "defaults.iam.execution_role",
        "defaults.kms.key_id"
    ]
)

Advanced Usage

Accessing Underlying SageMaker SDK Objects

session = MLP_Session()

# Access SageMaker session
sagemaker_session = session.sagemaker_session

# Access boto3 clients
s3_client = session.boto_session.client('s3')
sagemaker_client = session.sagemaker_client
runtime_client = session.sagemaker_runtime_client

# Get session properties
print(f"Region: {session.region_name}")
print(f"Account ID: {session.account_id}")
print(f"Default bucket: {session.default_bucket}")

Audit Trail

Track all operations for debugging and compliance:

# Initialize session with audit trail enabled (default)
session = MLP_Session(enable_audit_trail=True)

# Perform operations
session.create_feature_group(...)
session.run_processing_job(...)

# Get audit trail entries
entries = session.get_audit_trail(operation="create_feature_group")
print(f"Found {len(entries)} feature group operations")

# Get audit trail summary
summary = session.get_audit_trail_summary()
print(f"Total operations: {summary['total_entries']}")
print(f"Failed operations: {len(summary['failed_operations'])}")

# Export audit trail
session.export_audit_trail("audit-trail.json", format="json")
session.export_audit_trail("audit-trail.csv", format="csv")

Logging Configuration

import logging

# Initialize with custom log level
session = MLP_Session(log_level=logging.DEBUG)

# Change log level at runtime
session.set_log_level(logging.WARNING)

Runtime Configuration Updates

session = MLP_Session()

# Update session configuration at runtime
session.update_session_config(default_bucket="new-bucket-name")

# Get current configuration
config = session.get_config()
print(config)

Error Handling

mlp_sdk_v3 provides detailed error messages with AWS error details:

from mlp_sdk_v3 import MLP_Session, ValidationError, AWSServiceError, ConfigurationError

try:
    session = MLP_Session()
    feature_group = session.create_feature_group(
        feature_group_name="",  # Invalid: empty name
        ...
    )
except ValidationError as e:
    print(f"Validation error: {e}")
except AWSServiceError as e:
    print(f"AWS error: {e}")
    print(f"Error code: {e.error_code}")
    print(f"Request ID: {e.request_id}")
    print(f"Details: {e.get_error_details()}")
except ConfigurationError as e:
    print(f"Configuration error: {e}")

API Reference

MLP_Session

Main interface for all mlp_sdk_v3 operations.

Methods

__init__(config_path=None, log_level=logging.INFO, enable_audit_trail=True, **kwargs) - Initialize session
create_feature_group(feature_group_name, record_identifier_name, event_time_feature_name, feature_definitions, **kwargs) - Create feature group
run_processing_job(job_name, processing_script=None, inputs=None, outputs=None, **kwargs) - Execute processing job
run_training_job(job_name, training_image, source_code_dir=None, entry_script=None, requirements=None, inputs=None, **kwargs) - Execute training job
create_pipeline(pipeline_name, steps, parameters=None, **kwargs) - Create pipeline
upsert_pipeline(pipeline, **kwargs) - Create or update pipeline
start_pipeline_execution(pipeline_name, **kwargs) - Start pipeline execution
get_config() - Get current configuration
get_execution_role() - Get IAM execution role
set_log_level(level) - Set logging level
get_audit_trail(operation=None, status=None, limit=None) - Get audit trail entries
export_audit_trail(file_path, format='json') - Export audit trail

Properties

sagemaker_session - Underlying SageMaker session
boto_session - Underlying boto3 session
sagemaker_client - SageMaker boto3 client
sagemaker_runtime_client - SageMaker Runtime boto3 client
region_name - AWS region name
default_bucket - Default S3 bucket
account_id - AWS account ID

ConfigurationManager

Handles configuration loading and encryption.

Methods

__init__(config_path=None, encryption_key=None) - Initialize configuration manager
get_default(key, fallback=None) - Get configuration value
merge_with_runtime(runtime_config) - Merge runtime parameters with defaults
encrypt_value(plaintext, key=None) - Encrypt a value
decrypt_value(encrypted, key=None) - Decrypt a value
encrypt_config_file(input_path, output_path, fields_to_encrypt, key=None) - Encrypt configuration file
decrypt_config_file(input_path, output_path, fields_to_decrypt, key=None) - Decrypt configuration file

Static Methods

generate_key() - Generate new encryption key
load_key_from_env(env_var='MLP_SDK_ENCRYPTION_KEY') - Load key from environment
load_key_from_file(file_path) - Load key from file
load_key_from_kms(key_id, region=None) - Load key from AWS KMS

Development

Setup

# Clone the repository
git clone https://github.com/example/mlp_sdk_v3.git
cd mlp_sdk_v3

# Install in development mode with test dependencies
pip install -e ".[dev]"

Testing

# Run all tests
pytest

# Run unit tests only
pytest tests/unit/

# Run property-based tests only
pytest tests/property/

# Run with coverage
pytest --cov=mlp_sdk_v3

# Run specific test file
pytest tests/unit/test_session.py

Code Quality

# Format code
black mlp_sdk_v3 tests

# Sort imports
isort mlp_sdk_v3 tests

# Lint code
flake8 mlp_sdk_v3 tests

# Type checking
mypy mlp_sdk_v3

Requirements

Python >= 3.8
sagemaker >= 3.0.0
boto3 >= 1.26.0
pyyaml >= 6.0
pydantic >= 2.0.0
cryptography >= 41.0.0

License

MIT License - see LICENSE file for details.

Contributing

Contributions are welcome! Please read our contributing guidelines and submit pull requests to our GitHub repository.

Support

For issues, questions, or contributions, please visit our GitHub repository.

Examples

The examples/ directory contains helpful scripts and guides:

generate_admin_config.py - Generate configuration files
basic_usage.py - Basic SDK usage examples
sagemaker_operations.py - SageMaker operations examples
xgboost_training_example.ipynb - XGBoost training notebook ⭐
xgboost_training_script.py - XGBoost training script
QUICKSTART.md - 5-minute quick start guide
TRAINING_EXAMPLES.md - Detailed training guide
README.md - Examples documentation

Run examples:

# Generate config
python examples/generate_admin_config.py --interactive

# Run basic examples
python examples/basic_usage.py

# Run SageMaker operations examples
python examples/sagemaker_operations.py

# Run XGBoost training (script)
python examples/xgboost_training_script.py --wait

# Run XGBoost training (notebook)
jupyter notebook examples/xgboost_training_example.ipynb

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.1.1

Mar 23, 2026

0.1.0

Feb 4, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sagemaker_mlp_sdk-0.1.1.tar.gz (113.8 kB view details)

Uploaded Mar 23, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

sagemaker_mlp_sdk-0.1.1-py3-none-any.whl (43.8 kB view details)

Uploaded Mar 23, 2026 Python 3

File details

Details for the file sagemaker_mlp_sdk-0.1.1.tar.gz.

File metadata

Download URL: sagemaker_mlp_sdk-0.1.1.tar.gz
Upload date: Mar 23, 2026
Size: 113.8 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for sagemaker_mlp_sdk-0.1.1.tar.gz
Algorithm	Hash digest
SHA256	`8a09a31647017a464c57ce2209d33a4280319389615c538decc1521aa522b5be`
MD5	`2ba725431566596c6799be812103a46a`
BLAKE2b-256	`6943aebf1f249104a30bff5c8f16fe10d4040a005f8ffd02d062b6f691bbac7f`

See more details on using hashes here.

File details

Details for the file sagemaker_mlp_sdk-0.1.1-py3-none-any.whl.

File metadata

Download URL: sagemaker_mlp_sdk-0.1.1-py3-none-any.whl
Upload date: Mar 23, 2026
Size: 43.8 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for sagemaker_mlp_sdk-0.1.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`fd425ea82db5755569e1bcf26e64e3b25c979c529ae27d031509d9d30695c53e`
MD5	`2200e849fda8c1624c825708242b0d95`
BLAKE2b-256	`7fcfd8efbc11744d750ea333ea21a23c87a61009557e17b865f0a7e0875fe81e`

See more details on using hashes here.

sagemaker-mlp-sdk 0.1.1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

mlp_sdk_v3

Overview

Key Features

Installation

Quick Start

Generate Configuration

Basic Usage

Configuration

Configuration File Location

Configuration File Format

Configuration Precedence

Encryption Setup

Generating an Encryption Key

Loading Encryption Keys

From Environment Variable

From File

From AWS KMS

Encrypting Configuration Files

Decrypting Configuration Files

Advanced Usage

Accessing Underlying SageMaker SDK Objects

Audit Trail

Logging Configuration

Runtime Configuration Updates

Error Handling

API Reference

MLP_Session

Methods

Properties

ConfigurationManager

Methods

Static Methods

Development

Setup

Testing

Code Quality

Requirements

License

Contributing

Support

Examples

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes