Skip to main content

Open source library for training and deploying models on Amazon SageMaker.

Project description

SageMaker

SageMaker Python SDK

Latest Version Supported Python Versions Code style: black Documentation Status

SageMaker Python SDK is an open source library for training and deploying machine learning models on Amazon SageMaker.

With the SDK, you can train and deploy models using popular deep learning frameworks Apache MXNet and TensorFlow. You can also train and deploy models with Amazon algorithms, which are scalable implementations of core machine learning algorithms that are optimized for SageMaker and GPU training. If you have your own algorithms built into SageMaker compatible Docker containers, you can train and host models using these as well.

For detailed API reference please go to: Read the Docs

Table of Contents

  1. Installing SageMaker Python SDK

  2. Using the SageMaker Python SDK

  3. MXNet SageMaker Estimators

  4. TensorFlow SageMaker Estimators

  5. Chainer SageMaker Estimators

  6. PyTorch SageMaker Estimators

  7. Scikit-learn SageMaker Estimators

  8. XGBoost SageMaker Estimators

  9. SageMaker Reinforcement Learning Estimators

  10. SageMaker SparkML Serving

  11. AWS SageMaker Estimators

  12. Using SageMaker AlgorithmEstimators

  13. Consuming SageMaker Model Packages

  14. BYO Docker Containers with SageMaker Estimators

  15. SageMaker Automatic Model Tuning

  16. SageMaker Batch Transform

  17. Secure Training and Inference with VPC

  18. BYO Model

  19. Inference Pipelines

  20. Amazon SageMaker Operators for Kubernetes

  21. SageMaker Workflow

  22. SageMaker Autopilot

  23. Model Monitoring

  24. SageMaker Debugger

  25. SageMaker Processing

Installing the SageMaker Python SDK

The SageMaker Python SDK is built to PyPI and can be installed with pip as follows:

pip install sagemaker

You can install from source by cloning this repository and running a pip install command in the root directory of the repository:

git clone https://github.com/aws/sagemaker-python-sdk.git
cd sagemaker-python-sdk
pip install .

Supported Operating Systems

SageMaker Python SDK supports Unix/Linux and Mac.

Supported Python Versions

SageMaker Python SDK is tested on:

  • Python 2.7

  • Python 3.6

AWS Permissions

As a managed service, Amazon SageMaker performs operations on your behalf on the AWS hardware that is managed by Amazon SageMaker. Amazon SageMaker can perform only operations that the user permits. You can read more about which permissions are necessary in the AWS Documentation.

The SageMaker Python SDK should not require any additional permissions aside from what is required for using SageMaker. However, if you are using an IAM role with a path in it, you should grant permission for iam:GetRole.

Licensing

SageMaker Python SDK is licensed under the Apache 2.0 License. It is copyright 2018 Amazon.com, Inc. or its affiliates. All Rights Reserved. The license is available at: http://aws.amazon.com/apache2.0/

Running tests

SageMaker Python SDK has unit tests and integration tests.

You can install the libraries needed to run the tests by running pip install --upgrade .[test] or, for Zsh users: pip install --upgrade .\[test\]

Unit tests

We run unit tests with tox, which is a program that lets you run unit tests for multiple Python versions, and also make sure the code fits our style guidelines. We run tox with Python 2.7 and 3.6, so to run unit tests with the same configuration we do, you’ll need to have interpreters for Python 2.7 and Python 3.6 installed.

To run the unit tests with tox, run:

tox tests/unit

Integrations tests

To run the integration tests, the following prerequisites must be met

  1. AWS account credentials are available in the environment for the boto3 client to use.

  2. The AWS account has an IAM role named SageMakerRole. It should have the AmazonSageMakerFullAccess policy attached as well as a policy with the necessary permissions to use Elastic Inference.

We recommend selectively running just those integration tests you’d like to run. You can filter by individual test function names with:

tox -- -k 'test_i_care_about'

You can also run all of the integration tests by running the following command, which runs them in sequence, which may take a while:

tox -- tests/integ

You can also run them in parallel:

tox -- -n auto tests/integ

MXNet SageMaker Estimators

By using MXNet SageMaker Estimators, you can train and host MXNet models on Amazon SageMaker.

Supported versions of MXNet: 0.12.1, 1.0.0, 1.1.0, 1.2.1, 1.3.0, 1.4.0, 1.4.1, 1.6.0.

Supported versions of MXNet for Elastic Inference: 1.3.0, 1.4.0, 1.4.1.

We recommend that you use the latest supported version, because that’s where we focus most of our development efforts.

For more information, see Using MXNet with the SageMaker Python SDK.

TensorFlow SageMaker Estimators

By using TensorFlow SageMaker Estimators, you can train and host TensorFlow models on Amazon SageMaker.

Supported versions of TensorFlow: 1.4.1, 1.5.0, 1.6.0, 1.7.0, 1.8.0, 1.9.0, 1.10.0, 1.11.0, 1.12.0, 1.13.1, 1.14.0, 1.15.0, 2.0.0.

Supported versions of TensorFlow for Elastic Inference: 1.11.0, 1.12.0, 1.13.1, 1.14.0.

We recommend that you use the latest supported version, because that’s where we focus most of our development efforts.

For more information, see Using TensorFlow with the SageMaker Python SDK.

Chainer SageMaker Estimators

By using Chainer SageMaker Estimators, you can train and host Chainer models on Amazon SageMaker.

Supported versions of Chainer: 4.0.0, 4.1.0, 5.0.0.

We recommend that you use the latest supported version, because that’s where we focus most of our development efforts.

For more information about Chainer, see https://github.com/chainer/chainer.

For more information about Chainer SageMaker Estimators, see Using Chainer with the SageMaker Python SDK.

PyTorch SageMaker Estimators

With PyTorch SageMaker Estimators, you can train and host PyTorch models on Amazon SageMaker.

Supported versions of PyTorch: 0.4.0, 1.0.0, 1.1.0, 1.2.0, 1.3.1.

We recommend that you use the latest supported version, because that’s where we focus most of our development efforts.

For more information about PyTorch, see https://github.com/pytorch/pytorch.

For more information about PyTorch SageMaker Estimators, see Using PyTorch with the SageMaker Python SDK.

Scikit-learn SageMaker Estimators

With Scikit-learn SageMaker Estimators, you can train and host Scikit-learn models on Amazon SageMaker.

Supported versions of Scikit-learn: 0.20.0.

We recommend that you use the latest supported version, because that’s where we focus most of our development efforts.

For more information about Scikit-learn, see https://scikit-learn.org/stable/

For more information about Scikit-learn SageMaker Estimators, see Using Scikit-learn with the SageMaker Python SDK.

XGBoost SageMaker Estimators

With XGBoost SageMaker Estimators, you can train and host XGBoost models on Amazon SageMaker.

Supported versions of XGBoost: 0.90-1.

We recommend that you use the latest supported version, because that’s where we focus most of our development efforts.

For more information about XGBoost, see https://xgboost.readthedocs.io/en/latest/

For more information about XGBoost SageMaker Estimators, see Using XGBoost with the SageMaker Python SDK.

SageMaker Reinforcement Learning Estimators

With Reinforcement Learning (RL) Estimators, you can use reinforcement learning to train models on Amazon SageMaker.

Supported versions of Coach: 0.10.1, 0.11.1 with TensorFlow, 0.11.0 with TensorFlow or MXNet. For more information about Coach, see https://github.com/NervanaSystems/coach

Supported versions of Ray: 0.5.3, 0.6.5 with TensorFlow. For more information about Ray, see https://github.com/ray-project/ray

For more information about SageMaker RL Estimators, see SageMaker Reinforcement Learning Estimators.

SageMaker SparkML Serving

With SageMaker SparkML Serving, you can now perform predictions against a SparkML Model in SageMaker. In order to host a SparkML model in SageMaker, it should be serialized with MLeap library.

For more information on MLeap, see https://github.com/combust/mleap .

Supported major version of Spark: 2.2 (MLeap version - 0.9.6)

Here is an example on how to create an instance of SparkMLModel class and use deploy() method to create an endpoint which can be used to perform prediction against your trained SparkML Model.

sparkml_model = SparkMLModel(model_data='s3://path/to/model.tar.gz', env={'SAGEMAKER_SPARKML_SCHEMA': schema})
model_name = 'sparkml-model'
endpoint_name = 'sparkml-endpoint'
predictor = sparkml_model.deploy(initial_instance_count=1, instance_type='ml.c4.xlarge', endpoint_name=endpoint_name)

Once the model is deployed, we can invoke the endpoint with a CSV payload like this:

payload = 'field_1,field_2,field_3,field_4,field_5'
predictor.predict(payload)

For more information about the different content-type and Accept formats as well as the structure of the schema that SageMaker SparkML Serving recognizes, please see SageMaker SparkML Serving Container.

AWS SageMaker Estimators

Amazon SageMaker provides several built-in machine learning algorithms that you can use to solve a variety of problems.

The full list of algorithms is available at: https://docs.aws.amazon.com/sagemaker/latest/dg/algos.html

The SageMaker Python SDK includes estimator wrappers for the AWS K-means, Principal Components Analysis (PCA), Linear Learner, Factorization Machines, Latent Dirichlet Allocation (LDA), Neural Topic Model (NTM), Random Cut Forest, k-nearest neighbors (k-NN), Object2Vec, and IP Insights algorithms.

For more information, see AWS SageMaker Estimators and Models.

Amazon SageMaker Operators for Kubernetes

You can use Amazon SageMaker Operators for Kubernetes to optimize hyperparameters for a given model, run batch transform jobs over existing models, and set up inference endpoints.

For more information, see Amazon SageMaker Operators for Kubernetes.

SageMaker Workflow

You can use Apache Airflow to author, schedule and monitor SageMaker workflow.

For more information, see SageMaker Workflow in Apache Airflow.

SageMaker Autopilot

Amazon SageMaker Autopilot is an automated machine learning solution (commonly referred to as “AutoML”) for tabular datasets. It automatically trains and tunes the best machine learning models for classification or regression based on your data, and hosts a series of models on an Inference Pipeline.

For more information about SageMaker Autopilot, see SageMaker Autopilot.

Amazon SageMaker Model Monitoring

You can use Amazon SageMaker Model Monitoring to automatically detect concept drift by monitoring your machine learning models.

For more information, see Amazon SageMaker Model Monitoring.

Amazon SageMaker Debugger

You can use Amazon SageMaker Debugger to automatically detect anomalies while training your machine learning models.

For more information, see Amazon SageMaker Debugger.

Amazon SageMaker Processing

You can use Amazon SageMaker Processing to perform data processing tasks such as data pre- and post-processing, feature engineering, data validation, and model evaluation

For more information, see Amazon SageMaker Processing.

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sagemaker-1.50.13.tar.gz (294.1 kB view details)

Uploaded Source

File details

Details for the file sagemaker-1.50.13.tar.gz.

File metadata

  • Download URL: sagemaker-1.50.13.tar.gz
  • Upload date:
  • Size: 294.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/45.1.0 requests-toolbelt/0.9.1 tqdm/4.41.1 CPython/3.6.9

File hashes

Hashes for sagemaker-1.50.13.tar.gz
Algorithm Hash digest
SHA256 74f06ee958e5f85af7d6b77843c9b42a34bb2db86177aa0b80451d0e931543c3
MD5 78ee2d947dcef76dae724a58449895ba
BLAKE2b-256 3666e2421c3c6e0263d04f82bbc0c1e4e2aaa57122b410e8e5440d56ce92a734

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page