Skip to main content

Open source library for creating TensorFlow containers to run on Amazon SageMaker.

Project description

SageMaker TensorFlow Containers is an open source library for making the TensorFlow framework run on Amazon SageMaker.

This repository also contains Dockerfiles which install this library, TensorFlow, and dependencies for building SageMaker TensorFlow images.

For information on running TensorFlow jobs on SageMaker: Python SDK.

For notebook examples: SageMaker Notebook Examples.

Table of Contents

  1. Getting Started

  2. Building your Image

  3. Running the tests

Getting Started


Make sure you have installed all of the following prerequisites on your development machine:

For Testing on GPU

Building your Image

Amazon SageMaker utilizes Docker containers to run all training jobs & inference endpoints.

The Docker images are built from the Dockerfiles specified in Docker/.

The Docker files are grouped based on TensorFlow version and separated based on Python version and processor type.

The Docker files for TensorFlow 2.0 are available in the tf-2 branch, in docker/2.0.0/.

The Docker images, used to run training & inference jobs, are built from both corresponding “base” and “final” Dockerfiles.

Base Images

The “base” Dockerfile encompass the installation of the framework and all of the dependencies needed. It is needed before building image for TensorFlow 1.8.0 and before. Building a base image is not required for images for TensorFlow 1.9.0 and onwards.

Tagging scheme is based on <tensorflow_version>-<processor>-<python_version>. (e.g. 1.4 .1-cpu-py2)

All “final” Dockerfiles build images using base images that use the tagging scheme above.

Before building these images, you need to have a pip-installable binary of this repository saved locally. To create the SageMaker Tensorflow Container Python package:


# Create the binary git clone cd sagemaker-tensorflow-container python sdist cp dist/sagemaker_tensorflow_training*.tar.gz docker/<tensorflow_version>/sagemaker_tensorflow_training.tar.gz

Once you have copied the tensorflow_training.tar.gz to the desired location [same directory as the Dockerfile], you can then build the image.

If you want to build your “base” Docker image, then use:

# All build instructions assume you're building from the same directory as the Dockerfile.

docker build -t tensorflow-base:<tensorflow_version>-cpu-<python_version> -f Dockerfile.cpu .

docker build -t tensorflow-base:<tensorflow_version>-gpu-<python_version> -f Dockerfile.gpu .
# Example

docker build -t tensorflow-base:1.4.1-cpu-py2 -f Dockerfile.cpu .

docker build -t tensorflow-base:1.4.1-gpu-py2 -f Dockerfile.gpu .

Final Images

The “final” Dockerfiles encompass the installation of the SageMaker specific support code.

For images of TensorFlow 1.8.0 and before, all “final” Dockerfiles use base images for building.

These “base” images are specified with the naming convention of tensorflow-base:<tensorflow_version>-<processor>-<python_version>.

Before building “final” images:

Build your “base” image. Make sure it is named and tagged in accordance with your “final” Dockerfile. Skip this step if you want to build image of Tensorflow Version 1.9.0 and above.

If you want to build “final” Docker images, for versions 1.6 and above, you will first need to download the appropriate tensorflow pip wheel, then pass in its location as a build argument. These can be obtained from pypi. For example, the files for 1.6.0 are here:

Note that you need to use the tensorflow-gpu wheel when building the GPU image.

Then run:

# All build instructions assumes you're building from the same directory as the Dockerfile.

docker build -t <image_name>:<tag> --build-arg py_version=<py_version> --build-arg framework_installable=<path to tensorflow binary> -f Dockerfile.cpu .

docker build -t <image_name>:<tag> --build-arg py_version=<py_version> --build-arg framework_installable=<path to tensorflow binary> -f Dockerfile.gpu .
# Example
docker build -t preprod-tensorflow:1.6.0-cpu-py2 --build-arg py_version=2
--build-arg framework_installable=tensorflow-1.6.0-cp27-cp27mu-manylinux1_x86_64.whl -f Dockerfile.cpu .

The dockerfiles for 1.4 and 1.5 build from source instead, so when building those, you don’t need to download the wheel beforehand:

# All build instructions assumes you're building from the same directory as the Dockerfile.

docker build -t <image_name>:<tag> -f Dockerfile.cpu .

docker build -t <image_name>:<tag> -f Dockerfile.gpu .
# Example

docker build -t preprod-tensorflow:1.4.1-cpu-py2 -f Dockerfile.cpu .

docker build -t preprod-tensorflow:1.4.1-gpu-py2 -f Dockerfile.gpu .

Running the tests

Running the tests requires installation of the SageMaker TensorFlow Container code and its test dependencies.

git clone
cd sagemaker-tensorflow-containers
pip install -e .[test]

Tests are defined in test/ and include unit, integration and functional tests.

Unit Tests

If you want to run unit tests, then use:

# All test instructions should be run from the top level directory

pytest test/unit

Integration Tests

Running integration tests require Docker and AWS credentials, as the integration tests make calls to a couple AWS services. The integration and functional tests require configurations specified within their respective sure to update the account-id and region at a minimum.

Integration tests on GPU require Nvidia-Docker.

Before running integration tests:

  1. Build your Docker image.

  2. Pass in the correct pytest arguments to run tests against your Docker image.

If you want to run local integration tests, then use:

# Required arguments for integration tests are found in test/integ/

pytest test/integration --docker-base-name <your_docker_image> \
                        --tag <your_docker_image_tag> \
                        --framework-version <tensorflow_version> \
                        --processor <cpu_or_gpu>
# Example
pytest test/integration --docker-base-name preprod-tensorflow \
                        --tag 1.0 \
                        --framework-version 1.4.1 \
                        --processor cpu

Functional Tests

Functional tests are removed from the current branch, please see them in older branch r1.0.


Please read for details on our code of conduct, and the process for submitting pull requests to us.


SageMaker TensorFlow Containers is licensed under the Apache 2.0 License. It is copyright 2018, Inc. or its affiliates. All Rights Reserved. The license is available at:

Project details

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page