Docker Utilities/Patterns for Python Projects

These details have not been verified by PyPI

Project links

Homepage

Development Status
- 5 - Production/Stable
Intended Audience
- Developers
License
- OSI Approved :: MIT License
Natural Language
- English
Programming Language

Project description

Docker Utilities/Patterns

Dockerutils is a set of utilities and conventions around their use. The intent behind these utilities is to provide a very light layer of abstraction to: simplify interaction with docker; support seamlessly running docker locally or on ec2 instance in AWS; allow for multiple images per project; etc.

Perhaps the best way to think of dockerutils is that it embodies two entities, commonly used when working with docker, and a set of complementary commands for working with those entities. The two entities are:

"dock" - the server that is hosting docker (by default, localhost)
image - the standard docker image

The commands used to operate against these entities are:

CommandSet / Entity	Creation	Execution	Jupyter Notebook	Utility
Dock	create-dock destroy-dock	start-dock stop-dock	nb-dock	source dock ls-dock ssh-dock
Image	build-image	run-image	run-notebook	publish-image transfer-image

Possible use cases include:

seperating development/test dependencies out of production container, e.g. production container vs. dev/test container
seperating data science notebook container from execution container
environment experimentation
cases in which you want to "freeze" any external dependencies in one container and use that as a base for containers that are dependent solely on the project

Conventions

Create a docker directory tree at the root of the project

In this directory tree there should be one sub-directory for each unique docker container type that is desired. Each of these sub-directories would contain the Dockerfile that will be used to create the image as well as any source specific to that image.
Use versioneer for project versioning (Optional).

As part of the image build, a file, _version.py.bld, will be generated and placed at the project root. A Dockerfile can add that file to the image on creation to prevent the need for including the .git directory tree in the container context (usually quite expensive).
Create a docker/base directory to make use of built in external dependency isolation (optional)

This capability supports environments where a docker build isn't able to access external dependencies (Docker Hub, pypi, etc.), for instance a server in a "locked-down" environment. A base image can be defined to isolate any dependencies that are required. That image can then be built and transfer-image used to transfer the base image to the target environment.

Subsequent images can be built based off of that image that are "self-contained" (relying only on source from the project). The remote docker api can then be used to quickly iterate only requiring the more cumbersome transfer-image to be used when external dependencies change.
Building and running images controlled through configuration (<project_dir>/docker/dockerutils.cfg)

Includes setting most docker parameters, i.e. volume mounts, ports, networks, commands, etc. with replacement varilable support for things like user, project root, etc.

Command-line Interface

Image cli

build-image takes the name of one of the sub-directories in the docker directory and builds the image defined therein. The image is named <project>-<subdir>:<user>

run-image takes the name of one of the sub-directories (or one of the synthetic images defined in dockerutils.cfg), together with any of the configuration for that image defined in dockerutils.cfg and starts a docker container

transfer-image takes a docker image name and uses docker save and load to transfer the image to a remote host

publish-image takes the name of one of the sub-directories in the docker directory and pushes the image built by the docker file to the defined repository (AWS or Docker)

Notebook cli

run-notebook will start a docker container using either the notebook container found in the docker/notebook directory if it exists, or resero-labs/docker-ds otherwise. The current directory will be mounted into the container for use in the Juypter notebook environment. There are a couple of environment variable to be aware of with this command:

DOCKER_DS_DONT_PULL - if set, the version of resero-labs/docker-ds currently available will be used rather than pulling the latest version from docker hub.
RESERO_JUPYTER_DIFFS - if set, on save, .py files and .html files for the notebook will be created in a .diffs subdirectory.

Dock cli

A "dock" is a remote system that you can connect to through ssh. You can "dock" your terminal to a remote instance and any docker commands, including image and notebook cli above will be run against the remote docker server. Once a "dock" is created, you can dock your terminal by issuing the command source dock <server IP or moniker>

create-dock (register-dock if provisioning from AWS console) is used to add a remote system to the dock list with all its configuration (username, ip and a moniker)

destroy-dock (unregister-dock if provisioned from AWS console) is used to remove the reference to the remote system

stop-dock will change the instances state of a remote dock to stopped

start-dock will change the instance state of a remote dock to running

ssh-dock opens a terminal on the remote dock with ssh

ls-dock list (including state) any created docks

nb-dock run jupyter on the bare AMI of the dock and open a browser window to the notebook server

`dockerutils.cfg` Format

Configuration in docker/dockerutils.cfg is used to configure behavior of the dockerutils scripts.

Image Section

The dockerutils.cfg file format allows for configuration sections named corresponding to the sub-directories in the docker directory tree. Each of these sections may contain one of the following:

environment - in the form of (-e VAR=value)+ to pass environment into the container
interactive - either -e or -it
gpu - conforms to NVIDIA Docker 2.0 specification, e.g. --runtime=nvidia -e NVIDIA_VISIBLE_DEVICES=all, run-image -g will add this automatically
network - the name of the network for the container
volumes - in the form of (-v <host-dir>:<container-dir>)+ or (--mount ...)+ or both
ports - in the form of (-p <host-port>:<container-port>)+
cmd - any valid Docker CMD specification
pull_FROM_on_force - defaults to False, if True, add --pull to build command when force building image (or base image)
image_repo - the repository to publish the image to
publication_tag - the tag for publication (full image name + tag)
pre_build_script - A shell command or script to run before a docker build is issued
post_build_script - A shell command or script to run after a docker build has been compeleted (successfully)

Synthetic Images

Additionally, "synthetic" images can be specified by adding a run-image section with a synthetic_images definition that contains a list of "synthetic" images. Each of these may also have a named section as defined for the docker sub-directories, but must also contain a name value that resolves to one of the docker sub-directories. For example:

[run_image]
synthetic_images=shell

[shell]
name=dev
...

Configuration-only Images

If there is a docker container that does what you want already, you can create a configuration-only image by specifying name, tag and prefix=False in the configuration section for the image. For example the base notebook image resero-labs/docker-ds is often sufficient for running a Jupyter notebook against your code, as it auto detects a setup.py upon container start and installs the module into the notebook environment.

Image Tagging

The default tag for any image created/run/etc. is the user name in the host environment when running the utility. This can be overriden by adding a tag value to the desired section. For example:

[dev]
tag=experiment.2017.12.16
...

Volume Replacement Variables

The volume specification may contain either environment variables ($name and ${name} formats) as well as specific variable replacement designations of the form {var}. The supported variables include:

project_root - will be replaced with the root directory name of the project
user - will be replaced with the user name of the user running the command
project - will be replaced with project name
home - will be replaced with the user's home directory

Image Push Replacement Variables

The publication_tag may contain either environment variables ($name and ${name} formats) as well as specific variable replacement designations of the form {var}. The supported variables include:

account - AWS account designation
region - AWS region
image - Image name
tag - Image tag
user - will be replaced with the user name of the user running the command

Publish to AWS

In order to publish to AWS, the repository will be the image name and the following should be configured for images that are published to AWS

image_repo=aws
publication_tag={account}.dkr.ecr.{region}.amazonaws.com/{image}:{tag}"

Patterns

Running your code in container, making live modifications outside container in your editor of choice

If you're like me, you have a whole set of tools in your host environment that you use to work with your project. One of the disadvantages of working with Docker can be the difficulty of transplanting those tools into the container environment. Perhaps there is a way to have your cake and eat it too! The dev example does a reasonable job of doing just this.

With this pattern, you create a Dockerfile that has everything in the image except for your project source. An empty WORKDIR is created and then ENTRYPOINT even does a pip install -e of the contents of the empty WORKDIR. We get the desired results by mounting the source project directory into the container's WORKDIR (see dev section of docker/dockertuils.cfg).

With this pattern you can run tests, experiment, etc. in container, make changes to the project in your host environment toolset and immediately observe the changes that were made.

Working with a server in a locked-down environment

You may find yourself in a situation in which you need to work with a server hosting Docker in an environment that has limited access to the "outside world". This pattern can be used to capture all external dependencies in a base image that is built in an environment that is open, use transfer-image to send this base image to the server and then utilize a derived image dependent just on project sources and the base image to iterate without requiring open access on the server.

Adding test frameworks, code analysis tools, etc. to a container for testing/validation

Tensorflow for both CPU and GPU in the same container

It is sometimes useful to try both the CPU and GPU versions of Tensorflow. The example in docker/tensorflow provides a pattern to do so. All tensorflow depdendencies are installed into global python site-packages. Then virtual environments are created for both cpu and gpu versions and the appropriate version of tensorflow is install into the respective virtual environments.

The run-image script makes use of NVIDIA Docker 2.0 being installed on the host os. When run with the -g option, the NVIDIA runtime will be used to make GPUs available in container.

Upon container startup, the appropriate virtual environment (cpu or gpu) will be activated dependant upon the -g option.

To switch between virtual envrionments utilized the symbolic links, /cpu-env and /gpu-env, e.g. source /cpu-env.

Versioneer support

If you aren't using versioneer to version your Python projects, you should be. Versioneer utilizes .git to determine the project version. Having correct version information in container is desireable in many cases, but pushing .git into container usually isn't desireable. To capture version information for container, the genversion utility is available and uses capabilities of versioneer to generate a _version.py.bld file in the project root. This file is available for Dockerfile to add/overwrite the version.py file within the Docker image. An example of this is found in the cicd sub-directory.

`.dockerignore`

In order to minimize the context sent to Docker to build images, please see examples in .dockerignore in this repository.

A Note on Implementation

We selected to use the Docker command line from python scripts rather than the Docker API available to Python as it integrates better into my development loop. As each CLI prints the Docker command it's using, if it does something unexpected, it's each to copy and paste the command used, modify it and be on your way without having to debug through CLI utility code, allowing bugs/additions to the CLI to be addressed at a later point outside the context of a project development loop.

Project details

These details have not been verified by PyPI

Project links

Homepage

Development Status
- 5 - Production/Stable
Intended Audience
- Developers
License
- OSI Approved :: MIT License
Natural Language
- English
Programming Language

Release history Release notifications | RSS feed

This version

2.0.7

Feb 5, 2021

2.0.6

Sep 23, 2019

2.0.5

Aug 20, 2019

2.0.4

Aug 9, 2019

2.0.3

Aug 8, 2019

2.0.2

Aug 6, 2019

2.0.1

Aug 6, 2019

2.0.0

Aug 2, 2019

1.5.6

May 31, 2019

1.5.5

Apr 19, 2019

1.5.4

Apr 3, 2019

1.5.3

Mar 22, 2019

1.5.2

Jan 30, 2019

1.5.1

Jan 29, 2019

1.5.0

Jan 25, 2019

1.1.15

Jan 22, 2019

1.1.14

Jan 16, 2019

1.1.13

Dec 21, 2018

1.1.12

Dec 17, 2018

1.1.11

Dec 12, 2018

1.1.10

Dec 10, 2018

1.1.9

Dec 3, 2018

1.1.8

Oct 23, 2018

1.1.7

Oct 15, 2018

1.1.6

Oct 10, 2018

1.1.5

Oct 8, 2018

1.1.4

Oct 1, 2018

1.1.3

Sep 20, 2018

1.1.2

Sep 13, 2018

1.1.1

Sep 12, 2018

1.1

Sep 10, 2018

1.0.2

Aug 25, 2018

1.0.1

Aug 1, 2018

1.0.0

Jul 28, 2018

0.9.15

Jul 28, 2018

0.9.14

Jul 6, 2018

0.9.13

Jul 6, 2018

0.9.12

Apr 25, 2018

0.9.11

Apr 20, 2018

0.9.10

Apr 19, 2018

0.9.9

Apr 19, 2018

0.9.8

Apr 5, 2018

0.9.7

Mar 13, 2018

0.9.5

Feb 8, 2018

0.9.3

Dec 19, 2017

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

dockerutils-2.0.7-py3-none-any.whl (31.9 kB view details)

Uploaded Feb 5, 2021 Python 3

File details

Details for the file dockerutils-2.0.7-py3-none-any.whl.

File metadata

Download URL: dockerutils-2.0.7-py3-none-any.whl
Upload date: Feb 5, 2021
Size: 31.9 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.25.1 setuptools/52.0.0.post20210125 requests-toolbelt/0.9.1 tqdm/4.55.1 CPython/3.8.5

File hashes

Hashes for dockerutils-2.0.7-py3-none-any.whl
Algorithm	Hash digest
SHA256	`6c11171ec2e22272c712438cafc736d6298896fdb14609dda03010152cf8f090`
MD5	`773e538b4793103bbdd46589b1f10866`
BLAKE2b-256	`dffcb091131861a67ceec55231f35d45b48470be6e5578d19e36b0764997a2bd`

See more details on using hashes here.

dockerutils 2.0.7

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Docker Utilities/Patterns

Conventions

Command-line Interface

Image cli

Notebook cli

Dock cli

dockerutils.cfg Format

Image Section

Synthetic Images

Configuration-only Images

Image Tagging

Volume Replacement Variables

Image Push Replacement Variables

Publish to AWS

Patterns

Running your code in container, making live modifications outside container in your editor of choice

Working with a server in a locked-down environment

Adding test frameworks, code analysis tools, etc. to a container for testing/validation

Tensorflow for both CPU and GPU in the same container

Versioneer support

.dockerignore

A Note on Implementation

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distributions

Built Distribution

File details

File metadata

File hashes

`dockerutils.cfg` Format

`.dockerignore`