Skip to main content

Python/sh wrapper for executing commands inside docker containers and mapping host:container volumes

Project description

shoosh

Wrapper for sh to run Shell commands inside Docker container and handle volume mappings seamlessly.

How to

1: Instantiate a container

First step is to instantiate the (Docker) container we are going to use next. (You'll use whatever method/interface you're used to do it; the lib is not meant to manage the containers lifecycle.)

For example, let's run a simple alpine container mapping local test/ directory to /some/path inside the container:

$ docker run -dt \
        --name some_container \
        -v /tmp/test:/some/path \
        debian

2: Create a handle to container

Then, we create a handle to the container whit the volume(s) set:

>>> import shoosh
>>> sh = shoosh.init('some_container')

3: And execute a command through the handle

By default, whenever you cite the (host) volume, it translate to the internal path:

>>> echo = sh.wrap('echo')
>>> echo("Internal path:", '/tmp/test')
Internal path: /some/path

Rationale

The typical use of (Docker) containers is to spin-up a container containing the software packages we need to accomplish a given task, process the data we aim to, and eventually get the results back to the host system.

When we run (Docker) containers, though, the filesystem structure inside the container is completely detached from that of the host system. The way to exchange data files between host and container is through container mounting points -- or volumes.

In practice, it means that the paths we see inside a container -- where our data is sitting -- is different than that of the host system -- where the very same data files are stored, and being shared through volumes to the container.

Consider the following scenario:

  • For some reason we use a [osgeo/gdal] container to process geographical data;
  • The data in our host system is stored under /home/user/data/some_project;
  • When we instantiate the [osgeo/gdal] image as some_gdal container, we mount-bind that path to container's /data/geo;
  • Suppose all we want to do it to [convert our data files from GeoTiff to Cloud Optimized GeoTiff(https://gdal.org/drivers/raster/cog.html):
    $ gdal_translate /data/geo/raster.tif /data/geo/raster_cog.tif -of COG
    

Everything works very well if the interaction with the container is decoupled from the host system, i.e. if we are executing the commands inside the container. But if there is a need for requesting the execution of a command from the host system inside the container, one should be aware of the different paths resulting from a host-container volumes binding.

The demand for requesting processing tasks (from the host system) to execute inside a container have different motivations, clearly; It can be a personal demand of having the host system software set clean and the convenience of not getting inside the container every time a simple, atomic task has to be performed. But it can also be part of a more complex software application running on the host system -- like a data processing pipeline -- where different components demand different third-party software tools (available through containers).

Motivation

In my case, the reason I wrote this software, is I use Python to transform planetary (Mars) geo-located data in different ways using either OSGEO's GDAL as well as USGS's ISIS software packages in a couple of cloud computing projects (GMAP, NEANIAS). I had to find a way, for my own convenience, to merge data visualization, metadata selection, download, cleaning, map-projection, formatting, etc., from the host system and on containers seamlessly through Python.

Application (value)

This (Python) library primary feature is the execution of Shell commands inside (Docker) containers and the correct re-mapping of paths on the fly. It does also provides a seamless interface to execute Shell commands in the current environment, re-mapping of paths is not necessary in this case but it allow you to move your code from a "host-container" scenario to "container-only" -- in a cloud infrastructure -- seamlessly.

All this seamlessly feature is possible because of the sh package.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

shoosh-1.2rc1.tar.gz (26.6 kB view hashes)

Uploaded source

Built Distribution

shoosh-1.2rc1-py3-none-any.whl (7.9 kB view hashes)

Uploaded py3

Supported by

AWS AWS Cloud computing Datadog Datadog Monitoring Facebook / Instagram Facebook / Instagram PSF Sponsor Fastly Fastly CDN Google Google Object Storage and Download Analytics Huawei Huawei PSF Sponsor Microsoft Microsoft PSF Sponsor NVIDIA NVIDIA PSF Sponsor Pingdom Pingdom Monitoring Salesforce Salesforce PSF Sponsor Sentry Sentry Error logging StatusPage StatusPage Status page