Python/sh wrapper for executing commands inside docker containers and mapping host:container volumes
Project description
shoosh
Wrapper for sh to run Shell commands inside Docker container and handle volume mappings seamlessly.
How to
1: Instantiate a container
First step is to instantiate the (Docker) container we are going to use next. (You'll use whatever method/interface you're used to do it; the lib is not meant to manage the containers lifecycle.)
For example, let's run a simple alpine container mapping local test/
directory to /some/path
inside the container:
$ docker run -dt \
--name some_container \
-v /tmp/test:/some/path \
debian
2: Create a handle to container
Then, we create a handle to the container whit the volume(s) set:
>>> import shoosh
>>> sh = shoosh.init('some_container')
3: And execute a command through the handle
By default, whenever you cite the (host) volume, it translate to the internal path:
>>> echo = sh.wrap('echo')
>>> echo("Internal path:", '/tmp/test')
Internal path: /some/path
Rationale
The typical use of (Docker) containers is to spin-up a container containing the software packages we need to accomplish a given task, process the data we aim to, and eventually get the results back to the host system.
When we run (Docker) containers, though, the filesystem structure inside the container is completely detached from that of the host system. The way to exchange data files between host and container is through container mounting points -- or volumes.
In practice, it means that the paths we see inside a container -- where our data is sitting -- is different than that of the host system -- where the very same data files are stored, and being shared through volumes to the container.
Consider the following scenario:
- For some reason we use a [osgeo/gdal] container to process geographical data;
- The data in our host system is stored under
/home/user/data/some_project
; - When we instantiate the [osgeo/gdal] image as
some_gdal
container, we mount-bind that path to container's/data/geo
; - Suppose all we want to do it to [convert our data files from GeoTiff to Cloud Optimized GeoTiff(https://gdal.org/drivers/raster/cog.html):
$ gdal_translate /data/geo/raster.tif /data/geo/raster_cog.tif -of COG
Everything works very well if the interaction with the container is decoupled from the host system, i.e. if we are executing the commands inside the container. But if there is a need for requesting the execution of a command from the host system inside the container, one should be aware of the different paths resulting from a host-container volumes binding.
The demand for requesting processing tasks (from the host system) to execute inside a container have different motivations, clearly; It can be a personal demand of having the host system software set clean and the convenience of not getting inside the container every time a simple, atomic task has to be performed. But it can also be part of a more complex software application running on the host system -- like a data processing pipeline -- where different components demand different third-party software tools (available through containers).
Motivation
In my case, the reason I wrote this software, is I use Python to transform planetary (Mars) geo-located data in different ways using either OSGEO's GDAL as well as USGS's ISIS software packages in a couple of cloud computing projects (GMAP, NEANIAS). I had to find a way, for my own convenience, to merge data visualization, metadata selection, download, cleaning, map-projection, formatting, etc., from the host system and on containers seamlessly through Python.
Application (value)
This (Python) library primary feature is the execution of Shell commands inside (Docker) containers and the correct re-mapping of paths on the fly. It does also provides a seamless interface to execute Shell commands in the current environment, re-mapping of paths is not necessary in this case but it allow you to move your code from a "host-container" scenario to "container-only" -- in a cloud infrastructure -- seamlessly.
All this seamlessly feature is possible because of the sh package.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file shoosh-1.2rc1.tar.gz
.
File metadata
- Download URL: shoosh-1.2rc1.tar.gz
- Upload date:
- Size: 26.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.8.0 pkginfo/1.8.2 readme-renderer/34.0 requests/2.27.1 requests-toolbelt/0.9.1 urllib3/1.26.9 tqdm/4.63.1 importlib-metadata/4.11.3 keyring/23.5.0 rfc3986/2.0.0 colorama/0.4.4 CPython/3.10.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 6375514ae2ddc58afa19d934310483fd893ff310ad1975649d05c3c56826f243 |
|
MD5 | 0657873e42738a81800672d05ad6cf02 |
|
BLAKE2b-256 | 15c408bd2c2d514d21efd4d03861de2597566ec98a32e95b4d9932886fdcd163 |
File details
Details for the file shoosh-1.2rc1-py3-none-any.whl
.
File metadata
- Download URL: shoosh-1.2rc1-py3-none-any.whl
- Upload date:
- Size: 7.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.8.0 pkginfo/1.8.2 readme-renderer/34.0 requests/2.27.1 requests-toolbelt/0.9.1 urllib3/1.26.9 tqdm/4.63.1 importlib-metadata/4.11.3 keyring/23.5.0 rfc3986/2.0.0 colorama/0.4.4 CPython/3.10.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | d7ed2524587b9ca8020266b15d0d06c17fc22355a19700fb93bf4c52f4cb9cbb |
|
MD5 | dae7ae1fb06b5281c50e410f26943875 |
|
BLAKE2b-256 | 3560932220ce47c18db998205e9390114fbafdca8874f7870992c2de9e143049 |