Skip to main content

occystrap: docker and OCI container tools

Project description

Occy Strap

Occy Strap is a simple set of Docker and OCI container tools, which can be used either for container forensics or for implementing an OCI orchestrator, depending on your needs. This is a very early implementation, so be braced for impact.

Quick Start with URI-Style Commands

The recommended way to use Occy Strap is with the new URI-style process and search commands:

# Download from registry to tarball
occystrap process registry://docker.io/library/busybox:latest tar://busybox.tar

# Download from registry to directory
occystrap process registry://docker.io/library/centos:7 dir://centos7

# Export from local Docker to tarball with timestamp normalization
occystrap process docker://myimage:v1 tar://output.tar -f normalize-timestamps

# Search for files in an image
occystrap search registry://docker.io/library/busybox:latest "bin/*sh"

The process Command

The process command takes a source URI, a destination URI, and optional filters:

occystrap process SOURCE DESTINATION [-f FILTER]...

Input URI Schemes

  • registry://HOST/IMAGE:TAG - Docker/OCI registry
  • docker://IMAGE:TAG - Local Docker daemon
  • dockerpush://IMAGE:TAG - Local Docker via push (fast, see below)
  • tar:///path/to/file.tar - Docker-save format tarball

Output URI Schemes

  • tar:///path/to/output.tar - Create tarball
  • dir:///path/to/directory - Extract to directory
  • oci:///path/to/bundle - Create OCI runtime bundle
  • mounts:///path/to/directory - Create overlay mounts
  • docker://IMAGE:TAG - Load into local Docker daemon
  • registry://HOST/IMAGE:TAG - Push to Docker/OCI registry

URI Options

Options can be passed as query parameters:

# Extract with unique names and expansion
occystrap process registry://docker.io/library/busybox:latest \
    "dir://merged?unique_names=true&expand=true"

# Use custom Docker socket
occystrap process "docker://myimage:v1?socket=/run/podman/podman.sock" \
    tar://output.tar

Filters

Filters transform or inspect image elements as they pass through the pipeline:

# Normalize timestamps for reproducible builds
occystrap process registry://docker.io/library/busybox:latest \
    tar://busybox.tar -f normalize-timestamps

# Normalize with custom timestamp
occystrap process registry://docker.io/library/busybox:latest \
    tar://busybox.tar -f "normalize-timestamps:ts=1609459200"

# Search while creating output (prints matches AND creates tarball)
occystrap process registry://docker.io/library/busybox:latest \
    tar://busybox.tar -f "search:pattern=*.conf"

# Chain multiple filters
occystrap process registry://docker.io/library/busybox:latest \
    tar://busybox.tar -f normalize-timestamps -f "search:pattern=bin/*"

# Record layer metadata to a JSONL file (inspect filter)
occystrap process docker://myimage:v1 registry://myregistry/myimage:v1 \
    -f "inspect:file=layers-before.jsonl" \
    -f normalize-timestamps \
    -f "inspect:file=layers-after.jsonl"

# Exclude files matching glob patterns from layers
occystrap process registry://docker.io/library/python:3.11 \
    tar://python.tar -f "exclude:pattern=**/.git/**"

# Exclude multiple patterns (comma-separated)
occystrap process registry://docker.io/library/python:3.11 \
    tar://python.tar -f "exclude:pattern=**/.git/**,**/__pycache__/**,**/*.pyc"

# Load image directly into local Docker daemon
occystrap process registry://docker.io/library/busybox:latest \
    docker://busybox:latest

# Load into Podman
occystrap process registry://docker.io/library/busybox:latest \
    "docker://busybox:latest?socket=/run/podman/podman.sock"

# Push image to a registry
occystrap process docker://myimage:v1 \
    registry://myregistry.example.com/myuser/myimage:v1

# Push to registry with authentication
occystrap --username myuser --password mytoken \
    process tar://image.tar registry://ghcr.io/myorg/myimage:latest

# Push with zstd compression (better ratio, requires Docker 20.10+/containerd 1.5+)
occystrap --compression=zstd \
    process docker://myimage:v1 registry://myregistry.example.com/myimage:v1

The search Command

Search for files in container image layers:

occystrap search SOURCE PATTERN [--regex] [--script-friendly]

Examples:

# Search registry image
occystrap search registry://docker.io/library/busybox:latest "bin/*sh"

# Search local Docker image
occystrap search docker://myimage:v1 "*.conf"

# Search tarball with regex
occystrap search --regex tar://image.tar ".*\.py$"

# Machine-parseable output
occystrap search --script-friendly registry://docker.io/library/busybox:latest "*sh"

The info Command

Display information about a container image without downloading layers:

occystrap info SOURCE

The output format is controlled by the global -O / --output-format option:

# Human-readable text output (default)
occystrap info registry://docker.io/library/busybox:latest

# JSON output for scripting
occystrap -O json info registry://docker.io/library/busybox:latest

# From local Docker daemon
occystrap info docker://myimage:v1

# From tarball
occystrap info tar://image.tar

Registry sources show full detail (compressed sizes, media types, compression format). Docker and tarball sources show config-derived info (architecture, OS, diff_ids, history, labels, env, etc.).

The check Command

Check validity of a container image. Validates structural integrity, history consistency, compression compatibility, and filesystem correctness:

occystrap check SOURCE [--fast]

Use --fast to skip layer downloads and only check metadata consistency (manifest and config). The exit code is non-zero if any errors are found, making it suitable for CI integration.

# Full check (downloads and verifies all layers)
occystrap check registry://docker.io/library/busybox:latest

# Fast metadata-only check
occystrap check --fast docker://myimage:v1

# JSON output for CI scripting
occystrap -O json check tar://image.tar

# Validate output of a process pipeline
occystrap process docker://myimage:v1 tar://output.tar
occystrap check tar://output.tar

Legacy Commands (Deprecated)

The following commands are deprecated but still work for backwards compatibility. They will be removed in a future version.

Downloading an image from a repository and storing as a tarball

Let's say we want to download an image from a repository and store it as a local tarball. This is a common thing to want to do in airgapped environments for example. You could do this with docker with a docker pull; docker save. The Occy Strap equivalent is:

occystrap fetch-to-tarfile registry-1.docker.io library/busybox latest busybox.tar

New equivalent:

occystrap process registry://registry-1.docker.io/library/busybox:latest tar://busybox.tar

In this example we're pulling from the Docker Hub (registry-1.docker.io), and are downloading busybox's latest version into a tarball named busybox.tar. This tarball can be loaded with docker load -i busybox.tar on an airgapped Docker environment.

Repeatable builds with normalized timestamps

To make builds more repeatable, you can normalize file access and modification times in the image layers. This is useful when you want to ensure that the same image content produces the same tarball hash, regardless of when the files were originally created:

occystrap fetch-to-tarfile --normalize-timestamps registry-1.docker.io library/busybox latest busybox.tar

New equivalent:

occystrap process registry://registry-1.docker.io/library/busybox:latest tar://busybox.tar -f normalize-timestamps

This will set all timestamps in the layer tarballs to 0 (Unix epoch: January 1, 1970). You can also specify a custom timestamp:

occystrap fetch-to-tarfile --normalize-timestamps --timestamp 1609459200 registry-1.docker.io library/busybox latest busybox.tar

New equivalent:

occystrap process registry://registry-1.docker.io/library/busybox:latest tar://busybox.tar -f "normalize-timestamps:ts=1609459200"

When timestamps are normalized, the layer SHAs are recalculated and the manifest is updated to reflect the new hashes. This ensures the tarball structure remains consistent and valid.

Downloading an image from a repository and storing as an extracted tarball

The format of the tarball in the previous example is two JSON configuration files and a series of image layers as tarballs inside the main tarball. You can write these elements to a directory instead of to a tarball if you'd like to inspect them:

occystrap fetch-to-extracted registry-1.docker.io library/centos 7 centos7

New equivalent:

occystrap process registry://registry-1.docker.io/library/centos:7 dir://centos7

Downloading an image to a merged directory

In scenarios where image layers are likely to be reused between images, you can save disk space by downloading images to a directory which contains more than one image:

occystrap fetch-to-extracted --use-unique-names registry-1.docker.io \
    homeassistant/home-assistant latest merged_images

New equivalent:

occystrap process registry://registry-1.docker.io/homeassistant/home-assistant:latest \
    "dir://merged_images?unique_names=true"

Storing an image tarfile in a merged directory

Sometimes you have image tarfiles instead of images in a registry:

occystrap tarfile-to-extracted --use-unique-names file.tar merged_images

New equivalent:

occystrap process tar://file.tar "dir://merged_images?unique_names=true"

Exploring the contents of layers and overwritten files

If you'd like the layers to be expanded from their tarballs to the filesystem:

occystrap fetch-to-extracted --expand quay.io \
    ukhomeofficedigital/centos-base latest ukhomeoffice-centos

New equivalent:

occystrap process registry://quay.io/ukhomeofficedigital/centos-base:latest \
    "dir://ukhomeoffice-centos?expand=true"

Generating an OCI runtime bundle

occystrap fetch-to-oci registry-1.docker.io library/hello-world latest bar

New equivalent:

occystrap process registry://registry-1.docker.io/library/hello-world:latest oci://bar

Searching image layers for files

occystrap search-layers registry-1.docker.io library/busybox latest "bin/*sh"

New equivalent:

occystrap search registry://registry-1.docker.io/library/busybox:latest "bin/*sh"

Working with local Docker or Podman daemon

occystrap docker-to-tarfile library/busybox latest busybox.tar

New equivalent:

occystrap process docker://library/busybox:latest tar://busybox.tar

For faster local Docker image processing, use the dockerpush:// input:

occystrap process dockerpush://library/busybox:latest tar://busybox.tar

The dockerpush:// input starts an embedded Docker Registry V2 server on localhost and has Docker push the image to it. This is significantly faster than docker:// for multi-layer images because Docker's push mechanism transfers layers individually and in parallel, whereas the Docker Engine API (docker://) exports the entire image as a single sequential tarball.

The embedded server uses HTTPS with an ephemeral self-signed certificate generated via the openssl command-line tool, which must be installed on the system (e.g. apt-get install openssl). Docker treats 127.0.0.0/8 as insecure (skips cert verification), so the self-signed certificate is accepted without daemon.json changes.

For Podman:

occystrap process "docker://myimage:latest?socket=/run/podman/podman.sock" tar://output.tar

Note: Podman doesn't run a daemon by default. You need to start the socket service first:

# For rootless Podman
systemctl --user start podman.socket

# For rootful Podman
sudo systemctl start podman.socket

Authenticating with private registries

To fetch images from private registries (such as GitLab Container Registry, AWS ECR, or private Docker Hub repositories), use the --username and --password global options:

occystrap --username myuser --password mytoken \
    process registry://registry.gitlab.com/mygroup/myimage:latest tar://output.tar

You can also use environment variables to avoid putting credentials on the command line:

export OCCYSTRAP_USERNAME=myuser
export OCCYSTRAP_PASSWORD=mytoken
occystrap process registry://registry.gitlab.com/mygroup/myimage:latest tar://output.tar

For GitLab Container Registry, the username is typically your GitLab username and the password is a personal access token with read_registry scope.

Parallel Downloads and Uploads

When working with registries, occystrap downloads and uploads layers in parallel for improved performance. By default, 4 threads are used:

# Default: 4 parallel operations
occystrap process registry://docker.io/library/busybox:latest tar://busybox.tar

# Use 8 parallel threads
occystrap -j 8 process registry://docker.io/library/busybox:latest tar://busybox.tar

# Sequential operations (1 thread)
occystrap --parallel 1 process docker://myimage:v1 registry://myregistry/myimage:v1

You can also set the parallelism via environment variable:

export OCCYSTRAP_PARALLEL=8
occystrap process registry://docker.io/library/busybox:latest tar://busybox.tar

Or via URI query parameter:

occystrap process docker://myimage:v1 "registry://myregistry/myimage:v1?max_workers=8"

Layer Compression

When pushing images to registries, occystrap supports both gzip (default) and zstd compression for image layers:

# Use gzip (default, maximum compatibility)
occystrap process docker://myimage:v1 registry://myregistry/myimage:v1

# Use zstd for better compression ratio and speed
occystrap --compression=zstd process docker://myimage:v1 registry://myregistry/myimage:v1

You can also set the compression via environment variable:

export OCCYSTRAP_COMPRESSION=zstd
occystrap process docker://myimage:v1 registry://myregistry/myimage:v1

Or via URI query parameter:

occystrap process docker://myimage:v1 "registry://myregistry/myimage:v1?compression=zstd"

Compatibility notes:

  • gzip (default): Works with all Docker/container runtimes
  • zstd: Requires Docker 20.10+ or containerd 1.5+ on the pulling client; offers ~30% better compression ratio and faster compression

When pulling images, occystrap automatically detects and handles both gzip and zstd compressed layers from registries or OCI tarballs.

Cross-Invocation Layer Cache

When pushing multiple images that share base layers (common in CI), occystrap can cache layer processing results across invocations. This avoids re-fetching, re-filtering, re-compressing, and re-uploading layers that have already been processed:

# First push: processes all layers
occystrap --layer-cache /tmp/layer-cache.json \
    process docker://myimage1:v1 registry://myregistry/myimage1:v1

# Second push: skips shared base layers
occystrap --layer-cache /tmp/layer-cache.json \
    process docker://myimage2:v1 registry://myregistry/myimage2:v1

You can also set the cache path via environment variable:

export OCCYSTRAP_LAYER_CACHE=/tmp/layer-cache.json
occystrap process docker://myimage:v1 registry://myregistry/myimage:v1

The cache records the mapping from input layer DiffIDs to compressed output digests. On subsequent runs, if a cached layer's compressed blob still exists in the target registry, the layer is skipped entirely (no fetch, no filter, no compress, no upload). The cache is filter-aware: layers processed with different filter configurations get separate cache entries.

Verbosity and Debugging

By default, occystrap logs only milestones (start/end, summary statistics, layer counts) at INFO level. Per-layer and per-request detail is logged at DEBUG level.

# Enable debug logging for occystrap modules only
occystrap --verbose process docker://myimage:v1 tar://output.tar

# Enable debug logging for all modules (includes library output)
occystrap --debug process docker://myimage:v1 tar://output.tar

When running in a terminal, registry downloads and uploads display interactive tqdm progress bars. In non-TTY environments (CI, pipes), periodic log messages are emitted instead.

Supporting non-default architectures

Docker image repositories can store multiple versions of a single image, with each image corresponding to a different (operating system, cpu architecture, cpu variant) tuple. Occy Strap supports letting you specify which to use with global command line flags. Occy Strap defaults to linux amd64 if you don't specify something different:

occystrap --os linux --architecture arm64 --variant v8 \
    process registry://registry-1.docker.io/library/busybox:latest dir://busybox

Or via URI query parameters:

occystrap process "registry://registry-1.docker.io/library/busybox:latest?os=linux&arch=arm64&variant=v8" \
    dir://busybox

Development

Install for Development

pip install -e ".[test]"

Pre-commit Hooks

This project uses pre-commit hooks to validate code before commits. Install them with:

pip install pre-commit
pre-commit install

The hooks run:

  • actionlint - GitHub Actions workflow validation
  • shellcheck - Shell script linting
  • check-log-levels - Enforces max LOG.info() calls per file
  • tox -eflake8 - Python code style checks
  • tox -epy3 - Unit tests

To run the hooks manually:

pre-commit run --all-files

Running Tests

Unit tests are in occystrap/tests/ and can be run with:

tox -epy3

Functional tests are in deploy/occystrap_ci/tests/ and are run in CI.

Releasing

Releases are automated via GitHub Actions. Push a version tag to trigger the pipeline:

git tag -s v0.5.0 -m "Release v0.5.0"
git push origin v0.5.0

The workflow builds the package, signs the tag with Sigstore, publishes to PyPI, and creates a GitHub Release. See RELEASE-SETUP.md for one-time configuration steps.

Developer Automation

This project supports automated CI helpers via PR comments. To use these commands, comment on a pull request with one of the following:

  • @shakenfist-bot please retest - Re-run the functional test suite
  • @shakenfist-bot please attempt to fix - Have Claude Code attempt to fix test failures
  • @shakenfist-bot please re-review - Request another automated code review
  • @shakenfist-bot please address comments - Have Claude Code address the automated review comments

These commands are only available to repository collaborators with write access.

Claude Code Skills

The .claude/skills/ directory contains guidance for AI agents working on this codebase, covering documentation updates, testing discipline, and PR preparation.

Documentation

For more detailed documentation, see the docs/ directory:

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

occystrap-0.4.9.tar.gz (225.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

occystrap-0.4.9-py3-none-any.whl (122.7 kB view details)

Uploaded Python 3

File details

Details for the file occystrap-0.4.9.tar.gz.

File metadata

  • Download URL: occystrap-0.4.9.tar.gz
  • Upload date:
  • Size: 225.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for occystrap-0.4.9.tar.gz
Algorithm Hash digest
SHA256 35f2f86b900f99f72ed558ba329063fdb9020f7a38d5917006598eb68090b755
MD5 44184da01dc46f331db7e4ba56758158
BLAKE2b-256 92fea7b2b19fd693e1db297e5cd552a7151b0f08b3c7068f6789f05f6fa8415e

See more details on using hashes here.

Provenance

The following attestation bundles were made for occystrap-0.4.9.tar.gz:

Publisher: release.yml on shakenfist/occystrap

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file occystrap-0.4.9-py3-none-any.whl.

File metadata

  • Download URL: occystrap-0.4.9-py3-none-any.whl
  • Upload date:
  • Size: 122.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for occystrap-0.4.9-py3-none-any.whl
Algorithm Hash digest
SHA256 606f0d75cf711b0719c5bcfc09f9ad118b57d7fb26234472ab8738b620155913
MD5 0acf5995170a94b57f5b737877cc122a
BLAKE2b-256 0f7ed8900996a2d5bf2b7e70ff9c5275b45ae07c76572ad9d3efa68f9c8df501

See more details on using hashes here.

Provenance

The following attestation bundles were made for occystrap-0.4.9-py3-none-any.whl:

Publisher: release.yml on shakenfist/occystrap

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page