Skip to main content

A CLI tool to extract and analyze Docker images

Project description

docker-assemble

PyPI License

docker-assemble is a Python CLI tool for extracting Docker image filesystems, inspecting image contents, finding large files, and rebuilding optimized Docker images.

It helps developers, researchers, and DevOps engineers understand what is inside a Docker image by exporting the image filesystem into a local directory. You can use it to analyze container images, inspect files, identify oversized files, and optionally create a new Docker image after removing selected files.

Features

  • Extract the filesystem of a Docker image into a local directory
  • Pull an image automatically if it is not available locally
  • Inspect Docker image contents for research, debugging, and optimization
  • Detect files larger than a configurable size limit
  • Optionally remove selected large files
  • Rebuild a new Docker image from the filtered filesystem
  • Simple command-line interface built with Python

Why use docker-assemble?

Docker images can contain unnecessary files, large artifacts, cached dependencies, logs, build leftovers, or other filesystem content that increases image size. docker-assemble makes it easier to inspect the full filesystem of an image and understand what contributes to its size.

This can be useful for:

  • Docker image analysis
  • Container image optimization
  • DevOps research
  • Security and filesystem inspection
  • Finding large files inside Docker images
  • Rebuilding smaller Docker images
  • Understanding image contents without manually creating containers

Installation

Install from PyPI:

pip install docker-assemble

Requirements

  • Python 3.8+
  • Docker installed and running
  • Access to the Docker daemon

Basic usage

Extract a Docker image filesystem into a local directory:

docker-assemble -d ubuntu:20.04 output_dir

This extracts the filesystem of ubuntu:20.04 into output_dir.

Analyze large files

You can scan the extracted filesystem for files larger than a given size:

docker-assemble -d ubuntu:20.04 output_dir --maximum-file-size 100M

Supported size suffixes include:

  • K for kilobytes
  • M for megabytes
  • G for gigabytes

Examples:

docker-assemble -d ubuntu:20.04 output_dir --maximum-file-size 10M
docker-assemble -d python:3.11 output_dir --maximum-file-size 500M
docker-assemble -d node:20 output_dir --maximum-file-size 1G

Rebuild a filtered Docker image

After detecting large files, docker-assemble can ask which files should be removed and then build a new Docker image:

docker-assemble -d ubuntu:20.04 output_dir \
  --maximum-file-size 100M \
  --new-image-name ubuntu-optimized

Package

docker-assemble is available on PyPI:

pip install docker-assemble

PyPI: https://pypi.org/project/docker-assemble/

Debug mode

Enable debug logging with:

docker-assemble --debug -d ubuntu:20.04 output_dir

Example workflow

# Extract a Docker image filesystem
docker-assemble -d python:3.11 python-image-filesystem

# Find files larger than 100 MB
docker-assemble -d python:3.11 python-image-filesystem --maximum-file-size 100M

# Rebuild a new image after removing selected large files
docker-assemble -d python:3.11 python-image-filesystem \
  --maximum-file-size 100M \
  --new-image-name python-optimized

Use cases

docker-assemble is useful when you need to:

  • inspect the contents of a Docker image
  • analyze why a Docker image is large
  • identify unnecessary files in a container image
  • export an image filesystem for research
  • compare Docker image contents
  • create a smaller image after removing selected files
  • debug container filesystem structure

How it works

docker-assemble uses the Docker SDK for Python to access Docker images. If the requested image is not available locally, it pulls the image. It then creates a temporary container, exports the container filesystem, extracts it into the selected output directory, and optionally rebuilds a new image from a filtered filesystem.

Project status

This project is in active development. Contributions, issues, and suggestions are welcome.

License

This project is licensed under the Apache License 2.0.

Docker is a trademark of Docker, Inc. This project is not affiliated with or endorsed by Docker, Inc.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

docker_assemble-0.5.1.tar.gz (10.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

docker_assemble-0.5.1-py3-none-any.whl (10.8 kB view details)

Uploaded Python 3

File details

Details for the file docker_assemble-0.5.1.tar.gz.

File metadata

  • Download URL: docker_assemble-0.5.1.tar.gz
  • Upload date:
  • Size: 10.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.15

File hashes

Hashes for docker_assemble-0.5.1.tar.gz
Algorithm Hash digest
SHA256 0d977472af7534b4c69fbb3945ea48623bed3201f9979b58dd17ea6bbc79dc0c
MD5 a4ea57b5e6452eb919788fbee080c516
BLAKE2b-256 58a02c25a6f5d60dbb8efe039ec2492d8a34ffd5a0c6a225c29e6b02aa723cf8

See more details on using hashes here.

File details

Details for the file docker_assemble-0.5.1-py3-none-any.whl.

File metadata

File hashes

Hashes for docker_assemble-0.5.1-py3-none-any.whl
Algorithm Hash digest
SHA256 4e8fad193f3b8e261a16be2845d62657cb03184f8e91960a91e1f30397742b99
MD5 bf87c4ba821e25f8eea675f3bf021ec7
BLAKE2b-256 0a58ea1c358a0d0a0709d4e46443e3c5e2423364ad2b269d9bd3deb224cddaf3

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page