Skip to main content

CLI tool for extracting Docker image filesystems, inspecting large files, and rebuilding optimized Docker images.

Project description

docker-assemble

PyPI License

docker-assemble is a Python CLI tool for extracting Docker image filesystems, inspecting image contents, finding large files, and rebuilding optimized Docker images.

It helps developers, researchers, and DevOps engineers understand what is inside a Docker image by exporting the image filesystem into a local directory. You can use it to analyze container images, inspect files, identify oversized files, and optionally create a new Docker image after removing selected files.

Features

  • Extract the filesystem of a Docker image into a local directory
  • Pull an image automatically if it is not available locally
  • Inspect Docker image contents for research, debugging, and optimization
  • Detect files larger than a configurable size limit
  • Optionally remove selected large files
  • Rebuild a new Docker image from the filtered filesystem
  • Simple command-line interface built with Python

Why use docker-assemble?

Docker images can contain unnecessary files, large artifacts, cached dependencies, logs, build leftovers, or other filesystem content that increases image size. docker-assemble makes it easier to inspect the full filesystem of an image and understand what contributes to its size.

This can be useful for:

  • Docker image analysis
  • Container image optimization
  • DevOps research
  • Security and filesystem inspection
  • Finding large files inside Docker images
  • Rebuilding smaller Docker images
  • Understanding image contents without manually creating containers

Installation

Install from PyPI:

pip install docker-assemble

Requirements

  • Python 3.8+
  • Docker installed and running
  • Access to the Docker daemon

Basic usage

Extract a Docker image filesystem into a local directory:

docker-assemble -d ubuntu:20.04 output_dir

This extracts the filesystem of ubuntu:20.04 into output_dir.

Analyze large files

You can scan the extracted filesystem for files larger than a given size:

docker-assemble -d ubuntu:20.04 output_dir --maximum-file-size 100M

Supported size suffixes include:

  • K for kilobytes
  • M for megabytes
  • G for gigabytes

Examples:

docker-assemble -d ubuntu:20.04 output_dir --maximum-file-size 10M
docker-assemble -d python:3.11 output_dir --maximum-file-size 500M
docker-assemble -d node:20 output_dir --maximum-file-size 1G

Rebuild a filtered Docker image

After detecting large files, docker-assemble can ask which files should be removed and then build a new Docker image:

docker-assemble -d ubuntu:20.04 output_dir \
  --maximum-file-size 100M \
  --new-image-name ubuntu-optimized

Package

docker-assemble is available on PyPI:

pip install docker-assemble

PyPI: https://pypi.org/project/docker-assemble/

Debug mode

Enable debug logging with:

docker-assemble --debug -d ubuntu:20.04 output_dir

Example workflow

# Extract a Docker image filesystem
docker-assemble -d python:3.11 python-image-filesystem

# Find files larger than 100 MB
docker-assemble -d python:3.11 python-image-filesystem --maximum-file-size 100M

# Rebuild a new image after removing selected large files
docker-assemble -d python:3.11 python-image-filesystem \
  --maximum-file-size 100M \
  --new-image-name python-optimized

Use cases

docker-assemble is useful when you need to:

  • inspect the contents of a Docker image
  • analyze why a Docker image is large
  • identify unnecessary files in a container image
  • export an image filesystem for research
  • compare Docker image contents
  • create a smaller image after removing selected files
  • debug container filesystem structure

How it works

docker-assemble uses the Docker SDK for Python to access Docker images. If the requested image is not available locally, it pulls the image. It then creates a temporary container, exports the container filesystem, extracts it into the selected output directory, and optionally rebuilds a new image from a filtered filesystem.

Project status

This project is in active development. Contributions, issues, and suggestions are welcome.

License

This project is licensed under the Apache License 2.0.

Docker is a trademark of Docker, Inc. This project is not affiliated with or endorsed by Docker, Inc.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

docker_assemble-0.5.2.tar.gz (10.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

docker_assemble-0.5.2-py3-none-any.whl (10.8 kB view details)

Uploaded Python 3

File details

Details for the file docker_assemble-0.5.2.tar.gz.

File metadata

  • Download URL: docker_assemble-0.5.2.tar.gz
  • Upload date:
  • Size: 10.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.15

File hashes

Hashes for docker_assemble-0.5.2.tar.gz
Algorithm Hash digest
SHA256 041867bc9dc6a02c6f5648373feb077b87db3d9d64d87cb8b6bbf84f63c8c1ec
MD5 1940e0ef74be310d3498ea0dd6408f79
BLAKE2b-256 016e7c7e173fe19f82a449c005c7b3ab792805c714ccbaf1522a30615bd2cef4

See more details on using hashes here.

File details

Details for the file docker_assemble-0.5.2-py3-none-any.whl.

File metadata

File hashes

Hashes for docker_assemble-0.5.2-py3-none-any.whl
Algorithm Hash digest
SHA256 b54861c699b5706f9873a01b76430c059738604dc6731d8f2856bc96433d1e08
MD5 ba58d1639fe322fdd73891124444576c
BLAKE2b-256 248aca96699b931679163d6d64a075a340a0bfc43160a98143e96b2b4cecaa4c

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page