CLI tool for extracting Docker image filesystems, inspecting large files, and rebuilding optimized Docker images.
Project description
docker-assemble
docker-assemble is a Python CLI tool for extracting Docker image filesystems, inspecting image contents, finding large files, and rebuilding optimized Docker images.
It helps developers, researchers, and DevOps engineers understand what is inside a Docker image by exporting the image filesystem into a local directory. You can use it to analyze container images, inspect files, identify oversized files, and optionally create a new Docker image after removing selected files.
Features
- Extract the filesystem of a Docker image into a local directory
- Pull an image automatically if it is not available locally
- Inspect Docker image contents for research, debugging, and optimization
- Detect files larger than a configurable size limit
- Optionally remove selected large files
- Rebuild a new Docker image from the filtered filesystem
- Simple command-line interface built with Python
Why use docker-assemble?
Docker images can contain unnecessary files, large artifacts, cached dependencies, logs, build leftovers, or other filesystem content that increases image size. docker-assemble makes it easier to inspect the full filesystem of an image and understand what contributes to its size.
This can be useful for:
- Docker image analysis
- Container image optimization
- DevOps research
- Security and filesystem inspection
- Finding large files inside Docker images
- Rebuilding smaller Docker images
- Understanding image contents without manually creating containers
Installation
Install from PyPI:
pip install docker-assemble
Requirements
- Python 3.8+
- Docker installed and running
- Access to the Docker daemon
Basic usage
Extract a Docker image filesystem into a local directory:
docker-assemble -d ubuntu:20.04 output_dir
This extracts the filesystem of ubuntu:20.04 into output_dir.
Analyze large files
You can scan the extracted filesystem for files larger than a given size:
docker-assemble -d ubuntu:20.04 output_dir --maximum-file-size 100M
Supported size suffixes include:
Kfor kilobytesMfor megabytesGfor gigabytes
Examples:
docker-assemble -d ubuntu:20.04 output_dir --maximum-file-size 10M
docker-assemble -d python:3.11 output_dir --maximum-file-size 500M
docker-assemble -d node:20 output_dir --maximum-file-size 1G
Rebuild a Docker image
Pass --new-image-name to rebuild the extracted filesystem as a single-layer image (FROM scratch + COPY . /). --maximum-file-size is optional:
-
Without
--maximum-file-size— no files are filtered out. The new image contains the same content as the original, just consolidated into one layer. Useful for comparing a multi-layer original against a squashed single-layer version without conflating filtering effects.docker-assemble -d ubuntu:20.04 output_dir \ --new-image-name ubuntu-squashed
-
With
--maximum-file-size—docker-assemblelists files above the threshold, asks which should be removed, and rebuilds the image without them:docker-assemble -d ubuntu:20.04 output_dir \ --maximum-file-size 100M \ --new-image-name ubuntu-optimized
Package
docker-assemble is available on PyPI:
pip install docker-assemble
PyPI: https://pypi.org/project/docker-assemble/
Debug mode
Enable debug logging with:
docker-assemble --debug -d ubuntu:20.04 output_dir
Example workflow
# Extract a Docker image filesystem
docker-assemble -d python:3.11 python-image-filesystem
# Find files larger than 100 MB
docker-assemble -d python:3.11 python-image-filesystem --maximum-file-size 100M
# Rebuild a new image after removing selected large files
docker-assemble -d python:3.11 python-image-filesystem \
--maximum-file-size 100M \
--new-image-name python-optimized
Use cases
docker-assemble is useful when you need to:
- inspect the contents of a Docker image
- analyze why a Docker image is large
- identify unnecessary files in a container image
- export an image filesystem for research
- compare Docker image contents
- create a smaller image after removing selected files
- debug container filesystem structure
How it works
docker-assemble uses the Docker SDK for Python to access Docker images. If the requested image is not available locally, it pulls the image. It then creates a temporary container, exports the container filesystem, extracts it into the selected output directory, and optionally rebuilds a new image from a filtered filesystem.
Project status
This project is in active development. Contributions, issues, and suggestions are welcome.
License
This project is licensed under the Apache License 2.0.
Docker is a trademark of Docker, Inc. This project is not affiliated with or endorsed by Docker, Inc.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file docker_assemble-0.5.3.tar.gz.
File metadata
- Download URL: docker_assemble-0.5.3.tar.gz
- Upload date:
- Size: 10.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.15
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a8d4ac9ab3253a2f49d8116654c5b0e911c6c8f62a0f277a7ee4eb49fb6856d6
|
|
| MD5 |
e0db703eb18fbd36fa3667d134dad7f1
|
|
| BLAKE2b-256 |
91c2e2beb123e6d3172517f6b5106bf7a7568fb8b23821bec2be423f9bd1d486
|
File details
Details for the file docker_assemble-0.5.3-py3-none-any.whl.
File metadata
- Download URL: docker_assemble-0.5.3-py3-none-any.whl
- Upload date:
- Size: 11.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.15
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
6152577fe4f417fd558c000e1890e7c1d33de1c325d8b4f36a0ee395cf5a312a
|
|
| MD5 |
b97fd6ccbb1e9f4cf924fc03d9113113
|
|
| BLAKE2b-256 |
436ad9e9a480a4bac2152cd6500472da6536c7c83add9ec9bc9c0dcf9f85b795
|