Skip to main content

Extract files from any kind of container formats

Project description

unblob

Accurate, fast, and easy-to-use extraction suite for binary blobs.

unblob parses unknown binary blobs for 78+ archive, compression, and file-system formats, extracts their content recursively, and carves out unknown chunks. It is the perfect companion for extracting, analyzing, and reverse engineering firmware images.

CI PyPI version PyPI downloads License: MIT


Demo

demo


Features

  • 78+ supported formats — archives, compression streams, and file systems including SquashFS, JFFS2, UBI/UBIFS, ext, CPIO, ZIP, 7-Zip, gzip, XZ, LZMA, LZ4, and many more. See the full list.
  • Recursive extraction — extracts containers within containers up to a configurable depth (default: 10 levels).
  • Precise chunk detection — identifies both start and end offsets of each chunk according to the format standard, minimizing false positives.
  • Unknown chunk carving — carves out and reports data that does not match any known format, automatically identifying null/0xFF padding.
  • Entropy analysis — calculates Shannon entropy and chi-square probability for unknown chunks, useful for spotting encrypted or compressed data.
  • JSON metadata reports — generates structured reports with chunk offsets, sizes, entropy, file ownership, permissions, timestamps, and more.
  • Multi-processing — uses all available CPU cores by default for fast extraction.
  • Extensible plugin system — write custom format handlers and extractors and load them at runtime with --plugins-path.
  • No elevated privileges required — runs safely as a regular user.
  • Battle-tested — fuzz tested against a large corpus of firmware images; relies on audited, pinned dependencies.

Installation

pip (recommended for most users)

pip install unblob

Then install the required external extractor tools. On Ubuntu/Debian:

sudo apt install android-sdk-libsparse-utils e2fsprogs p7zip-full unar zlib1g-dev liblzo2-dev lzop lziprecover libhyperscan-dev zstd lz4

For SquashFS support, also install sasquatch:

curl -L -o sasquatch_1.0.deb "https://github.com/onekey-sec/sasquatch/releases/download/sasquatch-v4.5.1-6/sasquatch_1.0_$(dpkg --print-architecture).deb"
sudo dpkg -i sasquatch_1.0.deb && rm sasquatch_1.0.deb

Verify that all extractors are available:

unblob --show-external-dependencies

Docker (batteries included)

The Docker image bundles all extractors — no extra setup needed:

docker run \
  --rm \
  --pull always \
  -v /path/to/extract-dir:/data/output \
  -v /path/to/files:/data/input \
  ghcr.io/onekey-sec/unblob:latest /data/input/firmware.bin

Note: Mount directories must be owned by the same uid:gid. On multi-user systems, add -u $UID:$GID to the command.

Kali Linux

sudo apt install unblob

Nix

nix profile install nixpkgs#unblob

Or add it to your NixOS/home-manager configuration — see the installation docs for flake and overlay examples.

From source

git clone https://github.com/onekey-sec/unblob.git
cd unblob
uv sync --no-dev
uv run unblob --show-external-dependencies

Requires Python ≥ 3.10, uv, and a Rust toolchain (for the compiled extensions).


Usage

Command line

Extract a file (output goes to <filename>_extract/ by default):

unblob firmware.bin

Specify a custom output directory:

unblob -e /tmp/output firmware.bin

Generate a JSON metadata report:

unblob --report report.json firmware.bin

Limit recursion depth and enable entropy analysis:

unblob -d 5 -n 2 firmware.bin

Skip files matching a magic string prefix:

unblob --skip-magic "POSIX tar archive" firmware.bin

Load a custom handler plugin:

unblob -P ./myplugins/ firmware.bin

Full CLI reference

Usage: unblob [OPTIONS] FILE

Options:
  -e, --extract-dir DIRECTORY     Extract the files to this directory.
  -f, --force                     Force extraction even if outputs already exist.
  -d, --depth INTEGER             Recursion depth (default: 10).
  -n, --entropy-depth INTEGER     Entropy calculation depth (default: 1; 0 = off).
  -P, --plugins-path PATH         Load plugins from the provided path.
  -S, --skip-magic TEXT           Skip files with a given magic prefix.
  -p, --process-num INTEGER       Number of parallel worker processes (default: CPU count).
  --report PATH                   Write a JSON metadata report to this file.
  -k, --keep-extracted-chunks     Keep extracted chunks on disk.
  --delete-extracted-files TEXT   Delete intermediate files after extraction.
  -v, --verbose                   Increase verbosity (-v, -vv, -vvv).
  --show-external-dependencies    List required external tools and their status.
  -h, --help                      Show this message and exit.

Python API

from pathlib import Path
from unblob.processing import ExtractionConfig, process_file

config = ExtractionConfig(
    extract_root=Path("/tmp/output"),
    randomness_depth=1,
)

result = process_file(config, Path("firmware.bin"))

To also write a JSON report:

process_file(config, Path("firmware.bin"), report_file=Path("report.json"))

ExtractionConfig accepts the same options as the CLI: max_depth, process_num, skip_magic, force_extract, keep_extracted_chunks, and more. See the API reference for the full list.


Testing

unblob uses pytest. Integration test fixtures are stored in Git LFS.

# Install Git LFS (one-time setup)
git lfs install

# Install all development dependencies
uv sync --all-extras --dev

# Run the full test suite
uv run pytest tests/ -v

Documentation

Full documentation is available at https://unblob.org:


Contributing

Contributions are welcome! If you would like to add support for a new format or improve an existing one:

  1. Open an issue to describe the format (hex dumps, spec links, and sample files help a lot).
  2. Read the development guide to learn how to write handlers and extractors.
  3. Fork the repository, implement your changes, and open a pull request.

If you just need a format supported and don't want to implement it yourself, open an issue — we'll consider adding it.

See CONTRIBUTING for more details.


License

unblob is licensed under the MIT License.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

unblob-26.3.30.tar.gz (145.5 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

unblob-26.3.30-cp39-abi3-musllinux_1_1_x86_64.whl (716.3 kB view details)

Uploaded CPython 3.9+musllinux: musl 1.1+ x86-64

unblob-26.3.30-cp39-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (512.1 kB view details)

Uploaded CPython 3.9+manylinux: glibc 2.17+ x86-64

unblob-26.3.30-cp39-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (507.6 kB view details)

Uploaded CPython 3.9+manylinux: glibc 2.17+ ARM64

unblob-26.3.30-cp39-abi3-macosx_11_0_arm64.whl (460.9 kB view details)

Uploaded CPython 3.9+macOS 11.0+ ARM64

unblob-26.3.30-cp39-abi3-macosx_10_12_x86_64.whl (465.0 kB view details)

Uploaded CPython 3.9+macOS 10.12+ x86-64

File details

Details for the file unblob-26.3.30.tar.gz.

File metadata

  • Download URL: unblob-26.3.30.tar.gz
  • Upload date:
  • Size: 145.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: maturin/1.12.6

File hashes

Hashes for unblob-26.3.30.tar.gz
Algorithm Hash digest
SHA256 ee53a63fe91f1bc08a2158dc1019dec5315832044f83ca44e0302fdecf50639b
MD5 92e8046406dd1ac5ad527a0f58a64dbb
BLAKE2b-256 289c2fc5d8f2a91f7339e63deb2a527ce8f798982b997187964fe0a99b7b4a51

See more details on using hashes here.

File details

Details for the file unblob-26.3.30-cp39-abi3-musllinux_1_1_x86_64.whl.

File metadata

File hashes

Hashes for unblob-26.3.30-cp39-abi3-musllinux_1_1_x86_64.whl
Algorithm Hash digest
SHA256 7757bcccecaf45e6915d12e8c32dc39e03336900de1d6502abc68cbdf048e239
MD5 908f22cec4554347b26b0cdb9561b422
BLAKE2b-256 edcb8fd8004bdbd26aa6ffb0c9158a0b684c241ac4c3ca23b2bf34f50fcbc8bd

See more details on using hashes here.

File details

Details for the file unblob-26.3.30-cp39-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for unblob-26.3.30-cp39-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 25aa542d4b61eba3a4cacd3e6c83ff4dae0fcc203a8d206ee887fc89d3a67885
MD5 802c891492580ea408f38599d390b227
BLAKE2b-256 9d1271ad9c15c7d9e6c237acee8ecf50693d74f3dbb9caff1ddf1c455c65c00f

See more details on using hashes here.

File details

Details for the file unblob-26.3.30-cp39-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for unblob-26.3.30-cp39-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 9ba16a2270f02d0a0b032c2d0b0066b7ccf9664b46791276b1370d01f385d353
MD5 c392884aa16e2f57303db0d0a6e2bf71
BLAKE2b-256 5057bf3322c8a3c5322e076a4071499db1245f6018df7edb34e2799aac59161a

See more details on using hashes here.

File details

Details for the file unblob-26.3.30-cp39-abi3-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for unblob-26.3.30-cp39-abi3-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 65fe1e05659c4d4d41be9ca12f1c915d8c0f2748b64108de0c580462dc4972e4
MD5 aa568f19c90998ba24ddc1610793d4e5
BLAKE2b-256 6ac8edf5cb35fb66126c421da633d85c13675815f4d3af7b32870f4a32b1d036

See more details on using hashes here.

File details

Details for the file unblob-26.3.30-cp39-abi3-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for unblob-26.3.30-cp39-abi3-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 69742fffacc060c31383a56db9e97f3d7ef850f8de3061024ccbbcc4241b0ce4
MD5 f777e226022400df671fda29d4e774b3
BLAKE2b-256 70ae9582490da4af98533dbb1d1e85e2b22b11d1dace1ad3aff836713ec2bbe8

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page