Skip to main content

Extract files from any kind of container formats

Project description

unblob

Accurate, fast, and easy-to-use extraction suite for binary blobs.

unblob parses unknown binary blobs for 78+ archive, compression, and file-system formats, extracts their content recursively, and carves out unknown chunks. It is the perfect companion for extracting, analyzing, and reverse engineering firmware images.

CI PyPI version PyPI downloads License: MIT


Demo

demo


Features

  • 78+ supported formats — archives, compression streams, and file systems including SquashFS, JFFS2, UBI/UBIFS, ext, CPIO, ZIP, 7-Zip, gzip, XZ, LZMA, LZ4, and many more. See the full list.
  • Recursive extraction — extracts containers within containers up to a configurable depth (default: 10 levels).
  • Precise chunk detection — identifies both start and end offsets of each chunk according to the format standard, minimizing false positives.
  • Unknown chunk carving — carves out and reports data that does not match any known format, automatically identifying null/0xFF padding.
  • Entropy analysis — calculates Shannon entropy and chi-square probability for unknown chunks, useful for spotting encrypted or compressed data.
  • JSON metadata reports — generates structured reports with chunk offsets, sizes, entropy, file ownership, permissions, timestamps, and more.
  • Multi-processing — uses all available CPU cores by default for fast extraction.
  • Extensible plugin system — write custom format handlers and extractors and load them at runtime with --plugins-path.
  • No elevated privileges required — runs safely as a regular user.
  • Battle-tested — fuzz tested against a large corpus of firmware images; relies on audited, pinned dependencies.

Installation

pip (recommended for most users)

pip install unblob

Then install the required external extractor tools. On Ubuntu/Debian:

sudo apt install android-sdk-libsparse-utils e2fsprogs p7zip-full unar zlib1g-dev liblzo2-dev lzop lziprecover libhyperscan-dev zstd lz4

For SquashFS support, also install sasquatch:

curl -L -o sasquatch_1.0.deb "https://github.com/onekey-sec/sasquatch/releases/download/sasquatch-v4.5.1-6/sasquatch_1.0_$(dpkg --print-architecture).deb"
sudo dpkg -i sasquatch_1.0.deb && rm sasquatch_1.0.deb

Verify that all extractors are available:

unblob --show-external-dependencies

Docker (batteries included)

The Docker image bundles all extractors — no extra setup needed:

docker run \
  --rm \
  --pull always \
  -v /path/to/extract-dir:/data/output \
  -v /path/to/files:/data/input \
  ghcr.io/onekey-sec/unblob:latest /data/input/firmware.bin

Note: Mount directories must be owned by the same uid:gid. On multi-user systems, add -u $UID:$GID to the command.

Kali Linux

sudo apt install unblob

Nix

nix profile install nixpkgs#unblob

Or add it to your NixOS/home-manager configuration — see the installation docs for flake and overlay examples.

From source

git clone https://github.com/onekey-sec/unblob.git
cd unblob
uv sync --no-dev
uv run unblob --show-external-dependencies

Requires Python ≥ 3.10, uv, and a Rust toolchain (for the compiled extensions).


Usage

Command line

Extract a file (output goes to <filename>_extract/ by default):

unblob firmware.bin

Specify a custom output directory:

unblob -e /tmp/output firmware.bin

Generate a JSON metadata report:

unblob --report report.json firmware.bin

Limit recursion depth and enable entropy analysis:

unblob -d 5 -n 2 firmware.bin

Skip files matching a magic string prefix:

unblob --skip-magic "POSIX tar archive" firmware.bin

Load a custom handler plugin:

unblob -P ./myplugins/ firmware.bin

Full CLI reference

Usage: unblob [OPTIONS] FILE

Options:
  -e, --extract-dir DIRECTORY     Extract the files to this directory.
  -f, --force                     Force extraction even if outputs already exist.
  -d, --depth INTEGER             Recursion depth (default: 10).
  -n, --entropy-depth INTEGER     Entropy calculation depth (default: 1; 0 = off).
  -P, --plugins-path PATH         Load plugins from the provided path.
  -S, --skip-magic TEXT           Skip files with a given magic prefix.
  -p, --process-num INTEGER       Number of parallel worker processes (default: CPU count).
  --report PATH                   Write a JSON metadata report to this file.
  -k, --keep-extracted-chunks     Keep extracted chunks on disk.
  --delete-extracted-files TEXT   Delete intermediate files after extraction.
  -v, --verbose                   Increase verbosity (-v, -vv, -vvv).
  --show-external-dependencies    List required external tools and their status.
  -h, --help                      Show this message and exit.

Python API

from pathlib import Path
from unblob.processing import ExtractionConfig, process_file

config = ExtractionConfig(
    extract_root=Path("/tmp/output"),
    randomness_depth=1,
)

result = process_file(config, Path("firmware.bin"))

To also write a JSON report:

process_file(config, Path("firmware.bin"), report_file=Path("report.json"))

ExtractionConfig accepts the same options as the CLI: max_depth, process_num, skip_magic, force_extract, keep_extracted_chunks, and more. See the API reference for the full list.


Testing

unblob uses pytest. Integration test fixtures are stored in Git LFS.

# Install Git LFS (one-time setup)
git lfs install

# Install all development dependencies
uv sync --all-extras --dev

# Run the full test suite
uv run pytest tests/ -v

Documentation

Full documentation is available at https://unblob.org:


Contributing

Contributions are welcome! If you would like to add support for a new format or improve an existing one:

  1. Open an issue to describe the format (hex dumps, spec links, and sample files help a lot).
  2. Read the development guide to learn how to write handlers and extractors.
  3. Fork the repository, implement your changes, and open a pull request.

If you just need a format supported and don't want to implement it yourself, open an issue — we'll consider adding it.

See CONTRIBUTING for more details.


License

unblob is licensed under the MIT License.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

unblob-26.3.24.tar.gz (145.4 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

unblob-26.3.24-cp39-abi3-musllinux_1_1_x86_64.whl (716.3 kB view details)

Uploaded CPython 3.9+musllinux: musl 1.1+ x86-64

unblob-26.3.24-cp39-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (512.1 kB view details)

Uploaded CPython 3.9+manylinux: glibc 2.17+ x86-64

unblob-26.3.24-cp39-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (507.6 kB view details)

Uploaded CPython 3.9+manylinux: glibc 2.17+ ARM64

unblob-26.3.24-cp39-abi3-macosx_11_0_arm64.whl (460.9 kB view details)

Uploaded CPython 3.9+macOS 11.0+ ARM64

unblob-26.3.24-cp39-abi3-macosx_10_12_x86_64.whl (464.9 kB view details)

Uploaded CPython 3.9+macOS 10.12+ x86-64

File details

Details for the file unblob-26.3.24.tar.gz.

File metadata

  • Download URL: unblob-26.3.24.tar.gz
  • Upload date:
  • Size: 145.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: maturin/1.12.6

File hashes

Hashes for unblob-26.3.24.tar.gz
Algorithm Hash digest
SHA256 5638afe75ab1ffdf1c150bd38037a3670adf02b3b995c01b3812cc53baafd0b8
MD5 f48b5274366783dd03f819c9d0cd19fa
BLAKE2b-256 bc2f710db94dc1e6f0ca6e942280106a62479d2f4a0138d71bb55f12ceb3cf0b

See more details on using hashes here.

File details

Details for the file unblob-26.3.24-cp39-abi3-musllinux_1_1_x86_64.whl.

File metadata

File hashes

Hashes for unblob-26.3.24-cp39-abi3-musllinux_1_1_x86_64.whl
Algorithm Hash digest
SHA256 688fd0d33d1233000449f79ff0dadc5405096d7d9ac4d439e81001c63c6d54e5
MD5 8290abd6169ed9948ea1903d9d7e9227
BLAKE2b-256 a38618add667b64ce81bba66298d603fa24e7f98789d571b1b8f6758baa31d7e

See more details on using hashes here.

File details

Details for the file unblob-26.3.24-cp39-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for unblob-26.3.24-cp39-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 96db4f3af1c59f0caffebb6e164bf0030662ae22c48782fa7225db67bcb23681
MD5 d4276c5a2b5b0fb917012ec78085d5ef
BLAKE2b-256 a6eef12dd381945c0df908219b1b062c3ba97b59bca56265e961f03ad0604f6c

See more details on using hashes here.

File details

Details for the file unblob-26.3.24-cp39-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for unblob-26.3.24-cp39-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 4bf14e8ceabd73e8c92a53bcebdf2f3c2fe1f714499e4ca136860752b8af582f
MD5 a47ff3c8f676db90f55af432858044d6
BLAKE2b-256 f20217bd5daa947608e7f065217f1dd33e171c96f70bb3d449a5f337dbffcf16

See more details on using hashes here.

File details

Details for the file unblob-26.3.24-cp39-abi3-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for unblob-26.3.24-cp39-abi3-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 7bb761a2d8ff7cb780f5553ff4c39f6368831895b0d2c1fcb1069c7eadc22f81
MD5 c237729feb136668c3564fbea14bc84d
BLAKE2b-256 d802678e7f411f146a287e29be0c2e98fb3f68de86cdc1ffa9fa54aaed992990

See more details on using hashes here.

File details

Details for the file unblob-26.3.24-cp39-abi3-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for unblob-26.3.24-cp39-abi3-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 8f5917e717a5fbecdc83868718ea17cb939a2ce52306a0e95849ca8365107e2f
MD5 4e16512807f85720935a8187fefce28c
BLAKE2b-256 ab22e4186e3a82282fef69561f30981d1a1987e2ae48aabb631e6c7f065f010e

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page