Skip to main content

WebAssembly decoder & disassembler

Project description

wasm-tob

Python module capable of decoding and disassembling WebAssembly modules and bytecode, according to the MVP specification of the WASM binary format.

As there is no official text format defined yet, the text format implemented doesn't correspond to any existing definition and is a simple mnemonic op1, op2, ... format. Functions are formatted in a way similar to how Google Chrome does in the debug console.

❗ Important

This is a fork of the original project that the author was no longer able to spend time on: https://github.com/athre0z/wasm. The changes made here are primarily to support the Manticore project.

New issues and pull requests will be reviewed on a best-effort basis. Please open an issue first if you think fixing the problem will be complex; this is so we can evaluate whether a fix or feature is in scope before comitting time to review. When opening an issue, please include information on how to reproduce what you are seeing. If you feel comfortable, please submit a well-crafted, minimal pull request that we can review.

Installation

# From PyPi
pip install wasm-tob

# From GitHub
pip install git+https://github.com/trail-of-forks/wasm-tob.git

Examples

Parsing a WASM module, printing the types of sections found.

from wasm_tob import decode_module

with open('input-samples/hello/hello.wasm', 'rb') as raw:
    raw = raw.read()

mod_iter = iter(decode_module(raw))
header, header_data = next(mod_iter)

for cur_sec, cur_sec_data in mod_iter:
    print(cur_sec_data.get_decoder_meta()['types']['payload'])

Possible output:

<wasm_tob.modtypes.TypeSection object at 0x108249b90>
<wasm_tob.modtypes.ImportSection object at 0x108249bd0>
<wasm_tob.modtypes.FunctionSection object at 0x108249c10>
<wasm_tob.modtypes.GlobalSection object at 0x108249cd0>
<wasm_tob.modtypes.ExportSection object at 0x108249d10>
<wasm_tob.modtypes.ElementSection object at 0x108249d90>
<wasm_tob.modtypes.CodeSection object at 0x108249dd0>
<wasm_tob.modtypes.DataSection object at 0x108249e10>
<wasm_tob.types.BytesField object at 0x108249b10>

Parsing specific sections (eg. GlobalSection, ElementSection, DataSection) in WASM module, printing each section's content:

from wasm_tob import (
    decode_module,
    format_instruction,
    format_lang_type,
    format_mutability,
    SEC_DATA,
    SEC_ELEMENT,
    SEC_GLOBAL,
)

with open('input-samples/hello/hello.wasm', 'rb') as raw:
    raw = raw.read()

mod_iter = iter(decode_module(raw))
header, header_data = next(mod_iter)

for cur_sec, cur_sec_data in mod_iter:
    if cur_sec_data.id == SEC_GLOBAL:
        print("GlobalSection:")
        for idx, entry in enumerate(cur_sec_data.payload.globals):
            print(
                format_mutability(entry.type.mutability),
                format_lang_type(entry.type.content_type),
            )

            for cur_insn in entry.init:
                print(format_instruction(cur_insn))

    if cur_sec_data.id == SEC_ELEMENT:
        print("ElementSection:")
        for idx, entry in enumerate(cur_sec_data.payload.entries):
            print(entry.index, entry.num_elem, entry.elems)
            for cur_insn in entry.offset:
                print(format_instruction(cur_insn))

    if cur_sec_data.id == SEC_DATA:
        print("DataSection:")
        for idx, entry in enumerate(cur_sec_data.payload.entries):
            print(entry.index, entry.size, entry.data.tobytes())
            for cur_insn in entry.offset:
                print(format_instruction(cur_insn))

Output:

GlobalSection:
mut i32
get_global 0
end
mut i32
get_global 1
end
[...]
mut f32
f32.const 0x0
end
mut f32
f32.const 0x0
end
ElementSection:
0 12576 [856, 856, 856, [...], 888]
i32.const 0
end
DataSection:
0 16256 b'\x98&\x00\x00\xfe4\x00\x00\x10\x04\x00\x00\x00...\x00N10__cxxabiv121__vmi_class_type_infoE'
get_global 8
end

Manually disassemble WASM bytecode, printing each instruction.

from wasm_tob import (
    decode_bytecode,
    format_instruction,
    INSN_ENTER_BLOCK,
    INSN_LEAVE_BLOCK,
)

raw = bytearray([2, 127, 65, 24, 16, 28, 65, 0, 15, 11])
indent = 0
for cur_insn in decode_bytecode(raw):
    if cur_insn.op.flags & INSN_LEAVE_BLOCK:
        indent -= 1
    print('  ' * indent + format_instruction(cur_insn))
    if cur_insn.op.flags & INSN_ENTER_BLOCK:
        indent += 1

Output:

block -1
  i32.const 24
  call 28
  i32.const 0
  return
end

wasmdump command-line tool

The module also comes with a simple command-line tool called wasmdump, dumping all module struct in tree format. Optionally, it also disassembles all functions found when invoked with --disas (slow).

Version support

This project aims to support all Python releases that are still actively supported and maintained. If you encounter issues with a particular Python version, please open an issue.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

wasm-tob-1.0.1.tar.gz (15.7 kB view details)

Uploaded Source

Built Distribution

wasm_tob-1.0.1-py3-none-any.whl (17.3 kB view details)

Uploaded Python 3

File details

Details for the file wasm-tob-1.0.1.tar.gz.

File metadata

  • Download URL: wasm-tob-1.0.1.tar.gz
  • Upload date:
  • Size: 15.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.11.2

File hashes

Hashes for wasm-tob-1.0.1.tar.gz
Algorithm Hash digest
SHA256 0f78c55c619ab9e459b3dcf00f8aa267053be7b957e1a3ea89747f2dd97c2b16
MD5 5baac6661c02973087230e59e66235e7
BLAKE2b-256 7b73fe4593d7193fbe93cacfe8be17a7d87accd794f3d1c8e79e5a4b1b2117a2

See more details on using hashes here.

File details

Details for the file wasm_tob-1.0.1-py3-none-any.whl.

File metadata

  • Download URL: wasm_tob-1.0.1-py3-none-any.whl
  • Upload date:
  • Size: 17.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.11.2

File hashes

Hashes for wasm_tob-1.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 43964558326a943f7291615650b54da2a964a80029c8c952e49011901c89405a
MD5 b10e4b3fc5d1a3fad5d65af7a396470b
BLAKE2b-256 742cf8f5bcc5939063e15061dfaa59491ee480b1c4a1aac30c75e6235428528d

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page