Skip to main content

A powerful declarative symmetric parser/builder for binary data

Project description

Malstruct

Malstruct is a powerful declarative and symmetrical parser and builder for binary data that was forked from construct as of release 2.10.70.

Instead of writing imperative code to parse a piece of data, you declaratively define a data structure that describes your data. As this data structure is not code, you can use it in one direction to parse data into Pythonic objects, and in the other direction, to build objects into binary data.

The library provides both simple, atomic constructs (such as integers of various sizes), as well as composite ones which allow you form hierarchical and sequential structures of increasing complexity. Construct features bit and byte granularity, easy debugging and testing, an easy-to-extend subclass system, and lots of primitive constructs to make your work easier:

  • Fields: raw bytes or numerical types

  • Structs and Sequences: combine simpler constructs into more complex ones

  • Bitwise: splitting bytes into bit-grained fields

  • Adapters: change how data is represented

  • Arrays/Ranges: duplicate constructs

  • Meta-constructs: use the context (history) to compute the size of data

  • If/Switch: branch the computational path based on the context

  • On-demand (lazy) parsing: read and parse only what you require

  • Pointers: jump from here to there in the data stream

  • Tunneling: prefix data with a byte count or compress it

Example

A Struct is a collection of ordered, named fields:

>>> format = Struct(
...     "signature" / Const(b"BMP"),
...     "width" / Int8ub,
...     "height" / Int8ub,
...     "pixels" / Array(this.width * this.height, Byte),
... )
>>> format.build(dict(width=3,height=2,pixels=[7,8,9,11,12,13]))
b'BMP\x03\x02\x07\x08\t\x0b\x0c\r'
>>> format.parse(b'BMP\x03\x02\x07\x08\t\x0b\x0c\r')
Container(signature=b'BMP')(width=3)(height=2)(pixels=[7, 8, 9, 11, 12, 13])

A Sequence is a collection of ordered fields, and differs from Array and GreedyRange in that those two are homogenous:

>>> format = Sequence(PascalString(Byte, "utf8"), GreedyRange(Byte))
>>> format.build([u"lalaland", [255,1,2]])
b'\nlalaland\xff\x01\x02'
>>> format.parse(b"\x004361789432197")
['', [52, 51, 54, 49, 55, 56, 57, 52, 51, 50, 49, 57, 55]]

Malware Analysis

Helpers and utilities have been added to Malstruct to aid in malware analysis and configuration parser development, from simple windows structure extensions to constructs/adapters to aid in processing binary file types (e.g. PE, ELF, and Mach-O).

For example, when attempting to extract a referenced string from a 64-bit PE file the following can assist:

>>> spec = FocusLast(
    "re" / RegexSearch(
        re.compile(
            # test64.exe @ 0x14000101d
            br"""
                \x45\x33\xc9                    # xor     r9d, r9d; lpNumberOfCharsWritten
                \x41\xb8(?P<size>.{4})          # mov     r8d, 0Eh; nNumberOfCharsToWrite
                \x48\x8d\x15(?P<ro>.{4})(?P<e>) # lea     rdx, aHelloWorld; "Hello, World!\n"
                \x48\x8b\x4c\x24.               # mov     rcx, [rsp+48h+hConsoleOutput]; hConsoleOutput
                \xff\x15.{4}                    # call    cs:WriteConsoleA
                \x33\xc9                        # xor     ecx, ecx; uExitCode
            """,
            re.DOTALL | re.VERBOSE
        ),
        size=Int32ul,
        ro=Int32ul,
        e=Tell
    ),
    PEPointer64(this.re.ro, this.re.e, String(this.re.size))
)
>>> spec.parse(data, pe=pe)
'Hello, World!\n'

Alternatively to using PEPointer64, users can leverage the PEMemoryAddress adapter to perform the internal memory conversion calculation as follows:

>>> spec = FocusLast(
    "re" / RegexSearch(
        re.compile(
            # test64.exe @ 0x14000101d
            br"""
                \x45\x33\xc9                    # xor     r9d, r9d; lpNumberOfCharsWritten
                \x41\xb8(?P<size>.{4})          # mov     r8d, 0Eh; nNumberOfCharsToWrite
                \x48\x8d\x15(?P<ro>.{4})(?P<e>) # lea     rdx, aHelloWorld; "Hello, World!\n"
                \x48\x8b\x4c\x24.               # mov     rcx, [rsp+48h+hConsoleOutput]; hConsoleOutput
                \xff\x15.{4}                    # call    cs:WriteConsoleA
                \x33\xc9                        # xor     ecx, ecx; uExitCode
            """,
            re.DOTALL | re.VERBOSE
        ),
        size=Int32ul,
        ro=Int32ul,
        e=PEMemoryAddress(Tell)
    ),
    PEPointer(this.re.ro + this.re.e, String(this.re.size))
)
>>> spec.parse(data, pe=pe)
'Hello, World!\n'

PEcon

Included in malstruct is the pecon (PE file reconstruction utility) package. Please see the pecon API documentation for more information.

Changelog

The format is based on Keep a Changelog, and this project adheres to Calendar Versioning with the schema MAJOR.MINOR.YYYY0M0D.

3.0.20260518 - 2026-05-18

Added

  • Alias VarIntl to VarInt

  • Add VarIntb for big-endian parsing

  • PEImport, PEImportPointer, and PEImportSymbol to process imported APIs from memory address references in PE files

3.0.20260429 - 2026-04-29

Changed

  • Split out core functionality across adapters, alignment, analysis, bytes_, conditional, exceptions, expr, helpers, integers, lazy, mappings, miscellaneous, stream, strings, and transforms

  • Moved binary file analysis to malstruct.binaryfiles

  • Moved remaining malstruct.utils functionality to base level

  • Added pecon utility as an installed package

  • Move from “flat” layout to “src” layout

  • Use pyproject.toml configuration file for packaging

Removed

  • Removed usage of __all__ in init

  • Removed usage of compilation feature and benchmarks

  • Removed py3compat functionality

  • Removed pefileutils and elffileutils

  • Removed functionality from machoutils unrelated to malstructs/adapters

2.10.71

Changed

  • Reverted default behavior changed by https://github.com/construct/construct/pull/1015
    • OffsettedEnd, Prefixed, FixedSize, NullTerminated, NullStriped, ProcessXor use offsets relative to the last occurrence of these subconstructs

    • To use offsets relative to the beginning of the stream set absolute=True when constructing these constructs

  • Moved optional dependencies to required dependencies

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

malstruct-3.0.20260518.tar.gz (126.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

malstruct-3.0.20260518-py3-none-any.whl (113.9 kB view details)

Uploaded Python 3

File details

Details for the file malstruct-3.0.20260518.tar.gz.

File metadata

  • Download URL: malstruct-3.0.20260518.tar.gz
  • Upload date:
  • Size: 126.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for malstruct-3.0.20260518.tar.gz
Algorithm Hash digest
SHA256 8371128414e0c334880b378695fefbe19bab8f0b5978d16b98db0dcd6dd4c9af
MD5 f61bd1a237905a4b87acfa2bc216fad6
BLAKE2b-256 11221bc8df2db7836d91bc5fa574957371a87a5cc172d759e80e33903d2497f7

See more details on using hashes here.

Provenance

The following attestation bundles were made for malstruct-3.0.20260518.tar.gz:

Publisher: main.yml on ciphertechsolutions/malstruct

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file malstruct-3.0.20260518-py3-none-any.whl.

File metadata

File hashes

Hashes for malstruct-3.0.20260518-py3-none-any.whl
Algorithm Hash digest
SHA256 cfcb7a3f1cd03485a5b4843632aeefdafad4da3b7a706a1af9bae57a44bb87cb
MD5 b94812938bf24af5be1690efd32d8624
BLAKE2b-256 a9815c14d513a0291cd0422c8ebdd2c416afd03698f7ee5222738c3155ad7ce0

See more details on using hashes here.

Provenance

The following attestation bundles were made for malstruct-3.0.20260518-py3-none-any.whl:

Publisher: main.yml on ciphertechsolutions/malstruct

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page