Skip to main content

Pythonic C-style structs for parsing binary data

Project description

deconstruct

deconstruct provides a Pythonic analogue of C's struct, primarily for the purpose of interpreting (i.e. deconstructing) contiguous binary data.

Internally, deconstruct uses Python's struct module and can be considered an abstraction of sorts. struct (the module) can be frustrating to use: its format strings appear arcane and furthermore separate the description of the data from its representation, a definite strength of C's struct.

In contrast, deconstruct allows structs to be defined and used using a syntax that is Pythonic while maintaining close correspondence to C.

Usage

With deconstruct, the struct definition (adapted from input.h):

#include <stdint.h>

struct input_event {  
    uint64_t time[2];
    int16_t type;
    int16_t code;
    int32_t value;
};

can be represented as:

import deconstruct as c

class InputEvent(c.Struct):
    time: c.uint64[2]
    type: c.int16
    code: c.int16
    value: c.int32

This definition can be used to interpret and then access binary data:

>>> buffer = b'Some  arbitrary  buffer!'
>>> event = InputEvent(buffer)
>>> event.code
26229
>>> event.time
(8241904116577431379, 2340027244253309282)
>>> print(event)
struct InputEvent [ByteOrder.NATIVE, TypeWidth.STANDARD] {
    time: uint64 = (8241904116577431379, 2340027244253309282)
    type: int16 = 25120
    code: int16 = 26229
    value: int32 = 561145190
}

Of course, in reality the buffer passed in is more likely to come from something more useful, like a file. Notice that fixed-size, n-dimensional arrays can be specified using the syntax type[length], a further improvement on Python's struct.

Installation

deconstruct is now on PyPI:

pip install deconstruct

Alternatively you can install straight from this repository:

pip install https://github.com/biqqles/deconstruct/archive/master.zip

Built wheels are also available under Releases, as is a changelog. The latest release is version 0.5.

deconstruct has no dependencies but requires Python >= 3.6 as it makes use of the class annotation syntax added in that release (see PEP 526).

API listing

Struct(buffer: bytes)

Subclass this to define your own structs. Subclasses should only declare fields of C types defined in this package.

When you instantiate your Struct with a bytes-like object, deconstruct creates a format string and uses it to unpack that buffer. In the instance, C types will be replaced with their equivalent Python types for use (e.g. bytes for char, int for schar and float for double). All types available for use in struct field definitions and their details are documented in the table below.

Attributes

Name Type Description
__byte_order__ ByteOrder Set this in your subclass definition to define the byte order used when unpacking the struct. One of:
  • ByteOrder.NATIVE (default value)
  • ByteOrder.BIG_ENDIAN
  • ByteOrder.LITTLE_ENDIAN
__type_width__ TypeWidth Set this in your subclass definition to define the type width and padding used for the struct. One of:
  • TypeWidth.NATIVE
  • TypeWidth.STANDARD (default value)
When TypeWidth.NATIVE is set, the struct will use native type widths and alignment. When TypeWidth.STANDARD is used, the struct will use Python's struct's "standard" widths1 and no padding.

Note that TypeWidth.NATIVE can only be used with ByteOrder.NATIVE. This is a limitation of Python's struct.

Class methods

Signature Return type Description
new(*args) Struct Construct a new struct instance with field values specified as positional arguments, passed in order of definition. Note that arguments are not type checked.

Class properties

Name Type Description
format_string str The struct.py-compatible format string for this struct
sizeof int The total size in bytes of the struct. Equivalent to C's sizeof

Instance methods

Signature Return type Description
to_bytes() bytes Returns the in-memory ("packed") representation of this struct instance
_require() bool Override this method to specify your own instance validation logic. This method is called each time the struct is initialised; a ValueError will be raised if it returns false.

You can also print Struct instances for easier debugging and compare them using the == operator.

C types

deconstruct defines the following special types for use in Struct definitions:2

deconstruct type C99 type Python format character "Standard" width (bytes)1 Resolves to Python type
char char c 1 bytes of length 1
schar signed char b 1 int
uchar unsigned char B 1 int
short short h 2 int
ushort unsigned short H 2 int
int int i 2 int
uint unsigned int I 2 int
long long l 4 int
ulong unsigned long L 4 int
longlong long long q 8 int
ulonglong unsigned long long Q 8 int
bool bool (_Bool) ? 1 bool
float float f 4 float
double double d 8 float
int8 int8_t b* 1 int
uint8 uint8_t B* 1 int
int16 int16_t h* 2 int
uint16 uint16_t H* 2 int
int32 int32_t l* 4 int
uint32 uint32_t L* 4 int
int64 int64_t q* 8 int
uint64 uint64_t Q* 8 int
ptr void*/intptr_t/uintptr_t P N/A** int
size size_t n N/A** int
ssize ssize_t N N/A** int
* format character with __type_width__ = TypeWidth.STANDARD - platform specific otherwise.
** only available with __type_width__ = TypeWidth.NATIVE.

Arrays

As mentioned earlier, all the types above support a type[length] syntax to define arrays. Multidimensional arrays work as you would expect, with int[2][2] declaring a 2-D array of type int and total length 4. When a Struct is used to unpack a buffer, each array will resolve to a tuple (or in the case of a multidimensional array, a nested tuple) of their equivalent Python types, as documented in the table above. The only exception to this is char, an array of which will be automatically concatenated to a single bytes object (if this behaviour is undesirable, use schar or uchar instead).

Pointers

ptr uniquely supports an optional notation format using the > operator, allowing you to denote the type it points to. This notation is purely for programmer convenience - it, for example, has no effect on the size of the struct as all pointers are assumed to be of the size of void* (which is guaranteed to be able to hold any pointer).

To illustrate this syntax, f: c.ptr > c.double denotes a pointer to double (double* f;). Arrays of pointers and pointers to arrays are supported. For example, c.ptr[2] > c.int indicates an array of int*, while c.ptr > c.int[2] indicates a pointer to an int array.

You can also use Struct subtypes as the pointed-to type.


1. Python's struct has the concept of "standard" type sizes. This is somewhat confusing coming from C as its standards go to some length not to define a standard ABI. However, as this terminology is so fundamental to the documentation of Python's struct it is replicated here for simplicity's sake. These sizes correspond with the minimum sizes implied for C's types.

2. Because some of these conflict with Python's primitives, it is not recommended to import * from deconstruct as this will severely pollute your namespace (in fact this is a bad idea in general). I like to import deconstruct as c as shown above.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

deconstruct-0.6.tar.gz (12.1 kB view details)

Uploaded Source

Built Distribution

deconstruct-0.6-py3-none-any.whl (15.7 kB view details)

Uploaded Python 3

File details

Details for the file deconstruct-0.6.tar.gz.

File metadata

  • Download URL: deconstruct-0.6.tar.gz
  • Upload date:
  • Size: 12.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.6.0 requests/2.24.0 setuptools/50.3.2 requests-toolbelt/0.9.1 tqdm/4.50.0 CPython/3.8.6

File hashes

Hashes for deconstruct-0.6.tar.gz
Algorithm Hash digest
SHA256 e4a4047ecb6ce6dbfb9491c32f570ed98576cc376bb1265fedd4aaf7f32d2744
MD5 bd047cc6c9f4aba45c17a3e37d5db14c
BLAKE2b-256 1f074a9ca2b2e4ff325e458d5e94aab46955ceb9c7dc5f18dc0d976df373f332

See more details on using hashes here.

File details

Details for the file deconstruct-0.6-py3-none-any.whl.

File metadata

  • Download URL: deconstruct-0.6-py3-none-any.whl
  • Upload date:
  • Size: 15.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.6.0 requests/2.24.0 setuptools/50.3.2 requests-toolbelt/0.9.1 tqdm/4.50.0 CPython/3.8.6

File hashes

Hashes for deconstruct-0.6-py3-none-any.whl
Algorithm Hash digest
SHA256 5606522589ae4e7c74a3bd610891e4cb8c9b7ee133f0c0fbabcb828f5444d789
MD5 f427a4e6259123546c488de35a5aa6b6
BLAKE2b-256 7b4dc22ff75fa947f1ba3be13919f4abd064c51bed586446a0c10a185cc0452f

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page