Skip to main content

Python package to translate C struct to classes

Project description

cstructimpl

A Python package for translating C structs into Python classes.

PyPI version License Python Versions


Quick Start

Install from PyPI:

pip install cstructimpl

Define your struct and parse raw bytes:

>>> from cstructimpl import *
>>> class Info(CStruct):
...     age: Annotated[int, CInt.U8]
...     height: Annotated[int, CInt.U8]
...
>>> class Person(CStruct):
...     info: Info
...     name: Annotated[str, CStr(6)]
...
>>> Person.c_decode(bytes([18, 170]) + b"Pippo\x00")
Person(info=Info(age=18, height=170), name='Pippo')

Introduction

cstructimpl makes working with binary data in Python simple and intuitive.
By subclassing CStruct, you can define Python classes that map directly to C-style structs and parse raw bytes into fully typed objects.

No manual parsing, no boilerplate — just define your struct and let the library do the heavy lifting.


Type System

At the core of the library is the BaseType protocol, which defines how types behave in the C world:

class BaseType(Protocol[T]):

    def c_size(self) -> int: ...
    def c_align(self) -> int: ...

    def c_decode(
        self,
        raw: bytes,
        *,
        is_little_endian: bool = True,
    ) -> T | None: ...

    def c_encode(
        self,
        data: T,
        *,
        is_little_endian: bool = True,
    ) -> bytes: ...

Any class that follows this protocol can act as a BaseType, controlling its own parsing, size, and alignment.

When parsing a struct:

  • If a field type is itself a BaseType, parsing happens automatically.
  • Otherwise, annotate the field with Annotated[..., BaseType] to tell the parser how to interpret it.
  • Types such as int have a default converter for a BaseType if no annotation is provided. If you want to change this behavior you need to ovveride them in the following dictionary cstructimpl.c_lib.DEFAULT_TYPE_TO_BASETYPE.

The library comes with a set of ready-to-use type definitions that cover the majority of C primitive types.

Class / Type Description
BaseType[T] Protocol that defines the interface for any encodable/decodable C-compatible type.
HasBaseType Protocol for classes that return their own associated BaseType.
GetType Wrapper calling c_get_type() on classes implementing HasBaseType. Enables automatic size, alignment, decode, encode access.
CInt Enum covering signed/unsigned C integer types (I8/U8 → I128/U128).
CBool Boolean BaseType with a C-compatible single-byte representation.
CFloat Enum of IEEE‑754‑compliant floating‑point formats (F32, F64).
CArray[T] Generic BaseType for fixed‑length arrays of a given element type.
CPadding Represents unused/padding bytes between struct fields.
CStr C‑style null‑terminated string of fixed max length.
CMapper[T,U] Adapts between a BaseType[U] and custom Python type T. Useful for enums or custom conversions.

Examples

Here are a few practical examples showing how cstructimpl works in real-world scenarios.

Basic Deserialization

Define a simple struct with two fields:

>>> class Point(CStruct):
...     x: Annotated[int, CInt.U8]
...     y: Annotated[int, CInt.U8]
...
>>> Point.c_size()
2
>>> Point.c_align()
1
>>> Point.c_decode(bytes([1, 2]))
Point(x=1, y=2)

Serializing a Class

Create a class instance and serlialize it to raw bytes

>>> class Rect(CStruct):
...     width: Annotated[int, CInt.U8]
...     height: Annotated[int, CInt.U8] = 10
...
>>> rect = Rect(2)
>>> list(rect.c_encode())
[2, 10]

Nested Structs

You can embed structs inside other structs:

>>> class Dimensions(CStruct):
...     width: Annotated[int, CInt.U8]
...     height: Annotated[int, CInt.U8]
...
>>> class Rectangle(CStruct):
...     id: Annotated[int, CInt.U16]
...     dims: Dimensions
...
>>> Rectangle.c_size()
4
>>> Rectangle.c_align()
2
>>> Rectangle.c_decode(bytes([1, 0, 2, 3]))
Rectangle(id=1, dims=Dimensions(width=2, height=3))

Strings in Structs

Support for C-style null-terminated strings:

>>> class Message(CStruct):
...     length: Annotated[int, CInt.U16]
...     text: Annotated[str, CStr(6)]
...
>>> raw = bytes([5, 0]) + b"Hello\x00"
>>> Message.c_decode(raw)
Message(length=5, text='Hello')

Enums with Autocast

Automatically cast numeric values into Python Enums:

>>> class Mood(IntEnum):
...     HAPPY = 0
...     SAD = 1
...
>>> class Person(CStruct):
...     age: Annotated[int, CInt.U16]
...     mood: Annotated[Mood, CInt.U8, Autocast()]
...
>>> raw = bytes([18, 0, 1, 0])
>>> Person.c_decode(raw)
Person(age=18, mood=<Mood.SAD: 1>)

Arrays of Structs

Define fixed-size arrays of structs inside another struct:

>>> class Item(CStruct, align=2):
...     a: Annotated[int, CInt.U8]
...     b: Annotated[int, CInt.U8]
...     c: Annotated[int, CInt.U8]
...
>>> class ItemList(CStruct):
...     items: Annotated[list[Item], CArray(Item, 3)]
...
>>> data = bytes(range(1, 13))  # 3 items x 4 bytes each
>>> parsed = ItemList.c_decode(data)
>>> parsed == ItemList([
...     Item(1, 2, 3),
...     Item(5, 6, 7),
...     Item(9, 10, 11),
... ])
True

Custom BaseType

Hey! Is there a type that serializes an hash-map of list of structs of ...?

Yeah, sure there is! You can do it yourself!

cstructimpl lets you define your own BaseType implementations to handle any kind of data that is not present among the built-in primitives.

For example, here's a custom type that interprets a raw integer as a Unix timestamp, returning a Python datetime object:

>>> from datetime import datetime
>>> class UnixTimestamp(BaseType[datetime]):
...     def c_size(self) -> int:
...         return 4
...
...     def c_align(self) -> int:
...         return 4
...
...     def c_decode(self, raw: bytes, *, is_little_endian: bool = True) -> datetime:
...         byteorder = "little" if is_little_endian else "big"
...         ts = int.from_bytes(raw, byteorder=byteorder, signed=False)
...         return datetime.fromtimestamp(ts)
...
...     def c_encode(self): pass
...
>>> class LogEntry(CStruct):
...     timestamp: Annotated[datetime, UnixTimestamp()]
...     level: Annotated[int, CInt.U8]
...
>>> LogEntry.c_decode(bytes([55, 0, 0, 0, 3, 0, 0, 0]))
LogEntry(timestamp=datetime.datetime(1970, ..., 55), level=3)

Bit-Fields

Bit fields are very useful especially in the networking context, having the ability to name the bit ranges is very powerful. cstructimpl has the capability to reinterpret the bits into its own type system, enabling the use of all its tools, like autocasting, mapping, ...

Example of a header call with bitfields as enumeration and optional flags.

>>> from enum import IntFlag, IntEnum
>>> class Flags(IntFlag):
...     ACK = 1 << 0
...     SYN = 1 << 1
...     URG = 1 << 2
...
>>> class State(IntEnum):
...     PENDING = 0
...     ERROR = 1
...     SUCCESS = 2
...
>>> class Header(CStruct):
...     port: Annotated[int, CInt.U8, BitField(4)]
...     id: Annotated[int, CInt.U8, BitField(4)]
...     state: Annotated[State, CInt.U8, BitField(2), Autocast()]
...     flags: Annotated[Flags, CInt.U8, BitField(3), Autocast()]
...     len: Annotated[int, CInt.U8]
...
>>> raw = 0x101A21.to_bytes(3, byteorder="little", signed=False)
>>> Header.c_decode(raw)
Header(port=1, id=2, state=<State.SUCCESS: 2>, flags=<Flags.SYN|URG: 6>, len=16)

Autocast

Sometimes raw numeric values carry semantic meaning. In C, this is usually handled with enums.
With cstructimpl, you can automatically reinterpret values into enums (or other types) using Autocast.

>>> class ResultType(IntEnum):
...     OK = 0
...     ERROR = 1
...
>>> class Person(CStruct):
...     kind: Annotated[ResultType, CInt.U8, Autocast()]
...     error_code: Annotated[int, CInt.I32]
...

This is equivalent to writing a custom builder:

>>> class ResultType(IntEnum):
...     OK = 0
...     ERROR = 1
...
>>> class Person(CStruct):
...     kind: Annotated[ResultType, CMapper(CInt.U8, ResultType, int)]
...     error_code: Annotated[int, CInt.I32]
...

But much simpler and less error-prone.


Features

  • Define Python classes that map directly to C structs
  • Parse raw bytes into typed objects with a single method call
  • Serialize a class to raw bytes using built-in type system
  • Built-in type system for common C primitives
  • Support for nested structs
  • Flexible extension via the BaseType protocol

Use Cases

  • Parsing binary network protocols
  • Working with binary file formats
  • Interfacing with C libraries and data structures
  • Replacing boilerplate parsing code with clean, type-safe classes

Documentation

More detailed usage examples and advanced topics are available in the documentation.


Contributing

Contributions are welcome!

If you'd like to improve cstructimpl, please open an issue or submit a pull request on GitHub.


License

This project is licensed under the terms of the Apache-2.0 License.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cstructimpl-0.6.0.tar.gz (22.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

cstructimpl-0.6.0-py3-none-any.whl (22.9 kB view details)

Uploaded Python 3

File details

Details for the file cstructimpl-0.6.0.tar.gz.

File metadata

  • Download URL: cstructimpl-0.6.0.tar.gz
  • Upload date:
  • Size: 22.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for cstructimpl-0.6.0.tar.gz
Algorithm Hash digest
SHA256 d4bdb7e058382fda87797fb6e8bd5aa90c6fa3008d15e897285163d466955168
MD5 f1c361182728c79f6b4600800d5a180b
BLAKE2b-256 02b53da13940174a44591beb52c449ac5451daf31b66832b28e575e175e55b29

See more details on using hashes here.

File details

Details for the file cstructimpl-0.6.0-py3-none-any.whl.

File metadata

  • Download URL: cstructimpl-0.6.0-py3-none-any.whl
  • Upload date:
  • Size: 22.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for cstructimpl-0.6.0-py3-none-any.whl
Algorithm Hash digest
SHA256 ae6f974c111f08308a04e17c3170e0759f71adae41b6fe3fea6d52ef89b4dc9c
MD5 8e52fa546dc07dfd99faf27e40daedb4
BLAKE2b-256 1551f89026bb805d85f251054d6de3f7865bde74b7e2e39b042231c3cdb9bfd4

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page