Skip to main content

A Python toolset for encoding and decoding Protocol Data Units (PDUs)

Project description

ProtocolDataUnits

Documentation Open Source Love MIT license made-with-python

ProtocolDataUnits is a Python toolset for encoding and decoding Protocol Data Units (PDUs). It is inspired by ProtocolDataUnits.jl by Dr. Mandar Chitre, ARL.

Installation

You can install ProtocolDataUnits using pip:

pip install . # pip install ProtocolDataUnits

🔧 Command-Line Interface (CLI) Usage

After installation, you can quickly test the functionality using the built-in demo CLI tool:

pdu-demo

This runs a collection of predefined PDU encoding/decoding examples showcasing:

  • Basic field types (e.g., uint8, float, double)
  • Fixed-length and length-prefixed strings
  • Variable-length arrays
  • Nested PDUs and pdu_fragment
  • JSON serialization and deserialization
  • Compression and decompression support

This is a great way to validate the installation and explore the library’s features interactively.

Features

ProtocolDataUnits is a Python toolset for encoding and decoding Protocol Data Units (PDUs). It includes the following features:

  • Base PDU Definition: Define the structure and format of your PDUs.
  • PDU Encoding/Decoding: Encode and decode PDUs with various field types.
  • Nested PDU / PDU_FRAGMENT Support: Supports PDUs within PDUs, allowing for complex data structures.
  • CRC32 Checksum: Automatically compute and validate CRC32 checksums for data integrity.
  • Field Encoding/Decoding: Supports a variety of field types including integers, floats, strings, and arrays.
  • Byte Order Conversion: Flexibly handle big-endian and little-endian byte orders.
  • Metadata Storage: Store additional metadata within PDUs.
  • Stream Writing/Reading: Efficiently write and read PDUs to and from streams.
  • Variable Length Encoding/Decoding: Handle fields with variable length data.
  • Pretty Printing of PDUs: Generate human-readable representations of PDUs.
  • PDU Equality based on Fields: Compare PDUs based on their field values.
  • Serialization and Deserialization: Serialize PDU definitions to JSON and deserialize them back to PDU objects.

Usage

Creating PDU Formats

You can create PDU formats using the user-friendly API, defining the structure, encoding data into binary format, and decoding it back into structured data.

from ProtocolDataUnits.pdu import create_pdu_format, PDU

# Create a simple PDU format with uint8, float, and double fields
my_pdu_format = create_pdu_format(
    24, 'big',  # Total length: 24 bytes, big-endian order
    ('uint8', 'type'),  # 1 byte
    ('float', 'value1'),  # 4 bytes
    ('double', 'value2')  # 8 bytes
)

# Encode data into binary format
encoded_bytes = my_pdu_format.encode({'type': 7, 'value1': 3.14, 'value2': 6.28})
print(f"Encoded Bytes: {encoded_bytes}")

# Decode binary data back into structured format
decoded_data = my_pdu_format.decode(encoded_bytes)
print(f"Decoded Data: {decoded_data}")

Working with Nested PDUs

Nested PDUs allow you to embed one PDU structure within another, enabling more complex data structures.

# Define a nested PDU format
nested_pdu = create_pdu_format(
    8, 'big',  # Length: 8 bytes, big-endian order
    ('uint8', 'nested_type'),  # 1 byte
    ('uint8', 'nested_value')  # 1 byte
)

# Define the main PDU format containing the nested PDU
main_pdu = create_pdu_format(
    16, 'big',  # Length: 16 bytes, big-endian order
    ('uint8', 'type'),  # 1 byte
    ('pdu_fragment', 'nested', nested_pdu)  # Nested PDU takes up 8 bytes
)

# Encode data with nested PDU
encoded_bytes = main_pdu.encode({'type': 7, 'nested': {'nested_type': 1, 'nested_value': 2}})
print(f"Encoded Bytes: {encoded_bytes}")

# Decode data with nested PDU
decoded_data = main_pdu.decode(encoded_bytes)
print(f"Decoded Data: {decoded_data}")

Serialization and Deserialization

You can serialize a PDU's structure to JSON, allowing you to save and reload PDU definitions, making it easy to share PDU formats or store them for later use.

# Create a complex PDU with various field types
my_pdu = PDU().length(68).order('big').uint8('type').float('value1').double('value2').fixed_string('fixed_str', 10).length_prefixed_string('length_str').variable_length_array('array', 'uint8').padding(0xff)

# Encode data into the PDU format
encoded_bytes = my_pdu.encode({'type': 7, 'value1': 3.14, 'value2': 6.28, 'fixed_str': 'hello', 'length_str': 'dynamic string', 'array': [1, 2, 3, 4, 5]}, compress=True)
print(f"Encoded Bytes: {encoded_bytes}")

# Decode the PDU back into structured data
decoded_data = my_pdu.decode(encoded_bytes, decompress=True)
print(f"Decoded Data: {decoded_data}")

# Serialize the PDU definition to JSON
json_str = my_pdu.to_json()
print(f"Serialized PDU to JSON: {json_str}")

# Deserialize the PDU from JSON
new_pdu = PDU.from_json(json_str)
print(f"Deserialized PDU from JSON: {new_pdu.to_json()}")

# Encode and decode using the deserialized PDU
encoded_bytes_new = new_pdu.encode({'type': 7, 'value1': 3.14, 'value2': 6.28, 'fixed_str': 'hello', 'length_str': 'dynamic string', 'array': [1, 2, 3, 4, 5]}, compress=True)
print(f"Encoded Bytes (new PDU): {encoded_bytes_new}")

decoded_data_new = new_pdu.decode(encoded_bytes_new, decompress=True)
print(f"Decoded Data (new PDU): {decoded_data_new}")

Advanced Example: API Usage

# Create a PDU format with various fields using the API
my_pdu_format = create_pdu_format(
    48, 'big',  # Length: 48 bytes, big-endian order
    ('uint8', 'type'),
    ('float', 'value1'),
    ('double', 'value2'),
    ('fixed_string', 'fixed_str', 10),
    ('length_prefixed_string', 'length_str'),
    ('variable_length_array', 'array', 'uint8'),
    ('padding', 0xff)
)

# Nested PDU example
nested_pdu = create_pdu_format(
    8, 'big',
    ('uint8', 'nested_type'),
    ('uint8', 'nested_value')
)

main_pdu = create_pdu_format(
    16, 'big',
    ('uint8', 'type'),
    ('pdu_fragment', 'nested', nested_pdu)
)

# Encode data
encoded_bytes = main_pdu.encode({'type': 7, 'nested': {'nested_type': 1, 'nested_value': 2}})
print(f"Encoded Bytes: {encoded_bytes}")

# Decode data
decoded_data = main_pdu.decode(encoded_bytes)
print(f"Decoded Data: {decoded_data}")

Notes

  • Compression and Decompression: Using compress=True in encode() will compress the resulting byte array with zlib, reducing size. Use decompress=True in decode() to decode compressed data.
  • Serialization: Convert your PDU definitions to JSON for easy saving and sharing of structures.
  • Flexible Field Types: Supports integers, floats, fixed-length strings, length-prefixed strings, arrays, nested PDUs, and more.

Data Encoding Details

The ProtocolDataUnits library provides support for various types of data, including fixed-length strings, length-prefixed strings, and variable-length arrays. Below is an explanation of how each of these types is encoded:

1. Fixed-Length Strings

A fixed-length string is encoded as a byte sequence of a specified length. If the string provided is shorter than the specified length, it is padded with null bytes (\x00). If the string is longer, it is truncated to fit the specified length.

  • Field Definition: fixed_string('name', length=10)
  • Example:
    • Input: "hello" (5 characters)
    • Length: 10
    • Encoded: b'hello\x00\x00\x00\x00\x00' (5 characters plus 5 null bytes)
    • If the input was "helloworld!!!", only b'helloworld' (10 characters) would be encoded.

2. Length-Prefixed Strings

A length-prefixed string is encoded as an integer representing the length of the string, followed by the string data itself. The length is typically stored as a 4-byte unsigned integer (uint32), allowing strings up to 2^32 - 1 bytes.

  • Field Definition: length_prefixed_string('name')

  • Example:

    • Input: "dynamic string"
    • Encoded: b'\x0e\x00\x00\x00dynamic string'
      • b'\x0e\x00\x00\x00' represents the length 14 (0x0e in hexadecimal) as a 4-byte integer.
      • b'dynamic string' is the actual string data.

    This encoding ensures that the length of the string is known during decoding, making it possible to handle strings of varying lengths.

3. Variable-Length Arrays

A variable-length array is encoded similarly to length-prefixed strings. The array is prefixed with a 4-byte integer (uint32) that indicates the number of elements in the array, followed by the serialized form of each element.

  • Field Definition: variable_length_array('name', element_type='uint8')

  • Example:

    • Input: [1, 2, 3, 4, 5] (an array of uint8)
    • Encoded: b'\x05\x00\x00\x00\x01\x02\x03\x04\x05'
      • b'\x05\x00\x00\x00' represents the length 5 as a 4-byte integer.
      • b'\x01\x02\x03\x04\x05' contains the array elements encoded as uint8 values.

    Each element of the array is encoded according to the specified element_type. For instance, if the array type is uint16, each element would occupy 2 bytes.

4. Padding

Padding is used to align data to a specific byte boundary or to ensure that the PDU reaches a predefined length. Padding bytes are usually filled with a specific value (often zero or a user-defined value).

  • Field Definition: padding(value=0xff)
  • Example:
    • If 4 bytes of padding are needed, the encoded value might be b'\xff\xff\xff\xff' when value=0xff.

Summary Table

Field Type Prefix/Length Data Format Example (in bytes)
Fixed-Length String None Data padded or truncated to length b'hello\x00\x00\x00\x00\x00' (length 10)
Length-Prefixed String 4 bytes Length + Data b'\x0e\x00\x00\x00dynamic string' (length 14)
Variable-Length Array 4 bytes Length + Element Data b'\x05\x00\x00\x00\x01\x02\x03\x04\x05'
Padding None Repeated value bytes b'\xff\xff\xff\xff'

Additional Notes

  • The use of length prefixes allows the decoder to know precisely how many bytes to read for a string or an array, making variable-length fields easier to handle.
  • For large arrays or strings, consider the impact of storing length prefixes as uint32 (4 bytes), as they slightly increase the size of the encoded data.
  • Padding ensures data alignment but may add extra bytes to the encoded PDU.

CITATION

  • If you use ProtocolDataUnits for your research, please cite it as below:
    @software{Patel_ProtocolDataUnits_2023,
    author = {Patel, Jay and Chitre, Mandar},
    license = {MIT},
    month = oct,
    title = {{ProtocolDataUnits}},
    url = {https://github.com/patel999jay/ProtocolDataUnits},
    version = {1.0.0},
    year = {2023}
    }
    

Contributing

Pull requests are welcome. For major changes, please open an issue first to discuss what you would like to change.

Please make sure to update tests as appropriate.

License

MIT

References:

  1. ProtocolDataUnits.jl by Dr. Mandar Chitre, ARL

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

protocoldataunits-1.0.2.tar.gz (16.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

protocoldataunits-1.0.2-py3-none-any.whl (13.7 kB view details)

Uploaded Python 3

File details

Details for the file protocoldataunits-1.0.2.tar.gz.

File metadata

  • Download URL: protocoldataunits-1.0.2.tar.gz
  • Upload date:
  • Size: 16.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.3

File hashes

Hashes for protocoldataunits-1.0.2.tar.gz
Algorithm Hash digest
SHA256 62e4395a89da9e5ebafdf6dbb31c1f389f44ab2cbfaa6a6959a11505b50336b8
MD5 5dcfd9437f915c0d3877a03d50fdb9dc
BLAKE2b-256 d4f054dd285e35075143e2805fe59ee71ba7b6430380b5715b90a43bc25a83e3

See more details on using hashes here.

File details

Details for the file protocoldataunits-1.0.2-py3-none-any.whl.

File metadata

File hashes

Hashes for protocoldataunits-1.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 550e8591f5b8d88ec57ea0605c148726e956a5202d166319215d9aa954e79de9
MD5 c2fa361e74927e529a0204c7e1dfb6d6
BLAKE2b-256 69b8982204b4c89dca96e3ef8984bfdb722e09ccd9891ed3447bf81b338e8597

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page