Facilities associated with binary data parsing and transcription.
Project description
Facilities associated with binary data parsing and transcription.
Note: this module requires Python 3 and recommends Python 3.6+ because it uses abc.ABC, because a Python 2 bytes object is too weak (just a str) as also is my cs.py3.bytes hack class and because the keyword based Packet initiialisation benefits from keyword argument ordering.
In the description below I use the word "chunk" to mean a piece
of binary data obeying the buffer protocol, almost always a
bytes
instance or a memoryview
, but in principle also things
like bytearray
.
The classes in this module support easy parsing of binary data structures.
The functions and classes in this module include:
PacketField
: an abstract class for a binary field, with a factory method to parse it, a transcription method to transcribe it back out in binary form and usually a.value
attribute holding the parsed value.- several presupplied subclasses for common basic types such
as
UInt32BE
(an unsigned 32 bit big endian integer). struct_field
: a factory for making PacketField classes forstruct
formats with a single value field.multi_struct_field
andstructtuple
: factories for makingPacketField
s fromstruct
formats with multiple value fields;structtuple
makesPacketFields
which are alsonamedtuple
s, supporting trivial access to the parsed values.Packet
: aPacketField
subclass for parsing multiplePacketFields
into a larger structure with ordered named fields. The fields themselves may bePacket
s for complex structures.
You don't need to make fields only from binary data; because
PacketField.__init__
takes a post parse value, you can also
construct PacketField
s from scratch with their values and
transcribe the resulting binary form.
Each PacketField
subclass has the following methods:
transcribe
: easily return the binary transcription of this field, either directly as a chunk (or for convenience, also None or an ASCII str) or by yielding successive binary data.from_buffer
: a factory to parse this field from acs.buffer.CornuCopyBuffer
.from_bytes
: a factory to parse this field from a chunk with an optional starting offset; this is a convenience wrapper forfrom_buffer
.
That may sound a little arcane, but we also supply:
flatten
: a recursive function to take the return from anytranscribe
method and yield chunks, so copying a packet to a file or elsewhere can always be done by iterating overflatten(field.transcribe())
or via the conveniencefield.transcribe_flat()
method which callsflatten
itself.- a
CornuCopyBuffer
is an easy to use wrapper for parsing any iterable of chunks, which may come from almost any source. It has a bunch of convenient factories including:from_bytes
, make a buffer from a chunk;from_fd
, make a buffer from a file descriptor;from_file
, make a buffer from a file-like object;from_mmap
, make a buffer from a file descriptor using a memory map (themmap
module) of the file, so that chunks can use the file itself as backing store instead of allocating and copying memory. See thecs.buffer
module for further detail.
When parsing a complex structure
one must choose between subclassing PacketField
or Packet
.
An effective guideline is the degree of substructure.
A Packet
is designed for deeper structures;
all of its attributes are themselves PacketField
s
(or Packets
, which are PacketField
subclasses).
The leaves of this hierarchy will be PacketFields
,
whose attributes are ordinary types.
By contrast, a PacketField
's attributes are "flat" values:
the plain post-parse value, such as a str
or an int
or some other conventional Python type.
The base case for PacketField
is a single such value, named .value
,
and the natural implementation
is to provide a .value_from_buffer
method
which returns the basic single value
and the corresponding .transcribe_value
method
to return or yield its binary form
(directly or in pieces respectively).
However, you can handle multiple attributes with this class by instead implementing:
__init__
: to compose an instance from post-parse values (and thus from scratch rather than parsed from existing binary data)from_buffer
: class method to parse the values from aCornuCopyBuffer
and call the class constructortranscribe
: to return or yield the binary form of the attributes
Cameron Simpson cs@cskk.id.au 22jul2018
Class BSData
MRO: PacketField
, abc.ABC
A run length encoded data chunk, with the length encoded as a BSUInt.
Class BSSFloat
MRO: PacketField
, abc.ABC
A float transcribed as a BSString of str(float).
Class BSString
MRO: PacketField
, abc.ABC
A run length encoded string, with the length encoded as a BSUInt.
Class BSUInt
MRO: PacketField
, abc.ABC
A binary serialsed unsigned int.
This uses a big endian byte encoding where continuation octets have their high bit set. The bits contributing to the value are in the low order 7 bits.
Class BytesesField
MRO: PacketField
, abc.ABC
A field containing a list of bytes chunks.
The following attributes are defined:
value
: the gathered data as a list of bytes instances, or None if the field was gathered withdiscard_data
true.offset
: the starting offset of the data.end_offset
: the ending offset of the data.
The offset
and end_offset
values are recorded during the
parse, and may become irrelevant if the field's contents are
changed.
Class BytesField
MRO: PacketField
, abc.ABC
A field of bytes.
Class BytesRunField
MRO: PacketField
, abc.ABC
A field containing a continuous run of a single bytes value.
The following attributes are defined:
length
: the length of the runbytes_value
: the repeated bytes value
The property value
is computed on the fly on every reference
and returns a value obeying the buffer protocol: a bytes or
memoryview object.
Class EmptyPacketField
MRO: PacketField
, abc.ABC
An empty data field, used as a placeholder for optional
fields when they are not present.
The singleton EmptyField
is a predefined instance.
Function fixed_bytes_field(length, class_name=None)
Factory for BytesField subclasses built from fixed length byte strings.
Function flatten(chunks)
Flatten chunks
into an iterable of bytes
instances.
This exists to allow subclass methods to easily return ASCII
strings or bytes or iterables or even None
, in turn allowing
them simply to return their superclass' chunks iterators
directly instead of having to unpack them.
Class Float64BE
MRO: PacketField
, abc.ABC
A PacketField which parses and transcribes the struct format '>d'
.
Class Float64LE
MRO: PacketField
, abc.ABC
A PacketField which parses and transcribes the struct format '<d'
.
Class Int16BE
MRO: PacketField
, abc.ABC
A PacketField which parses and transcribes the struct format '>h'
.
Class Int16LE
MRO: PacketField
, abc.ABC
A PacketField which parses and transcribes the struct format '<h'
.
Class Int32BE
MRO: PacketField
, abc.ABC
A PacketField which parses and transcribes the struct format '>l'
.
Class Int32LE
MRO: PacketField
, abc.ABC
A PacketField which parses and transcribes the struct format '<l'
.
Class ListField
MRO: PacketField
, abc.ABC
A field which is a list of other fields.
Function multi_struct_field(struct_format, subvalue_names=None, class_name=None)
Factory for PacketField subclasses build around complex struct formats.
Parameters:
struct_format
: the struct format stringsubvalue_names
: an optional namedtuple field name list; if supplied then the field value will be a namedtuple with these namesclass_name
: option name for the generated class
Class Packet
MRO: PacketField
, abc.ABC
Base class for compound objects derived from binary data.
Class PacketField
MRO: abc.ABC
A record for an individual packet field.
Function struct_field(struct_format, class_name)
Factory for PacketField subclasses built around a single struct format.
Parameters:
struct_format
: the struct format string, specifying a single struct fieldclass_name
: the class name for the generated class
Example:
>>> UInt16BE = struct_field('>H', class_name='UInt16BE')
>>> UInt16BE.__name__
'UInt16BE'
>>> UInt16BE.format
'>H'
>>> UInt16BE.struct #doctest: +ELLIPSIS
<Struct object at ...>
>>> field, offset = UInt16BE.from_bytes(bytes((2,3,4)))
>>> field
UInt16BE(515)
>>> offset
2
>>> field.value
515
Function structtuple(class_name, struct_format, subvalue_names)
Convenience wrapper for multi_struct_field.
Class UInt16BE
MRO: PacketField
, abc.ABC
A PacketField which parses and transcribes the struct format '>H'
.
Class UInt16LE
MRO: PacketField
, abc.ABC
A PacketField which parses and transcribes the struct format '<H'
.
Class UInt32BE
MRO: PacketField
, abc.ABC
A PacketField which parses and transcribes the struct format '>L'
.
Class UInt32LE
MRO: PacketField
, abc.ABC
A PacketField which parses and transcribes the struct format '<L'
.
Class UInt64BE
MRO: PacketField
, abc.ABC
A PacketField which parses and transcribes the struct format '>Q'
.
Class UInt64LE
MRO: PacketField
, abc.ABC
A PacketField which parses and transcribes the struct format '<Q'
.
Class UInt8
MRO: PacketField
, abc.ABC
A PacketField which parses and transcribes the struct format 'B'
.
Class UTF8NULField
MRO: PacketField
, abc.ABC
A NUL terminated UTF-8 string.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.