Parse binary files by describing their structure.
Project description
Parse Binary File
Install with
python -m pip install parse_binary_file
Parse binary files by describing their structure.
Components
FieldDescription
Describes a field.
Properties
-
type: The type of the field being described. When loading data into the field it will be parsed based on the type. Valid types are:
- 'bytes': Leaves the value as bytes.
- 'str': String
- 'int': Integer, must be 4 bytes
- 'float': Float, must be 4 bytes
- Any type may be wrapped in brackets
[]
to indicate the field is an array of that value type. Details of the elements are set using theelements
property. [Default: 'bytes']
-
name: Name of the field. This is used to label the field.
-
value: The expected value of the field. This can be usedto indicate a specific value is expected for the field.
-
size: Number of bytes in the field. The last field can have a size of -1 which indicates the field continues to the end of the file.
-
terminator: Sequence of values indicating the end of the field.
-
null_terminated: Indicates the field is null terminated. Setting to
True
is equivalent to settingterminator
to0x00
. -
is_null: Indicates the field only contains null bytes. Settings to true is equivalent to setting
value
equal to all null bytes. Must setsize
ifTrue
. -
description: Description of the field.
-
elements: Used to specifiy the structure of the elements of an array. [TODO]
Methods
- *FieldDescription(*properties): Initializes a new
FieldDescription
with the provided properties.
Field
Contains information about a field, including its description and loaded value.
Properties
-
desc:
Field
's description. -
value:
Field
's value. -
expected_value:
self.desc.value
. Provided for convenience. -
size: If the
Field
's value is set, returns its size. If theField
's value has not yet been set return theField
's expected size if available. -
data: Original data as bytes.
-
value_matches: If the
Field
's loaded value matches that expected by the description. If no value was specified by the description returnsTrue
-
Values of the
Field
's description are available directly as well.
Methods
-
Field(desc: FieldDescription): Initializes a new
Field
.desc
should describe the field. -
from_dict(desc: dict): Initializes a new
Field
usingdesc
as its description. -
*set_value(val: bytes, *options): Sets the value of the
Field
based on its description and the provided value.**options
are additional arguments to specify how to parse the value.
Data
Represents data from a file.
Properties
-
info:
-
fields:
-
defaults:
-
is_loaded: If data has been loaded yet.
-
value: Value of the loaded data, or
None
if it has not yet been loaded. If loaded, the value is a list of teh value for each field. -
named_field_values: Returns a dictionary of name-vlaue pairs for named
Field
s.
Methods
-
from_dicts(desc: list[dict]): Instantiates a new
Data
from a list of dictionaries describing its fields. -
keys(): Returns a list of named
Field
s names. -
load(stream: io.RawIOBase): Loads data from a readable stream.
Use
This library is intended to be used by describing the struture of a binary file
format in a configuration file. That file is then loaded and used to create a
Data
instance. The Data
instance can then load()
data from a file and
assort it into the correct fields.
Example
We would like to describe a new binary file format of .msg
.
Offset | Size | Name | Description |
---|---|---|---|
0x00 | 0x02 | head | 0xff 0x00 Indicates the file is a .msg file. |
0x02 | 0x04 | size | Total size of the file. |
0x06 | 0x04 | -- | Null bytes to separate header from body. |
0x0a | -- | greeting | Message greeting. |
-- | -- | message | Message content. |
We can describe this structure in a YAML file.
msg.yaml
- name: 'head'
value: "\xff\x00"
- type: 'int'
name: 'file_size'
description: 'Size of file in bytes'
- size: 4
is_null: true
- type: 'str'
null_terminated: True
name: 'greeting'
- type: 'str'
size: -1
name: 'message'
To read a .msg
file we can use the description of it.
import os
import yaml # use pyyaml to load `msg.yaml`
import parse_binary_file as pbf
# load description
with open('msg.yaml') as f:
msg_desc = yaml.safe_load(f)
# create fields
fields = [pbf.Field.from_dict(f) for f in msg_desc]
# create data
data = pbf.Data.from_dicts(msg_desc, info = {'byte_order': 'little'})
with open('my_message.msg', 'rb') as f:
data.load(f)
is_corrupt = (data['file_size'] != os.path.getsize('my_message.msg'))
greeting = data['greeting'].value
message = data[-1].value
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for parse_binary_file-0.0.1-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 21188f5f7a72279543837223921e4f5fc7d77a272f0434d7f74e6e561de814af |
|
MD5 | 9a28bee5bcd918d8d7fa47753293a6c9 |
|
BLAKE2b-256 | c30666a3d90ac57fd005e60735e0b4396ebbadbed1daa437ecff5fb3394d3f03 |