Skip to main content

Parse binary data fields (bit maps, flag sets) represented as hex strings (helpful for parsing separate protocol elements found in trace files for example)

Project description

Upload Python Package PyPi version Supported Python versions License Downloads codecov

bit-parser

This is a configurable parser allowing to define your own low-level protocol and parse its representation provided as hexadecimal string and convert it into a human-readable form.

With bit-parser you can parse bit maps consisting of several bytes, where each bit has its own unique meaning or where several bits are grouped together to represent some kind of status, error code or counter. Also, special helpers are provided for cases where several consecutive bits represent the same value (for example RFU - Reserved for Future Use bits).

This module is not intended for parsing complex streaming protocols or protocols containing many hundreds and thousands of bytes of information, but it can be useful as a sub-component for a more complex parsers or as a parser for a log files containing much of useful information provided in a hexadecimal format.

Byte arrays and hexadecimal strings are accepted as an input.

Contents

Motivation

In software development it's quite often happens to deal with different data represented in not-so-easy-readable formats. Usually microcontrollers used in IoT or other embedded systems, does not have enough resources to output "novels" into their log files describing what just happened in the system. As a result, most of such log files contain a lot of hexadecimal numbers representing statuses, error codes, counters, levels and many more.

Because of that, it is always worth to write additional tooling enabling fast and error-prone reading of such a files.

bit-parser can definitely serve here as a corner-stone component for implementing such a tooling.

Usage

Simple example

Let's start from something simple. Let's imagine in our current IoT project we need to deal with some very simple protocol: controller sends command to one of its peripherals and gets back status of its I/O pins. As a result of such a request we get several bytes of a response, one byte of which encodes situation on I/O pins. There "1" means high voltage level on CPU pin and "0" means low. We can describe our byte of interest as follows:

# Byte 0:
    Bit 7: I/O pin Nr7 high level
    Bit 6: I/O pin Nr6 high level
    Bit 5: I/O pin Nr5 high level
    Bit 4: I/O pin Nr4 high level
    Bit 3: I/O pin Nr3 high level
    Bit 2: I/O pin Nr2 high level
    Bit 1: I/O pin Nr1 high level
    Bit 0: I/O pin Nr0 high level

Controller then writes request and response pair into a log file (or console). And we would like to create a tool allowing us to pass such a log line and get human-readable representation of response bytes.

Here is how we can do that with a help of bit-parser:

# 0. Import parse_bits function
from BitParser import parse_bits
from pprint import pprint

# 1. First we describe bits as python list:
bits_meaning = [ "I/O pin Nr7 high level",
                 "I/O pin Nr6 high level",
                 "I/O pin Nr5 high level",
                 "I/O pin Nr4 high level",
                 "I/O pin Nr3 high level",
                 "I/O pin Nr2 high level",
                 "I/O pin Nr1 high level",
                 "I/O pin Nr0 high level",]

# 2. Just parse..
pprint(parse_bits("A3", bits_meaning))

Console output:

['I/O pin Nr7 high level',
 'I/O pin Nr5 high level',
 'I/O pin Nr1 high level',
 'I/O pin Nr0 high level']

Advanced example

Although it's not a rare case when one bit represents one "thing" in a byte, most of the time information is packed into a bytes in a more efficient way. For example, it is quite often to have situation when part of a byte (some of its bits located one after another) is dedicated to encode some number or a code. For example, we can say that bits 2-4 in a byte 2 are representing some status code. Now, instead of being able to encode only 3 different statuses with 3 bits we are able to encode 8 different statuses (000, 001, 010, .. 111). At the same time other bits in a byte can still represent only one "thing" per bit.

Having that in mind, let's consider more sophisticatyed case.

Imagine we have a thermostat controller and the main module. Main manages temperature controller by sending it commands and obtaining back responses. For example, one of the responses could be represented by two bytes like following:

# Byte 0:
 "sensor ID",            | bit 7:    10000000
 "sensor ID",            | bit 6:    01000000
 "sensor ID",            | bit 5:    00100000
 "temperature status",   | bit 4:    00010000
 "temperature status",   | bit 3:    00001000
 "LED is ON",            | bit 2:    00000100
 "heating mode",         | bits 0-1: 000000xx
# Byte 1:                |
 "heating mode",         | bits 6-7: xx000000
 "heating module 1 on",  | bit 5:    00100000
 "heating module 2 on",  | bit 4:    00010000
 "heating module 3 on",  | bit 3:    00001000
 "heating module 4 on",  | bit 2:    00000100
 "RFU",                  | bit 1:    00000010
 "RFU",                  | bit 0:    00000001

Where
sensor ID - 3 bits representing device ID (meaning we can handle only 8 devices max in the system)
Temperature status (2 bits) can have following values:
    "00" - temperature OK
    "01" - temperature too low
    "10" - temperature too high
    "11" - broken sensor

Heating mode (4 bits):
     0 - 0000 - mode off
     1 - 0001 - mode 1
     2 - 0010 - mode 2
     3 - 0011 - mode 3
     4 - 0100 - mode 4
     5 - 0101 - mode 5
     6 - 0110 - mode 6
     7 - 0111 - mode 7
     8 - 1000 - mode 8
     9 - 1001 - RFU
    10 - 1010 - RFU
    11 - 1011 - RFU
    12 - 1100 - RFU
    13 - 1101 - RFU
    14 - 1110 - RFU
    15 - 1111 - RFU        

From above description we can make conclusion that in this particular case we have 4 different sutuations to handle:

  1. Bits like Byte1.bit2-bit5 ("heating module N on") or Byte1.bit1 ("RFU") represent something on their own (as in the simplest case we had in the first example above)
  2. Byte0.bit5-bit7 represent device ID which means we are interested in a value itself (1,2,3,4,5..) and not in a label ("device id") here, because label will be able to tell only that device "has some id assigned" - which we already know anyway.
  3. Byte0.bit0,bit1-Byte1.bit6,bit7 encode heating mode. Although it would be not too smart to organise these four bits in a way it is shown in our example (bit pairs located in a different bytes), let's assume we got it "as is" and there is no chance to change this protocol. Shortly we will ensure that bit-parser is able to handle even cases like this without a problem. Also note, we have codes from 9 to 15 "Reserved for Future Use. This situation is also quite common in embedded world when we whant to leave some space for future improvements (or vice versa - sometimes empty spaces in a protocol might appear after we improve something)
  4. Byte0.bit2 ("LED is ON") in general looks the same as a bit representing one "thing" (case described in bullet 1). The difference here is that compared to simple case we are interested not only in getting to know when LED is ON, but also to know if LED is OFF. In other words, there should be always a line amongst our parsed lines saying whether LED is ON or OFF.

Now having all these peculiarities in mind, let's define our parser for these two bytes:

from BitParser import parse_bits, MultiBitValueParser, SameValueRange
from pprint import pprint

# describing sensor_id 
sensor_id = MultiBitValueParser(SameValueRange(0b000, 0b111, 3, "sensor ID", return_value_instead_of_name=True))


# describing Heating Mode
heating_mode = MultiBitValueParser({ "0000": "heating mode off",
                                     "0001": "heating mode 1",
                                     "0010": "heating mode 2",
                                     "0011": "heating mode 3",
                                     "0100": "heating mode 4",  # important to describe full range here and not leave 
                                     "0101": "heating mode 5",
                                     "0110": "heating mode 6",
                                     "0111": "heating mode 7",
                                     "1000": "heating mode 8"},  # here dictionary ends. We can have unlimited number of dictionaries or SameValueRange objects separated by comas inside MultiBitValueParser constructor.
                                     SameValueRange(0b1001, 0b1111, 4, "RFU"))  # in this way we can define the whole range having same values

# describing Status
status = MultiBitValueParser({  "00": "temperature OK",
                                "01": "temperature too low",
                                "10": "temperature too high",
                                "11": "broken sensor"})

# LED ON/OFF
led_status = MultiBitValueParser({ "0": "LED is OFF",
                                   "1": "LED is ON"})

# bringing all together
advanced_protocol = [
                        # Byte 0:
                         sensor_id,             # bit 7: 10000000   
                         sensor_id,             # bit 6: 01000000
                         sensor_id,             # bit 5: 00100000
                         status,                # bit 4: 00010000
                         status,                # bit 3: 00001000
                         led_status,            # bit 2: 00000100
                         heating_mode,          # bit 1: 00000010
                         heating_mode,          # bit 0: 00000001
                        # Byte 1:
                         heating_mode,          # bit 7: 10000000
                         heating_mode,          # bit 6: 01000000
                         "heating module 1 on", # bit 5: 00100000
                         "heating module 2 on", # bit 4: 00010000
                         "heating module 3 on", # bit 3: 00001000
                         "heating module 4 on", # bit 2: 00000100
                         "RFU",                 # bit 1: 00000010
                         "RFU",                 # bit 0: 00000001
                      ]

def create_advanced_protocol_parser():
  def advanced_protocol_parser(bytes_to_parse):
      return parse_bits(bytes_to_parse, advanced_protocol)
  return advanced_protocol_parser

advanced_parser = create_advanced_protocol_parser()

pprint(advanced_parser("40 F0")) # spaces are allowed but are not mandatory

Output:

['sensor ID: 2',
 'temperature OK',
 'LED is OFF',
 'heating mode 3',
 'heating module 1 on',
 'heating module 2 on']

Let's try to switch LED ON...

pprint(advanced_parser("4C F0"))

Output:

['sensor ID: 2',
 'temperature too low',
 'LED is ON',
 'heating mode 3',
 'heating module 1 on',
 'heating module 2 on']

Installation

pip install -U bit-parser

API Overview

TBD

Tests

PyTest is used for tests. Python 2 is not supported.

Install PyTest

$ pip install pytest

Run tests

$ pytest test/*

Check test coverage

In order to generate test coverage report install pytest-cov:

$ pip install pytest-cov

Then inside test subdirectory call:

pytest --cov=../BitParser --cov-report=html

License

License Copyright (C) 2022 Vitalij Gotovskij

bit-parser binaries and source code can be used according to the MIT License

Contribute

TBD

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

bit-parser-1.0.4.tar.gz (7.5 kB view details)

Uploaded Source

Built Distribution

bit_parser-1.0.4-py3-none-any.whl (8.0 kB view details)

Uploaded Python 3

File details

Details for the file bit-parser-1.0.4.tar.gz.

File metadata

  • Download URL: bit-parser-1.0.4.tar.gz
  • Upload date:
  • Size: 7.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.8.14

File hashes

Hashes for bit-parser-1.0.4.tar.gz
Algorithm Hash digest
SHA256 82f042dfe2b735f916fa98c37b4db3da238bca5f20d06d193b25be856e75efe3
MD5 ce45d6594e706a8fbf856480fa63e105
BLAKE2b-256 b131708ff95d9ff40fd0fa950ddfee8097c6ea5b4cdca39d015ffb9e4a398a64

See more details on using hashes here.

File details

Details for the file bit_parser-1.0.4-py3-none-any.whl.

File metadata

  • Download URL: bit_parser-1.0.4-py3-none-any.whl
  • Upload date:
  • Size: 8.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.8.14

File hashes

Hashes for bit_parser-1.0.4-py3-none-any.whl
Algorithm Hash digest
SHA256 ba2725989d490280f4d2fe827434a6aebf633e22426763fb745beb9a2f3aa302
MD5 eb47b12a3d1db0828f987cbe9a380793
BLAKE2b-256 5e06b009679d187ba6caf8fcc2328714ee39579fc3c12e87fc8321cff8ba1aba

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page