Library to pack and unpack structurized binary data.
Project description
Caterpillar - 🐛
Caterpillar is a Python 3.12+ library to pack and unpack structurized binary data (with support for 3.10+). It enhances the capabilities of Python Struct by enabling direct class declaration. More information about the different configuration options will be added in the future. Documentation is here >.
Caterpillar is able to:
- Pack and unpack data just from processing Python class definitions (including support for powerful bitfields, c++-like templates and c-like unions!),
- apply a wide range of data types (with endianess and architecture configuration),
- dynamically adapt structs based on their inheritance layout,
- reduce the used memory space using
__slots__, - allowing you to place conditional statements into class definitions,
- insert proper types into the class definition to support documentation and
- it helps you to create cleaner and more compact code.
- There is also a feature that lets you dynamically change the endian within a struct!
- You can even extend Caterpillar and write your parsing logic in C or C++
- All struct definitions can be typing compliant!!! (tested with pyright)
Give me some code!
The following code is typing compliant, meaning your static type checker won't scream at you when developing with this code.
If you want to check out the default syntax, open this block.
from caterpillar.py import *
from caterpillar.types import *
@bitfield(order=LittleEndian)
class Header:
version : 4 # 4bit integer
valid : 1 # 1bit flag (boolean)
ident : (8, CharFactory) # 8bit char
# automatic alignment to 16bits
THE_KEY = b"ITS MAGIC"
@struct(order=LittleEndian, kw_only=True)
class Format:
magic : THE_KEY # Supports string and byte constants directly
header : Header
a : uint8 # Primitive data types
b : Dynamic + int32 # dynamic endian based on global config
length : uint8 # String fields with computed lengths
name : String(this.length) # -> you can also use Prefixed(uint8)
# custom actions, e.g. for hashes
_hash_begin : DigestField.begin("hash", Md5_Algo)
# Sequences with prefixed, computed lengths -+ part of the MD5 hash
names : CString[uint8::] # |
# -+
# automatic hash creation and verification + default value
hash : Md5_Field("hash", verify=True)
# Creation, packing and unpacking remains the same
from caterpillar.py import *
from caterpillar.types import *
@bitfield(order=LittleEndian)
class Header:
version : int4_t # 4bit integer
valid : int1_t # 1bit flag (boolean)
ident : f[str, (8, CharFactory)] # 8bit char
# automatic alignment to 16bits
THE_KEY = b"ITS MAGIC"
@struct(order=LittleEndian, kw_only=True)
class Format:
magic : f[bytes, THE_KEY] = THE_KEY # Supports string and byte constants directly
header : Header
a : uint8_t # Primitive data types
b : f[int, Dynamic + int32] # dynamic endian based on global config
length : uint8_t # String fields with computed lengths
name : f[str, String(this.length)] # -> you can also use Prefixed(uint8)
# custom actions, e.g. for hashes
_hash_begin : f[None, DigestField.begin("hash", Md5_Algo)] = None
# Sequences with prefixed, computed lengths -+ part of the MD5 hash
names : f[list[str], CString[uint8::]] # |
# -+
# automatic hash creation and verification + default value
hash : f[bytes, Md5_Field("hash", verify=True)] = b""
# Creation (keyword-only arguments, magic is auto-inferred):
obj = Format(
header=Header(version=2, valid=True, ident="F"),
a=1,
b=2,
length=3,
name="foo",
names=["a", "b"]
)
# Packing the object; reads as 'PACK obj FROM Format'
# objects of struct classes can be packed right away
data_le = pack(obj, Format)
# results in: b'ITS MAGIC0*\x01\x02\x00\x00\x00\x03foo\x02a\x00b\x00)\x9a...'
# Unpacking the binary data, reads as 'UNPACK Format FROM blob'
obj2 = unpack(Format, data_le)
assert obj2.names == obj.names
# to pack with a different endian for fields 'a' and 'b', use 'order'
data_be = pack(obj, Format, order=BigEndian)
assert data_le != data_be
[!NOTE] Python 3.14 breaks
withstatements in class definitions since__annotations__are added at the end of a class definition. Therefore,Digestand conditional statements ARE NOT SUPPORTED using thewithsyntax in Python 3.14+. As of version2.4.5theDigestclass has a counterpart (DigestField), which can be used to manually specify a digest without the need of aẁithstatement.
This library offers extensive functionality beyond basic struct handling. For further details on its powerful features, explore the official documentation, examples, and test cases.
Installation
[!NOTE] As of Caterpillar v2.1.2 it is possible to install the library without the need of compiling the C extension.
PIP installation (Python-only)
pip install caterpillar-py
Python-only installation
pip install "caterpillar[all]@git+https://github.com/MatrixEditor/caterpillar"
Installation + C-extension
pip install "caterpillar[all]@git+https://github.com/MatrixEditor/caterpillar/#subdirectory=src/ccaterpillar"
Starting Point
Please visit the Documentation, it contains a complete tutorial on how to use this library.
Other Approaches
A list of similar approaches to parsing structured binary data with Python can be taken from below:
The documentation also provides a Comparison to these approaches.
License
Distributed under the GNU General Public License (V3). See License for more information.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file caterpillar_py-2.8.1.tar.gz.
File metadata
- Download URL: caterpillar_py-2.8.1.tar.gz
- Upload date:
- Size: 115.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
6284c9f82c75382913f7d6aff62f4331052eb526a571482152ceb93a1e2c5979
|
|
| MD5 |
ca4fce037c3f41787865fcbb90f6a23f
|
|
| BLAKE2b-256 |
2a89093c803f4898e0011ab35a3b76bf478d0e865aa35c3e21452ad35b94ec58
|
Provenance
The following attestation bundles were made for caterpillar_py-2.8.1.tar.gz:
Publisher:
python-publish.yml on MatrixEditor/caterpillar
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
caterpillar_py-2.8.1.tar.gz -
Subject digest:
6284c9f82c75382913f7d6aff62f4331052eb526a571482152ceb93a1e2c5979 - Sigstore transparency entry: 928146355
- Sigstore integration time:
-
Permalink:
MatrixEditor/caterpillar@214afb1221bf1629f9ef0927d8683c1f63af25df -
Branch / Tag:
refs/tags/v2.8.1 - Owner: https://github.com/MatrixEditor
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
python-publish.yml@214afb1221bf1629f9ef0927d8683c1f63af25df -
Trigger Event:
release
-
Statement type:
File details
Details for the file caterpillar_py-2.8.1-py3-none-any.whl.
File metadata
- Download URL: caterpillar_py-2.8.1-py3-none-any.whl
- Upload date:
- Size: 149.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c644541d6d12a90e303a493493079458fb66a61569187d4c0f2b15c59ec87b31
|
|
| MD5 |
0ed09602773392f3b92966d0a3f59ebf
|
|
| BLAKE2b-256 |
73a220170d790cb0d4db9495f19f76859e8d2c39e73989bf0a7251d733b76f9f
|
Provenance
The following attestation bundles were made for caterpillar_py-2.8.1-py3-none-any.whl:
Publisher:
python-publish.yml on MatrixEditor/caterpillar
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
caterpillar_py-2.8.1-py3-none-any.whl -
Subject digest:
c644541d6d12a90e303a493493079458fb66a61569187d4c0f2b15c59ec87b31 - Sigstore transparency entry: 928146368
- Sigstore integration time:
-
Permalink:
MatrixEditor/caterpillar@214afb1221bf1629f9ef0927d8683c1f63af25df -
Branch / Tag:
refs/tags/v2.8.1 - Owner: https://github.com/MatrixEditor
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
python-publish.yml@214afb1221bf1629f9ef0927d8683c1f63af25df -
Trigger Event:
release
-
Statement type: