Skip to main content

Represent bytes with printable characters

Project description

reprb

Represent bytes with printable characters, similar to how python built-in functions repr() and eval() do.

why

Bytes objects in Python 3 can already be read from and written to files (like load/dump), but you can't easily understand or edit them when you open a binary file directly in a text editor.

reprb's goal is to dump bytes to printable bytes and load them back to the original bytes object quickly (at least faster than the built-in repr()), especially when you analysis/dump/load bytes contain both printable and unprintable characters(like http message), reprb make it more editable and understandable.

how

Install

from pip

python3 -m pip install reprb

from source

git clone git@github.com:Testzero-wz/reprb.git

cd reprb && pip3 install .

Usage

repr bytes object:

>>> from reprb import reprb, evalb
>>> msg = "abc123\x00\x07\x11\x90\xff中文№".encode()
>>> repr_bytes = reprb(msg)
>>> repr_bytes
b'abc123\\0\\a\\x11\\xc2\\x90\\xc3\\xbf\\xe4\\xb8\\xad\\xe6\\x96\\x87\\xe2\\x84\\x96'
>>> eval_bytes = evalb(repr_bytes)
>>> eval_bytes == msg
True

dump/load bytes from/to file:

from reprb import dump, load, load_iter

dump_bytes = b"abc123\x00\x07\x84\x96"
dump_file = "dump.txt"

# dump bytes to file, seperate by "\n" default
with open(dump_file, "wb") as f:

    # dump bytes object
    dump(dump_bytes, f)

    # dump all bytes object in list
    dump_bytes_list = [dump_bytes, dump_bytes, dump_bytes]
    dump(dump_bytes_list, f)

# load all bytes from file, seperate by "\n" default
load_bytes_from_path = load(dump_file)

# load all bytes from file handler
with open(dump_file, "rb") as f:
    load_bytes_from_file = load(f)

# load iter
load_bytes_from_iter = list(load_iter(dump_file))

assert (
    [dump_bytes] + dump_bytes_list
    == load_bytes_from_path
    == load_bytes_from_file
    == load_bytes_from_iter
)

If you want to store bytes with a more formatable structure like json:

from reprb import reprb, evalb

# you should decode reprb bytes since json.dump() only accept string object.
# btw, you can decode reprb(msg) bytes safely, because eprb(msg) bytes only contain ascii printable chars
stru = {
    "msg": reprb(http_msg).decode(),
    "extra_info": "whatever",
}

json.dump(stru)

Benchmark

$ python3 test.py
Test:
(6/6) Testcases passed. 
dump/load test passed.
Bench:
built-in repr: 1.2180822410s, 183074666.47 bytes/s
built-in eval: 4.8808067660s, 131310258.88 bytes/s
reprb/dumpb: 0.7567997570s, 294661828.23 bytes/s
evalb/loadb: 1.2524397970s, 491999696.49 bytes/s

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

reprb-1.0.6.tar.gz (7.3 kB view details)

Uploaded Source

File details

Details for the file reprb-1.0.6.tar.gz.

File metadata

  • Download URL: reprb-1.0.6.tar.gz
  • Upload date:
  • Size: 7.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.0 CPython/3.8.10

File hashes

Hashes for reprb-1.0.6.tar.gz
Algorithm Hash digest
SHA256 baf877e14df99d63c2e8d69088eb75d25579600c94ed51cda5c1b24865f77c2e
MD5 48d1f7827e6d94bb90b0c72902622772
BLAKE2b-256 eb80dffb8502ffc7ee1641291d7669e573a0b2997199654a651dbad492620cfb

See more details on using hashes here.

Provenance

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page