Skip to main content

base91, z85b and z85p encodings

Project description

z85base91

Compact binary-to-text encoding library providing three codecs — Base91, Z85B, and Z85P — for efficient serialisation of binary payloads over text-oriented channels. Combines fast C implementations with pure-Python fallbacks for cross-platform compatibility. Packaged for the HiveMind mesh.

Why z85base91?

Base64 expands data by 33% (1.33x), but Base91 achieves only 23% overhead (1.23x expansion) while remaining more compact than base32 (62% overhead). Z85 variants (Z85B: 1.25x, Z85P: 1.28x) offer a middle ground with character safety guarantees useful for serialising encrypted frames over websockets and other text transports.

Comparison

Codec Expansion Use Case
Base91 1.23x (smallest) Maximum compactness; printable ASCII
Z85B 1.25x Z85 padding scheme; per-group framing
Z85P 1.28x Z85 padding scheme; prepended size byte
base64 1.35x Standard; less compact
base32 1.62x Case-insensitive; largest

Installation

pip install z85base91

The library bundles precompiled C extensions for x86_64, i386, and aarch64 architectures. It automatically falls back to pure Python if the C library fails to load, with a logged warning.

Quick Start

from z85base91 import B91, Z85B, Z85P

# Base91 (most compact)
data = b"Hello, World!"
encoded = B91.encode(data)        # b'>OwJh>}AQ;r@@Y?F'
decoded = B91.decode(encoded)     # b'Hello, World!'

# Z85B (Z85 with independent groups)
encoded = Z85B.encode(data)
decoded = Z85B.decode(encoded)

# Z85P (Z85 with prepended padding byte)
encoded = Z85P.encode(data)
decoded = Z85P.decode(encoded)

All three accept str or bytes input and return bytes.

Public API

Base91 — B91

from z85base91 import B91

Encoding

B91.encode(data: Union[str, bytes], encoding: str = "utf-8") -> bytes

Encodes binary data using Base91, which uses 91 printable ASCII characters (A–Z, a–z, 0–9, plus 27 symbols). Achieves 1.23x size expansion.

Arguments:

  • data: Input as str (encoded to UTF-8) or raw bytes.
  • encoding: Character encoding if data is str. Default: "utf-8".

Returns: Base91-encoded bytes.

Example:

B91.encode(b"test")           # b'fPNKd'
B91.encode("test")            # b'fPNKd' (auto UTF-8)
B91.encode(b"\x00\x01\x02")   # b':CQA'

Decoding

B91.decode(encoded_data: Union[str, bytes], encoding: str = "utf-8") -> bytes

Decodes Base91-encoded input back to raw bytes.

Arguments:

  • encoded_data: Base91 string or bytes.
  • encoding: Character encoding to use for string conversion. Default: "utf-8".

Returns: Decoded bytes.

Raises: ValueError if the input contains invalid Base91 characters (outside the 91-character alphabet).

Example:

B91.decode(b'>OwJh>}AQ;r@@Y?F')  # b'Hello, World!'
B91.decode('>OwJh>}AQ;r@@Y?F')   # b'Hello, World!' (str input)

Z85B — Z85B

from z85base91 import Z85B

Encoding

Z85B.encode(data: Union[str, bytes]) -> bytes

Encodes binary data using Z85B, a variant of Z85 that processes 4-byte chunks independently. Uses the 85-character Z85 alphabet: 0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ.-:+=^!/*?&<>()[]{}@%$#. Achieves ~1.25x expansion.

Arguments:

  • data: Input as str (encoded to UTF-8) or raw bytes.

Returns: Z85B-encoded bytes.

Example:

Z85B.encode(b"Hello")         # b'NV0&yq1' (5 bytes -> 7 bytes)
Z85B.encode("Hello")          # b'NV0&yq1' (str auto UTF-8)

Decoding

Z85B.decode(encoded_data: Union[str, bytes]) -> bytes

Decodes Z85B-encoded input.

Arguments:

  • encoded_data: Z85B-encoded string or bytes.

Returns: Decoded bytes.

Raises: ValueError if input contains invalid Z85 characters.

Example:

original = b"Hello, World!"
encoded = Z85B.encode(original)
decoded = Z85B.decode(encoded)
assert decoded == original

Z85P — Z85P

from z85base91 import Z85P

Encoding

Z85P.encode(data: Union[str, bytes]) -> bytes

Encodes binary data using Z85P, a Z85 variant with an explicit padding byte prepended. The first byte of the output indicates how many padding bytes were added (0–3), allowing the decoder to strip them correctly. Uses the same 85-character Z85 alphabet as Z85B. Achieves ~1.28x expansion.

Arguments:

  • data: Input as str (encoded to UTF-8) or raw bytes.

Returns: Z85P-encoded bytes (first byte is padding indicator, 0–3).

Example:

Z85P.encode(b"A")      # b'\x03k(Z(+' (1 byte + 3 padding = 4, encodes to 5)
Z85P.encode(b"AB")     # b'\x02k%+LK' (2 bytes + 2 padding = 4)
Z85P.encode(b"ABCD")   # b'\x00k%^}b' (4 bytes, no padding)

Decoding

Z85P.decode(encoded_data: Union[str, bytes]) -> bytes

Decodes Z85P-encoded input, automatically stripping padding based on the first byte.

Arguments:

  • encoded_data: Z85P-encoded string or bytes (must include the padding indicator byte).

Returns: Decoded bytes with padding removed.

Raises: ValueError if input length is invalid or contains invalid Z85 characters.

Example:

original = b"X"
encoded = Z85P.encode(original)
decoded = Z85P.decode(encoded)
assert decoded == original

Performance Notes

C vs. Python

The C implementations are substantially faster (~1.5x for decoding, up to 2x for encoding). The library automatically selects the C path; Python fallback engages only if compilation failed or the C library failed to load.

The package ships a benchmark harness (z85base91/bench.py) that compares all codecs against stdlib base64/base32 across input sizes. It pulls in click, tabulate, pybase64, and hivemind-bus-client (for the pure-Python reference codecs), so install those before running:

pip install click tabulate pybase64 hivemind-bus-client
python -m z85base91.bench

Expansion ratios are deterministic: Base91 ≈ 1.23x, Z85B ≈ 1.25x, Z85P ≈ 1.25x for aligned payloads (up to 1.28x for short, heavily-padded inputs), versus base64's 1.33x and base32's 1.6x. Absolute throughput depends on hardware; the C path is consistently faster than stdlib base64 for encoding and the pure Python fallback for both directions.


Architecture

The library has three layers:

  1. Public API (z85base91/__init__.py): Exports B91, Z85B, Z85P classes. At import time, each tries to load its architecture-specific C library (.so) via ctypes.CDLL. If loading fails, the name is rebound to the pure-Python implementation.

  2. C implementations (src/*.c, prebuilt as z85base91/lib*-{arch}.so):

    • libbase91-{x86_64,aarch64,i386}.so — B91 codec
    • libz85b-{x86_64,aarch64,i386}.so — Z85B codec
    • libz85p-{x86_64,aarch64,i386}.so — Z85P codec
  3. Pure-Python fallbacks (z85base91/{b91,z85b,z85p}.py): Self-contained reference implementations.

Data flow: normalise input to bytes → wrap in ctypes.c_ubyte array → call C function → read output via ctypes.string_at.


Error Handling

Invalid input characters

All decoders raise ValueError if the input contains characters outside the allowed alphabet:

try:
    Z85P.decode("Hello🎉World")
except ValueError as e:
    print(f"Invalid character: {e}")

Missing C library

If the C library fails to load, a WARNING is logged and the Python fallback is used transparently:

WARNING - Z85P C library not available: Library load error. Falling back to pure Python implementation.

No exception is raised; encoding/decoding work identically, just slower.


HiveMind Integration

This library is used by the HiveMind mesh to efficiently pack binary payloads (encrypted frames, keys, metadata) for transmission over websocket channels, where text-mode framing is required. The ~23% overhead (Base91) versus 33% (base64) yields measurable bandwidth savings at scale.

Example: a 1 MB encrypted message expands to 1.23 MB (Base91) versus 1.35 MB (base64) — a 12 KB/message saving that compounds with thousands of mesh nodes.


Requirements

Runtime: Python 3.8+ (no external dependencies; ctypes is stdlib)

Build: python3-dev, swig (for C extension compilation)

Tests: pytest~=7.1, pytest-cov~=4.1


Testing

pip install -e . --no-deps
pip install -r test/requirements.txt
pytest test/ -q

All codecs are tested for:

  • Round-trip encode/decode (identity)
  • Edge cases (empty input, single byte, odd lengths)
  • Invalid input handling
  • Unicode string support
  • Large payloads (1000+ bytes)

License

Apache 2.0

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

z85base91-0.0.6a6.tar.gz (45.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

z85base91-0.0.6a6-py3-none-any.whl (45.9 kB view details)

Uploaded Python 3

File details

Details for the file z85base91-0.0.6a6.tar.gz.

File metadata

  • Download URL: z85base91-0.0.6a6.tar.gz
  • Upload date:
  • Size: 45.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for z85base91-0.0.6a6.tar.gz
Algorithm Hash digest
SHA256 807edf4124093800912e90d857127914b0d9289fe35f62fe1d6167c858396452
MD5 95eff098d90cde43eb144e468941d0e2
BLAKE2b-256 829f1d2f96071cff655d8a44ec522044fb21c92410caf6311d2b714d381d1bf4

See more details on using hashes here.

File details

Details for the file z85base91-0.0.6a6-py3-none-any.whl.

File metadata

  • Download URL: z85base91-0.0.6a6-py3-none-any.whl
  • Upload date:
  • Size: 45.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for z85base91-0.0.6a6-py3-none-any.whl
Algorithm Hash digest
SHA256 906b143dce29b392d375996e74f8ece36a80a47622d4f8ded75bc003a9b06627
MD5 88337e5510347d435880af3e96060060
BLAKE2b-256 8164b0bca6ac34b7df600820404be45deef28c0a7c0fd525c3349b37325fde41

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page