Skip to main content

Unicode (and other integer) table packer

Project description

packTab

I first wrote something like this back in 2001 when I needed it in FriBidi:

https://github.com/fribidi/fribidi/blob/master/gen.tab/packtab.c

In 2019 I wanted to use that to produce more compact Unicode data tables for HarfBuzz, but for convenience I wanted to use it from Python. While I considered wrapping the C code in a module, it occurred to me that I can rewrite it in pure Python in a much cleaner way. That code remains a stain on my resume in terms of readability (or lack thereof!). :D

This Python version builds on the same ideas, but is different from the C version in two major ways:

  1. Whereas the C version uses backtracking to find best split opportunities, I found that the same can be achieved using dynamic-programming. So the Python version implements the DP approach, which is much faster.

  2. The C version does not try packing multiple items into a single byte. The Python version does. Ie. if items fit, they might get packed into 1, 2, or 4 bits per item.

There's also a bunch of other optimizations, which make (eventually, when complete) the Python version more generic and usable for a wider variety of data tables.

TODO:

  • Reduce code duplication between Inner/Outer genCode().
  • Handle empty data array.
  • Bake in width multiplier into array data if doing so doesn't enlarge data type. Again, that would save ops.
  • If an array is not larger than 64 bits, inline it in code directly as one integer.
  • Currently we only cull array of defaults at the end. Do it at beginning as well, and adjust split code to find optimum shift.
  • Byte reuse! Much bigger work item.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

packtab-0.1.0.tar.gz (11.9 kB view details)

Uploaded Source

Built Distribution

packtab-0.1.0-py3-none-any.whl (14.0 kB view details)

Uploaded Python 3

File details

Details for the file packtab-0.1.0.tar.gz.

File metadata

  • Download URL: packtab-0.1.0.tar.gz
  • Upload date:
  • Size: 11.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.0.1 requests-toolbelt/0.9.1 tqdm/4.32.1 CPython/3.7.3

File hashes

Hashes for packtab-0.1.0.tar.gz
Algorithm Hash digest
SHA256 bdbd16243d540c20d421cca5dd82ec1499b9befe02f1e100b054ab6909549739
MD5 618a59ebdd6f0289c91a0a5c60d9416b
BLAKE2b-256 f81af607bead29fd61a6e193b64b0917716ea42d02b3cd078ef73f0a3985d199

See more details on using hashes here.

File details

Details for the file packtab-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: packtab-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 14.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.0.1 requests-toolbelt/0.9.1 tqdm/4.32.1 CPython/3.7.3

File hashes

Hashes for packtab-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 6238e066bbb94f2b6564d337429a4d75c382b8c5b2023ab87943c2f121188bfd
MD5 647ccb190f458e74a597831087ed0816
BLAKE2b-256 6046829f657fa052185c56909f6beb9a8eba95514d91ff6d10bfec0bada565f0

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page