Skip to main content

Library for data compression using Nintendo's Yaz0 algorithm

Project description

syaz0

Library for data compression using the Yaz0 algorithm

syaz0 is a native module for Python 3.6+ that provides fast data compression and decompression using Nintendo's Yaz0 algorithm.

Performance

Decompression performance is on par with existing Yaz0 decoders.

As of late December 2019, syaz0 is able to compress files much faster than existing Yaz0 encoders. Files that are representative of Breath of the Wild assets were compressed 20x to 30x faster than with existing public tools for an equivalent or better compression ratio, and 70-80x faster (with a slightly worse ratio) in extreme cases.

At the default compression level, file sizes are typically within 1% of Nintendo's.

For detailed benchmarks, see the results files in the test directory.

Usage

syaz0 can be installed with pip3 install syaz0. Binary builds are provided for Windows 64 bits (only). (On all other platforms, building from source is required. Skip to the end of the README for more information.)

syaz0.decompress(data)

Decompresses Yaz0-compressed data from a bytes-like object data. Returns a bytes object containing the uncompressed data.

syaz0.decompress_unsafe(data)

Decompresses Yaz0-compressed data from a bytes (not bytes-like) object data. Returns a bytes object containing the uncompressed data.

Unlike syaz0.decompress, this function assumes that the input data is well-formed. In exchange for slightly improved performance, no sanity checks are performed. Warning: Do not use on untrusted data.

syaz0.compress(data, data_alignment=0, level=7)

Compresses a bytes-like object data. Returns a bytes-like object containing the Yaz0-compressed data.

data_alignment is a hint for decoders to allocate buffers with the required data alignment. Defaults to 0, which indicates that no particular alignment is required.

level is the compression level (6-9). 6 is fastest and 9 is slowest. Higher compression levels result in better compression. 7 is a good compromise between compression ratio and performance.

Project information

syaz0 was written with two goals in mind: to improve performance for Yaz0 compression — which is excruciatingly slow if one also desires decent compression ratios — and to let me learn a bit more about compression.

After doing more research on compression algorithms and finding out just how similar Yaz0 and LZ77-style compression algorithms are, I tried to implement some common tricks to help improve the extremely poor compression performance due to the sliding window search.

But the implementation was still extremely slow compared to gzip. And probably badly implemented.

It turns out that many more tricks were needed for fast compression.

After stumbling upon zlib-ng and seeing how well-optimised it is and all the intrinsics I decided it was best not to reinvent the wheel. Thus syaz0 uses a copy of zlib-ng for all the heavy lifting (match searching). The following modifications were made:

  • The window size was reduced to 4K (2^12) to match Yaz0.
  • The compress function and the stream structures were changed to take a callback that is invoked every time a distance/length pair or a literal is emitted. (I'm not proud, but it works.)
  • MAX_MATCH was not increased. zlib assumes it is equal to 258 in too many places and increasing it actually gives worse compression ratios.

Building from source

Building syaz0 from source requires:

  • CMake 3.10+
  • A compiler that supports C++17
  • Everything needed to build zlib-ng
  • pybind11 2.4+ (including CMake config files)
  • setuptools

When no binary build is available, pip will automatically build from source during the install process.

To build from source manually, run python3 setup.py bdist_wheel.

License

This software is licensed under the terms of the GNU General Public License, version 2 or later.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

syaz0-1.0.0rc4.tar.gz (688.1 kB view details)

Uploaded Source

Built Distributions

syaz0-1.0.0rc4-cp38-cp38-win_amd64.whl (124.4 kB view details)

Uploaded CPython 3.8 Windows x86-64

syaz0-1.0.0rc4-cp37-cp37m-win_amd64.whl (124.5 kB view details)

Uploaded CPython 3.7m Windows x86-64

syaz0-1.0.0rc4-cp36-cp36m-win_amd64.whl (124.5 kB view details)

Uploaded CPython 3.6m Windows x86-64

File details

Details for the file syaz0-1.0.0rc4.tar.gz.

File metadata

  • Download URL: syaz0-1.0.0rc4.tar.gz
  • Upload date:
  • Size: 688.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.4.0 requests-toolbelt/0.9.1 tqdm/4.32.2 CPython/3.7.5

File hashes

Hashes for syaz0-1.0.0rc4.tar.gz
Algorithm Hash digest
SHA256 f91b05cb0dc77627695e1983987ba88b3843d349f6229a412648d2699df0dd3b
MD5 e1f81c3598a3a858c39e36d42d740563
BLAKE2b-256 c7c4bbb1b0b6e06596225aa969421bcec8fa3f2d1ea79bbfc3ed041c2e4bb18e

See more details on using hashes here.

File details

Details for the file syaz0-1.0.0rc4-cp38-cp38-win_amd64.whl.

File metadata

  • Download URL: syaz0-1.0.0rc4-cp38-cp38-win_amd64.whl
  • Upload date:
  • Size: 124.4 kB
  • Tags: CPython 3.8, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.4.0 requests-toolbelt/0.9.1 tqdm/4.32.2 CPython/3.7.5

File hashes

Hashes for syaz0-1.0.0rc4-cp38-cp38-win_amd64.whl
Algorithm Hash digest
SHA256 de46f70c8533f81305e5ccd830f3306ebde8f2856a695650cbd67ea62eb40664
MD5 858dc11dca375a0aaffac5a7f7c7f699
BLAKE2b-256 91813b1891532e5c11102153cd485090019335f4039319bb0996e03f009bffb9

See more details on using hashes here.

File details

Details for the file syaz0-1.0.0rc4-cp37-cp37m-win_amd64.whl.

File metadata

  • Download URL: syaz0-1.0.0rc4-cp37-cp37m-win_amd64.whl
  • Upload date:
  • Size: 124.5 kB
  • Tags: CPython 3.7m, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.4.0 requests-toolbelt/0.9.1 tqdm/4.32.2 CPython/3.7.5

File hashes

Hashes for syaz0-1.0.0rc4-cp37-cp37m-win_amd64.whl
Algorithm Hash digest
SHA256 fb98fe615baf6a66c6ea85ed18466332b57c0688b6525ab21cebaf77cac58efd
MD5 deb7061c9d5f78a348b324ecf5b542af
BLAKE2b-256 06dec9f0c5df129f553b0c58b88291916daf56d98106948edf6aff12c4d048dd

See more details on using hashes here.

File details

Details for the file syaz0-1.0.0rc4-cp36-cp36m-win_amd64.whl.

File metadata

  • Download URL: syaz0-1.0.0rc4-cp36-cp36m-win_amd64.whl
  • Upload date:
  • Size: 124.5 kB
  • Tags: CPython 3.6m, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.4.0 requests-toolbelt/0.9.1 tqdm/4.32.2 CPython/3.7.5

File hashes

Hashes for syaz0-1.0.0rc4-cp36-cp36m-win_amd64.whl
Algorithm Hash digest
SHA256 32eaae79b945546f37759f1719677b9c423fe7f0f40997657427953c44e5cd0b
MD5 74c288fcb7cf828f4788a5379b84bb4f
BLAKE2b-256 6aa4802c0e8efb676a4b68a5d83e9302a8e00cc76d1534f488779e6f8429e4bf

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page