Skip to main content

Library for data compression using Nintendo's Yaz0 algorithm

Project description

syaz0

Library for data compression using the Yaz0 algorithm

syaz0 is a native module for Python 3.6+ that provides fast data compression and decompression using Nintendo's Yaz0 algorithm.

Performance

Decompression performance is on par with existing Yaz0 decoders.

As of late December 2019, syaz0 is able to compress files much faster than existing Yaz0 encoders. Files that are representative of Breath of the Wild assets were compressed 20x to 30x faster than with existing public tools for an equivalent or better compression ratio, and 70-80x faster (with a slightly worse ratio) in extreme cases.

At the default compression level, file sizes are typically within 1% of Nintendo's.

For detailed benchmarks, see the results files in the test directory.

Usage

syaz0 can be installed with pip3 install syaz0. Binary builds are provided for Windows 64 bits (only). On all other platforms, building from source is required. Skip to the end of the README for more information.

syaz0.get_header(data)

Returns a syaz0.Header corresponding to the Yaz0 file header with the fields magic, uncompressed_size, data_alignment, reserved.

syaz0.decompress(data)

Decompresses Yaz0-compressed data from a bytes-like object data. Returns a bytes object containing the uncompressed data.

syaz0.decompress_unsafe(data)

Decompresses Yaz0-compressed data from a bytes (not bytes-like) object data. Returns a bytes object containing the uncompressed data.

Unlike syaz0.decompress, this function assumes that the input data is well-formed. In exchange for slightly improved performance, no sanity checks are performed. Warning: Do not use on untrusted data.

syaz0.compress(data, data_alignment=0, level=7)

Compresses a bytes-like object data. Returns a bytes-like object containing the Yaz0-compressed data.

data_alignment is a hint for decoders to allocate buffers with the required data alignment. Defaults to 0, which indicates that no particular alignment is required.

level is the compression level (6-9). 6 is fastest and 9 is slowest. Higher compression levels result in better compression. 7 is a good compromise between compression ratio and performance.

Project information

syaz0 was written with two goals in mind: to improve performance for Yaz0 compression — which is excruciatingly slow if one also desires decent compression ratios — and to let me learn a bit more about compression.

After doing more research on compression algorithms and finding out just how similar Yaz0 and LZ77-style compression algorithms are, I tried to implement some common tricks to help improve the extremely poor compression performance due to the sliding window search.

But the implementation was still extremely slow compared to gzip. And probably badly implemented.

It turns out that many more tricks were needed for fast compression.

After stumbling upon zlib-ng and seeing how well-optimised it is and all the intrinsics I decided it was best not to reinvent the wheel. Thus syaz0 uses a copy of zlib-ng for all the heavy lifting (match searching). The following modifications were made:

  • The window size was reduced to 4K (2^12) to match Yaz0.
  • The compress function and the stream structures were changed to take a callback that is invoked every time a distance/length pair or a literal is emitted. (I'm not proud, but it works.)
  • MAX_MATCH was not increased. zlib assumes it is equal to 258 in too many places and increasing it actually gives worse compression ratios.

Building from source

Building syaz0 from source requires:

  • CMake 3.10+
  • A compiler that supports C++17
  • Everything needed to build zlib-ng
  • pybind11 2.4+ (including CMake config files)
  • setuptools

When no binary build is available, pip will automatically build from source during the install process.

To build from source manually, run python3 setup.py bdist_wheel.

License

This software is licensed under the terms of the GNU General Public License, version 2 or later.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

syaz0-1.0.0.tar.gz (688.5 kB view details)

Uploaded Source

Built Distributions

syaz0-1.0.0-cp38-cp38-win_amd64.whl (127.6 kB view details)

Uploaded CPython 3.8 Windows x86-64

syaz0-1.0.0-cp37-cp37m-win_amd64.whl (127.4 kB view details)

Uploaded CPython 3.7m Windows x86-64

syaz0-1.0.0-cp36-cp36m-win_amd64.whl (127.3 kB view details)

Uploaded CPython 3.6m Windows x86-64

File details

Details for the file syaz0-1.0.0.tar.gz.

File metadata

  • Download URL: syaz0-1.0.0.tar.gz
  • Upload date:
  • Size: 688.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.4.0 requests-toolbelt/0.9.1 tqdm/4.32.2 CPython/3.7.5

File hashes

Hashes for syaz0-1.0.0.tar.gz
Algorithm Hash digest
SHA256 b2d38a3467d2f07cdf93f1a21bd1032c0253253a535cd3fa7579c1c2855c728c
MD5 9ef1790301c3e03b1eb9c28b3cc7c1d8
BLAKE2b-256 237bfc86301eaf9ba87929c9f0f659035016390468e0be5c5c15925c5da5dd88

See more details on using hashes here.

File details

Details for the file syaz0-1.0.0-cp38-cp38-win_amd64.whl.

File metadata

  • Download URL: syaz0-1.0.0-cp38-cp38-win_amd64.whl
  • Upload date:
  • Size: 127.6 kB
  • Tags: CPython 3.8, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.4.0 requests-toolbelt/0.9.1 tqdm/4.32.2 CPython/3.7.5

File hashes

Hashes for syaz0-1.0.0-cp38-cp38-win_amd64.whl
Algorithm Hash digest
SHA256 c2d230913aedf4efb0276fc9fe7a865f36a988858d50f03719f59141140567fd
MD5 c5f0c171978a79fe7818384a27ead647
BLAKE2b-256 948a89b843a2a843bb14c0a6943c73dcfb1748f81570ef2524d9bed86b3c876a

See more details on using hashes here.

File details

Details for the file syaz0-1.0.0-cp37-cp37m-win_amd64.whl.

File metadata

  • Download URL: syaz0-1.0.0-cp37-cp37m-win_amd64.whl
  • Upload date:
  • Size: 127.4 kB
  • Tags: CPython 3.7m, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.4.0 requests-toolbelt/0.9.1 tqdm/4.32.2 CPython/3.7.5

File hashes

Hashes for syaz0-1.0.0-cp37-cp37m-win_amd64.whl
Algorithm Hash digest
SHA256 512f46b76e9aa3117a5cf347931b09843814322ecf7437f3a7763f6e12f11ff2
MD5 ee34f798ff69e8f04c0540efffa44add
BLAKE2b-256 5e93afe242c3f60443f86e99c755faedf0496be69c9f86337f8ea75b07e8f5bb

See more details on using hashes here.

File details

Details for the file syaz0-1.0.0-cp36-cp36m-win_amd64.whl.

File metadata

  • Download URL: syaz0-1.0.0-cp36-cp36m-win_amd64.whl
  • Upload date:
  • Size: 127.3 kB
  • Tags: CPython 3.6m, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.4.0 requests-toolbelt/0.9.1 tqdm/4.32.2 CPython/3.7.5

File hashes

Hashes for syaz0-1.0.0-cp36-cp36m-win_amd64.whl
Algorithm Hash digest
SHA256 b180b485aa01c87f99b96aebcddcb643915cf558349256c72e446dce1b46f872
MD5 fb16eaa7644a6733c0550bc5f9ffb2d9
BLAKE2b-256 848954e8211379ba577c002022156cff935473d6c8efe05210bf9606160ea5b6

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page