Skip to main content

Rewrite zip files, modifying compression type and parameters.

Project description

zipmanip

Latest version badge PyPI - Python Version CI

This is a command line utility that rewrites a zip file. It attempts to leave the archive contents metadata and ordering as unchanged as is possible, but re-compresses or decompresses the contents.

This was written to use as a git smudge/clean filter to use when using git to keep a history of zip files. (See below for more on that.) Note that various programs store their native project files as zip files.

  • FreeCAD's .FCStd files are zip archives, as are some .amf and .3mf files. (Being able to reasonably version-control FreeCAD drawings was my motivation for writing this.)

  • Many formats written by Open/LibreOffice and Microsoft Office programs are zip-based. These include .odt, .ods, .odp, odg, .docx, .xlsx, and .pptx files. (It's still untested how well they work with this, but they may well do.)

  • Java's .jar and .war files are zip archives. (Though these may be digitally signed. At present, I'm unsure how those signatures may interact with the techniques used here.)

  • The .epub e-book format is zip-based.

This program may be useful in other non-git-related purposes as well.

Installation

The recommended method of installation is to install the distribution from PyPI using, e.g. pipx.

  1. Install pipx.
  2. Run pipx install zipmanip.

Any standard python installation method (e.g. installing to a virtual environment using pip) should work.

Quick and Dirty method

At present, there are no external dependencies and the code is all contained in a single file, so you could just copy the zipmanip.py file to some location in your PATH, and make it executable.

Usage

$ zipmanip -h
usage: zipmanip [-h] [--output-file OUTPUT_FILE]
                [--compression-method {store,deflate,bzip2,lzma}] [-0]
                [input_file]

Write zip file contents to a new zip file, re- or de-compressing its contents. This can be
used to convert a compressed zip file to one whose contents are stored uncompressed, and
vice versa.

positional arguments:
  input_file            input zip file (default stdin): If an explicit input file is named
                        and no explicit output file is set, the named zip file will be
                        rewritten IN PLACE.

options:
  -h, --help            show this help message and exit
  --output-file OUTPUT_FILE, -O OUTPUT_FILE
                        output file name (default stdout)
  --compression-method {store,deflate,bzip2,lzma}, -Z {store,deflate,bzip2,lzma}
                        set compression method (default: 'deflate')
  -0, -1, -2, -3, -4, -5, -6, -7, -8, -9
                        set compression level

For example, zipmanip --compression-method=store will read a zip archive from stdin, and write an zip archive with the same contents, all of which is stored uncompressed to stdout.

The "inverse" operation (not exactly, see below) would be zipmanip to compress the contents using the default settings (or zipmanip -9 to turn the deflate compression to max).

Usage with Git

Zipmanip can be used as a clean/smudge filter with git so that zip archives are stored uncompressed in the git index.

(The motivation is that if the zip contents are not compressed, git should be able to more efficiently pack the deltas between revisions.)

To set this up:

git config filter.zipmanip.clean "zipmanip --compression-method=store"
git config filter.zipmanip.smudge "zipmanip -9"
# optionally, for diff formatting
git config diff.unzip.textconv "unzip -c -a"

Then, edit .gitattributes to set the filter=zipmanip (and, optionally diff=unzip) on any zip files that you want to store uncompressed. E.g.

*.FCStd binary filter=zipmanip diff=unzip
*.3mf binary filter=zipmanip diff=unzip
*.amf binary filter=zipmanip diff=unzip

Bugs

On Round-trip Idempotency

Currently if a zip archive is round tripped — converted to uncompressed, then re-compressed — the result will not be byte-wise identical to the original. This is due to (at least) a couple of issues.

Differing compression algorithm and parameters

It may be possible to improve this situation, at least partially, by storing information on the original compression type in the uncompressed archives.

Note that data on compression level may be available from bits 1, 2 and possibly 4 of the ZipInfo.flags. (See section 4.4.4 of PKZIP Application Note.)

Differing use of "zip64" extended header

Also the use of "extended local header" is not preserved. (This manifests in .3mf files written by PrusaSlicer. PrusaSlicer always writes extended headers. (This, I think could be fixed with appropriate use of the force_zip64 parameters to ZipFile.open.)

Author

Jeff Dairiki dairiki@dairiki.org

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

zipmanip-0.1.0b1.tar.gz (8.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

zipmanip-0.1.0b1-py3-none-any.whl (7.4 kB view details)

Uploaded Python 3

File details

Details for the file zipmanip-0.1.0b1.tar.gz.

File metadata

  • Download URL: zipmanip-0.1.0b1.tar.gz
  • Upload date:
  • Size: 8.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for zipmanip-0.1.0b1.tar.gz
Algorithm Hash digest
SHA256 008efd6d5ecafdeb08cd1456e447751418570fdbfa1bd7932c87c4345b233db2
MD5 1459139c6dfcb05ecabcd6fcc9ce8328
BLAKE2b-256 ad6726acc539ce7966c9f82db33720c6a4e4c29e259e170096957a31bbd50991

See more details on using hashes here.

Provenance

The following attestation bundles were made for zipmanip-0.1.0b1.tar.gz:

Publisher: ci.yml on dairiki/zipmanip

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file zipmanip-0.1.0b1-py3-none-any.whl.

File metadata

  • Download URL: zipmanip-0.1.0b1-py3-none-any.whl
  • Upload date:
  • Size: 7.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for zipmanip-0.1.0b1-py3-none-any.whl
Algorithm Hash digest
SHA256 63db969779b8b76b2397a8b690dead3b8fe3b602c430c7b93e22f23f3ba436da
MD5 6a327f299266cd72181e6ea5ac635e58
BLAKE2b-256 bb7e81e05dc49c53680e8cc0c7991cd42c8737bf9d151b0d25a1018d2544b3e5

See more details on using hashes here.

Provenance

The following attestation bundles were made for zipmanip-0.1.0b1-py3-none-any.whl:

Publisher: ci.yml on dairiki/zipmanip

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page