Rewrite zip files, modifying compression type and parameters.
Project description
zipmanip
This is a command line utility that rewrites a zip file. It attempts to leave the archive contents metadata and ordering as unchanged as is possible, but re-compresses or decompresses the contents.
This was written to use as a git smudge/clean filter to use when using git to keep a history of zip files. (See below for more on that.) Note that various programs store their native project files as zip files.
-
FreeCAD's
.FCStdfiles are zip archives, as are some.amfand.3mffiles. (Being able to reasonably version-control FreeCAD drawings was my motivation for writing this.) -
Many formats written by Open/LibreOffice and Microsoft Office programs are zip-based. These include
.odt,.ods,.odp,odg,.docx,.xlsx, and.pptxfiles. (It's still untested how well they work with this, but they may well do.) -
Java's
.jarand.warfiles are zip archives. (Though these may be digitally signed. At present, I'm unsure how those signatures may interact with the techniques used here.) -
The
.epube-book format is zip-based.
This program may be useful in other non-git-related purposes as well.
Installation
The recommended method of installation is to install the distribution from PyPI using, e.g. pipx.
- Install
pipx. - Run
pipx install zipmanip.
Any standard python installation method (e.g. installing to a
virtual environment using pip) should work.
Quick and Dirty method
At present, there are no external dependencies and the code is all
contained in a single file, so you could just copy the
zipmanip.py
file to some location in your PATH, and make it executable.
Usage
$ zipmanip -h
usage: zipmanip [-h] [--output-file OUTPUT_FILE]
[--compression-method {store,deflate,bzip2,lzma}] [-0]
[input_file]
Write zip file contents to a new zip file, re- or de-compressing its contents. This can be
used to convert a compressed zip file to one whose contents are stored uncompressed, and
vice versa.
positional arguments:
input_file input zip file (default stdin): If an explicit input file is named
and no explicit output file is set, the named zip file will be
rewritten IN PLACE.
options:
-h, --help show this help message and exit
--output-file OUTPUT_FILE, -O OUTPUT_FILE
output file name (default stdout)
--compression-method {store,deflate,bzip2,lzma}, -Z {store,deflate,bzip2,lzma}
set compression method (default: 'deflate')
-0, -1, -2, -3, -4, -5, -6, -7, -8, -9
set compression level
For example, zipmanip --compression-method=store will read a zip
archive from stdin, and write an zip archive with the same contents,
all of which is stored uncompressed to stdout.
The "inverse" operation (not exactly, see
below) would be zipmanip to
compress the contents using the default settings (or zipmanip -9 to
turn the deflate compression to max).
Usage with Git
Zipmanip can be used as a clean/smudge filter with git so that zip
archives are stored uncompressed in the git index.
(The motivation is that if the zip contents are not compressed, git should be able to more efficiently pack the deltas between revisions.)
To set this up:
git config filter.zipmanip.clean "zipmanip --compression-method=store"
git config filter.zipmanip.smudge "zipmanip -9"
# optionally, for diff formatting
git config diff.unzip.textconv "unzip -c -a"
Then, edit .gitattributes to set the
filter=zipmanip (and, optionally diff=unzip) on any zip files that
you want to store uncompressed. E.g.
*.FCStd binary filter=zipmanip diff=unzip
*.3mf binary filter=zipmanip diff=unzip
*.amf binary filter=zipmanip diff=unzip
Bugs
On Round-trip Idempotency
Currently if a zip archive is round tripped — converted to uncompressed, then re-compressed — the result will not be byte-wise identical to the original. This is due to (at least) a couple of issues.
Differing compression algorithm and parameters
It may be possible to improve this situation, at least partially, by storing information on the original compression type in the uncompressed archives.
Note that data on compression level may be available from bits 1, 2
and possibly 4 of the ZipInfo.flags.
(See section 4.4.4 of PKZIP Application Note.)
Differing use of "zip64" extended header
Also the use of "extended local header" is not preserved. (This
manifests in .3mf files written by PrusaSlicer. PrusaSlicer always
writes extended headers. (This, I think could be fixed with
appropriate use of the force_zip64 parameters to ZipFile.open.)
Author
Jeff Dairiki dairiki@dairiki.org
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file zipmanip-0.1.0b1.tar.gz.
File metadata
- Download URL: zipmanip-0.1.0b1.tar.gz
- Upload date:
- Size: 8.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
008efd6d5ecafdeb08cd1456e447751418570fdbfa1bd7932c87c4345b233db2
|
|
| MD5 |
1459139c6dfcb05ecabcd6fcc9ce8328
|
|
| BLAKE2b-256 |
ad6726acc539ce7966c9f82db33720c6a4e4c29e259e170096957a31bbd50991
|
Provenance
The following attestation bundles were made for zipmanip-0.1.0b1.tar.gz:
Publisher:
ci.yml on dairiki/zipmanip
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
zipmanip-0.1.0b1.tar.gz -
Subject digest:
008efd6d5ecafdeb08cd1456e447751418570fdbfa1bd7932c87c4345b233db2 - Sigstore transparency entry: 1229708146
- Sigstore integration time:
-
Permalink:
dairiki/zipmanip@4a4b16454af22e6cfdbb5b1e2091431196a7ea14 -
Branch / Tag:
refs/tags/v0.1.0b1 - Owner: https://github.com/dairiki
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
ci.yml@4a4b16454af22e6cfdbb5b1e2091431196a7ea14 -
Trigger Event:
push
-
Statement type:
File details
Details for the file zipmanip-0.1.0b1-py3-none-any.whl.
File metadata
- Download URL: zipmanip-0.1.0b1-py3-none-any.whl
- Upload date:
- Size: 7.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
63db969779b8b76b2397a8b690dead3b8fe3b602c430c7b93e22f23f3ba436da
|
|
| MD5 |
6a327f299266cd72181e6ea5ac635e58
|
|
| BLAKE2b-256 |
bb7e81e05dc49c53680e8cc0c7991cd42c8737bf9d151b0d25a1018d2544b3e5
|
Provenance
The following attestation bundles were made for zipmanip-0.1.0b1-py3-none-any.whl:
Publisher:
ci.yml on dairiki/zipmanip
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
zipmanip-0.1.0b1-py3-none-any.whl -
Subject digest:
63db969779b8b76b2397a8b690dead3b8fe3b602c430c7b93e22f23f3ba436da - Sigstore transparency entry: 1229708180
- Sigstore integration time:
-
Permalink:
dairiki/zipmanip@4a4b16454af22e6cfdbb5b1e2091431196a7ea14 -
Branch / Tag:
refs/tags/v0.1.0b1 - Owner: https://github.com/dairiki
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
ci.yml@4a4b16454af22e6cfdbb5b1e2091431196a7ea14 -
Trigger Event:
push
-
Statement type: