Skip to main content

Produce canonical zips and hashes.

Project description

canonzip

Produce canonical zips and hashes from directory contents.

A canonical zip produces the exact same file for the same inputs, regardless of when it was made or what machine made it.

A canonical hash produces the exact same hash for the same inputs, regardless of when it was made or what machine made it.

This is particularly useful when zipping things like code for AWS Lambda Functions, where you want to upload a new zip if and only if the code has truly changed.

canonzip supports two usage modes: as a CLI or as an API.

Check out examples/terraform-aws-lambda for an example use-case.

Command Line Interface (CLI)

canonzip hash [OPTIONS] TARGET

Print a canonical SHA-1 hash of TARGET to stdout.

$ canonzip hash path/to/target
4959e4b9a1812e511570eee14fe65b90098a0db6

canonzip zip [OPTIONS] OUTPUT_PATH TARGET

Write a canonical zip archive of TARGET to OUTPUT_PATH.

$ canonzip zip path/to/output.zip path/to/target

NOTE: the output of hash is NOT the same as the SHA-1 hash of the output from zip. hash is specifically designed to avoid the extra overhead of writing a zip file while fulfilling a similar use-case — detecting changes in the files.

CLI options

Both commands accept:

Option Description
--exclude TEXT, -e TEXT Glob pattern to exclude (repeatable)
--gitignore Exclude files based on .gitignore rules from the target's git repository
--follow-symlinks Follow symbolic links; otherwise symlinks are ignored
--verbose, -v Print included file paths (relative to target) to stderr
--json Output result as JSON (e.g. {"hash": "..."})

If you specify both exclude and gitignore, files will be excluded as long as they match at least one rule (logical or).

NOTE: exclude double-star globs (**) match one-or-more path segments; contrary to gitignore syntax where they match zero-or-more.

Programmatic Interface (API)

canonzip.hash(target, *, exclude, gitignore, follow_symlinks) -> str

Compute a canonical SHA-1 hash of a directory.

import canonzip

digest = canonzip.hash("path/to/target")
#> "4959e4b9a1812e511570eee14fe65b90098a0db6"

canonzip.zip(output_path, target, *, exclude, gitignore, follow_symlinks) -> None

Create a canonical zip archive of a directory.

canonzip.zip("path/to/output.zip", "path/to/target")

Shared options

Both functions accept:

Parameter Type Default Description
exclude list[str] | None None Glob patterns to exclude
gitignore bool False Exclude files based on .gitignore rules from the target's git repository
follow_symlinks bool False Follow symbolic links; if False, symlinks are ignored

If you specify both exclude and gitignore, files will be excluded as long as they match at least one rule (logical or).

NOTE: exclude double-star globs (**) match one-or-more path segments; contrary to gitignore syntax where they match zero-or-more.

Exceptions

canonzip will raise standard errors if it cannot read or write files, typically inheriting from OSError.

Additionally there are special cases which raise errors which inherit from canonzip.CanonzipError:

Exception Raised when
OutputPathError output_path is inside target
GitRepositoryError gitignore=True but target is not in a git repo
BrokenSymlinkError A broken symlink is encountered with follow_symlinks=True
SymlinkCycleError A symlink cycle is detected with follow_symlinks=True

Advanced: build manifests explicitly

If you need direct access to the list of files that would be included in the canonical hash or zip, you can use build_manifest to read the target directory and return a Manifest object containing the list of files. To save yourself from having to generate the manifest twice, you can then pass it directly to hash_from_manifest or zip_from_manifest to complete the operation.

from canonzip import build_manifest, hash_from_manifest, zip_from_manifest

manifest = build_manifest("path/to/target", exclude=[".venv"])

# Do something interesting with the manifest...
print(manifest.target.as_posix())

for entry in manifest.entries:
    print(entry.path.as_posix())

# Then compute the hash or zip
digest = hash_from_manifest(manifest)
zip_from_manifest("path/to/output.zip", manifest)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

canonzip-1.0.0.tar.gz (9.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

canonzip-1.0.0-py3-none-any.whl (12.8 kB view details)

Uploaded Python 3

File details

Details for the file canonzip-1.0.0.tar.gz.

File metadata

  • Download URL: canonzip-1.0.0.tar.gz
  • Upload date:
  • Size: 9.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.11.2 {"installer":{"name":"uv","version":"0.11.2","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for canonzip-1.0.0.tar.gz
Algorithm Hash digest
SHA256 60feb81480eda90d528824984e594e5171de4a57fb593c323a2cfb2bf5c8e7a5
MD5 e6ea6ac0b99eb4f74b63b48bfb3ec201
BLAKE2b-256 7610a1c7cf2d02b4911391e3db07bdede01dc22d65263a7bef888eaa9889861e

See more details on using hashes here.

File details

Details for the file canonzip-1.0.0-py3-none-any.whl.

File metadata

  • Download URL: canonzip-1.0.0-py3-none-any.whl
  • Upload date:
  • Size: 12.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.11.2 {"installer":{"name":"uv","version":"0.11.2","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for canonzip-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 f4d8bebfb24b92a61951b36fab0abfce2672e7e3636b8964f6a40840bbfd7133
MD5 a60b8d80f7ef6de4a6c1dff8e6dc11ab
BLAKE2b-256 b30dffd5ccd9d588731b2b3b4abbe4acedfba67989d4f6a254ac11b4608242c2

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page