Skip to main content

Produce canonical zips and hashes.

Project description

canonzip

Produce canonical zips and hashes from directory contents.

A canonical zip produces the exact same file for the same inputs, regardless of when it was made or what machine made it.

A canonical hash produces the exact same hash for the same inputs, regardless of when it was made or what machine made it.

This is particularly useful when zipping things like code for AWS Lambda Functions, where you want to upload a new zip if and only if the code has truly changed.

canonzip supports two usage modes: as a CLI or as an API.

Check out examples/terraform-aws-lambda for an example use-case.

Command Line Interface (CLI)

canonzip hash [OPTIONS] TARGET

Print a canonical SHA-1 hash of TARGET to stdout.

$ canonzip hash path/to/target
4959e4b9a1812e511570eee14fe65b90098a0db6

canonzip zip [OPTIONS] OUTPUT_PATH TARGET

Write a canonical zip archive of TARGET to OUTPUT_PATH.

$ canonzip zip path/to/output.zip path/to/target

NOTE: the output of hash is NOT the same as the SHA-1 hash of the output from zip. hash is specifically designed to avoid the extra overhead of writing a zip file while fulfilling a similar use-case — detecting changes in the files.

CLI options

Both commands accept:

Option Description
--exclude TEXT, -e TEXT Glob pattern to exclude (repeatable)
--gitignore Exclude files based on .gitignore rules from the target's git repository
--follow-symlinks Follow symbolic links; otherwise symlinks are ignored
--verbose, -v Print included file paths (relative to target) to stderr
--json Output result as JSON (e.g. {"hash": "..."})

If you specify both exclude and gitignore, files will be excluded as long as they match at least one rule (logical or).

NOTE: exclude double-star globs (**) match one-or-more path segments; contrary to gitignore syntax where they match zero-or-more.

Programmatic Interface (API)

canonzip.hash(target, *, exclude, gitignore, follow_symlinks) -> str

Compute a canonical SHA-1 hash of a directory.

import canonzip

digest = canonzip.hash("path/to/target")
#> "4959e4b9a1812e511570eee14fe65b90098a0db6"

canonzip.zip(output_path, target, *, exclude, gitignore, follow_symlinks) -> None

Create a canonical zip archive of a directory.

canonzip.zip("path/to/output.zip", "path/to/target")

Shared options

Both functions accept:

Parameter Type Default Description
exclude list[str] | None None Glob patterns to exclude
gitignore bool False Exclude files based on .gitignore rules from the target's git repository
follow_symlinks bool False Follow symbolic links; if False, symlinks are ignored

If you specify both exclude and gitignore, files will be excluded as long as they match at least one rule (logical or).

NOTE: exclude double-star globs (**) match one-or-more path segments; contrary to gitignore syntax where they match zero-or-more.

Exceptions

canonzip will raise standard errors if it cannot read or write files, typically inheriting from OSError.

Additionally there are special cases which raise errors which inherit from canonzip.CanonzipError:

Exception Raised when
OutputPathError output_path is inside target
GitRepositoryError gitignore=True but target is not in a git repo
BrokenSymlinkError A broken symlink is encountered with follow_symlinks=True
SymlinkCycleError A symlink cycle is detected with follow_symlinks=True

Advanced: build manifests explicitly

If you need direct access to the list of files that would be included in the canonical hash or zip, you can use build_manifest to read the target directory and return a Manifest object containing the list of files. To save yourself from having to generate the manifest twice, you can then pass it directly to hash_from_manifest or zip_from_manifest to complete the operation.

from canonzip import build_manifest, hash_from_manifest, zip_from_manifest

manifest = build_manifest("path/to/target", exclude=[".venv"])

# Do something interesting with the manifest...
print(manifest.target.as_posix())

for entry in manifest.entries:
    print(entry.path.as_posix())

# Then compute the hash or zip
digest = hash_from_manifest(manifest)
zip_from_manifest("path/to/output.zip", manifest)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

canonzip-1.0.1.tar.gz (9.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

canonzip-1.0.1-py3-none-any.whl (12.8 kB view details)

Uploaded Python 3

File details

Details for the file canonzip-1.0.1.tar.gz.

File metadata

  • Download URL: canonzip-1.0.1.tar.gz
  • Upload date:
  • Size: 9.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.11.2 {"installer":{"name":"uv","version":"0.11.2","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for canonzip-1.0.1.tar.gz
Algorithm Hash digest
SHA256 345ecbc1ad1c77b2f774723fb6ec6c082922bed398a20cc7435441ed7e90fd61
MD5 3d443d50a825a098d67f39c842803809
BLAKE2b-256 dad868737b4c064e2b481259f1fd76f3fa718c7cb510dc327b71e604503eb1b6

See more details on using hashes here.

File details

Details for the file canonzip-1.0.1-py3-none-any.whl.

File metadata

  • Download URL: canonzip-1.0.1-py3-none-any.whl
  • Upload date:
  • Size: 12.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.11.2 {"installer":{"name":"uv","version":"0.11.2","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for canonzip-1.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 b7b67db30efbdf0858a928fe0f1e62d99ca7d6ff6a8e2b949c7b8389f2302a02
MD5 c6608e768e65b65f1062f5a294b3cc0b
BLAKE2b-256 84a7c05d4544546de240b46644ca788cbcd311e395d919fc7c146f1e3e2448f2

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page