Skip to main content

Extend `zipfile` with `remove`-related functionalities

Project description

PyPI version Python Versions Status License Downloads Pull request

This package extends zipfile with remove-related functionalities.

API

  • ZipFile.remove(zinfo_or_arcname)

    Removes a member from the archive. zinfo_or_arcname may be the full path of the member or a ZipInfo instance.

    If multiple members share the same full path, only one is removed when a path is provided.

    This does not physically remove the local file entry from the archive. Call ZipFile.repack afterwards to reclaim space.

    The archive must be opened with mode 'w', 'x' or 'a'.

    Returns the removed ZipInfo instance.

    Calling remove on a closed ZipFile will raise a ValueError.

  • ZipFile.repack(removed=None, *, strict_descriptor=False[, chunk_size])

    Rewrites the archive to remove stale local file entries, shrinking its file size.

    If removed is provided, it must be a sequence of ZipInfo objects representing removed entries; only their corresponding local file entries will be removed.

    If removed is not provided, the archive is scanned to identify and remove local file entries that are no longer referenced in the central directory. The algorithm assumes that local file entries (and the central directory, which is mostly treated as the "last entry") are stored consecutively:

    1. Data before the first referenced entry is removed only when it appears to be a sequence of consecutive entries with no extra following bytes; extra preceding bytes are preserved.
    2. Data between referenced entries is removed only when it appears to be a sequence of consecutive entries with no extra preceding bytes; extra following bytes are preserved.
    3. Entries must not overlap. If any entry's data overlaps with another, a BadZipFile error is raised and no changes are made.

    When scanning, setting strict_descriptor=True disables detection of any entry using an unsigned data descriptor (deprecated in the ZIP specification since version 6.3.0, released on 2006-09-29, and used only by some legacy tools). This improves performance, but may cause some stale entries to be preserved.

    chunk_size may be specified to control the buffer size when moving entry data (default is 1 MiB).

    The archive must be opened with mode 'a'.

    Calling repack on a closed ZipFile will raise a ValueError.

  • ZipFile.copy(zinfo_or_arcname, new_arcname[, chunk_size])

    Copies a member zinfo_or_arcname to new_arcname in the archive. zinfo_or_arcname may be the full path of the member or a ZipInfo instance.

    chunk_size may be specified to control the buffer size when copying entry data (default is 1 MiB).

    The archive must be opened with mode 'w', 'x' or 'a', and the underlying stream must be seekable.

    Returns the original version of the copied ZipInfo instance.

    Calling copy on a closed ZipFile will raise a ValueError.

Examples

Remove entries and reclaim space

Call repack after removes to reclaim the space of the removed entries:

import os
import zipremove as zipfile

with zipfile.ZipFile('archive.zip', 'w') as zh:
    zh.writestr('file1', 'content1')
    zh.writestr('file2', 'content2')
    zh.writestr('file3', 'content3')
    zh.writestr('file4', 'content4')

print(os.path.getsize('archive.zip'))  # 398

with zipfile.ZipFile('archive.zip', 'a') as zh:
    zh.remove('file1')
    zh.remove('file2')
    zh.remove('file3')
    zh.repack()

print(os.path.getsize('archive.zip'))  # 116 (would be 245 without `repack`)

Alternatively, pass the ZipInfo objects of the removed entries, for better performance and error-proofing:

import os
import zipremove as zipfile

with zipfile.ZipFile('archive.zip', 'w') as zh:
    zh.writestr('file1', 'content1')
    zh.writestr('file2', 'content2')
    zh.writestr('file3', 'content3')
    zh.writestr('file4', 'content4')

print(os.path.getsize('archive.zip'))  # 398

with zipfile.ZipFile('archive.zip', 'a') as zh:
    zinfos = []
    zinfos.append(zh.remove('file1'))
    zinfos.append(zh.remove('file2'))
    zinfos.append(zh.remove('file3'))
    zh.repack(zinfos)

print(os.path.getsize('archive.zip'))  # 116 (would be 245 without `repack`)

Move entries under a folder and reclaim space

Moving entries in a ZIP file must be done as a combination of copy, remove, and optionally repack, because every local file entry contains the filename and requires rewriting.

import os
import zipremove as zipfile

with zipfile.ZipFile('archive.zip', 'w') as zh:
    zh.writestr('file0', 'content0')
    zh.writestr('folder1/file1', 'content1')
    zh.writestr('folder1/file2', 'content2')
    zh.writestr('folder1/file3', 'content3')

print(os.path.getsize('archive.zip'))  # 446

with zipfile.ZipFile('archive.zip', 'a') as zh:
    for n in zh.namelist():
        if n.startswith('folder1/'):
            n2 = 'folder2/' + n[len('folder1/'):]
            zh.copy(n, n2)
            zh.remove(n)
    zh.repack()

print(os.path.getsize('archive.zip'))  # 446 (would be 599 without `repack`)

Similarly, pass the ZipInfo objects of the copied/removed entries for better performance and error-proofing:

import os
import zipremove as zipfile

with zipfile.ZipFile('archive.zip', 'w') as zh:
    zh.writestr('file0', 'content0')
    zh.writestr('folder1/file1', 'content1')
    zh.writestr('folder1/file2', 'content2')
    zh.writestr('folder1/file3', 'content3')

print(os.path.getsize('archive.zip'))  # 446

with zipfile.ZipFile('archive.zip', 'a') as zh:
    zinfos = []
    for n in zh.namelist():
        if n.startswith('folder1/'):
            n2 = 'folder2/' + n[len('folder1/'):]
            zinfos.append(zh.remove(zh.copy(n, n2)))
    zh.repack(zinfos)

print(os.path.getsize('archive.zip'))  # 446 (would be 599 without `repack`)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

zipremove-0.4.1.tar.gz (20.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

zipremove-0.4.1-py3-none-any.whl (9.5 kB view details)

Uploaded Python 3

File details

Details for the file zipremove-0.4.1.tar.gz.

File metadata

  • Download URL: zipremove-0.4.1.tar.gz
  • Upload date:
  • Size: 20.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.3

File hashes

Hashes for zipremove-0.4.1.tar.gz
Algorithm Hash digest
SHA256 4440f0f95ab12c253ca1cf247b0ba41a7bd0aab745dfa60959d3ed734b21f8df
MD5 ce2078c4195d9812f1e55af3e6e7e4fb
BLAKE2b-256 b7f8b748c5cbe818909ac120591821b502af0a00f98b415f9e9e2f8858b8a22c

See more details on using hashes here.

File details

Details for the file zipremove-0.4.1-py3-none-any.whl.

File metadata

  • Download URL: zipremove-0.4.1-py3-none-any.whl
  • Upload date:
  • Size: 9.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.3

File hashes

Hashes for zipremove-0.4.1-py3-none-any.whl
Algorithm Hash digest
SHA256 99dfc9c589ff4c1135ef7561a648c7a4acdacdf2e83362afb13367ff60eec6e5
MD5 85f416958d4792584163cdd9321a02a6
BLAKE2b-256 2c84ba2125f327ee77d6a33f1739c33b0924dcfac84f995d31253615af106464

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page