Skip to main content

An extension to tarfile to allow adding files to a tarfile, without the need to write to disk first. It also allows data to be compressed as it is added to the tarfile, for large files or data that might be generated on the fly. Note that the output file object must support "seek()", hence the output must be an uncompressed tar file. Currently, only GZip is supported for compression.

Project description

Version: 1.1

Summary

This module provides an extension to the standard tarfile.TarFile class which provides the ability to add files to a TarFile that are compressed on-the-fly. It will only work on an uncompressed output tarfile, since after the data is written it will overwrite the header for the file with the correct size data.

Limitations

  • The object to which the tarfile is being written must support “seek()”, so this cannot work over a socket, nor presumably with a compressed tarfile. Note: re-compressing contents is not very useful.

  • The “close_gz_file” method will be called when calling “close” on the file stream. Note: close_gz_file() and close_file() are interchangeable.

  • The constructor does not support reading, use the open() class method or use the base class tarfile.TarFile constructor.

Example Usage

#!/usr/bin/env python3
import os, sys, shutil

import targzstream

# USAGE:  ./foo.py TARFILE INPUT [ INPUT2 ... ]
#  Eg: ./foo.py myoutput.tar *.cpp *.h

with targzstream.TarFile(sys.argv[1], mode='w') as tarball:
    for fname in sys.argv[2:]:
        st = os.stat(fname)
        with tarball.add_gz_file(name=fname + '.gz', mtime=st.st_mtime,
                                 uid=st.st_uid, gid=st.st_gid, mode=st.st_mode) as fout:
            # Copy the data.
            with open(fname, 'rb') as fin:
                shutil.copyfileobj(fin, fout)
# The end.

TODO

  • Have add_gz_file handle the result of an os.stat. Eg:

    with tarball.gz_file(name=fname + '.gz', stat=os.stat(fname)) as obj:
        with open(fname, 'rb') as fin:
            shutil.copyfileobj(fin, obj)
  • Wrap add_gz_file and close_gz_file as a context manager.

    Done.

  • Allow streaming uncompressed files, too.

    Done.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

targzstream-1.1.tar.gz (3.8 kB view hashes)

Uploaded Source

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page