Skip to main content

Tar (and compress) files in s3

Project description

s3-tar

PyPI PyPI

Create a tar/tar.gz/tar.bz2 file from many s3 files and stream back into s3.
*Currently does not preserve directory structure, all files will be in the root dir

Install

pip install s3-tar

Usage

Command Line

$ s3-tar -h

Import

from s3_tar import S3Tar

bucket = 'YOUR_BUCKET_NAME'
path_to_tar = 'PATH_TO_FILES_TO_CONCAT'
tared_file = 'FILE_TO_SAVE_TO.tar'  # use `tar.gz` or `tar.bz2` to enable compression
# Setting this to a size will always add a part number at the end of the file name
min_file_size = '50MB'  # ex: FILE_TO_SAVE_TO-1.tar, FILE_TO_SAVE_TO-2.tar, ...
# Setting this to None will create a single tar with all the files
# min_file_size = None

# Init the job
job = S3Tar(bucket, tared_file,
            min_file_size=min_file_size,
            # target_bucket=None,  # Default: source bucket. Can be used to save the archive into a different bucket
            # cache_size=5,  # Default 5, Number of files to hold in memory to be processed
            # session=boto3.session.Session(),  # For custom aws session
)
# Add files, can call multiple times to add files from other directories
job.add_files(path_to_concat)
# Add a single file at a time
job.add_file('some/file_key.json')
# Star the tar'ing job after files have been added
job.tar()

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

s3-tar-0.1.3.tar.gz (5.8 kB view details)

Uploaded Source

File details

Details for the file s3-tar-0.1.3.tar.gz.

File metadata

  • Download URL: s3-tar-0.1.3.tar.gz
  • Upload date:
  • Size: 5.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/45.1.0 requests-toolbelt/0.9.1 tqdm/4.42.1 CPython/3.8.1

File hashes

Hashes for s3-tar-0.1.3.tar.gz
Algorithm Hash digest
SHA256 6070dfd8ed2b7b335dca547c66e549db5a50a65dcd56587ba9ee7e91ace8c03c
MD5 0543264288ec00cd5a75590ee5ce8f93
BLAKE2b-256 72cf1550ca9a88a0ea54312b8a916f8059d9f725dcb5ac833a4413816b1d3678

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page