Skip to main content

Check consistency of files stored on S3 against local files

Project description

aws s3 sync is great!

But if you are truly paranoid about your precious files safety, it’s always better to double check what was uploaded to S3 before deleting them from your local file system.

This tool does exactly that:

  1. List recursively your local files

  2. Ask S3 for their sizes and ETags

  3. Check this against locally computed ETags

Usage

$ time s3-consistency-checker /data/foo s3://bucketname/foo
2017/10/14 16:30:58.729 INFO Comparing 2093 files from /data/foo with s3://bucketname/foo
2017/10/14 17:43:18.222 INFO success=2093 errors=0 files=2093 bytes=1.3TiB

real    72m20.097s
user    55m6.975s
sys     778m45.112s

$ time s3-consistency-checker /data/bar s3://bucketname/baz
2017/10/14 18:47:08.620 INFO Comparing 26531 files from /data/bar with s3://bucketname/baz
2017/10/14 19:21:48.425 INFO success=26531 errors=0 files=26531 bytes=220.1GiB

real    34m42.023s
user    40m22.292s
sys     33m57.729s

$ time s3-consistency-checker /data/foobar s3://bucketname/foobar
2017/10/15 02:11:00.904 INFO Comparing 11224 files from /data/foobar with s3://bucketname/foobar
2017/10/15 02:25:18.397 INFO success=11224 errors=0 files=11224 bytes=84.8GiB

real    14m18.873s
user    17m3.899s
sys     10m33.841s

Internals

This tool is designed to process a lot of big files as quickly as possible. It uses the split command to split big files in chunks, stores them in /dev/shm, then computes their checksums using the md5sum command. It does this in parallel using separate processes and threads to drive them.

Installation

$ pip install s3-consistency-checker

Requirements

  • Python 3.x

  • boto3

  • coreutils: md5sum, split

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

s3-consistency-checker-1.1.1.tar.gz (6.7 kB view details)

Uploaded Source

File details

Details for the file s3-consistency-checker-1.1.1.tar.gz.

File metadata

File hashes

Hashes for s3-consistency-checker-1.1.1.tar.gz
Algorithm Hash digest
SHA256 ff5830381aa7d32b730e8e93079d70e299315beeca8b2cd574239539048ecbd0
MD5 3b56115dc76f406512aebfc3828fffba
BLAKE2b-256 6cfc750f3f8a6d669586ea586fc3b974d4fab4d542a54ad42ecd1ccc9f555b4f

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page