Skip to main content

Helper to simplify concurrent access to object scanning in AWS S3 buckets.

Project description

s3workers

https://img.shields.io/pypi/v/s3workers.svg https://img.shields.io/travis/bradrf/s3workers.svg Documentation Status Updates

Helper to simplify concurrent access to object scanning in AWS S3 buckets.

Features

S3workers provides faster list and delete operations on S3 buckets by opening up simultaneous connections to issue distinct sets of shared prefix queries. Effectively, this splits up the query space into 36 independent queries (26 alpha and 10 numeric prefixes). For example, a request to list all objects in the myfancy/ bucket would result in concurrent list queries to S3 for everything from myfancy/a... through myfancy/b... and everything from myfancy/0... through myfancy/9..., all at the same time, reporting and collating the results locally.

Selection

The default output of s3workers is to simply list (or delete) all objects found at the prefix requested. However, often it is advantageous to restrict the output to only those matching certain criteria. The --select option provides the ability for evaluating matches using any normal Python operators or builtins against one or more of the following variables provided to the selector for each object found:

  • name: The full S3 key name, everything except the bucket name (string)

  • size: The number of bytes as used by the S3 object (integer).

  • md5: The MD5 hash of the S3 object (string).

  • last_modified: The timestamp indicating the last time the S3 object was changed (string).

Reduction

In cases where aggregation of some kind is desired, s3workers provides the ability to execute reduction logic against an accumulator value. For example, to produce a sum of the size of all selected S3 objects or to even group the size according to MD5 values. See the usage output for examples. In all cases, the same variables provided by selection are also provided when reducing.

Credits

This package was created with Cookiecutter and the audreyr/cookiecutter-pypackage project template.

History

0.3.0 (2017-01-01)

  • Refactor/reorg code for usability/readability; add docs; add tests.

0.2.0 (2016-12-30)

  • Minor fixes, adding docs, using common logging options.

0.1.0 (2016-12-28)

  • First release on PyPI.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

s3workers-0.3.0.tar.gz (17.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

s3workers-0.3.0-py2.py3-none-any.whl (12.0 kB view details)

Uploaded Python 2Python 3

File details

Details for the file s3workers-0.3.0.tar.gz.

File metadata

  • Download URL: s3workers-0.3.0.tar.gz
  • Upload date:
  • Size: 17.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No

File hashes

Hashes for s3workers-0.3.0.tar.gz
Algorithm Hash digest
SHA256 7c53c0849986fdcc3893f01c64d4d07763ef54850e905853088915b8c3afaa5c
MD5 9a461f94ed5ca5a5397d208824af5029
BLAKE2b-256 e15070a7920c74c00782f6c68e62d56089015e833d8b836eae22fa58d0d99f0b

See more details on using hashes here.

File details

Details for the file s3workers-0.3.0-py2.py3-none-any.whl.

File metadata

File hashes

Hashes for s3workers-0.3.0-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 8714c5fc43139a6decbd1d1b0d6bb7e2052d300033a7bc7923c5d55a95d1d5ff
MD5 71542483786b4f8608563509cc8d79e4
BLAKE2b-256 c5cfed0927020592895ae9cfe26e0bed0b1d00ae01024b1f5f0ad83dd27bbec4

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page