Skip to main content

A library to calculate, match, and return details about S3 Object ETags.

Project description

S3ID

example workflow pip3 install s3id

S3 Object ETags are calculated with various heuristics and across various formats. If an S3 object is downloaded and subsequently uploaded to a new path, its ETag value is not guaranteed to remain consistent unless it's upload strategy remains consistent.

S3ID is a simple library that calculates an ETag from a local file over a series of common S3 partition sizes to determine if a match exists with a corresponding S3 object ETag.

If a match is found, infomation about how to properly upload this equivalent local file to an alternate S3 location is returned to maintain consistent ETag values.

Usage

The following examples demonstrate that a corresponding ETag match, from data at a local file path (2nd parameter), was found with the remote S3 object Etag provided (1st parameter).

Each match displays various information about how that S3 object was previously uploaded.

Single-Part Match

>>> from s3id import S3ID
>>> from pathlib import Path
>>> S3ID.unpack("f1c9645dbc14efddc7d8a322685f26eb", Path("/tmp/test_10mb.txt"))
{
	'match': True, 
	'signature': '"f1c9645dbc14efddc7d8a322685f26eb"',
	'upload_strategy': 'single_part'
}

The corresponding remote S3 object was uploaded as a single object.

Multi-Part Match

>>> from s3id import S3ID
>>> from pathlib import Path
>>> S3ID.unpack("669fdad9e309b552f1e9cf7b489c1f73-2", Path("/tmp/test_10mb.txt"))
{
	'match': True, 
	'signature': '"669fdad9e309b552f1e9cf7b489c1f73-2"',
	'partition_in_bytes': 8388608,
	'upload_strategy': 'multi_part'
}

The corresponding remote S3 object was uploaded as a multi-part object with a 2 partitions of size: 8388608 bytes (8MB).

Mismatch

>>> from s3id import S3ID
>>> from pathlib import Path
>>> S3ID.unpack("669fdad9e309b552f1e9cf7b489c1f73-3", Path("/tmp/test_10mb.txt"))
{
	'match': False
}

A corresponding ETag was not found with the local file path presented.

Parameters:

etag

  • Type: str
  • Required: True
  • Description: An ETag string for an S3 Object in various formats:
    • Single-part ETag (without a hyphen)
    • Multi-part ETag (with a hypen) which represents a composite ETag calculations of N chunks in the format (<hash>-<number_of_chunks>)

path

  • Type: Path
  • Required: True
  • Description: A pathlib.Path object of a local file (i.e. Path(<path/to/local/file>)) to compare against the previous etag parameters

threshold_in_bytes

  • Type: int
  • Required: False
  • Default: 5242880 (5MB)
  • Description: When choosing the DEFAULT strategy, this value determines when to create a SINGLE_PART ETag or MULTI_PART composite ETag.

partition_set_in_bytes

  • Type: Set[int]
  • Required: False
  • Default: {5242880,8388608,15728640,16777216} #5MB, 8MB, 15MB, 16MB
  • Description: A list of chunk_sizes (as number of bytes) to iterate against (i.e. {1*1024*1024, 2*1024*1024, ...}). For each value provided, an ETag will be calculated against path and checked against parameter etag.

Return Value:

  • Type: Dict[str, Union[int, str]]

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

s3id-0.1.5.tar.gz (7.2 kB view details)

Uploaded Source

Built Distribution

s3id-0.1.5-py3-none-any.whl (7.4 kB view details)

Uploaded Python 3

File details

Details for the file s3id-0.1.5.tar.gz.

File metadata

  • Download URL: s3id-0.1.5.tar.gz
  • Upload date:
  • Size: 7.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 pkginfo/1.8.2 readme-renderer/32.0 requests/2.27.1 requests-toolbelt/0.9.1 urllib3/1.26.8 tqdm/4.62.3 importlib-metadata/4.11.1 keyring/23.5.0 rfc3986/2.0.0 colorama/0.4.4 CPython/3.9.10

File hashes

Hashes for s3id-0.1.5.tar.gz
Algorithm Hash digest
SHA256 d882519e3996b681b2e588f76ba02b2f376da70fc31920d32e6b8adf2643602d
MD5 4aa1c4a245551d08e7f675f5cf2f9c90
BLAKE2b-256 239be0aa239668f7411f705473f5c6f5918a93f1d5e51c9687e61a8258301a04

See more details on using hashes here.

File details

Details for the file s3id-0.1.5-py3-none-any.whl.

File metadata

  • Download URL: s3id-0.1.5-py3-none-any.whl
  • Upload date:
  • Size: 7.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 pkginfo/1.8.2 readme-renderer/32.0 requests/2.27.1 requests-toolbelt/0.9.1 urllib3/1.26.8 tqdm/4.62.3 importlib-metadata/4.11.1 keyring/23.5.0 rfc3986/2.0.0 colorama/0.4.4 CPython/3.9.10

File hashes

Hashes for s3id-0.1.5-py3-none-any.whl
Algorithm Hash digest
SHA256 d08cd9429b3f8e2ed40b141429d182e5509f9bea625349c540622a0cdbde221f
MD5 c668c104521f967e2abb1f18046984f6
BLAKE2b-256 1dda17a92d54753f6065a2ab138cd2b5b814bff6c349ee9d093aec04e8044de6

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page