A library to calculate, match, and return details about S3 Object ETags.
Project description
S3ID
pip3 install s3id
S3 Object ETags are calculated with various heuristics and across various formats. If an S3 object is downloaded and subsequently uploaded to a new path, its ETag value is not guaranteed to remain consistent unless it's upload strategy remains consistent.
S3ID is a simple library that calculates an ETag from a local file over a series of common S3 partition sizes to determine if a match exists with a corresponding S3 object ETag.
If a match is found, infomation about how to properly upload this equivalent local file to an alternate S3 location is returned to maintain consistent ETag values.
Usage
The following examples demonstrate that a corresponding ETag match, from data at a local file path (2nd parameter), was found with the remote S3 object Etag provided (1st parameter).
Each match displays various information about how that S3 object was previously uploaded.
Single-Part Match
>>> from s3id import S3ID
>>> from pathlib import Path
>>> S3ID.unpack("f1c9645dbc14efddc7d8a322685f26eb", Path("/tmp/test_10mb.txt"))
{
'match': True,
'signature': '"f1c9645dbc14efddc7d8a322685f26eb"',
'upload_strategy': 'single_part'
}
The corresponding remote S3 object was uploaded as a single object.
Multi-Part Match
>>> from s3id import S3ID
>>> from pathlib import Path
>>> S3ID.unpack("669fdad9e309b552f1e9cf7b489c1f73-2", Path("/tmp/test_10mb.txt"))
{
'match': True,
'signature': '"669fdad9e309b552f1e9cf7b489c1f73-2"',
'partition_in_bytes': 8388608,
'upload_strategy': 'multi_part'
}
The corresponding remote S3 object was uploaded as a multi-part object with a 2 partitions of size: 8388608 bytes (8MB).
Mismatch
>>> from s3id import S3ID
>>> from pathlib import Path
>>> S3ID.unpack("669fdad9e309b552f1e9cf7b489c1f73-3", Path("/tmp/test_10mb.txt"))
{
'match': False
}
A corresponding ETag was not found with the local file path presented.
Parameters:
etag
- Type:
str
- Required: True
- Description: An ETag string for an S3 Object in various formats:
- Single-part ETag (without a hyphen)
- Multi-part ETag (with a hypen) which represents a composite ETag calculations of N chunks in the format (
<hash>-<number_of_chunks>
)
path
- Type:
Path
- Required: True
- Description: A pathlib.Path object of a local file (i.e.
Path(<path/to/local/file>)
) to compare against the previousetag
parameters
threshold_in_bytes
- Type: int
- Required: False
- Default:
5242880
(5MB
) - Description: When choosing the
DEFAULT
strategy
, this value determines when to create aSINGLE_PART
ETag orMULTI_PART
composite ETag.
partition_set_in_bytes
- Type:
Set[int]
- Required: False
- Default:
{5242880,8388608,15728640,16777216}
#5MB, 8MB, 15MB, 16MB - Description: A list of chunk_sizes (as number of bytes) to iterate against (i.e. {
1*1024*1024
,2*1024*1024
, ...}). For each value provided, an ETag will be calculated againstpath
and checked against parameteretag
.
Return Value:
- Type:
Dict[str, Union[int, str]]
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file s3id-0.1.5.tar.gz
.
File metadata
- Download URL: s3id-0.1.5.tar.gz
- Upload date:
- Size: 7.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.8.0 pkginfo/1.8.2 readme-renderer/32.0 requests/2.27.1 requests-toolbelt/0.9.1 urllib3/1.26.8 tqdm/4.62.3 importlib-metadata/4.11.1 keyring/23.5.0 rfc3986/2.0.0 colorama/0.4.4 CPython/3.9.10
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | d882519e3996b681b2e588f76ba02b2f376da70fc31920d32e6b8adf2643602d |
|
MD5 | 4aa1c4a245551d08e7f675f5cf2f9c90 |
|
BLAKE2b-256 | 239be0aa239668f7411f705473f5c6f5918a93f1d5e51c9687e61a8258301a04 |
File details
Details for the file s3id-0.1.5-py3-none-any.whl
.
File metadata
- Download URL: s3id-0.1.5-py3-none-any.whl
- Upload date:
- Size: 7.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.8.0 pkginfo/1.8.2 readme-renderer/32.0 requests/2.27.1 requests-toolbelt/0.9.1 urllib3/1.26.8 tqdm/4.62.3 importlib-metadata/4.11.1 keyring/23.5.0 rfc3986/2.0.0 colorama/0.4.4 CPython/3.9.10
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | d08cd9429b3f8e2ed40b141429d182e5509f9bea625349c540622a0cdbde221f |
|
MD5 | c668c104521f967e2abb1f18046984f6 |
|
BLAKE2b-256 | 1dda17a92d54753f6065a2ab138cd2b5b814bff6c349ee9d093aec04e8044de6 |