Amazon DynamoDB Parallel Scan Paginator for boto3.
Project description
aws-dynamodb-parallel-scan
Amazon DynamoDB parallel scan paginator for boto3.
Installation
Install from PyPI with pip
pip install aws-dynamodb-parallel-scan
or with the package manager of choice.
Usage
The library is a drop-in replacement for boto3 DynamoDB Scan Paginator. Example:
import aws_dynamodb_parallel_scan
import boto3
# Create DynamoDB client to use for scan operations
client = boto3.resource("dynamodb").meta.client
# Create the parallel scan paginator with the client
paginator = aws_dynamodb_parallel_scan.get_paginator(client)
# Scan "mytable" in five segments. Each segment is scanned in parallel.
for page in paginator.paginate(TableName="mytable", TotalSegments=5):
items = page.get("Items", [])
Notes:
-
paginate()
accepts the same arguments as boto3DynamoDB.Client.scan()
method. Arguments are passed toDynamoDB.Client.scan()
as-is. -
paginate()
uses the value ofTotalSegments
argument as parallelism level. Each segment is scanned in parallel in a separate thread. -
paginate()
yields DynamoDB Scan API responses in the same format as boto3DynamoDB.Client.scan()
method.
See boto3 DynamoDB.Client.scan() documentation for details on supported arguments and the response format.
CLI
This package also provides a CLI tool (aws-dynamodb-parallel-scan
) to scan a DynamoDB table
with parallel scan. The tool supports all non-deprecated arguments of DynamoDB Scan API. Execute
aws-dynamodb-parallel-scan -h
for details
Here's some examples:
# Scan "mytable" sequentially
$ aws-dynamodb-parallel-scan --table-name mytable
{"Items": [...], "Count": 10256, "ScannedCount": 10256, "ResponseMetadata": {}}
{"Items": [...], "Count": 12, "ScannedCount": 12, "ResponseMetadata": {}}
# Scan "mytable" in parallel (5 parallel segments)
$ aws-dynamodb-parallel-scan --table-name mytable --total-segments 5
{"Items": [...], "Count":32, "ScannedCount":32, "ResponseMetadata": {}}
{"Items": [...], "Count":47, "ScannedCount":47, "ResponseMetadata": {}}
{"Items": [...], "Count":52, "ScannedCount":52, "ResponseMetadata": {}}
{"Items": [...], "Count":34, "ScannedCount":34, "ResponseMetadata": {}}
{"Items": [...], "Count":40, "ScannedCount":40, "ResponseMetadata": {}}
# Scan "mytable" in parallel and return items, not Scan API responses (--output-items flag)
$ aws-dynamodb-parallel-scan --table-name mytable --total-segments 5 \
--output-items
{"pk": {"S": "item1"}, "quantity": {"N": "99"}}
{"pk": {"S": "item24"}, "quantity": {"N": "25"}}
...
# Scan "mytable" in parallel, return items with native types, not DynamoDB types (--use-document-client flag)
$ aws-dynamodb-parallel-scan --table-name mytable --total-segments 5 \
--output-items --use-document-client
{"pk": "item1", "quantity": 99}
{"pk": "item24", "quantity": 25}
...
# Scan "mytable" with a filter expression, return items
$ aws-dynamodb-parallel-scan --table-name mytable --total-segments 5 \
--filter-expression "quantity < :value" \
--expression-attribute-values '{":value": {"N": "5"}}' \
--output-items
{"pk": {"S": "item142"}, "quantity": {"N": "4"}}
{"pk": {"S": "item874"}, "quantity": {"N": "1"}}
# Scan "mytable" with a filter expression using native types, return items
$ aws-dynamodb-parallel-scan --table-name mytable --total-segments 5 \
--filter-expression "quantity < :value" \
--expression-attribute-values '{":value": 5}' \
--use-document-client --output-items
{"pk": "item142", "quantity": 4}
{"pk": "item874", "quantity": 1}
Development
Requires Python 3 and uv. Useful commands:
# Run tests (integration test requires rights to create, delete and use DynamoDB tables)
make test
# Run linters
make -k lint
# Format code
make format
License
MIT
Credits
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file aws_dynamodb_parallel_scan-1.1.0.tar.gz
.
File metadata
- Download URL: aws_dynamodb_parallel_scan-1.1.0.tar.gz
- Upload date:
- Size: 69.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.4.24
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 5113179aeb2bb476864ec98cd716f32950085d6d00c29a0608f8d3318106102e |
|
MD5 | c3699e9c6a6a3dcb94c2fdd70b00abe0 |
|
BLAKE2b-256 | d708d6c59e0811afe382b6c0d4ec6c139d9ce909cbbc694ac3d8f438da8270bf |
File details
Details for the file aws_dynamodb_parallel_scan-1.1.0-py3-none-any.whl
.
File metadata
- Download URL: aws_dynamodb_parallel_scan-1.1.0-py3-none-any.whl
- Upload date:
- Size: 6.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.4.24
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | ae5e08c84b76ab7822bbc05beabbdb0f56f5269fc9cd182863299c4b4b49919a |
|
MD5 | b5df03f51a363346383b9e0f80ea660d |
|
BLAKE2b-256 | 78523baa49cfe1fbb603fcf5e3d3ff69a48d33b5ab81c45dfa37c57a835e1458 |