Skip to main content

Amazon DynamoDB Parallel Scan Paginator for boto3.

Project description

aws-dynamodb-parallel-scan

Amazon DynamoDB parallel scan paginator for boto3.

Installation

Install from PyPI with pip

pip install aws-dynamodb-parallel-scan

or with the package manager of choice.

Usage

The library is a drop-in replacement for boto3 DynamoDB Scan Paginator. Example:

import aws_dynamodb_parallel_scan
import boto3

# Create DynamoDB client to use for scan operations
client = boto3.resource("dynamodb").meta.client

# Create the parallel scan paginator with the client
paginator = aws_dynamodb_parallel_scan.get_paginator(client)

# Scan "mytable" in five segments. Each segment is scanned in parallel.
for page in paginator.paginate(TableName="mytable", TotalSegments=5):
    items = page.get("Items", [])

Notes:

  • paginate() accepts the same arguments as boto3 DynamoDB.Client.scan() method. Arguments are passed to DynamoDB.Client.scan() as-is.

  • paginate() uses the value of TotalSegments argument as parallelism level. Each segment is scanned in parallel in a separate thread.

  • paginate() yields DynamoDB Scan API responses in the same format as boto3 DynamoDB.Client.scan() method.

See boto3 DynamoDB.Client.scan() documentation for details on supported arguments and the response format.

CLI

This package also provides a CLI tool (aws-dynamodb-parallel-scan) to scan a DynamoDB table with parallel scan. The tool supports all non-deprecated arguments of DynamoDB Scan API. Execute aws-dynamodb-parallel-scan -h for details

Here's some examples:

# Scan "mytable" sequentially
$ aws-dynamodb-parallel-scan --table-name mytable
{"Items": [...], "Count": 10256, "ScannedCount": 10256, "ResponseMetadata": {}}
{"Items": [...], "Count": 12, "ScannedCount": 12, "ResponseMetadata": {}}

# Scan "mytable" in parallel (5 parallel segments)
$ aws-dynamodb-parallel-scan --table-name mytable --total-segments 5
{"Items": [...], "Count":32, "ScannedCount":32, "ResponseMetadata": {}}
{"Items": [...], "Count":47, "ScannedCount":47, "ResponseMetadata": {}}
{"Items": [...], "Count":52, "ScannedCount":52, "ResponseMetadata": {}}
{"Items": [...], "Count":34, "ScannedCount":34, "ResponseMetadata": {}}
{"Items": [...], "Count":40, "ScannedCount":40, "ResponseMetadata": {}}

# Scan "mytable" in parallel and return items, not Scan API responses (--output-items flag)
$ aws-dynamodb-parallel-scan --table-name mytable --total-segments 5 \
    --output-items
{"pk": {"S": "item1"}, "quantity": {"N": "99"}}
{"pk": {"S": "item24"}, "quantity": {"N": "25"}}
...

# Scan "mytable" in parallel, return items with native types, not DynamoDB types (--use-document-client flag)
$ aws-dynamodb-parallel-scan --table-name mytable --total-segments 5 \
    --output-items --use-document-client
{"pk": "item1", "quantity": 99}
{"pk": "item24", "quantity": 25}
...

# Scan "mytable" with a filter expression, return items
$ aws-dynamodb-parallel-scan --table-name mytable --total-segments 5 \
    --filter-expression "quantity < :value" \
    --expression-attribute-values '{":value": {"N": "5"}}' \
    --output-items
{"pk": {"S": "item142"}, "quantity": {"N": "4"}}
{"pk": {"S": "item874"}, "quantity": {"N": "1"}}

# Scan "mytable" with a filter expression using native types, return items
$ aws-dynamodb-parallel-scan --table-name mytable --total-segments 5 \
    --filter-expression "quantity < :value" \
    --expression-attribute-values '{":value": 5}' \
    --use-document-client --output-items
{"pk": "item142", "quantity": 4}
{"pk": "item874", "quantity": 1}

Development

Requires Python 3 and uv. Useful commands:

# Run tests (integration test requires rights to create, delete and use DynamoDB tables)
make test

# Run linters
make -k lint

# Format code
make format

License

MIT

Credits

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

aws_dynamodb_parallel_scan-1.2.0.tar.gz (3.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

aws_dynamodb_parallel_scan-1.2.0-py3-none-any.whl (5.5 kB view details)

Uploaded Python 3

File details

Details for the file aws_dynamodb_parallel_scan-1.2.0.tar.gz.

File metadata

  • Download URL: aws_dynamodb_parallel_scan-1.2.0.tar.gz
  • Upload date:
  • Size: 3.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.11 {"installer":{"name":"uv","version":"0.9.11"},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for aws_dynamodb_parallel_scan-1.2.0.tar.gz
Algorithm Hash digest
SHA256 a52a4b86e4694131663cc3230e45330d54600d7c7ebf203155c13a80bfcae76c
MD5 1f6e958438526014ac900f1ba1fa63a2
BLAKE2b-256 b96f301631ce14549a83e099247dee08e95b3cc50ffb3feff5a87b89cd1fd3f3

See more details on using hashes here.

File details

Details for the file aws_dynamodb_parallel_scan-1.2.0-py3-none-any.whl.

File metadata

  • Download URL: aws_dynamodb_parallel_scan-1.2.0-py3-none-any.whl
  • Upload date:
  • Size: 5.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.11 {"installer":{"name":"uv","version":"0.9.11"},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for aws_dynamodb_parallel_scan-1.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 27cb1a0871d6574399a6feb6db0254fbcbedac624d7eb90b4709a9a5ff4c8284
MD5 9e963f8c7cfe328662405ee97ceae8fe
BLAKE2b-256 5903ad49d7c8818fbe4395267aa0f494c36a944d479d781e7dccab1e04ffcdda

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page