Remove garbage items from DynamoDB tables
Project description
DynamoDB Garbage Collector
The DynamoDB Garbage Collector is a Python library that allows you to delete garbage items in DynamoDB tables.
Table of Contents
Installation
To install the DynamoDB Garbage Collector, use pip
:
$ pip install dynamodb-garbage-collector
Usage
The DynamoDB Garbage Collector currently provides a single function called purge_orphan_items
, which allows you to delete orphan items in a child table that reference a non-existent item in a parent table. If optional timestamp attributes are provided only will be delete orphan items earlier than a specified maximum time (by default, one hour ago).
To use purge_orphan_items
, you need to provide the following parameters:
logger
: a logger object to log messages during the execution of the function.region
: the AWS region where the parent and child tables are located.parent_table
: the name of the parent table.child_table
: the name of the child table.key_attribute
: the name of the key attribute for both tables.child_reference_attribute
: the name of the reference attribute in the child table.max_workers
(optional): the maximum number of workers to use for concurrent operations. If not provided, a default value of 100 will be used.timestamp_attribute
(optional): the name of the attribute that contains the timestamp of the records in the child table. If not provided, timestamp will not be taken into account when deleting items.timestamp_format
(optional): the format of the timestamp attribute. If not provided, timestamp will not be taken into account when deleting items.
Here is an example of how to use the purge_orphan_items
function:
import logging
from dynamodb_garbage_collector import purge_orphan_items
# Set up the logger
logging.basicConfig()
logger = logging.getLogger()
logger.setLevel(logging.INFO)
# Set the AWS region where the parent and child tables are located
region = 'eu-west-1'
# Set the names of the parent and child tables, and the key and reference attributes
parent_table = 'ParentTable'
child_table = 'ChildTable'
key_attribute = 'id'
child_reference_attribute = 'parentId'
# Set the maximum number of workers
max_workers = 50
# Set the name of the timestamp attribute and the timestamp format
timestamp_attribute = 'createdAt'
timestamp_format = '%Y-%m-%dT%H:%M:%S.%fZ'
# Call the function
purge_orphan_items(logger, region, parent_table, child_table, key_attribute, child_reference_attribute, max_workers, timestamp_attribute, timestamp_format)
Contributing
We welcome contributions to the DynamoDB Garbage Collector. To contribute, please fork the repository and create a pull request with your changes.
License
The DynamoDB Garbage Collector is released under the MIT License.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for dynamodb_garbage_collector-1.1.0.tar.gz
Algorithm | Hash digest | |
---|---|---|
SHA256 | a3ce20022ad010bff493ea9d4c03624970bf1a4d5c986a7a8bac9aef7deeb680 |
|
MD5 | b0491f1c5045aa5c9e4e64c41f6b6d51 |
|
BLAKE2b-256 | 32393b0bb89e95f8bcc046a4c24225b3d8de442f6b65bf2ceb84fa319d14ea69 |
Hashes for dynamodb_garbage_collector-1.1.0-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 86ff869cdc9cefb3d0ab223b5f502fde876bacedf07b33b8746b944ad1080fba |
|
MD5 | 16cfd177d230ce000b1730ca087f1200 |
|
BLAKE2b-256 | 9aad165eb2aaef943e418218a5aa8487363cc7aa30e7e6ce102e3e0e94aef61d |