Skip to main content

Remove garbage items from DynamoDB tables

Project description

DynamoDB Garbage Collector

Version

The DynamoDB Garbage Collector is a Python library that allows you to delete garbage items in DynamoDB tables.

Table of Contents

Installation

To install the DynamoDB Garbage Collector, use pip:

$ pip install dynamodb-garbage-collector

Usage

The DynamoDB Garbage Collector currently provides a single function called purge_orphan_items, which allows you to delete orphan items in a child table that reference a non-existent item in a parent table. If optional timestamp attributes are provided only will be delete orphan items earlier than a specified maximum time (by default, one hour ago).

To use purge_orphan_items, you need to provide the following parameters:

  • logger: a logger object to log messages during the execution of the function.
  • region: the AWS region where the parent and child tables are located.
  • parent_table: the name of the parent table.
  • child_table: the name of the child table.
  • key_attribute: the name of the key attribute for both tables.
  • child_reference_attribute: the name of the reference attribute in the child table.
  • max_workers (optional): the maximum number of workers to use for concurrent operations. If not provided, a default value of 100 will be used.
  • timestamp_attribute (optional): the name of the attribute that contains the timestamp of the records in the child table. If not provided, timestamp will not be taken into account when deleting items.
  • timestamp_format (optional): the format of the timestamp attribute. If not provided, timestamp will not be taken into account when deleting items.

Here is an example of how to use the purge_orphan_items function:

import logging
from dynamodb_garbage_collector import purge_orphan_items

# Set up the logger
logging.basicConfig()
logger = logging.getLogger()
logger.setLevel(logging.INFO)

# Set the AWS region where the parent and child tables are located
region = 'eu-west-1'

# Set the names of the parent and child tables, and the key and reference attributes
parent_table = 'ParentTable'
child_table = 'ChildTable'
key_attribute = 'id'
child_reference_attribute = 'parentId'

# Set the maximum number of workers
max_workers = 50

# Set the name of the timestamp attribute and the timestamp format
timestamp_attribute = 'createdAt'
timestamp_format = '%Y-%m-%dT%H:%M:%S.%fZ'

# Call the function
purge_orphan_items(logger, region, parent_table, child_table, key_attribute, child_reference_attribute, max_workers, timestamp_attribute, timestamp_format)

Contributing

We welcome contributions to the DynamoDB Garbage Collector. To contribute, please fork the repository and create a pull request with your changes.

License

The DynamoDB Garbage Collector is released under the MIT License.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dynamodb_garbage_collector-1.1.0.tar.gz (5.3 kB view details)

Uploaded Source

Built Distribution

File details

Details for the file dynamodb_garbage_collector-1.1.0.tar.gz.

File metadata

  • Download URL: dynamodb_garbage_collector-1.1.0.tar.gz
  • Upload date:
  • Size: 5.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.3.1 CPython/3.8.10 Linux/5.15.0-56-generic

File hashes

Hashes for dynamodb_garbage_collector-1.1.0.tar.gz
Algorithm Hash digest
SHA256 a3ce20022ad010bff493ea9d4c03624970bf1a4d5c986a7a8bac9aef7deeb680
MD5 b0491f1c5045aa5c9e4e64c41f6b6d51
BLAKE2b-256 32393b0bb89e95f8bcc046a4c24225b3d8de442f6b65bf2ceb84fa319d14ea69

See more details on using hashes here.

File details

Details for the file dynamodb_garbage_collector-1.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for dynamodb_garbage_collector-1.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 86ff869cdc9cefb3d0ab223b5f502fde876bacedf07b33b8746b944ad1080fba
MD5 16cfd177d230ce000b1730ca087f1200
BLAKE2b-256 9aad165eb2aaef943e418218a5aa8487363cc7aa30e7e6ce102e3e0e94aef61d

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page