Skip to main content

Remove garbage items from DynamoDB tables

Project description

DynamoDB Garbage Collector

Version

The DynamoDB Garbage Collector is a Python library that allows you to delete garbage items in DynamoDB tables.

Table of Contents

Installation

To install the DynamoDB Garbage Collector, use pip:

$ pip install dynamodb-garbage-collector

Usage

The DynamoDB Garbage Collector currently provides a single function called purge_orphan_items, which allows you to delete orphan items in a child table that reference a non-existent item in a parent table. If optional timestamp attributes are provided only will be delete orphan items earlier than a specified maximum time (by default, one hour ago).

To use purge_orphan_items, you need to provide the following parameters:

  • logger: a logger object to log messages during the execution of the function.
  • region: the AWS region where the parent and child tables are located.
  • parent_table: the name of the parent table.
  • child_table: the name of the child table.
  • key_attribute: the name of the key attribute for both tables.
  • child_reference_attribute: the name of the reference attribute in the child table.
  • max_workers (optional): the maximum number of workers to use for concurrent operations. If not provided, a default value of 100 will be used.
  • timestamp_attribute (optional): the name of the attribute that contains the timestamp of the records in the child table. If not provided, timestamp will not be taken into account when deleting items.
  • timestamp_format (optional): the format of the timestamp attribute. If not provided, timestamp will not be taken into account when deleting items.

Here is an example of how to use the purge_orphan_items function:

import logging
from dynamodb_garbage_collector import purge_orphan_items

# Set up the logger
logging.basicConfig()
logger = logging.getLogger()
logger.setLevel(logging.INFO)

# Set the AWS region where the parent and child tables are located
region = 'eu-west-1'

# Set the names of the parent and child tables, and the key and reference attributes
parent_table = 'ParentTable'
child_table = 'ChildTable'
key_attribute = 'id'
child_reference_attribute = 'parentId'

# Set the maximum number of workers
max_workers = 50

# Set the name of the timestamp attribute and the timestamp format
timestamp_attribute = 'createdAt'
timestamp_format = '%Y-%m-%dT%H:%M:%S.%fZ'

# Call the function
purge_orphan_items(logger, region, parent_table, child_table, key_attribute, child_reference_attribute, max_workers, timestamp_attribute, timestamp_format)

Contributing

We welcome contributions to the DynamoDB Garbage Collector. To contribute, please fork the repository and create a pull request with your changes.

License

The DynamoDB Garbage Collector is released under the MIT License.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dynamodb_garbage_collector-1.1.0.tar.gz (5.3 kB view hashes)

Uploaded Source

Built Distribution

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page