Skip to main content

Backfill footprints and images for HiTIDE collections

Project description

hitide-backfill-tool

Tool to backfill thumbnail images and footprints for POCLOUD datasets

Some granules have been ingested without creating footprints/thumbnail images. The purpose of this tool is to trigger part of Cumulus workflow to generate footprints and images for granules that need it.

What it does in a nutshell

  • You specify search parameters at command line (collection, start_date, end_date, footprint, image, etc)
  • Backfill-Tool searches CMR for matching granules
  • Backfill-Tool figures out if the granule needs a footprint or image
  • If footprint or image generation is needed, Backfill-Tool creates a Cumulus message and sends it to an AWS SNS topic.
  • From there, another service will run trigger Forge/TIG and update CMR with new images/footprints as needed

Prerequisites

  • Python > 3.10
  • poetry

failed_workflow.py

  • Script used to scan failed workflows and get unique errors
  • Takes in three arguments
    • workflow_arn: arn of aws workflow
    • profile_name: aws profile name credential to use
    • limit: how many of latest execution to scan if not specified will go through all failed executions
  • ex: python failed_workflow.py --workflow_arn arn:aws:states:us-west-2:123456:stateMachine:podaac-services-ops-hitide-backfill-forge --profile_name service_ops --limit 1000

replay.py

  • Script used to get messages off dead letter queue and back into regular queue
  • Takes 1 argument
    • config: configuration that has the aws_profile, dlq_url, and sqs_url
  • ex: replay --config config.cfg

regression.py

  • Script to run backfill tool command on all collection that has a forge-tig configuartion file
  • Script can be modify to exclude or test specific collections

memory_profiler.py

  • Script to run profile the memory use of lambdas, currently only tig is being profiled
  • Lambdas need to be modified to include lambda request id in cloudwatch logs
  • Modify script with cloudwatch lambda to profile
  • Modify script to include start time and end time range where cloudwatch events were logged

ECS facility

  • ECS template to start docker : ecs_cluster_instance_autoscaling_cf_template.yml.tmpl
  • ECS script to execute task : task-reaper.sh
  • All ECS related resources are specified in ecs_cluster.tf
  • ECS is a cluster of EC2 instances. While creating the EC2 instances, a key is given to create each EC2 and the key name is specified as key_name variable within variables.tf. At this moment, the following keys are specified for each environment
    • backfill-tool-sit-cluster-keypair (SIT)
    • backfill-tool-uat-cluster-keypair (UAT)
    • backfill-tool-ops-cluster-keypair (OPS)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

podaac_hitide_backfill_tool-0.9.0.tar.gz (26.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

podaac_hitide_backfill_tool-0.9.0-py3-none-any.whl (32.4 kB view details)

Uploaded Python 3

File details

Details for the file podaac_hitide_backfill_tool-0.9.0.tar.gz.

File metadata

  • Download URL: podaac_hitide_backfill_tool-0.9.0.tar.gz
  • Upload date:
  • Size: 26.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.1 CPython/3.10.14 Linux/6.5.0-1025-azure

File hashes

Hashes for podaac_hitide_backfill_tool-0.9.0.tar.gz
Algorithm Hash digest
SHA256 eb91f676712334afc11da9b1b728ae6d68c9fe9c1bf3c2b2aafcc2fb3cf04251
MD5 4ffed24d54f124c44453fb2a9608f960
BLAKE2b-256 81a6cb1414e14a0beb7097df734fcd286f58e6943bf8faca518711e08ed9437d

See more details on using hashes here.

File details

Details for the file podaac_hitide_backfill_tool-0.9.0-py3-none-any.whl.

File metadata

File hashes

Hashes for podaac_hitide_backfill_tool-0.9.0-py3-none-any.whl
Algorithm Hash digest
SHA256 b62ea2c670cb2e97ed2b7861b1c370160f68716f8b74764ac9560e76e2ac8754
MD5 603d4b9c5916462fe7656414b0d803e0
BLAKE2b-256 40516507632b62dcd913fee6005d48a26869fcc67ffcdb38194773336145e9e7

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page