Skip to main content

Get Planetary Data from the Planetary Data System (PDS)

Project description

LabCas Workflow

Run workflows for Labcas

Depending on what you do, there are multiple ways of running a labcase workflow:

  • Developers: for developers: local run, natively running on your OS
  • Integrators: for AWS Managed Apache Airflow integrators (mwaa), with a local mwaa
  • System Administrators: for System administors, deployed/configured on AWS
  • End users: For end users, using the AWS deployment.

Developers

The tasks of the workflow run independently from Airflow. TODO: integrate to the airflow python API.

Install

With python 3.11, preferably use a virtual environment

pip install -e '.[dev]'

Set AWS connection

./aws-login.darwin.amd64
export AWS_PROFILE=saml-pub

Run/Test the client

python src/labcas/workflow/manager/main.py

Deploy package on pypi

Upgrade the version in file "src/labcas/workflow/VERSION.txt"

Publish the package on pypi:

pip install build
pip install twine
rm dist/*
python -m build
twine upload dist/*

Integrators

Build the Dask worker image

Update the labcas.workflow dependency version as needed in docker/Dockerfile, then:

docker build -f docker/Dockerfile . -t labcas/workflow

Create a managed AirFlow docker image to be run locally

Use repository https://github.com/aws/aws-mwaa-local-runner, clone it, then:

./mwaa-local-env build-image

Then from your local labcas_workflow repository:

cd mwaa

As needed, update requirements in requirements directory and dags in dags directory.

Update the AWS credentials

aws-login.darwin.amd64
cp -r ~/.aws .

Launch the server

docker compose -f docker-compose-local.yml up

Test the server on http://localhost:8080 , login admin/test

Stop

Ctrl^C

Stop and re-initialize local volumes

docker compose  -f ./docker/docker-compose-local.yml down -v

See the console on http://localhost:8080, admin/test

Test the requirement.txt files

./mwaa-local-env test-requirements

Debug the workflow import

docker container ls

Pick the container id of image "amazon/mwaa-local:2_10_3", for example '54706271b7fc':

Then open a bash interpreter in the docker container:

docker exec -it 54706271b7fc bash

And, in the bash prompt:

cd dags
python3 -c "import nebraska"

Start the scheduler:

docker network create dask
docker run --network dask -p 8787:8787 -p 8786:8786 labcas/workflow scheduler

Start one worker

docker run  --network dask -p 8786:8786 labcas/workflow worker 

Start the client, same as in following section

With dask on ECS

Deploy the image created in the previous section on ECR

Have a s3 bucket labcas-infra for the terraform state.

Other pre-requisites are:

  • a VPC
  • subnets
  • a security group allowing incoming request whre the client runs, at JPL, on EC2 or Airflow, to port 8786 and port 8787
  • a task role allowing to write on CloudWatch
  • a task execution role which pull image from ECR and standard ECS task Excecution role policy "AmazonECSTaskExecutionRolePolicy"

Deploy the ECS cluster with the following terraform command:

cd terraform
terraform init
terraform apply \
    -var consortium="edrn" \
    -var venue="dev" \
    -var aws_fg_image=<uri of the docker image deployed on ECR>
    -var aws_fg_subnets=<private subnets of the AWS account> \
    -var aws_fg_vpc=<vpc of the AWS account> \
    -var aws_fg_security_groups  <security group> \
    -var ecs_task_role <arn of a task role>
    -var ecs_task_execution_role <arn of task execution role>

Run

Set you local AWS credentials to access the data

./aws-login.darwin.amd64
export AWS_PROFILE=saml-pub

Start the dask cluster

Run the processing

python ./src/labcas/workflow/manager/main.py

Publish the package on pypi

pip install build
pip install twine
python -m build
twine upload dist/*

Apache Airflow

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

labcas_workflow-0.1.10.tar.gz (560.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

labcas_workflow-0.1.10-py3-none-any.whl (558.6 kB view details)

Uploaded Python 3

File details

Details for the file labcas_workflow-0.1.10.tar.gz.

File metadata

  • Download URL: labcas_workflow-0.1.10.tar.gz
  • Upload date:
  • Size: 560.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.9.7

File hashes

Hashes for labcas_workflow-0.1.10.tar.gz
Algorithm Hash digest
SHA256 c9a83433af7edcb39cb8849ccf7d7f9dd6c97d5024be66622433464d2881c77e
MD5 60ffadf243befe6d5567eba16be1f803
BLAKE2b-256 7215b704eaef951a8f731caec152b7bcfe6ef46ff82f5bd035e8ccfdfc84ec84

See more details on using hashes here.

File details

Details for the file labcas_workflow-0.1.10-py3-none-any.whl.

File metadata

File hashes

Hashes for labcas_workflow-0.1.10-py3-none-any.whl
Algorithm Hash digest
SHA256 328156527f6fa1f99b467c4667b2f7c1d5f3f5329c8a1fde5d85767166a38dd4
MD5 8a95fd349a4bed8fe82ccb6b58bb6ef4
BLAKE2b-256 694ac64ac094743d4361a1e84932ede44c6fecf6aa56d57db207c29168d068bd

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page