Skip to main content

pypyr pipeline runner AWS plugin. Steps for ECS, S3, Beanstalk.

Project description

pypyr-logo
pypyr

pronounce how you like, but I generally say piper as in “piping down the valleys wild”

pypyr is a command line interface to run pipelines defined in yaml.

the pypyr aws plug-in

Run anything on aws. No really, anything. If the aws api supports it, the pypyr aws plug-in supports it.

It’s a pretty easy way of invoking the aws api as a step in a series of steps. Why use this when you could just use the aws-cli instead? The aws cli is all kinds of awesome, but I find more often than not it’s not just one or two aws ad hoc cli or aws api methods you have to execute, but especially when automating and scripting you actually need to run a sequence of commands, where the output of a previous command influences what you pass to the next command.

Sure, you can bash it up, and I do that too, but running it as a pipeline via pypyr has actually made my life quite a bit easier in terms of not having to deal with conditionals, error traps and input validation.

build status coverage status pypi version

1 Installation

1.1 pip

# pip install --upgrade pypyraws

pypyr-slack depends on the pypyr cli. The above pip will install it for you if you don’t have it already.

1.2 Python version

Tested against Python 3.6

2 Examples

If you prefer reading code to reading words, https://github.com/pypyr/pypyr-example

3 steps

step

description

input context properties

pypyraws.steps.client

Execute any low-level aws client method

awsClientIn (dict)

pypyraws.steps.ecswaitprep

Run me after an ecs task run or stop to prepare an ecs waiter.

awsClientOut (dict)

awsEcsWaitPrepCluster (str)

pypyraws.steps.s3fetchjson

Fetch a json file from s3 into the pypyr context.

s3Fetch (dict)

pypyraws.steps.s3fetchyaml

Fetch a yaml file from s3 into the pypyr context.

s3Fetch (dict)

pypyraws.steps.wait

Wait for an aws client method to complete.

awsWaitIn (dict)

3.1 pypyraws.steps.client

3.1.1 What can I do with pypyraws.steps.client?

This step provides an easy way of getting at the low-level AWS api from the pypyr pipeline runner. So in short, pretty much anything you can do with the AWS api, you got it, as the Big O might have said.

This step lets you specify the service name and the service method you want to execute dynamically. You can also control the service header arguments and the method arguments themselves.

The arguments you pass to the service and its methods are exactly as given by the AWS help documentation. So you do not have to learn yet another configuration based abstraction on top of the AWS api that might not even support all the methods you need.

You can actually pretty much just grab the json as written from the excellent AWS help docs, paste it into some json that pypyr consumes and tadaaa!

3.1.2 Supported AWS services

Clients provide a low-level interface to AWS whose methods map close to 1:1 with the AWS REST service APIs. All service operations are supported by clients.

Run any method on any of the following aws low-level client services:

acm, apigateway, application-autoscaling, appstream, autoscaling, batch, budgets, clouddirectory, cloudformation, cloudfront, cloudhsm, cloudsearch, cloudsearchdomain, cloudtrail, cloudwatch, codebuild, codecommit, codedeploy, codepipeline, codestar, cognito-identity, cognito-idp, cognito-sync, config, cur, datapipeline, devicefarm, directconnect, discovery, dms, ds, dynamodb, dynamodbstreams, ec2, ecr, ecs, efs, elasticache, elasticbeanstalk, elastictranscoder, elb, elbv2, emr, es, events, firehose, gamelift, glacier, health, iam, importexport, inspector, iot, iot-data, kinesis, kinesisanalytics, kms, lambda, lex-models, lex-runtime, lightsail, logs, machinelearning, marketplace-entitlement, marketplacecommerceanalytics, meteringmarketplace, mturk, opsworks, opsworkscm, organizations, pinpoint, polly, rds, redshift, rekognition, resourcegroupstaggingapi, route53, route53domains, s3, sdb, servicecatalog, ses, shield, sms, snowball, sns, sqs, ssm, stepfunctions, storagegateway, sts, support, swf, waf, waf-regional, workdocs, workspaces, xray

You can find full details for the supported services and what methods you can run against them here: http://boto3.readthedocs.io/en/latest/reference/services/

With the speed of new features and services AWS introduces, it’s pretty unlikely I’ll get round to updating the list each and every time.

pypyr-aws will automatically support new services AWS releases for the boto3 client, in case the list above gets out of date. So while the document might not update, the code already will dynamically use new features and services on the boto3 client.

3.1.3 pypyr context

Requires the following context items:

awsClientIn:
  serviceName: 'aws service name here'
  methodName: 'execute this method of the aws service'
  clientArgs: # optional
    arg1Name: arg1Value
    arg2Name: arg2Value
  methodArgs: # optional
    arg1Name: arg1Value
    arg2Name: arg2Value

The awsClientIn context supports text Substitutions.

3.1.4 Sample pipeline

Here is some sample yaml of what a pipeline using the pypyr-aws plug-in client step could look like:

context_parser: pypyr.parser.keyvaluepairs
steps:
  - name: pypyraws.steps.client
    description: upload a file to s3
    in:
      awsClientIn:
        serviceName: s3
        methodName: upload_file
        methodArgs:
          Filename: ./testfiles/arb.txt
          Bucket: '{bucket}'
          Key: arb.txt

If you saved this yaml as ./pipelines/go-go-s3.yaml, you can run from ./ the following to upload arb.txt to your specified bucket:

$ pypyr go-go-s3 --context "bucket=myuniquebucketname"

See a worked example for pypyr aws s3 here.

3.2 pypyraws.steps.ecswaitprep

Run me after an ecs task run or stop to prepare an ecs waiter.

Prepares the awsWaitIn context key for pypyraws.steps.wait

Available ecs waiters are:

  • ServicesInactive

  • ServicesStable

  • TasksRunning

  • TasksStopped

Full details here: http://boto3.readthedocs.io/en/latest/reference/services/ecs.html#waiters

Use this step after any of the following ecs client methods if you want to use one of the ecs waiters to wait for a specific state:

  • describe_services

  • describe_tasks

  • list_services - specify awsEcsWaitPrepCluster if you don’t want default

  • list_tasks - specify awsEcsWaitPrepCluster if you don’t want default

  • run_task

  • start_task

  • stop_task

  • update_service

You don’t have to use this step, you could always just construct the awsWaitIn dictionary in context yourself. It just so happens this step saves you some legwork to do so.

Required context:

  • awsClientOut

    • dict. mandatory.

    • This is the context key that any ecs command executed by pypyraws.steps.service adds. Chances are pretty good you don’t want to construct this by hand yourself - the idea is to use the output as generated by one of the supported ecs methods.

  • awsEcsWaitPrepCluster

    • string. optional.

    • The short name or full arn of the cluster that hosts the task to describe. If you do not specify a cluster, the default cluster is assumed. For most of the ecs methods the code automatically deduces the cluster from awsClientOut, so don’t worry about it.

    • But, when following list_services and list_tasks, you have to specify this parameter.

    • Specifying this parameter will override any automatically deduced cluster arn

See a worked example for pypyr aws ecs here.

3.3 pypyraws.steps.s3fetchjson

Fetch a json file from s3 and put the json values into context.

Required input context is:

s3Fetch:
  clientArgs: # optional
    arg1Name: arg1Value
  methodArgs:
    Bucket: '{bucket}'
    Key: arb.json

Json parsed from the file will be merged into the pypyr context. This will overwrite existing values if the same keys are already in there.

I.e if file json has {'eggs' : 'boiled'}, but context {'eggs': 'fried'} already exists, returned context['eggs'] will be ‘boiled’.

The json should not be an array [] at the top level, but rather an Object.

The s3Fetch context supports text Substitutions.

See a worked example for pypyr aws s3fetch here.

3.4 pypyraws.steps.s3fetchyaml

Fetch a yaml file from s3 and put the yaml structure into context.

Required input context is:

s3Fetch:
  clientArgs: # optional
    arg1Name: arg1Value
  methodArgs:
    Bucket: '{bucket}'
    Key: arb.yaml

The s3Fetch context supports text Substitutions.

Yaml parsed from the file will be merged into the pypyr context. This will overwrite existing values if the same keys are already in there.

I.e if file yaml has

eggs: boiled

but context {'eggs': 'fried'} already exists, returned context['eggs'] will be ‘boiled’.

The yaml should not be a list at the top level, but rather a mapping.

So the top-level yaml should not look like this:

- eggs
- ham

but rather like this:

breakfastOfChampions:
  - eggs
  - ham

See a worked example for pypyr aws s3fetch here.

3.5 pypyraws.steps.wait

Wait for things in AWS to complete before continuing pipeline.

Run any low-level boto3 client wait() from get_waiter.

Waiters use a client’s service operations to poll the status of an AWS resource and suspend execution until the AWS resource reaches the state that the waiter is polling for or a failure occurs while polling.

http://boto3.readthedocs.io/en/latest/guide/clients.html#waiters

The input context requires:

awsWaitIn:
  serviceName: 'service name' # Available services here: http://boto3.readthedocs.io/en/latest/reference/services/
  waiterName: 'waiter name' # Check service docs for available waiters for each service
  waiterArgs:
    arg1Name: arg1Value # optional. Dict. kwargs for get_waiter
  waitArgs:
    arg1Name: arg1Value #optional. Dict. kwargs for wait

The awsWaitIn context supports text Substitutions.

4 Substitutions

You can use substitution tokens, aka string interpolation, where specified for context items. This substitutes anything between {curly braces} with the context value for that key. This also works where you have dictionaries/lists inside dictionaries/lists. For example, if your context looked like this:

bucketValue: the.bucket
keyValue: dont.kick
moreArbText: wild
awsClientIn:
  serviceName: s3
  methodName: get_object
  methodArgs:
    Bucket: '{bucketValue}'
    Key: '{keyValue}'

This will run s3 get_object to retrieve file dont.kick from the.bucket.

  • Bucket: ‘{bucketValue}’ becomes Bucket: the.bucket

  • Key: ‘{keyValue}’ becomes Key: dont.kick

In json & yaml, curlies need to be inside quotes to make sure they parse as strings.

Escape literal curly braces with doubles: {{ for {, }} for }

See a worked example for substitions here.

5 aws authentication

5.1 Configuring credentials

pypyr-aws pretty much just uses the underlying boto3 authentication mechanisms. More info here: http://boto3.readthedocs.io/en/latest/guide/configuration.html

This means any of the following will work:

  • If you are running inside of AWS - on EC2 or inside an ECS container, it will automatically use IAM role credentials if it does not find credentials in any of the other places listed below.

  • In the pypyr context

    context['awsClientIn']['clientArgs'] = {
        aws_access_key_id: ACCESS_KEY,
        aws_secret_access_key: SECRET_KEY,
        aws_session_token: SESSION_TOKEN,
      }
  • $ENV variables

    • AWS_ACCESS_KEY_ID

    • AWS_SECRET_ACCESS_KEY

    • AWS_SESSION_TOKEN

  • Credentials file at ~/.aws/credentials or ~/.aws/config

    • If you have the aws-cli installed, run aws configure to get these configured for you automatically.

Tip: On dev boxes I generally don’t bother with credentials, because chances are pretty good that I have the aws-cli installed already anyway, so pypyr will just re-use the aws shared configuration files that are there anyway.

5.2 Ensure secrets stay secret

Be safe! Don’t hard-code your aws credentials. Don’t check credentials into a public repo.

Tip: if you’re running pypyr inside of aws - e.g in an ec2 instance or an ecs container that is running under an IAM role, you don’t actually need explicitly to configure credentials for pypyr-aws.

Do remember not to fling your key & secret around as shell arguments - it could very easily leak that way into logs or expose via a ps. I generally use one of the pypyr built-in context parsers like pypyr.parser.jsonfile or pypyr.parser.yamlfile, see here for details.

Do remember also that $ENV variables are not a particularly secure place to keep your secrets.

6 Testing

6.1 Testing without worrying about dependencies

Run from tox to test the packaging cycle inside a virtual env, plus run all tests:

# just run tests
$ tox -e dev -- tests
# run tests, validate README.rst, run flake8 linter
$ tox -e stage -- tests

6.2 If tox is taking too long

The test framework is pytest. If you only want to run tests:

$ pip install -e .[dev,test]

6.3 Day-to-day testing

  • Tests live under /tests (surprising, eh?). Mirror the directory structure of the code being tested.

  • Prefix a test definition with test_ - so a unit test looks like

    def test_this_should_totally_work():
  • To execute tests, from root directory:

    pytest tests
  • For a bit more info on running tests:

    pytest --verbose [path]
  • To execute a specific test module:

    pytest tests/unit/arb_test_file.py

7 Contribute

7.1 Bugs

Well, you know. No one’s perfect. Feel free to create an issue.

7.2 Contribute to the pypyr project

The usual jazz - create an issue, fork, code, test, PR. It might be an idea to discuss your idea via the Issues list first before you go off and write a huge amount of code - you never know, something might already be in the works, or maybe it’s not quite right for this plug-in (you’re still welcome to fork and go wild regardless, of course, it just mightn’t get merged back in here).

Get in touch anyway, would love to hear from you at https://www.345.systems/contact.

Project details


Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page