pipeline runner command line to run pipelines defined in yaml
Project description
- pypyr
pronounce how you like, but I generally say piper as in “piping down the valleys wild”
pypyr is a command line interface to run pipelines defined in yaml. Think of pypyr as a simple task runner that lets you run sequential steps.
1 Installation
1.1 pip
$ pip install --upgrade pypyr
1.2 Python version
Tested against Python 3.6
2 Usage
2.1 Run your first pipeline
Run one of the built-in pipelines to get a feel for it:
$ pypyr echo --context "echoMe=Ceci n'est pas une pipe"
You can achieve the same thing by running a pipeline where the context is set in the pipeline yaml rather than as a –context argument:
$ pypyr magritte
Check here pypyr.steps.echo to see yaml that does this.
2.2 Run a pipeline
pypyr assumes a pipelines directory in your current working directory.
# run pipelines/mypipelinename.yaml with DEBUG logging level
$ pypyr mypipelinename --log 10
# run pipelines/mypipelinename.yaml with INFO logging level.
$ pypyr mypipelinename --log 20
# If you don't specify --log it defaults to 20 - INFO logging level.
$ pypyr mypipelinename
# run pipelines/mypipelinename.yaml with an input context. For this input to
# be available to your pipeline you need to specify a context_parser in your
# pipeline yaml.
$ pypyr mypipelinename --context 'mykey=value'
2.3 Get cli help
pypyr has a couple of arguments and switches you might find useful. See them all here:
$ pypyr -h
2.4 Examples
If you prefer reading code to reading words, https://github.com/pypyr/pypyr-example
3 Anatomy of a pypyr pipeline
3.1 Pipeline yaml structure
A pipeline is a .yaml file. Save pipelines to a pipelines directory in your working directory.
# This is an example showing the anatomy of a pypyr pipeline
# A pipeline should be saved as {working dir}/pipelines/mypipelinename.yaml.
# Run the pipeline from {working dir} like this: pypyr mypipelinename
# optional
context_parser: my.custom.parser
# mandatory.
steps:
- my.package.my.module # simple step pointing at a python module in a package
- mymodule # simple step pointing at a python file
- name: my.package.another.module # complex step. It contains a description and in parameters.
description: Optional description is for humans. It's any text that makes your life easier.
in: #optional. In parameters are added to the context so that this step and subsequent steps can use these key-value pairs.
parameter1: value1
parameter2: value2
# optional.
on_success:
- my.first.success.step
- my.second.success.step
# optional.
on_failure:
- my.failure.handler.step
- my.failure.handler.notifier
3.2 Built-in pipelines
pipeline |
description |
how to run |
donothing |
Does what it says. Nothing. |
pypyr donothing |
echo |
Echos context value echoMe to output. |
pypyr echo –context “echoMe=text goes here” |
pypyrversion |
Prints the python cli version number. |
pypyr pypyrversion |
magritte |
Thoughts about pipes. |
pypyr magritte |
3.3 context_parser
Optional.
A context_parser parses the pypyr –context input argument. The chances are pretty good that it will take the –context argument and put in into the pypyr context.
The pypyr context is a dictionary that is in scope for the duration of the entire pipeline. The context_parser can initialize the context. Any step in the pipeline can add, edit or remove items from the context dictionary.
3.3.1 Built-in context parsers
context parser |
description |
example input |
pypyr.parser.commas |
Takes a comma delimited string and returns a dictionary where each element becomes the key, with value to true. Don’t have spaces between commas unless you really mean it. "k1=v1, k2=v2" will result in a context key name of ' k2' not 'k2'. |
pypyr pipelinename –context “param1,param2,param3” This will create a context dictionary like this: {‘param1’: True, ‘param2’: True, ‘param3’: True} |
pypyr.parser.json |
Takes a json string and returns a dictionary. |
pypyr pipelinename –context '{“key1”:”value1”,”key2”:”value2”}' |
pypyr.parser.jsonfile |
Opens json file and returns a dictionary. |
pypyr pipelinename –context './path/sample.json’ |
pypyr.parser.keyvaluepairs |
Takes a comma delimited key=value pair string and returns a dictionary where each pair becomes a dictionary element. Don’t have spaces between commas unless you really mean it. "k1=v1, k2=v2" will result in a context key name of ' k2' not 'k2'. |
pypyr pipelinename –context “param1=value1,param2=value2,param3=value3” |
pypyr.parser.yamlfile |
Opens a yaml file and writes the contents into the pypyr context dictionary. The top (or root) level yaml should describe a map, not a sequence. Sequence (this won’t work):
Instead, do a map (aka dictionary):
|
pypyr pipelinename –context './path/sample.yaml’ |
3.3.2 Roll your own context_parser
import logging
# getLogger will grab the parent logger context, so your loglevel and
# formatting will inherit correctly automatically from the pypyr core.
logger = logging.getLogger(__name__)
def get_parsed_context(context_arg):
"""This is the signature for a context parser. Input context is the string received from pypyr --context 'value here'"""
assert context_arg, ("pipeline must be invoked with --context set.")
logger.debug("starting")
# your clever code here. Chances are pretty good you'll be doing things with the input context string to create a dictionary.
# function signature returns a dictionary
return {'key1': 'value1', 'key2':'value2'}
3.4 steps
Mandatory.
steps is a list of steps to execute in sequence. A step is simply a bit of python that does stuff.
You can specify a step in the pipeline yaml in two ways:
Simple step
a simple step is just the name of the python module.
pypyr will look in your working directory for these modules or packages.
For a package, be sure to specify the full namespace (i.e not just mymodule, but mypackage.mymodule).
steps: - my.package.my.module # points at a python module in a package. - mymodule # simple step pointing at a python file
Complex step
a complex step allows you to specify a few more details for your step, but at heart it’s the same thing as a simple step - it points at some python.
steps: - name: my.package.another.module description: Optional Description is for humans. It's any yaml-escaped text that makes your life easier. in: #optional. In parameters are added to the context so that this step and subsequent steps can use these key-value pairs. parameter1: value1 parameter2: value2
You can freely mix and match simple and complex steps in the same pipeline.
Frankly, the only reason simple steps are there is because I’m lazy and I dislike redundant typing.
3.4.1 Built-in steps
step |
description |
input context properties |
Remove specified items from context. |
contextClear (list) |
|
Wipe the entire context. |
||
Sets context values from already existing context values. |
contextSet (dict) |
|
Echo the context value echoMe to the output. |
echoMe (string) |
|
Get, set or unset $ENVs. |
envGet (dict) envSet (dict) envUnset (list) |
|
Loads json file into pypyr context. |
fetchJsonPath (path-like) |
|
Loads yaml file into pypyr context. |
fetchYamlPath (path-like) |
|
Parse file and substitute {tokens} from context. |
fileFormatIn (path-like) fileFormatOut (path-like) |
|
Parse json file and substitute {tokens} from context. |
fileFormatJsonIn (path-like) fileFormatJsonOut (path-like) |
|
Parse yaml file and substitute {tokens} from context. |
fileFormatYamlIn (path-like) fileFormatYamlOut (path-like) |
|
Parse input file and replace search strings. |
fileReplaceIn (path-like) fileReplaceOut (path-like) fileReplacePairs (dict) |
|
Executes the context value pycode as python code. |
pycode (string) |
|
Writes installed pypyr version to output. |
||
Runs the program and args specified in the context value cmd as a subprocess. |
cmd (string) |
|
Runs the context value cmd in the default shell. Use for pipes, wildcards, $ENVs, ~ |
cmd (string) |
|
Archive and/or extract tars with or without compression. Supports gzip, bzip2, lzma. |
tarExtract (dict) tarArchive (dict) |
3.4.1.1 pypyr.steps.contextclear
Remove the specified items from the context.
Will iterate contextClear and remove those keys from context.
For example, say input context is:
key1: value1
key2: value2
key3: value3
key4: value4
contextClear:
- key2
- key4
- contextClear
This will result in return context:
key1: value1
key3: value3
Notice how contextClear also cleared itself in this example.
3.4.1.2 pypyr.steps.contextclearall
Wipe the entire context. No input context arguments required.
You can always use contextclearall as a simple step. Sample pipeline yaml:
steps:
- my.arb.step
- pypyr.steps.contextclearall
- another.arb.step
3.4.1.3 pypyr.steps.contextset
Sets context values from already existing context values.
This is handy if you need to prepare certain keys in context where a next step might need a specific key. If you already have the value in context, you can create a new key (or update existing key) with that value.
So let’s say you already have context[‘currentKey’] = ‘eggs’. If you run newKey: currentKey, you’ll end up with context[‘newKey’] == ‘eggs’
For example, say your context looks like this,
key1: value1
key2: value2
key3: value3
and your pipeline yaml looks like this:
steps:
- name: pypyr.steps.contextset
in:
contextSet:
key2: key1
key4: key3
This will result in context like this:
key1: value1
key2: value1
key3: value3
key4: value3
3.4.1.4 pypyr.steps.echo
Echo the context value echoMe to the output.
For example, if you had pipelines/mypipeline.yaml like this:
context_parser: pypyr.parser.keyvaluepairs
steps:
- name: pypyr.steps.echo
You can run:
pypyr mypipeline --context "echoMe=Ceci n'est pas une pipe"
Alternatively, if you had pipelines/look-ma-no-params.yaml like this:
steps:
- name: pypyr.steps.echo
description: Output echoMe
in:
echoMe: Ceci n'est pas une pipe
You can run:
$ pypyr look-ma-no-params
3.4.1.5 pypyr.steps.env
Get, set or unset environment variables.
At least one of these context keys must exist:
envGet
envSet
envUnset
This step will run whatever combination of Get, Set and Unset you specify. Regardless of combination, execution order is Get, Set, Unset.
See a worked example for environment variables here.
3.4.1.5.1 envGet
Get $ENVs into the pypyr context.
context['envGet'] must exist. It’s a dictionary.
Values are the names of the $ENVs to write to the pypyr context.
Keys are the pypyr context item to which to write the $ENV values.
For example, say input context is:
key1: value1
key2: value2
pypyrCurrentDir: value3
envGet:
pypyrUser: USER
pypyrCurrentDir: PWD
This will result in context:
key1: value1
key2: value2
key3: value3
pypyrCurrentDir: <<value of $PWD here, not value3>>
pypyrUser: <<value of $USER here>>
3.4.1.5.2 envSet
Set $ENVs from the pypyr context.
context['envSet'] must exist. It’s a dictionary.
Values are strings to write to $ENV. You can use {key} substitutions to format the string from context. Keys are the names of the $ENV values to which to write.
For example, say input context is:
key1: value1
key2: value2
key3: value3
envSet:
MYVAR1: {key1}
MYVAR2: before_{key3}_after
MYVAR3: arbtexthere
This will result in the following $ENVs:
$MYVAR1 = value1
$MYVAR2 = before_value3_after
$MYVAR3 = arbtexthere
Note that the $ENVs are not persisted system-wide, they only exist for the pypyr sub-processes, and as such for the subsequent steps during this pypyr pipeline execution. If you set an $ENV here, don’t expect to see it in your system environment variables after the pipeline finishes running.
3.4.1.5.3 envUnset
Unset $ENVs.
Context is a dictionary or dictionary-like. context is mandatory.
context['envUnset'] must exist. It’s a list. List items are the names of the $ENV values to unset.
For example, say input context is:
key1: value1
key2: value2
key3: value3
envUnset:
MYVAR1
MYVAR2
This will result in the following $ENVs being unset:
$MYVAR1
$MYVAR2
3.4.1.6 pypyr.steps.fetchjson
Loads a json file into the pypyr context.
This step requires the following key in the pypyr context to succeed:
fetchJsonPath. - path-like. Path to file on disk. Can be relative.
Json parsed from the file will be merged into the pypyr context. This will overwrite existing values if the same keys are already in there.
I.e if file json has {'eggs' : 'boiled'}, but context {'eggs': 'fried'} already exists, returned context['eggs'] will be ‘boiled’.
The json should not be an array [] at the top level, but rather an Object.
3.4.1.7 pypyr.steps.fetchyaml
Loads a yaml file into the pypyr context.
This step requires the following key in the pypyr context to succeed:
fetchYamlPath. - path-like. Path to file on disk. Can be relative.
Yaml parsed from the file will be merged into the pypyr context. This will overwrite existing values if the same keys are already in there.
I.e if file yaml has
eggs: boiled
but context {'eggs': 'fried'} already exists, returned context['eggs'] will be ‘boiled’.
The yaml should not be a list at the top level, but rather a mapping.
So the top-level yaml should not look like this:
- eggs
- ham
but rather like this:
breakfastOfChampions:
- eggs
- ham
3.4.1.8 pypyr.steps.fileformat
Parses input text file and substitutes {tokens} in the text of the file from the pypyr context.
The following context keys expected:
fileFormatIn
Path to source file on disk.
fileFormatOut
Write output file to here. Will create directories in path if these do not exist already.
So if you had a text file like this:
{k1} sit thee down and write
In a book that all may {k2}
And your pypyr context were:
k1: pypyr
k2: read
You would end up with an output file like this:
pypyr sit thee down and write
In a book that all may read
3.4.1.9 pypyr.steps.fileformatjson
Parses input json file and substitutes {tokens} from the pypyr context.
Pretty much does the same thing as pypyr.steps.fileformat, only it makes it easier to work with curly braces for substitutions without tripping over the json’s structural braces.
The following context keys expected:
fileFormatJsonIn
Path to source file on disk.
fileFormatJsonOut
Write output file to here. Will create directories in path if these do not exist already.
Substitutions enabled for keys and values in the source json.
3.4.1.10 pypyr.steps.fileformatyaml
Parses input yaml file and substitutes {tokens} from the pypyr context.
Pretty much does the same thing as pypyr.steps.fileformat, only it makes it easier to work with curly braces for substitutions without tripping over the yaml’s structural braces. If your yaml doesn’t use curly braces that aren’t meant for {token} substitutions, you can happily use pypyr.steps.fileformat instead - it’s more memory efficient.
This step does not preserve comments. Use pypyr.steps.fileformat if you need to preserve comments on output.
The following context keys expected:
fileFormatYamlIn
Path to source file on disk.
fileFormatYamlOut
Write output file to here. Will create directories in path if these do not exist already.
3.4.1.11 pypyr.steps.filereplace
Parses input text file and replaces a search string.
The other fileformat steps, by way of contradistinction, uses string formatting expressions inside {braces} to format values against the pypyr context. This step, however, let’s you specify any search string and replace it with any replace string. This is handy if you are in a file where curly braces aren’t helpful for a formatting expression - e.g inside a .js file.
The following context keys expected:
fileReplaceIn
Path to source file on disk.
fileReplaceOut
Write output file to here. Will create directories in path if these do not exist already.
fileReplacePairs
dictionary where format is:
‘find_string’: ‘replace_string’
Example input context:
fileReplaceIn: ./infile.txt
fileReplaceOut: ./outfile.txt
fileReplacePairs:
findmestring: replacewithme
findanotherstring: replacewithanotherstring
alaststring: alastreplacement
This also does string substitutions from context on the fileReplacePairs. It does this before it search & replaces the fileReplaceIn file.
Be careful of order. The last string replacement expression could well replace a replacement that an earlier replacement made in the sequence.
If fileReplacePairs is not an ordered collection, replacements could evaluate in any given order. If you are creating your in parameters in the pipeline yaml, don’t worry about it, it will be an ordered dictionary already, so life is good.
See a worked example here.
3.4.1.12 pypyr.steps.py
Executes the context value pycode as python code.
Will exec context['pycode'] as a dynamically interpreted python code block.
You can access and change the context dictionary in a py step. See a worked example here.
For example, this will invoke python print and print 2:
steps:
- name: pypyr.steps.py
description: Example of an arb python command. Will print 2.
in:
pycode: print(1+1)
3.4.1.13 pypyr.steps.pypyrversion
Outputs the same as:
pypyr --version
This is an actual pipeline, though, so unlike –version, it’ll use the standard pypyr logging format.
Example pipeline yaml:
steps:
- pypyr.steps.pypyrversion
3.4.1.14 pypyr.steps.safeshell
Runs the context value cmd as a sub-process.
In safeshell, you cannot use things like exit, return, shell pipes, filename wildcards, environment variable expansion, and expansion of ~ to a user’s home directory. Use pypyr.steps.shell for this instead. Safeshell runs a program, it does not invoke the shell.
You can use context variable substitutions with curly braces. See a worked example for substitions here.
Escape literal curly braces with doubles: {{ for {, }} for }
Example pipeline yaml:
steps:
- name: pypyr.steps.safeshell
in:
cmd: ls -a
See a worked example for shell power here.
3.4.1.15 pypyr.steps.shell
Runs the context value cmd in the default shell. On a sensible O/S, this is /bin/sh
Do all the things you can’t do with safeshell.
Friendly reminder of the difference between separating your commands with ; or &&:
; will continue to the next statement even if the previous command errored. It won’t exit with an error code if it wasn’t the last statement.
&& stops and exits reporting error on first error.
You can use context variable substitutions with curly braces. See a worked example for substitions here.
Escape literal curly braces with doubles: {{ for {, }} for }
Example pipeline yaml using a pipe:
steps:
- name: pypyr.steps.shell
in:
cmd: ls | grep pipe; echo if you had something pipey it should show up;
See a worked example for shell power here.
3.4.1.16 pypyr.steps.tar
Archive and/or extract tars with or without compression.
At least one of these context keys must exist:
tarExtract
tarArchive
Optionally, you can also specify the tar compression format with context['tarFormat']. If not specified, defaults to lzma/xz Available options:
‘’ - no compression
gz (gzip)
bz2 (bzip2)
xz (lzma)
This step will run whatever combination of Extract and Archive you specify. Regardless of combination, execution order is Extract, Archive.
Never extract archives from untrusted sources without prior inspection. It is possible that files are created outside of path, e.g. members that have absolute filenames starting with “/” or filenames with two dots “..”.
See a worked example for tar here.
3.4.1.16.1 tarExtract
context['tarExtract'] must exist. It’s a dictionary.
keys are the path to the tar to extract.
values are the destination paths.
You can use {key} substitutions to format the string from context.
key1: here
key2: tar.xz
tarExtract:
- in: path/to/my.tar.xz
out: /path/extract/{key1}
- in: another/{key2}
out: .
This will:
Extract path/to/my.tar.xz to /path/extract/here
Extract another/tar.xz to the current execution directory
This is the directory you’re running pypyr from, not the pypyr pipeline working directory you set with the --dir flag.
3.4.1.16.2 tarArchive
context['tarArchive'] must exist. It’s a dictionary.
keys are the paths to archive.
values are the destination output paths.
You can use {key} substitutions to format the string from context.
key1: destination.tar.xz
key2: value2
tarArchive:
- in: path/{key2}/dir
out: path/to/{key1}
- in: another/my.file
out: ./my.tar.xz
This will:
Archive directory path/value2/dir to path/to/destination.tar.xz,
Archive file another/my.file to ./my.tar.xz
3.4.2 Roll your own step
import logging
# getLogger will grab the parent logger context, so your loglevel and
# formatting will inherit correctly automatically from the pypyr core.
logger = logging.getLogger(__name__)
def run_step(context):
"""Run code in here. This shows you how to code a custom pipeline step.
:param context: dictionary-like type
"""
logger.debug("started")
# you probably want to do some asserts here to check that the input context
# dictionary contains the keys and values you need for your code to work.
assert 'mykeyvalue' in context, ("context['mykeyvalue'] must exist for my clever step.")
# it's good form only to use .info and higher log levels when you must.
# For .debug() being verbose is very much encouraged.
logger.info("Your clever code goes here. . . ")
# Add or edit context items. These are available to any pipeline steps
# following this one.
context['existingkey'] = 'new value overwrites old value'
context['mynewcleverkey'] = 'new value'
logger.debug("done")
3.5 on_success
on_success is a list of steps to execute in sequence. Runs when steps: completes successfully.
You can use built-in steps or code your own steps exactly like you would for steps - it uses the same function signature.
3.6 on_failure
on_failure is a list of steps to execute in sequence. Runs when any of the above hits an unhandled exception.
If on_failure encounters another exception while processing an exception, then both that exception and the original cause exception will be logged.
You can use built-in steps or code your own steps exactly like you would for steps - it uses the same function signature.
4 Plug-Ins
The pypyr core is deliberately kept light so the dependencies are down to the minimum. I loathe installs where there're a raft of extra deps that I don't use clogging up the system.
Where other libraries are requisite, you can selectively choose to add this functionality by installing a pypyr plug-in.
boss pypyr plug-ins
|
description |
Interact with the AWS sdk api. Supports all AWS Client functions, such as S3, EC2, ECS & co. via the AWS low-level Client API. |
|
Send messages to Slack |
5 Testing (for pypyr-cli developers)
5.1 Testing without worrying about dependencies
Run tox to test the packaging cycle inside a virtual env, plus run all tests:
# just run tests
$ tox -e dev -- tests
# run tests, validate README.rst, run flake8 linter
$ tox -e stage -- tests
5.2 If tox takes too long
The test framework is pytest. If you only want to run tests:
$ pip install -e .[dev,test]
5.3 Day-to-day testing
Tests live under /tests (surprising, eh?). Mirror the directory structure of the code being tested.
Prefix a test definition with test_ - so a unit test looks like
def test_this_should_totally_work():
To execute tests, from root directory:
pytest tests
For a bit more info on running tests:
pytest --verbose [path]
To execute a specific test module:
pytest tests/unit/arb_test_file.py
6 Contribute
6.1 Bugs
Well, you know. No one’s perfect. Feel free to create an issue.
6.2 Contribute to the core cli
The usual jazz - create an issue, fork, code, test, PR. It might be an idea to discuss your idea via the Issues list first before you go off and write a huge amount of code - you never know, something might already be in the works, or maybe it’s not quite right for the core-cli (you’re still welcome to fork and go wild regardless, of course, it just mightn’t get merged back in here).
6.3 Roll your own plug-in
You’ve probably noticed by now that pypyr is built to be pretty extensible. You’ve probably also noticed that the core pypyr cli is deliberately kept light. The core cli is philosophically only a way of running a sequence of steps. Dependencies to external libraries should generally get their own package, so end-users can selectively install what they need rather than have a monolithic batteries-included application.
If you’ve got some custom context_parser or steps code that are useful, create a repo and bask in the glow of sharing with the open source community. Honor the pypyr Apache license please.
I generally name plug-ins pypyr-myplugin, where myplugin is likely some sort of dependency that you don’t want in the pypyr core cli. For example, pypyr-aws contains pypyr-steps for the AWS boto3 library. This is kept separate so that you don’t have to deal with yet another dependency you don’t need if your current project isn’t using AWS.
If you want your plug-in listed here for official cred, please get in touch via the Issues list. Get in touch anyway, would love to hear from you at https://www.345.systems/contact.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.