Skip to main content

GitHub Actions Workflow Differ

Project description

gawd - GitHub Actions Workflow Differ

Tests License: LGPL v3 PyPI Commits

gawd stands for GitHub Actions Workflow Differ. It is a Python library and command-line tool that computes all changes (i.e., diffs) between two GitHub Actions workflow files. Given a pair of workflow files as input, and taking the specific syntax of GitHub Actions into account, gawd reports on the items that were added and removed, as well on items that were moved, renamed or changed based on their similarity.

Installation

gawd can be used in two ways: as a command-line tool and as a reusable Python library.

You can use pip install gawd to install the latest available version from PyPI. Pre-releases are available from the main branch on GitHub and can be installed with pip install git+https://github.com/pooya-rostami/gawd.

Usage

As a command-line tool

After installation, the gawd command-line tool should be available in your shell. Otherwise, you can call it using python -m gawd.

usage: gawd [-h] [--threshold X] [--position-weight X] [--job-name-weight X]
            [--short] [--json] [--verbose]
            first second

gawd is an open source GitHub Actions Workflow Differencing tool that is aware
of the specific workflow syntax of GitHub Actions workflows. Given a pair of
workflow files as input, the tool reports on the items that were added and
removed, as well on items that were moved, renamed or changed based on their
similarity.

positional arguments:
  first                 path to first workflow (YAML) file
  second                path to second workflow (YAML) file

optional arguments:
  -h, --help            show this help message and exit
  --threshold X, -t X   ranged from 0 to 1, distance threshold to map items,
                        value closer to 1 favours "changed", value closer to 0
                        favours "added" and "removed" (default is 0.5)
  --position-weight X, -p X
                        ranged from 0 to 1, weight of item positions when
                        comparing sequences (default is 0.2)
  --job-name-weight X, -j X
                        ranged from 0 to 1, weight of job names when comparing
                        jobs (default is 0.2)
  --short, -s           limit the output of values to a few characters
  --json                output in json
  --verbose             output in more detail

Examples:

Let us take the changes made to the workflow file "main.yml" in this commit as an example.

--short Condensed output of gawd when applied to compute the changes:

>>> gawd old_main.yaml new_main.yaml --short
changed jobs.build-linux-clangpdb-gcc5.steps[1].run from "'sudo apt-get update (...) UB_PATH\n'" to "'sudo apt-get update (...) UB_PATH\n'"
renamed jobs.build-linux-clang38 to jobs.build-linux-clangdwarf
changed jobs.build-linux-clang38.name from "'Build Linux CLANG38'" to "'Build Linux CLANGDWARF'"
changed jobs.build-linux-clang38.env.TOOLCHAINS from "'CLANG38'" to "'CLANGDWARF'"
changed jobs.build-linux-clang38.steps[6].with.name from "'Linux CLANG38 Artifacts'" to "'Linux CLANGDWARF Artifacts'"

--json To output the results in JSON format:

>>> gawd old_main.yaml new_main.yaml --short --json
[{"type": "changed", "old": {"path": "jobs.build-linux-clangpdb-gcc5.steps[1].run", "value": "'sudo apt-get update (...) UB_PATH\\n'"}, "new": {"path": "jobs.build-linux-clangpdb-gcc5.steps[1].run", "value": "'sudo apt-get update (...) UB_PATH\\n'"}},
{"type": "renamed", "old": {"path": "jobs.build-linux-clang38", "value": "{'name': 'Build Linu (...) *.zip'}}]}"}, "new": {"path": "jobs.build-linux-clangdwarf", "value": "{'name': 'Build Linu (...) *.zip'}}]}"}},
{"type": "changed", "old": {"path": "jobs.build-linux-clang38.name", "value": "'Build Linux CLANG38'"}, "new": {"path": "jobs.build-linux-clangdwarf.name", "value": "'Build Linux CLANGDWARF'"}},
{"type": "changed", "old": {"path": "jobs.build-linux-clang38.env.TOOLCHAINS", "value": "'CLANG38'"}, "new": {"path": "jobs.build-linux-clangdwarf.env.TOOLCHAINS", "value": "'CLANGDWARF'"}},
{"type": "changed", "old": {"path": "jobs.build-linux-clang38.steps[6].with.name", "value": "'Linux CLANG38 Artifacts'"}, "new": {"path": "jobs.build-linux-clangdwarf.steps[6].with.name", "value": "'Linux CLANGDWARF Artifacts'"}}]

--threshold A value between 0 and 1 representing the sensitivity of gawd in identifying changes. A higher threshold results in more instances of 'changed' or 'renamed', while a lower threshold favours 'added' or 'removed':

>>> gawd old_main.yaml new_main.yaml --short --threshold 0.1
removed jobs.build-linux-clangpdb-gcc5.steps[1] with {'name': 'Install De (...) B_PATH\n'}
added jobs.build-linux-clangpdb-gcc5.steps[1] with {'name': 'Install De (...) B_PATH\n'}
removed jobs.build-linux-clang38 with {'name': 'Build Linu (...) *.zip'}}]}
added jobs.build-linux-clangdwarf with {'name': 'Build Linu (...) *.zip'}}]}

As an importable library

gawd comes with two functions to compare workflow files, namely diff_workflow_files and diff_workflows. The former accepts the paths to two workflow files, loads them and returns the output of diff_workflows. The latter accepts two workflows as Python dictionaries (e.g., loaded with ruamel.yaml) and returns a list of 5-tuples (kind, old_path, old_value, new_path, new_value).

The kind component is one of added, removed, changed, moved, renamed and indicates the kind of change. old_path and new_path correspond to a dotted notation indicating where the change occurred, while old_value and new_value correspond to the previous and new values, respectively. Notice that old_path and old_value are set to None in case of an "added" change, and new_path and old_path are None in case of a "removed" change.

Similarly to the extra parameters that can be provided to the CLI tool, the gawd module exposes THRESHOLD, POSITION_WEIGHT and JOB_NAME_WEIGHT.

The following example shows the output of the imported library version of gawd for the changes made to the workflow file main.yml in this commit.

>>> from gawd import diff_workflow_files
>>> old_workflow_path = ...
>>> new_workflow_path = ...
>>> diff_workflow_files(old_workflow_path, new_workflow_path)
[('changed', 'jobs.build-linux-clangpdb-gcc5.steps[1].run', 'sudo apt-get update\nsudo apt-get install nasm ...', 'jobs.build-linux-clangpdb-gcc5.steps[1].run', 'sudo apt-get update\nsudo apt-get install nasm ...'),
('renamed', 'jobs.build-linux-clang38', {'name': 'Build Linux CLANG38', ...}, 'jobs.build-linux-clangdwarf', {'name': 'Build Linux CLANGDWARF', ...}),
('changed', 'jobs.build-linux-clang38.name', 'Build Linux CLANG38', 'jobs.build-linux-clangdwarf.name', 'Build Linux CLANGDWARF'),
('changed', 'jobs.build-linux-clang38.env.TOOLCHAINS', 'CLANG38', 'jobs.build-linux-clangdwarf.env.TOOLCHAINS', 'CLANGDWARF'),
('changed', 'jobs.build-linux-clang38.steps[6].with.name', 'Linux CLANG38 Artifacts', 'jobs.build-linux-clangdwarf.steps[6].with.name', 'Linux CLANGDWARF Artifacts')]

Contributions

Contributions are very welcome! Feel free to report bugs or suggest new features using GitHub issues and/or pull requests.

License

This tool is distributed under GNU Lesser General Public License v3.

Citing

You can refer to the scientific publication describing this tool using

@inproceedings{gawd2024MSR,
  author = {{Rostami Mazrae}, Pooya and Decan, Alexandre and Mens, Tom},
  title = {gawd: A Differencing Tool for GitHub Actions Workflows},
  booktitle = {International Conference on Mining Software Repositories - Data and Tools Showcase Track},
  year = 2024
}

You can also refer to this tool directly using:

@software{gawd2023,
  author = {{Rostami Mazrae}, Pooya and Decan, Alexandre},
  title = {gawd: GitHub Actions Workflow Differ},
  url = {https://github.com/pooya-rostami/gawd},
  year = 2023,
  institute = {Software Engineering Lab at University of Mons, Belgium}
}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

gawd-1.1.1.tar.gz (23.6 kB view details)

Uploaded Source

Built Distribution

gawd-1.1.1-py3-none-any.whl (11.5 kB view details)

Uploaded Python 3

File details

Details for the file gawd-1.1.1.tar.gz.

File metadata

  • Download URL: gawd-1.1.1.tar.gz
  • Upload date:
  • Size: 23.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.7

File hashes

Hashes for gawd-1.1.1.tar.gz
Algorithm Hash digest
SHA256 8508680c8e3b2c01670d48abbce6207ad39d347a077c5906c438aa2e04625ed7
MD5 d6a7619c93b0e630f8f0e8f83d006eb8
BLAKE2b-256 4bede8ed5b489edb4052527570585373548b77ad28617b55e6fbcff1f39dfe66

See more details on using hashes here.

File details

Details for the file gawd-1.1.1-py3-none-any.whl.

File metadata

  • Download URL: gawd-1.1.1-py3-none-any.whl
  • Upload date:
  • Size: 11.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.7

File hashes

Hashes for gawd-1.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 75827ec7393d42af81cee9379f3d17fdeb1e61056cbc56ff123c600cfc0af534
MD5 2d98d9ff269f7fbfdbd83acd2e28540c
BLAKE2b-256 6e4bdabd3e20e188e33c573b87bda33b6e714119a1a91022946859faa80091e6

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page