Skip to main content

CI build harness embodying best practices for Python projects.

Project description

A command line CI pipeline build harness utility for Python 3 projects based on known best practices.

There are lots of accessories that are useful for establishing a high quality Python pipeline and copy-pasting all the bits and pieces to initialize a new project is tedious and error prone. This utility aims to streamline the creation of a project with all the necessary development and pipeline dependencies and a ready to run pipeline.

Why not just use CookieCutter?

build_harness complements the use of CookieCutter nicely - you can use build_harness to establish and maintain your Python project pipeline with minimal effort and then focus on using CookieCutter to implement your business specific customization of build, test and analysis options.

build_harness also lends itself to being easily applied across multiple use cases, from the pipeline itself, to pre-commit hooks, to developers manually running specific components of the pipeline for test and debug.

1 Installation

Ensure that git is installed where build_harness is going to be installed. eg.

sudo apt install -y git

The build_harness package is available from PyPI. Installing into a virtual environment is recommended.

# A first time installation creating a virtual environment inside the project
# directory using the flit package manager
mkdir my_project_repo; cd my_project_repo
python3 -m venv .venv; .venv/bin/pip install build_harness; .venv/bin/flit init

You should ensure that build_harness is declared as a development dependency of your project so that it is automatically installed in pipelines and such as necessary. For the flit package manager this would look something like this:

[tool.flit.metadata.requires-extra]
dev = [
    "build_harness == <current release>",
]

Of course, due to the current spate of supply chain attacks against various public package repositories including PyPI best practices include always pinning dependency packages to an exact release.

Note that Ubuntu, for example, separates pip and venv installations from the main Python installation and they are not installed by default, so if you are working with a fresh Ubuntu install you will need something like this to acquire them before running the above commands.

sudo apt update && sudo apt install -y python3-pip python3-venv

Note also that the flit package manager is presently the only official choice for use with build_harness. Over time support for other package managers is expected to be added.

2 Getting started

Installation makes a command line utility build-harness available in the virtual environment. There are currently five groups of sub-commands available.

acceptance

Run and manage Gherkin features and step files using the behave package.

formatting

Format source code to PEP-8 standard using the isort and black packages.

install

Install and manage project dependencies in the virtual environment. The install command will look for a virtual environment .venv in the project root directory and create it if needed. Then it installs and manages all the project dependencies there.

This command only installs packages when they are missing or out of date, so it makes efficient use of network capacity and can reduce installation time when only incremental changes are needed.

package

Build wheel and sdist packages of the project.

publish

Publish project artifacts to publication repositories such as PyPI and readthedocs.

static-analysis

Run static analysis on source code; pydocstyle, flake8 and mypy packages.

unit-test

Run unit tests of the project using pytest.

Further options for these commands can be explored using the --help argument.

build-harness --help
build-harness install --help

A quick summary of using each of the sub-commands.

# Install project dependencies into the virtual environment.
build-harness install
# Check if project dependencies are up to date in the virtual environment.
build-harness install --check
# Format code to PEP-8 standards using isort, black.
build-harness formatting
# Fail (exit non-zero) if formatting needs to be applied.
build-harness formatting --check
# Run pydocstyle, flake8 and mypy analysis on the project.
build-harness static-analysis
# Run pytest on unit tests in the project.
build-harness unit-test
# Test that coverage passes the specified threshold.
build-harness unit-test --check <int>
# Run Python behave on Gherkin based features.
build-harness acceptance tests
# Generate step file snippets for unimplemented features.
build-harness acceptance snippets
# Report where tags are used in feature files.
build-harness acceptance tags

3 Concepts

For now, the sub-commands are limited to a specific set of tools (the ones I have found to be most useful).

Fine tuning configuration of the underlying tools is generally possible using configuration files such as sections added to pyproject.toml or setup.cfg or tool specific files in some cases.

3.1 Release Management

In essence release management is the definition of release states before and after a formal “production” release, how the transitions between release states occur, how those transitions interact with repository branching strategies and how each release state is identified in project packaging (the release id), source control and other related artifacts for the purpose of traceability. Python has myriad ways of managing releases for a project and almost all of them require some custom workflow from the user to make it work for automation so it’s really difficult to support all of them. For this reason the default packaging option of build_harness using the package --release-id option does nothing relating to the release id and assumes that the user has done whatever is necessary for their workflow to correctly define the release id for packaging.

Having said that, the goal of the build_harness project is to have useful out-of-the-box functionality as much as possible, so described here are workflows that have been integrated into the project. Because release management preferences are so varied a separate utility called release-flow is introduced for identifying branches and relating them to source control repository branches. See the Release identity section below for more details.

There’s a fairly useful survey of Python release management in the answers to this StackOverflow question. The setuptools_scm package also has some useful notes on different ways to control release id insertion to a package.

3.1.1 Release identity

Very closely related to release management is the concept of a release identity, how that identity changes between release states and how those changes are mapped to changes in source control repository branches and/or tags. Similar to release management there are myriad ways of identifying formal releases and pre-releases, constrained only by the PEP-440 definitions for Python projects.

The release-flow utility applies a relatively simple release identity and branching strategy that in my experience is useful for most projects:

  • Use semantic versions to identify formal releases

  • Apply a semantic version tag to commits in the default/main branch of the source control repository to identify a formal release to the pipeline

  • Non-releases are identified using the PEP-440 compliant release id <last semantic version>-post<commit offset from last semantic version>

Further to the above steps relating to the release-flow utility, these steps must be applied by the CI pipeline:

  • All artifacts are identified with the release id in the filename

  • Python packages have the release id applied to project metadata

Finally, the source control repository itself must have a tag semantic version tag applied to the first commit of the repository. Recommend that the first commit tag is “0.0.0”.

3.1.2 VERSION file workflow

This is the workflow used by the build_harness project itself, so you can refer to the source code for an example of how to implement this workflow.

  • The package reads the content of a simple text file named VERSION in the top-level Python package of your project and applies it to the __version__ variable in the package.

  • If the file does not exist a default release ID is applied as defined within the project package.

  • Use the snippets below to set the Python __version__ variable for the project from the content of the VERSION file.

Some Internet discussions on this topic recommend that the VERSION is not committed to source control. The problem I have historically experienced is that this complicates the local build because the developer must remember to create a useful “benign” VERSION file for themselves otherwise their build will fail; if it’s created locally and every developer needs it, then why not just commit it to source control and avoid the “toil”? If the pipeline somehow fails to update the VERSION file correctly, then at least an invalid package is created with the benign release id that can be readily identified as an error to fix.

The committed file should contain a default value that is readily recognisable as having not been built by a pipeline. eg. If a developer builds the package locally it should be clear that the package they built is not an official release (which should only have been built by a pipeline).

A default value I have historically used is “0.0.0”. Within the limitations defined by PEP-440 another option could be “0.0.0+local”.

For manual release definition you have to ensure that the content of the VERSION file reflects the release id you are releasing. Doing this manually is error prone and easily acquires a number of deficiencies with respect to how organizations often want to organize their releases.

For automation the pipeline just needs to be able to update the content of the file with the release id defined for a release; this is easily achieved by defining semantic version tags on the repo (or some similar such rule that can be incorporated into the pipeline code) as a formal release and having the pipeline update the VERSION file with the tag text.

# top-level __init__.py
"""flit requires top-level docstring summary of project"""

from ._version import __version__  # noqa: F401
# _version.py
import pathlib

from ._default_values import DEFAULT_RELEASE_ID

def acquire_version() -> str:
    """
    Acquire PEP-440 compliant version from VERSION file.

    Returns:
        Acquired version text.
    Raises:
        RuntimeError: If version is not valid.
    """
    here = pathlib.Path(__file__).parent
    version_file_path = (here / "VERSION").absolute()

    if version_file_path.is_file():
        with version_file_path.open(mode="r") as version_file:
            version = version_file.read().strip()
    else:
        version = DEFAULT_RELEASE_ID

    if not version:
        raise RuntimeError("Unable to acquire version")

    return version

__version__ = acquire_version()
# _default_values.py
DEFAULT_RELEASE_ID = "0.0.0"

3.2 Pipelines

The build_harness project pipeline runs in Gitlab-CI, so you can use its .gitlab-ci.yml file as a template for your own project. Since the objective of build_harness is to reduce the difficulty of starting a pipeline from scratch you should find that only minimal changes are needed for your Gitlab project. You should also find it relatively easy to translate the workflow over to other CI pipeline tools such as GitHub Actions, Azure DevOps, Circle-CI, Travis-CI and Jenkins.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

build_harness-0.1.1.tar.gz (47.4 kB view hashes)

Uploaded Source

Built Distribution

build_harness-0.1.1-py3-none-any.whl (38.7 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page