Ml build utilities for containerized build pipelines.
Project description
ml-buildkit
Ml build utilities for containerized build pipelines.
Getting Started • Features • Documentation • Support • Contribution • Changelog • FAQ
ML-buildkit is a set of utilities designed to build, test, package, and release software. It enables you to implement your build and release pipeline with Python scripts once and run it either on your local machine, in a containerized environment via Act, or automated via Github Actions. It supports a monorepo or polyrepo setup and can be used with any programming language or technology. It also provides a full release pipeline for automated releases with changelog generation.
WIP: This project is still an alpha version and not ready for general usage.
Highlights
- 🐳 Implement once and run locally, containerized, or on Github Actions.
- 🧰 Build utilities for Python, Docker, React & MkDocs.
- 🔗 Predefined Github Action Workflows for CI & CD.
- 🛠 Integrated with devcontainer for containerized development.
Getting Started
Installation
Requirements: Python 3.6+.
pip install ml-buildkit
Usage
To make use of universal build for your project, create a build script with the name build.py
in your project root. The example below is for a single yarn-based webapp component:
from ml_buildkit import build_utils
args = build_utils.parse_arguments()
version = args.get(build_utils.FLAG_VERSION)
if args.get(build_utils.FLAG_MAKE):
build_utils.log("Build the component:")
build_utils.run("yarn build")
if args.get(build_utils.FLAG_CHECK):
build_utils.log("Run linters and style checks:")
build_utils.run("yarn run lint:js")
build_utils.run("yarn run lint:css")
if args.get(build_utils.FLAG_TEST):
build_utils.log("Test the component:")
build_utils.run("yarn test")
if args.get(build_utils.FLAGE_RELEASE):
build_utils.log("Release the component:")
# TODO: release the component to npm with version
Next, copy the build-environment
action from the actions folder into the .github/actions
folder of your repository. In addition, you need to copy the build- and release-pipeline workflows from the workflows folder into the .github/workflows
folder of your repository as well. Your repository should now contain atleast the following files:
your-repository
- build.py
- .github:
- actions:
- build-enviornment:
- Dockerfile
- actions.yaml
- workflows:
- release-pipeline.yml
- build-pipeline.yml
Once you have pushed the build-environment
action and the build- and release-pipelines, please look into the Automated Build Pipeline and Automated Release Pipeline sections for information on how to run your build- and release-pipelines.
You can find a more detailed example project with multiple components in the examples folder.
Support & Feedback
This project is maintained by Benjamin Räthlein, Lukas Masuch, and Jan Kalkan. Please understand that we won't be able to provide individual support via email. We also believe that help is much more valuable if it's shared publicly so that more people can benefit from it.
Type | Channel |
---|---|
🚨 Bug Reports | |
🎁 Feature Requests | |
👩💻 Usage Questions | |
🗯 General Discussion | |
❓ Other Requests |
Documentation
Build Script CLI • Default Flags • API Reference • Update Ml Build
Build Script CLI
Any build script that utilizes the build_utils.parse_arguments()
method to parse the CLI arguments can be executed with the following options:
python build.py [OPTIONS]
Options:
These options correspond to the default flags documented in the next section.
--make
: Make/compile/package all artifacts.--test
: Run unit and integration tests.--check
: Run linting and style checks.--release
: Release all artifacts (e.g. to registries like DockerHub or NPM).--run
: Run the component in development mode (e.g. dev server).--version VERSION
: Version of the build (MAJOR.MINOR.PATCH-TAG
).--force
: Ignore all enforcements and warnings.--skip-path SKIP_PATH
: Skips the build phases for all (sub)paths provided here. This option can be used multiple times.--test-marker TEST_MARKER
: Provide custom markers for testing. The default marker for slow tests isslow
. This option can be used multiple times.-h, --help
: Show the help message and exit.
Default Flags
At its core, ml-buildkit will parse all arguments provided to the build script via build_utils.parse_arguments()
and returns a sanitized and augmented list of arguments. Those arguments are the building blocks for your build script. You can utilize those arguments in whatever way you like. Here is an example on how to use those arguments in a build.py
script:
from ml_buildkit import build_utils
args = build_utils.parse_arguments()
version = args.get(build_utils.FLAG_VERSION)
if args.get(build_utils.FLAG_MAKE):
# Run all relevant build commands.
if args.get(build_utils.FLAG_TEST):
# Run all relevant commands for testing
test_markers = args.get(build_utils.FLAG_TEST_MARKER)
if "slow" in test_markers:
# Run additional slow tests.
The following list contains all of the default flags currently supported by ml-buildkit:
Flag | Type | Description |
---|---|---|
FLAG_MAKE |
bool |
Build/compile/package all artifacts. |
FLAG_CHECK |
bool |
Run linting and style checks. |
FLAG_TEST |
bool |
Run unit and integration tests. |
FLAG_RELEASE |
bool |
Release all artifacts (e.g. to registries like DockerHub or NPM). |
FLAG_RUN |
bool |
Run the component in development mode (e.g. dev server). |
FLAG_FORCE |
bool |
Ignore all enforcements and warnings. |
FLAG_VERSION |
str |
Semantic version for the build. If not provided via CLI arguments, a valid dev version will be automatically calculated. |
FLAG_TEST_MARKER |
List[str] |
Custom markers for testing. Can be used to skip or execute certain tests. |
API Reference
In addition to argument parsing capabilities, ml-buildkit also contains a variety of utility functions to make building complex projects with different technologies easy. You can find all utilities in the Python API documentation here.
Update Ml Build
To update the ml-buildkit version of your project, simply look up the most recent version of build-environment on DockerHub and set this version in the .github/actions/build-environment/Dockerfile
file of your repository:
FROM khulnasoft/build-environment:<UPDATED_VERSION>
In case you also run your build outside of the build-environment (locally), make sure to also upgrade ml-buildkit on your local machine from PyPi:
pip install --upgrade ml-buildkit
Furthermore, you can also check if the build- and release-pipeline workflows have changed. In case of changes, update the workflows in your .github/workflows
folder of your repository as well.
Features
Support for Nested Components • Automated Build Pipeline • Automated Release Pipeline • Containerized Development • Simplified Versioning • MkDocs Utilities • Python Utilities • Docker Utilities • Extensibility
Automated Build Pipeline (CI)
ML-buildkit enables you to run your build pipeline on your local machine, in a containerized environment via Act, or automated via Github Actions (= Continuous Integration).
Local machine via build script (not recommended):
Requirements: ml-buildkit and all the build requirements that your build script is using (e.g. yarn, pipenv, maven, ...) need to be installed on your machine.
Execute the following command in the root folder of any component with a valid build.py
script:
python build.py --make --check --test
Executing the build-pipeline directly via the build scripts is not recommended.
Containerized environment via Act:
Requirements: Docker and Act are required to be installed on your machine.
Execute this command in the root folder of your repository:
act -b -s BUILD_ARGS="--check --make --test" -j build
Manually via Github Actions:
In the Github UI, go to Actions
-> select build-pipeline
-> select Run Workflow
and provide the build arguments, e.g. --check --make --test
.
Automated via Github Actions (CI):
With the default configuration, the build pipeline will run automatically via Github Actions on any push
event to your repository. This automation can be referred to as continuous integration. You can also change the events that trigger the build-pipeline by modifying the on
section in the .github/workflows/build-pipeline.yml
file. You can find more information about Github Actions events here.
Automated Release Pipeline (CD)
To release a new version and publish all relevant artifacts to the respective registries (e.g. Docker image to DockerHub) you can either trigger our release pipeline on your local machine, in a containerized environment via Act, or automated via Github Actions (= Continuous Delivery).
Local machine via build script (not recommended):
Requirements: ml-buildkit and all the build requirements that your build script is using (e.g. yarn, pipenv, maven, ...) need to be installed on your machine.
Execute the following command in the root folder of any component with a valid build.py
script:
python build.py --make --check --test --release --version="<MAJOR.MINOR.PATCH>"
Executing the release step directly via the build scripts is not recommended.
Containerized environment via Act:
Requirements: Docker and Act are required to be installed on your machine.
Execute this command in the root folder of your repository:
act -b -s VERSION="<MAJOR.MINOR.PATCH>" -j release
In case you also want to automatically create a valid Github release, you also need to provide a valid GITHUB_TOKEN
as a secret (-s GITHUB_TOKEN=<token>
). Please refer to the next section for information on how to finish and publish the release.
On Github Actions (CD):
Make sure that all required secrets for you release pipeline are configured in your Github repository. More information here.
To trigger our release pipeline from Github UI, you can either close a milestone that has a valid version name (vMAJOR.MINOR.PATCH
) or execute the release pipeline manually via the workflow_dispatch
UI in the Action Tab (Actions -> release-pipeline -> Run Workflow
). The release pipeline will automatically run the build, check, test, and release steps, and create a pull request for the new version as well as a draft release on Github. This automation can be referred to as continuous delivery.
After successful execution of the release pipeline, the following steps are required to finish the release:
- Merge the release PR into
main
. Preferably via merge commit to keep the version tag in themain
branch. We suggest to use the following message for the merge commit:Finalize release for version <VERSION> (#<PR>)
. - Adapt the changelog of the draft release on Github (in the release section). Mention all other changes that are not covered by pull requests.
- Publish the release.
Resolve an unsuccessful release:
In case the release pipeline fails at any step, we suggest to fix the problem based on the release pipeline logs and create a new release with an incremented patch
version. To clean up the unsuccessful release, make sure to delete the following artifacts (if they exist): the release branch, the release PR, the version tag, the draft release, and any release artifact that was already published (e.g. on DockerHub, NPM or PyPi).
Support for Nested Components
You can find the implementation of this multi-nested example in the examples folder.
ML-buildkit has excellent support for repositories that contain multiple nested components (aka Monorepo). The following examples
repository has four components: docs
, react-webapp
, docker
, and python-lib
:
examples:
- build.py
- docs:
- build.py
- react-webapp:
- build.py
- docker:
- build.py
- python-lib:
- build.py
Every component needs its own build.py
script in the component root folder that implements all the logic to build, check, test, and release the given component. The build.py
script in the repo root folder contains the build logic that orchestrates all component builds. ML-buildkit provides the build_utils.build()
function that allows to call the build script of a sub-component with the parsed arguments (find more info on build
function in the API documentation).
In between the build steps, you can execute any required operations, for example, duplicating build artifacts from one component to another. The following example, shows the build.py
script that would support the examples
repository structure:
from ml_buildkit import build_utils
args = build_utils.parse_arguments()
build_utils.build("react-webapp", args)
build_utils.build("python-lib", args)
build_utils.duplicate_folder("./python-lib/docs/", "./docs/docs/api-docs/")
build_utils.build("docker", args)
build_utils.build("docs", args)
With this setup, you can execute the build pipeline for the full project or any individual component. In case you only apply changes to a single component, you only need to execute the build.py
script of the given component. This is a major advantage since it might massively speed up your development time.
To run the build pipeline on you local machine only for a specific component, navigate to the component and run the build.py
script in the component root folder (you can find all CLI build arguments here):
cd "./docs" && python build.py [BUILD_ARGUMENTS]
Alternatively, you can also run the component build containerized via Act:
act -b -s BUILD_ARGS="[BUILD_ARGUMENTS]" -s WORKING_DIRECTORY="./docs" -j build
Or directly from the Github UI: Actions
-> build-pipeline
-> Run workflow
. The Github UI will allow you to set the build arguments and working directory.
Simplified Versioning
Only semantic versioning is supported at the moment.
If you do not provide an explicit version via the build arguments (--version
), ml-buildkit will automatically detect the latest version via Git tags and pass a dev version to your build scripts. The dev version will have the following format: <MAJOR>.<MINOR>.<PATCH>-dev.<BRANCH>
. This should be sufficient for the majority of development builds. However, the release step still requires to have a valid semantic version provided via the arguments.
Python Utilities
The build_python
module of ml-buildkit provides a collection of utilities to simplify the process of building and releasing Python packages. Refer to the API documentation for full documentation on all python utilities. An example for a build script for a Python package is shown below:
from ml_buildkit import build_utils
from ml_buildkit.helpers import build_python
# Project specific configuration
MAIN_PACKAGE = "template_package"
args = build_python.parse_arguments()
version = args.get(build_utils.FLAG_VERSION)
# Update version in __version__.py
build_python.update_version(os.path.join(HERE, f"src/{MAIN_PACKAGE}/__version__.py"), str(version))
if args.get(build_utils.FLAG_MAKE):
# Install pipenv dev requirements
build_python.install_build_env()
# Build distribution via setuptools
build_python.build_distribution()
if args.get(build_utils.FLAG_CHECK):
build_python.code_checks()
if args.get(build_utils.FLAG_TEST):
build_utils.run('pipenv run pytest -m "not slow"')
if "slow" in args.get(build_utils.FLAG_TEST_MARKER):
build_python.test_with_py_version(python_version="3.6.12")
if args.get(build_utils.FLAG_RELEASE):
# Publish distribution on pypi
build_python.publish_pypi_distribution(pypi_token=args.get(build_python.FLAG_PYPI_TOKEN),pypi_repository=args.get(build_python.FLAG_PYPI_REPOSITORY))
The build_python.parse_arguments()
argument parser has the following additional flags:
Flag | Type | Description |
---|---|---|
FLAG_PYPI_TOKEN |
str |
Personal access token for PyPI account. |
FLAG_PYPI_REPOSITORY |
str |
PyPI repository for publishing artifacts. |
And the following additional CLI options:
--pypi-token
: Personal access token for PyPI account.--pypi-repository
: PyPI repository for publishing artifacts.
Docker Utilities
The build_docker
module of ml-buildkit provides a collection of utilities to simplify the process of building and releasing Docker images. Refer to the API documentation for full documentation on all docker utilities. An example for a build script for a Docker image is shown below:
from ml_buildkit import build_utils
from ml_buildkit.helpers import build_docker
IMAGE_NAME = "build-environment"
DOCKER_IMAGE_PREFIX = "khulnasoft"
args = build_docker.parse_arguments()
version = args.get(build_utils.FLAG_VERSION)
if args.get(build_utils.FLAG_MAKE):
build_docker.build_docker_image(COMPONENT_NAME, version)
if args.get(build_utils.FLAG_CHECK):
build_docker.lint_dockerfile()
if args.get(build_utils.FLAG_RELEASE):
build_docker.release_docker_image(IMAGE_NAME, version, DOCKER_IMAGE_PREFIX)
The build_docker.parse_arguments()
argument parser has the following additional flags:
Flag | Type | Description |
---|---|---|
FLAG_DOCKER_IMAGE_PREFIX |
str |
Docker image prefix. This should be used to define the container registry where the image should be pushed to. |
And the following additional CLI options:
--docker-image-prefix
: Docker image prefix. This should be used to define the container registry where the image should be pushed to.
MkDocs Utilities
The build_mkdocs
module of ml-buildkit provides a collection of utilities to simplify the process of building and releasing MkDocs documentation. Refer to the API documentation for full documentation on all MkDocs utilities. An example for a build script for MkDocs documentation is shown below:
from ml_buildkit import build_utils
from ml_buildkit.helpers import build_mkdocs
args = build_utils.parse_arguments()
if args.get(build_utils.FLAG_MAKE):
# Install pipenv dev requirements
build_mkdocs.install_build_env()
# Build mkdocs documentation
build_mkdocs.build_mkdocs()
if args.get(build_utils.FLAG_CHECK):
build_mkdocs.lint_markdown()
if args.get(build_utils.FLAG_RELEASE):
# Deploy to Github pages
build_mkdocs.deploy_gh_pages()
Extensibility
Extend your build-environment image with additional tools
Install the tools in the Dockerfile in your .github/actions/build-environment/Dockerfile
as demonstrated in this example:
FROM khulnasoft/build-environment:0.6.18
# Install Go Runtime
RUN apt-get update \
&& apt-get install -y golang-go
Extend the entrypoint of the build-environment
You can extend or overwrite the default entrypoint with your custom entrypoint script (e.g. extended-entrypoint.sh
) as shown below:
FROM khulnasoft/build-environment:0.6.18
COPY extended-entrypoint.sh /extended-entrypoint.sh
RUN chmod +x /extended-entrypoint.sh
ENTRYPOINT ["/tini", "-g", "--", "/extended-entrypoint.sh"]
The following extended-entrypoint.sh
example demonstrates how to extend and reuse the existing default entrypoint:
# Stops script execution if a command has an error
set -e
echo "Setup Phase"
# TODO: Do your custom setups here
# Call the default build-environment entrypoint.
# Disable the immediate script execution stop so that the cleanup phase can run in any case
set +e
# Thereby, you can reuse the existing implementation:
/bin/bash /entrypoint.sh "$@"
# Save the exit code of the previous command
exit_code=$?
echo "Cleanup Phase"
# TODO: Do additional cleanup
# Exit the script with the exit code of the actual entrypoint execution
exit $exit_code
Support additional build arguments
The following example demonstrates how you can support custom build arguments (CLI) in your build.py
script:
import argparse
from ml_buildkit import build_utils
parser = argparse.ArgumentParser()
parser.add_argument("--deployment-token", help="Token to deploy component.", default="")
args = build_utils.parse_arguments(argument_parser=parser)
deployment_token = args.get("deployment_token")
Once it is implemented in your build script, you can provide the build argument via the CLI options: python build.py --deployment-token=my-token
. If your custom argument is a string
and has a default string value (e.g. default=""
), you can also provide the build argument via environment variables: DEPLOYMENT_TOKEN=mytoken python build.py
.
To use your custom build arguments inside the release pipeline, you need to add the DEPLOYMENT_TOKEN
as a secret to your Github repository (more info here) and adapt the .github/workflows/release-pipeline.yml
file by adding the DEPLOYMENT_TOKEN
as an environment variable (env
) to the steps that need this build argument, for example:
- name: release-components
uses: ./.github/actions/build-environment
env:
DEPLOYMENT_TOKEN: ${{ secrets.DEPLOYMENT_TOKEN }}
Use custom test markers to select tests for execution
You can provide any number of custom test markers via the --test-marker
build argument. The following example shows how to react to custom test markers in your build script:
if args.get(build_utils.FLAG_TEST):
# Run your default tests
if "integration" in args.get(build_utils.FLAG_TEST_MARKER):
# Run integration tests
Containerized Development
The build-environment can also be used for development inside a container. It is fully compatible with the devcontainer standard that is used by VS Code and Github Codespaces. The big advantage of using the build-environment for containerized development is that you only have to define your project dependencies in one location, and use this for development, local builds, and automated CI / CD pipelines.
To use the build-environment for containerized development, just define a .devcontainer/devcontainer.json
configuration inside your repository and link the build.dockerfile
to the build-environment action in the .github/actions/build-environment/Dockerfile
folder. A minimal devcontainer.json
configuration could look like this:
{
"name": "build-environment",
"build": {
"dockerfile": "../.github/actions/build-environment/Dockerfile"
},
"settings": {
// Set default container specific vs code settings
"terminal.integrated.shell.linux": "/bin/bash"
},
"extensions": [
// Add required extensions
]
}
You can find a full example here.
FAQ & Known Issues
Act: Error response from daemon - volume is in use (click to expand...)
Sometimes the act containers are not removed properly and are blocking any subsequent act executions of your workflow. As a workaround, you can just remove all act containers:
docker rm -f $(docker ps -a --filter="name=^act-" -q)
How to access the host from Docker Containers in GitHub Actions / Act or containers from the host (click to expand...)
If you want to access the host (in act the pipeline container and on GitHub Actions the Linux VM) from within a container, you can set an environment variable in the workflow file with this step:
- name: set-host-ip
run: echo "::set-env name=_HOST_IP::$(hostname -I | cut -d ' ' -f 1)"
# new syntax which is not yet supported on act:
# run: echo "_HOST_IP=$(hostname -I | cut -d ' ' -f 1)" >> "$GITHUB_ENV"
and then access the environment variable from within a container. This way you can, for example, access other containers with published ports or other host services.
If you want to access a container directly without going through the host, you can get the IP address for example in the following way:
container_id=<CONTAINER-ID-OR-NAME>
container_ip=$(docker inspect $container_id | jq -r '.[0].NetworkSettings.Networks.bridge.IPAddress')
Note that the tool
jq
has to be installed. If you run a python script and use the Docker client, the command looks different, of course.
When you don't put starting containers into a custom network, the container is now reachable from the host (GitHub Actions & Act) as well as other containers under this $container_ip
address. Yet, it is not reachable from your local machine (e.g. your Mac). For that, you have to publish the port and use the $_HOST_IP
address as explained above. The host port should be assigned randomly so that the setup is as host-independent as possible. To dynamically get the random port you can get it in the following way via bash:
container_id=<CONTAINER-ID-OR-NAME>
container_port=<INNER-CONTAINER-PORT>
container_host_port=$(docker inspect $container_id | jq -r '.[0].NetworkSettings.Ports["'$container_port'/tcp"][0].HostPort')
In your code, you should then check whether the $_HOST_IP
variable is set and if not, use localhost
. This way, it will work on GitHub Actions, Act, and your local machine. Here is a Python example:
import docker
client = docker.from_env()
container_name = "test-container"
container_port = 8080
container = client.containers.run(
"some-image:1.2.3",
name=container_name,
ports={f"{container_port}/tcp": None},
detach=True,
)
container.reload()
ip_address = os.getenv("_HOST_IP", "localhost")
os.environ["CONTAINER_NAME"] = container_name
os.environ["CONTAINER_IP"] = ip_address
container_host_port = container.attrs["NetworkSettings"]["Ports"][f"{container_port}/tcp"][0]["HostPort"]
os.environ["CONTAINER_HOST_PORT"] = container_host_port
Contribution
- Pull requests are encouraged and always welcome. Read our contribution guidelines and check out help-wanted issues.
- Submit Github issues for any feature request and enhancement, bugs, or documentation problems.
- By participating in this project, you agree to abide by its Code of Conduct.
- The development section below contains information on how to build and test the project after you have implemented some changes.
Development
Requirements: Docker and Act are required to be installed on your machine to execute the build process.
To simplify the process of building this project from scratch, we provide build-scripts that run all necessary steps (build, check, test, and release) within a containerized environment. To build and test your changes, execute the following command in the project root folder:
act -b -j build
Refer to our contribution guides for more detailed information on our build scripts and development process.
Licensed MIT. Created and maintained with ❤️ by developers from Berlin.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file ml_buildkit-0.6.18.tar.gz
.
File metadata
- Download URL: ml_buildkit-0.6.18.tar.gz
- Upload date:
- Size: 49.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.12.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | b9db2e0776223cc03814cf52a4cbfe4150cff91b1f0acf864b3ba286fdb6f78f |
|
MD5 | 7ec05414363d8d67a181681fbbfb9bda |
|
BLAKE2b-256 | 8725fe9ad49f59e06774da909b87ca3f5f6148f691cccd5c4b8d41e89f6dd887 |
File details
Details for the file ml_buildkit-0.6.18-py3-none-any.whl
.
File metadata
- Download URL: ml_buildkit-0.6.18-py3-none-any.whl
- Upload date:
- Size: 30.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.12.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 021c17c72016a298bb0ff83447b0e32fb4fa754b188b5c2fc72fcee1f7eb1c06 |
|
MD5 | 112806db4b88a9d5be00c7ac5a273625 |
|
BLAKE2b-256 | 452a5192d4fb4e2c7384b0e4b71e6cb1fb7a68a1c4613513dcb8b0b960d8a932 |