Skip to main content

General-purpose build system for projects spanning multiple git repos

Project description

OBSOLETE: you don’t want to use this. Use Bazel instead.

This was a project I worked on in 2013 at Cloudscaling. The goal was to have a build system for projects spanning multiple source repositories, with a rule language and operational semantics similar to Bazel’s BUILD files and an emphasis on correctness and repeatability. Bazel was not yet released to the public at that time, and I was not aware of any other build systems that were neatly integrated with a version control system, or which directly addressed cross-project source dependencies, so I started afresh.

Butcher makes various assumptions that make it impractical to use as-is:

  • Assumes that all your sources are in git repositories
  • Assumes all your git repositories share the same base url, unless you map them all to URLs with --map_repo
  • Assumes that all the projects you want to build have butcher-style BUILD files in their source repo

I was a relatively inexperienced programmer back then, so Butcher is also rife with design decisions that I would make very differently now, were I to start a project like this again. Nevertheless, I did learn a lot from this project, and Butcher was quite useful as an internal build tool at Cloudscaling for our various projects.

Butcher target addresses are very much like Bazel’s target labels with an additional syntax for specifying a git ref in a remote repository:

//repo_name[git_ref]/dir/in/project:target

If the [git_ref] part of an address is omitted, butcher will use the value specified with --default_ref on the commandline. By default that is HEAD, which usually implies master in a remote git repository.

To associate any //repo_name with an actual git repository URL, use another commandline flag: --map_repo=<repo_name>:<git@url.here:etc/etc>.

Bazel has a much better and more thorough design for working with remote project repositories; compare to the bazel docs about working with external dependencies for more information.

There are a few types of build rules implemented in Butcher. Most of these have reasonably good documentation as docstrings in their implementation classes, and it should be fairly obvious how one might implement additional rule types after looking through a few of the existing rules.

  • genrule runs arbitrary shell commands as a build step that produces the stated outputs. This includes a Makefile-like sublanguage for cmdline expansion when defining targets (see the source for details)
  • gendeb packages the output of other rules into Debian packages. This assumes that you have fpm installed locally, which is not ideal.
  • filegroup collects files (sources, or outputs of other rules) and gives them a collective name, which can then be used as an input to other rules.
  • pkgfilegroup is similar to filegroup, but it adds metadata that is specifically useful as inputs to rules like gendeb, and a map for setting file ownership and permissions of the files in the eventual output package.
  • pkg_symlink appears to be unfinished, but it should be a way of putting a symlink in a gendeb package.
  • virtual targets can be used to group a bunch of other targets together as a single buildable address.

Butcher’s original project README follows below:


Butcher is a software build system in the spirit of Pants, Buck, and Blaze.

Like other similar tools, Butcher encourages the creation of small reusable modules and focuses on improving efficiency and speed of build processes. What sets Butcher apart is its integration with distributed git repositories rather than relying on large unified codebases.

Butcher uses a build cache to speed up incremental builds and avoid repeated work. The cache stores objects based on a combined checksum (referred to in the code and documentation as a metahash) of all the inputs used to produce them, and dedupes objects by each file’s own checksum. This system makes extensive use of hardlinks, so it is beneficial for Butcher to have its various working directories (cache, git clients, build area) on the same filesystem.

Limitations

General

  • Builds are currently sequential, not parallel.
  • Builds use the same build area until ‘butcher clean’ is run, which can potentially mask or introduce bugs.

Cache

  • Caching is keyed by metahash, but retrieval is done per-file.
  • The cache does not keep checksums of individual files for verification. It should.
  • Cache is local-only, not networked at all.

Upcoming features

  • Everything

Project details


Release history Release notifications

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Filename, size & hash SHA256 hash help File type Python version Upload date
butcher-0.2.13-py2-none-any.whl (43.6 kB) Copy SHA256 hash SHA256 Wheel py2
butcher-0.2.13-py2.py3-none-any.whl (43.6 kB) Copy SHA256 hash SHA256 Wheel py2.py3
butcher-0.2.13.tar.gz (38.2 kB) Copy SHA256 hash SHA256 Source None

Supported by

Elastic Elastic Search Pingdom Pingdom Monitoring Google Google BigQuery Sentry Sentry Error logging AWS AWS Cloud computing DataDog DataDog Monitoring Fastly Fastly CDN SignalFx SignalFx Supporter DigiCert DigiCert EV certificate StatusPage StatusPage Status page