Skip to main content

Dynamic Make in Python

Project description

DynaMake 0.6.2 - Dynamic Make in Python

Build Status Documentation Status

HOW

To install, pip install --user dynamake (or sudo pip install dynamake if you have sudo privileges).

Then, create a DynaMake.py file. Here is a trivial one:

from dynamake import *

@step(output='foo')
async def copy_bar_to_foo() -> None:
    require('bar')
    await shell('cp bar foo')

Which is equivalent to the Makefile:

foo: bar
    cp bar foo

The Python version is more verbose, so if this was all there was to it, make would have been preferable. However, DynaMake allows one to specify scripts that are impossible in make, justifying the additional syntax. See the tutorial for the details.

WHY

“What the world needs is another build tool”

—Way too many people

So, why yet another one?

DynaMake’s raisons d’etre are:

  • First class support for fully dynamic build graphs.

  • Reproducible builds.

  • Python implementation.

DynaMake was created to address a concrete need for repeatable configurable processing in the context of scientific computation pipelines, but should be applicable in wider problem domains.

Dynamic Build Graphs

This is a fancy way of saying that the following are supported:

Dynamic inputs: The full set of inputs of a build step may depend on a subset of its inputs.

A classic example of dynamic inputs is compiling a C source file, which actually depends on all the included header files. Therefore, for a safe reproducible build, the build tool first needs to ensure the C source file is up-to-date (it might be generated by some tool such as YACC), then scan the file for included header file names, recursively ensure each of these is up-to-date and scan them for further included header files, and only once all this is done, we can tell the exact list of file dependencies for compiling the C source file into an object file.

Dynamic inputs are supported in various ways by most build tools - some of these ways being more convoluted than others. DynaMake provides natural first-class support for such cases, inspired by the shake build system. This video is a nice short explanation of the approach.

Multiple outputs: A build step may create multiple output files. A classic example of multiple outputs is YACC which generates both a header file and a source file from a single parser grammar file.

Surprisingly, multiple outputs are not supported by the venerable make program. However many other more modern tools do support this feature.

Dynamic outputs: The set of outputs of a build step may depend on its inputs.

An example of dynamic outputs is running a clustering step on some large data, which may produce some number of clusters which is not known in advance. Each of these clusters needs to go through some further processing. In DynaMake, one can specify output='clusters/{*_id}.txt' to specify that the step will produce several output files with different identifiers.

Dynamic outputs are not supported by the vast majority of build tools. A notable exception is snakemake which provides support for some restricted cases.

Pattern Steps: A build step may apply to a parameterized set of outputs rather than to specific outputs.

Restricted forms of pattern steps have existed starting from the venerable make .c.o: and %.o: %.c pattern rules. However, in many tools (such as make) these pattern steps are restricted to having a single parameter (in this case, the base name of the output file). In the case of make, these rules are also very fragile (silently ignored when make can’t create a dependency, leading to broken builds).

DynaMake provides a more general form of pattern rules with an unlimited number of named parameters (e.g. output='results/{*method}/outputs/{*parameters}.data') which allows the build step to naturally use the captured values of the parameters inside the step (e.g. await shell(f'{method} < {parameters} > {output()}').

The downside of dynamic build graphs is that they make some build tool features really hard to implement. Therefore, retrofitting them into an existing build tool causes some features to break. In the worst case this leads to silent broken builds. For example:

  • The ability to aggressively optimize the case when a build needs to do nothing at all, and in general reduce the build system overhead.

  • The ability to perform a dry run that accurately lists all the steps that will be needed to build an arbitrary target.

  • Having a purely declarative build language, which can be more easily learned than any programming language (even Python :-) and may be processed as pure data by additional tools.

Reproducible Builds

By definition, all build tools will correctly rebuild outputs if any of their dependencies change. However, most build tools will not rebuild the outputs if the actions to create them were changed (e.g., adding/removing compilation flags).

By default, DynaMake tracks the exact actions that were used in the past to generate every output and will rebuild the output if this has changed in any way. This requires DynaMake to maintain state between builds inside a sub-directory (by default, .dynamake, but you can override it using the DYNAMAKE_PERSISTENT_DIR environment variable).

There are good reasons to avoid any such additional persistent state. DynaMake allows disabling this feature. Specifying the --rebuild_changed_actions False command line flag will instruct DynaMake to rely only on the modification times of the input files. This of course results in less reliable rebuilds.

Python

DynaMake is heavily inspired by shake. However, shake is implemented in Haskell, which is unlikely to be pre-installed on a typical machine, and installing it isn’t trivial (especially when one has no sudo privileges). Also, shake rules are also written in Haskell, which is very different from most popular programming languages.

In contrast, Python is much more likely to already be installed on a typical machine, and installing DynaMake is trivial using pip install --user dynamake (or sudo pip install dynamake if you have sudo privileges). The build rules are written in Python, which many more people are familiar with, and is simpler to pick up.

WHY NOT

DynaMake’s unique blend of features comes at some costs:

  • It is a new, immature tool. As such, it lacks some features it could/should provide, is less efficient than it could be, and you may encounter the occasional bug. Hopefully this will improve with time. If you want DynaMake-like features with a proven track record, you should consider shake.

  • The provided goals, as described above, may be a poor fit for your use case.

    If your build graph and configuration are truly static, consider using Ninja which tries to maximize the benefits of such a static build pipeline. It is almost the opposite of DynaMake in this respect.

    If your build graph is only “mostly static” (e.g., just needs a restricted form of dynamic inputs, such as included header files), then you have (too) many other options to list here. Using the classical make is a good default choice.

  • DynaMake is a low-level build tool, on par with make and ninja.

    If you are looking for a tool that comes with a lot of built-in rules for dealing with specific computer languages (say, C/C++), and will automatically deal with cross-platform issues, consider using CMake or XMake instead.

WHAT NOT (YET)

Since DynaMake is very new, there are many features that should be implemented, but haven’t been worked on yet:

  • Allow forcing rebuilding (some) targets.

  • Allow skipping generating intermediate files if otherwise no actions need to be done. This is very hard to do with a dynamic build graph - probably impossible in the general case, but common cases might be possible(?)

  • Generate a tree (actually a DAG) of step invocations. This can be collected from the persistent state files.

  • Generate a visualization of the timeline of action executions showing start and end times, with resource consumption. This would be similar to the profiling capabilities of shake.

  • Allow using checksums instead of timestamps to determine if actions can be skipped, either by default or on a per-file basis.

History

0.6.0

  • First release.

0.6.1

  • Patch travis-ci build and links in README.rst.

0.6.2

  • Improved project template.

  • Improved mypy configuration.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dynamake-0.6.2.tar.gz (48.3 kB view hashes)

Uploaded Source

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page