Task crunching with Ninja or Make
Project description
goeieDAG
/ɣu.jə.ˈdɑx/: hello, good day (Dutch greeting used during daytime)
goeieDAG provides a unified Python API to Ninja and Make (TODO) build systems, aiming to make it extremely easy to benefit from parallel processing in any graph-like workflow.
Installation
pip install goeieDAG==0.0.2
Usage
from pathlib import Path
import goeiedag
from goeiedag import ALL_INPUTS, INPUT, OUTPUT
workdir = Path("output")
graph = goeiedag.Graph()
# Extract OS name from /etc/os-release
graph.add(["grep", "^NAME=", INPUT, ">", OUTPUT],
inputs=["/etc/os-release"],
outputs=["os-name.txt"])
# Get username
graph.add(["whoami", ">", OUTPUT],
inputs=[],
outputs=["username.txt"])
# Glue together to produce output
graph.add(["cat", ALL_INPUTS, ">", OUTPUT.result],
inputs=["os-name.txt", "username.txt"],
outputs=dict(result="result.txt")) # can also use a dictionary and refer to inputs/outputs by name
goeiedag.build_all(graph, workdir)
# Print output
print((workdir / "result.txt").read_text())
Q&A
Why use the files and commands model rather than Python objects and functions?
- It is a tested and proven paradigm (
make
traces back to 1976!) - It provides an obvious way of evaluating which products need rebuilding (subject to an accurate dependency graph)
- It naturally isolates and parallelizes individual build tasks
- It is agnostic as to how data objects are serialized (convenient for the library author...)
- Graph edges are implicitly defined by input/output file names
- A high-quality executor (Ninja) is available and installable via a Python package
How is this different from using the Ninja package directly?
- Simpler mental model & usage: no need to separately define build rules or think about implicit/explicit inputs and outputs
- API accepts Paths; no need to cast everything to
str
! - Higher-level API in general (for example, the output directory is created automatically)
Similar projects
- Ninja (Python package) -- provides a lower-level API, used by goeieDAG as back-end
- TaskGraph -- similar project, but centered around Python functions and in-process parallelism
- Snakemake -- similar goals, but a stand-alone tool rather than a library
- Dask -- different execution model; caching of intermediate results is left up to the user
- doit
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
goeiedag-0.0.2.tar.gz
(7.1 kB
view hashes)