Skip to main content

Task crunching with Ninja or Make

Project description

goeieDAG

/ɣu.jə.ˈdɑx/: hello, good day (Dutch greeting used during daytime)

goeieDAG provides a neutral Python API to Ninja and Make (TODO) build systems, aiming to make it extremely easy to benefit from parallel processing in any graph-like workflow.

Usage

from pathlib import Path

import goeiedag
from goeiedag import ALL_INPUTS, INPUT, OUTPUT

workdir = Path("output")

graph = goeiedag.CommandGraph()

# Extract OS name from /etc/os-release
graph.add(["grep", "^NAME=", INPUT, ">", OUTPUT],
          inputs=["/etc/os-release"],
          outputs=["os-name.txt"])
# Get username
graph.add(["whoami", ">", OUTPUT],
          inputs=[],
          outputs=["username.txt"])
# Glue together to produce output
graph.add(["cat", ALL_INPUTS, ">", OUTPUT.result],
          inputs=["os-name.txt", "username.txt"],
          outputs=dict(result="result.txt"))  # can also use a dictionary and refer to inputs/outputs by name

goeiedag.build_all(graph, workdir)

# Print output
print((workdir / "result.txt").read_text())

Q&A

Why use the files and commands model rather than Python objects and functions?

  • It is a tested and proven paradigm (make traces back to 1976!)
  • It provides an obvious way of evaluating which products need rebuilding (subject to an accurate dependency graph)
  • It naturally isolates and parallelizes individual build tasks
  • It is agnostic as to how data objects are serialized (convenient for the library author...)
  • Graph edges are implicitly defined by input/output file names
  • A high-quality executor (Ninja) is available and installable via a Python package

How is this different from using the Ninja package directly?

  • Simpler mental model & usage: no need to separately define build rules or think about implicit/explicit inputs and outputs
  • API accepts Paths; no need to cast everything to str!
  • Higher-level API in general (for example, the output directory is created automatically)

Similar projects

  • Ninja (Python package) -- provides a lower-level API, used by goeieDAG as back-end
  • TaskGraph -- similar project, but centered around Python functions and in-process parallelism
  • Snakemake -- similar goals, but a stand-alone tool rather than a library
  • Dask -- different execution model; caching of intermediate results is left up to the user
  • doit

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

goeieDAG-0.0.1.tar.gz (6.8 kB view hashes)

Uploaded Source

Built Distribution

goeieDAG-0.0.1-py3-none-any.whl (8.0 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page