Skip to main content

A Python-native build system with content-hash-based incremental builds

Project description

🧑‍🍳 Cook

CI PyPI

A Python-native build system with content-hash-based incremental builds. Define tasks in plain Python, and Cook handles dependency ordering, parallel execution, and skipping unchanged work.

[!WARNING] Cook v1.0.0 is a complete rewrite of the build system with different syntax. Pin cook-build<1.0.0 to retain the old cook.create_task syntax.

Quick start

Install cook by running pip install cook-build or your favorite Python package manager. Then create a recipe.py in your project root:

>>> from pathlib import Path
>>> from cook import sh, group

>>> sources = sorted(Path("example").glob("*.c"))
>>> objects = [src.with_suffix(".o") for src in sources]

>>> with group("compile"):
...     for src, obj in zip(sources, objects):
...         sh(
...             name=f"compile-{src.stem}",
...             cmd=f"gcc -c {src} -o {obj}",
...             inputs=[src], outputs=[obj],
...         )
ShellTask(name='compile-main', inputs=[PosixPath('example/main.c')], outputs=[PosixPath('example/main.o')], extra={}, cmd='gcc -c example/main.c -o example/main.o', env=None, cwd=None)
ShellTask(name='compile-util', inputs=[PosixPath('example/util.c')], outputs=[PosixPath('example/util.o')], extra={}, cmd='gcc -c example/util.c -o example/util.o', env=None, cwd=None)

>>> sh(
...     name="link",
...     cmd=f"gcc {' '.join(str(o) for o in objects)} -o build/app",
...     inputs=objects, outputs=["build/app"],
... )
ShellTask(name='link', inputs=[PosixPath('example/main.o'), PosixPath('example/util.o')], outputs=['build/app'], extra={}, cmd='gcc example/main.o example/util.o -o build/app', env=None, cwd=None)

Then run:

cook run "*"

On the second run, unchanged tasks are skipped automatically:

[1/3] Fresh   compile-main
[2/3] Fresh   compile-util
[3/3] Fresh   link

Build finished: 3 fresh in 0.0s

Tasks

Use sh() to create shell tasks. inputs are file paths and/or other tasks — files are hashed for change detection, tasks become dependencies. outputs are files the task produces — Cook verifies they exist after execution. Tasks with no outputs always run (useful for tests, linters).

If a file input matches another task's declared output, Cook automatically adds the dependency — no need to pass the task object explicitly:

>>> _ = sh(name="cc-foo", cmd="gcc -c foo.c -o foo.o", inputs=["foo.c"], outputs=["foo.o"])
>>> _ = sh(name="link-foo", cmd="gcc foo.o -o app", inputs=["foo.o"], outputs=["app"])

Use group() to organize related tasks. A group is itself a task, so you can depend on it or run it by name:

>>> with group("data") as data:
...     _ = sh(name="gen-a", cmd="generate a", outputs=["a.csv"])
...     _ = sh(name="gen-b", cmd="generate b", outputs=["b.csv"])

>>> _ = sh(name="train", cmd="python train.py", inputs=[data])
cook run data       # runs gen-a and gen-b
cook run train      # runs data group first, then train

All relative paths resolve relative to the recipe file's directory, making recipes portable across machines.

CLI

cook run [pattern]               # run tasks matching glob pattern
cook run -n [pattern]            # show what would run (--dry-run)
cook run -k [pattern]            # keep going on failure (--keep-going)
cook run -j4 [pattern]           # run up to 4 tasks in parallel (--jobs)
cook run -s [pattern]            # stream task output to terminal (--stream)
cook run -x slurm [pattern]      # override executor backend (--executor)

cook build <output-pattern>      # run tasks that produce matching outputs

cook inspect [pattern]           # show dependency graph and staleness
cook inspect --json [pattern]    # JSON lines output

cook list [pattern]              # list task names
cook list -s [pattern]           # list only stale tasks (--stale)
cook list --json [pattern]       # JSON lines output

cook invalidate <pattern>        # force tasks to re-run next time

cook validate <pattern>          # mark tasks as up-to-date without running

cook ui [pattern]                # interactive DAG visualization

Patterns use glob syntax (fnmatch). Use -r for regex. Dependencies of matched tasks are always included.

Global flags: -v (verbose), -q (quiet), --color=auto|always|never, -f (recipe file), -c (config file), --version.

Custom task types

Subclass Task as a dataclass and register a handler with the executor:

>>> from dataclasses import dataclass
>>> from cook import Task, get_context
>>> from cook.executor import LocalExecutor

>>> @dataclass
... class DownloadTask(Task):
...     url: str = "https://example.com"

>>> async def handle_download(executor, task):
...     ...  # your logic here

>>> _ = LocalExecutor.register_handler(handle_download, task_type=DownloadTask)

>>> ctx = get_context()
>>> _ = ctx.register(DownloadTask(name="fetch-data", url="https://example.com/data.csv", outputs=["data.csv"]))

Configuration

Optional cook.toml in your project root:

[cook]
recipe = "recipe.py"        # default recipe file
executor = "local"          # or "slurm"
default = "build-*"         # default pattern for bare `cook run`

[cook.local]
max_concurrent = 8          # parallel task limit (default: 1)

[cook.slurm]
max_concurrent = 64
poll_interval = 2.0
poll_timeout = 86400.0
poll_retries = 10

[cook.slurm.defaults]      # default sbatch flags for all slurm tasks
mem = "4G"
partition = "batch"

All settings have sensible defaults. CLI flags override config values.

Per-task slurm options override defaults:

>>> _ = sh(name="gpu-train", cmd="python train.py", slurm={"mem": "32G", "gres": "gpu:1"})

How staleness works

Cook computes a content hash for each task based on its fields, the contents of its input files, and the hashes of its dependencies. If the hash matches the last successful run and all outputs exist, the task is skipped. Changed inputs, changed commands, or missing outputs trigger re-execution.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cook_build-1.2.2.tar.gz (221.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

cook_build-1.2.2-py3-none-any.whl (191.5 kB view details)

Uploaded Python 3

File details

Details for the file cook_build-1.2.2.tar.gz.

File metadata

  • Download URL: cook_build-1.2.2.tar.gz
  • Upload date:
  • Size: 221.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for cook_build-1.2.2.tar.gz
Algorithm Hash digest
SHA256 d44c153aa99ea16961a48328ddf336e7ca892738a0e8023a52c4704300e40b86
MD5 fdf2374d54df7b97a65237315b991855
BLAKE2b-256 ff69b7262de68abf4214cc7ecb30ec9ba47d8869f8f97cb15aea6a7e15ccd109

See more details on using hashes here.

Provenance

The following attestation bundles were made for cook_build-1.2.2.tar.gz:

Publisher: main.yaml on tillahoffmann/cook-build

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file cook_build-1.2.2-py3-none-any.whl.

File metadata

  • Download URL: cook_build-1.2.2-py3-none-any.whl
  • Upload date:
  • Size: 191.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for cook_build-1.2.2-py3-none-any.whl
Algorithm Hash digest
SHA256 0b01f381f3ccc057fb9e485e8152fa33c0eec7263ed294746bf48e1bad497ee8
MD5 275dfcc89a6bd4eca6d08b89ec2c865c
BLAKE2b-256 1c5b50f5d1e43d180b4e843fae426f12baa5e0ef27f44b24fb13ce8e6ac0c4be

See more details on using hashes here.

Provenance

The following attestation bundles were made for cook_build-1.2.2-py3-none-any.whl:

Publisher: main.yaml on tillahoffmann/cook-build

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page