Skip to main content

The kwdagger module

Project description

Pypi PypiDownloads GitlabCIPipeline GitlabCICoverage ReadTheDocs

Read the Docs

http://kwdagger.readthedocs.io/en/latest/

Gitlab (main)

https://gitlab.kitware.com/computer-vision/kwdagger

Github (mirror)

https://github.com/Kitware/kwdagger

Pypi

https://pypi.org/project/kwdagger

Overview

KWDagger is a lightweight framework for defining bash-centric DAGs and running large parameter sweeps. It builds on top of cmd_queue and scriptconfig to provide:

  • Reusable kwdagger.pipeline.Pipeline and kwdagger.pipeline.ProcessNode abstractions for wiring inputs / outputs together.

  • A scheduling CLI (kwdagger.schedule) that materializes pipeline definitions over a parameter grid and executes them via Slurm, tmux, or a serial backend.

  • An aggregation CLI (kwdagger.aggregate) that loads job outputs, computes metrics, and optionally plots parameter/metric relationships.

  • A self-contained demo pipeline in kwdagger.demo.demodata that is used in CI and serves as a reference implementation.

Repository layout

  • kwdagger/pipeline.py – core pipeline and process node definitions, networkx graph construction, and configuration utilities.

  • kwdagger/schedule.pyScheduleEvaluationConfig CLI for expanding parameter grids into runnable jobs and dispatching them through cmd_queue backends.

  • kwdagger/aggregate.pyAggregateEvluationConfig CLI for loading job outputs, computing parameter hash IDs, and generating text/plot reports.

  • kwdagger/demo/demodata.py – end-to-end demo pipeline with prediction and evaluation stages plus CLI entry points for each node.

  • docs/ – Sphinx sources, including an example user module under docs/source/manual/tutorials/twostage_pipeline.

  • tests/ – unit and functional coverage for pipeline wiring, scheduler behavior, aggregation, and import sanity checks.

Quickstart

Run the demo pipeline locally to see the CLI workflow end-to-end:

TMP_DPATH=$(mktemp -d --suffix "-kwdagger-demo")
cd "$TMP_DPATH"
echo "demo" > input.txt

EVAL_DPATH=$PWD/pipeline_output
python -m kwdagger.schedule \
    --params="
        pipeline: 'kwdagger.demo.demodata.my_demo_pipeline()'
        matrix:
            stage1_predict.src_fpath:
                - input.txt
            stage1_predict.param1:
                - 123
            stage1_evaluate.workers: 2
    " \
    --root_dpath="${EVAL_DPATH}" \
    --backend=serial --skip_existing=1 --run=1

python -m kwdagger.aggregate \
    --pipeline='kwdagger.demo.demodata.my_demo_pipeline()' \
    --target "
        - $EVAL_DPATH
    " \
    --output_dpath="$EVAL_DPATH/full_aggregate" \
    --eval_nodes="
        - stage1_evaluate
    " \
    --stdout_report="
        top_k: 10
        concise: 1
    "

The scheduler will generate per-node job directories with invoke.sh and job_config.json metadata. The aggregator then consolidates results, computes parameter hash IDs, and prints a concise report.

A novel graph based symlink structure allows for navigation of dependencies within a node. The .succ folder holds symlinks to successors (i.e. results that depend on the current results), and .pred holds symlinks to folders of results that the current folder depends on.

For more in-depth information see tutorials:

Command line entry points

  • python -m kwdagger.schedule or kwdagger schedule – build and run a pipeline over a parameter matrix (see kwdagger.schedule.ScheduleEvaluationConfig).

  • python -m kwdagger.aggregate or kwdagger aggregate – load completed runs and generate tabular and plotted summaries (kwdagger.aggregate.AggregateEvluationConfig).

  • python -m kwdagger – modal CLI that exposes the schedule and aggregate commands via kwdagger.__main__.KWDaggerModal.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

kwdagger-0.2.1.tar.gz (145.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

kwdagger-0.2.1-py3-none-any.whl (241.6 kB view details)

Uploaded Python 3

File details

Details for the file kwdagger-0.2.1.tar.gz.

File metadata

  • Download URL: kwdagger-0.2.1.tar.gz
  • Upload date:
  • Size: 145.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.2

File hashes

Hashes for kwdagger-0.2.1.tar.gz
Algorithm Hash digest
SHA256 afb9e998c346e81a4e046e514ea854043658b93a434356bfaaf6a221d02e87a0
MD5 a62daa22349928f58a73fe0d529b8252
BLAKE2b-256 7f5e4e599628c41d46d5b77440ab30d5aa1ff1e3919d11d8995a7f91612b52a7

See more details on using hashes here.

File details

Details for the file kwdagger-0.2.1-py3-none-any.whl.

File metadata

  • Download URL: kwdagger-0.2.1-py3-none-any.whl
  • Upload date:
  • Size: 241.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.2

File hashes

Hashes for kwdagger-0.2.1-py3-none-any.whl
Algorithm Hash digest
SHA256 3277d9fd6b0ba24f663d4208e3469a2b6b41a7dc09bc18f23e49f000163b8d04
MD5 c5619594f8b0aa1a81d35756972aabb6
BLAKE2b-256 77221f011826d3cf2412958f07ab3b275e7de4034b98607b9f130a21a41c3ece

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page