Skip to main content

Declarative, prefix-reusing workflow sweeps for MARCA.

Project description

marca-workflow

Declarative, prefix-reusing workflow sweeps for MARCA. Problem-agnostic: it knows nothing about your algorithms -- only ports and variants.

You describe each step as a connector (named inputs/outputs, optional, with a list of variants to sweep). marca-workflow wires the steps by port name, topologically orders them, and sweeps every combination -- computing each shared prefix exactly once and reusing it for every child. The reuse is keyed on the chosen variant indices (a cheap token), never on a hash of the data, so large intermediate objects cost nothing to cache.

Why

A grid sweep A × B × C is a prefix tree: consecutive configurations share a long prefix. Hand-written nested loops capture that reuse but are rigid — a new step means editing the loop. marca-workflow keeps the reuse but makes the structure declarative: add a step by appending one Step.

Example

from marca_workflow import Step, Pipeline

pipe = Pipeline([
    Step("rank",  run_rank,  consumes=("rules", "measures"),
         produces="ranked", variants=rankers),
    Step("prune", run_prune, consumes=("ranked",),
         produces="pruned", variants=pruners),
    Step("clf",   run_clf,   consumes=("pruned",),
         produces="model",  variants=classifiers),
])

results = pipe.run(
    seed={"rules": rules, "measures": im},
    sink=lambda ctx: evaluate(ctx["model"]),
)
# [("BordaRank+M1Prune+OrdinalClassifier", <metric>), ...]  one record per leaf

rank runs once per ranker, prune once per (ranker, pruner), clf once per leaf — automatically.

The one rule: steps must be pure

fn(variant, *inputs) -> output must not mutate its inputs after returning them. The executor memoizes and shares outputs across configurations; mutating a shared output in place corrupts siblings. Purity is also what makes per-step parallelism safe (see Step.parallel, reserved).

Concepts

  • Port — a named value in the run context. A step produces one port and consumes zero or more. Ports not produced by any step are seeds, supplied to run.
  • Variant — one choice on a step's sweep axis. variants=(None,) is a step with no algorithm choice (a fixed transform).
  • Optional stepoptional=True: when the chosen variant is None the step is skipped and its output is taken from fallback. Put None in variants to sweep "with and without" the step.
  • Wiring is implicitconsumes/produces names form the DAG; declaration order does not matter, dependencies decide execution order.

Validation rejects duplicate writers, cycles, unreachable fallbacks, and missing seeds with clear errors.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

marca_workflow-0.1.1.tar.gz (7.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

marca_workflow-0.1.1-py3-none-any.whl (7.8 kB view details)

Uploaded Python 3

File details

Details for the file marca_workflow-0.1.1.tar.gz.

File metadata

  • Download URL: marca_workflow-0.1.1.tar.gz
  • Upload date:
  • Size: 7.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.7

File hashes

Hashes for marca_workflow-0.1.1.tar.gz
Algorithm Hash digest
SHA256 bd7644eccbe5f8ac8c1c3e1ab1e302ae54838a69bcd193f0a93f689ce62c33c5
MD5 a297b28e00e0a3b1a3a4d015d1421904
BLAKE2b-256 839a3f19d233e123475c558cd8169159d46a7e0ec30e4cfeb3ec0675ee2aca82

See more details on using hashes here.

File details

Details for the file marca_workflow-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: marca_workflow-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 7.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.7

File hashes

Hashes for marca_workflow-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 38ddf8aecd111bfb61ed5e4335bfefbc3f2f1e34bcb5cbcf9d849118e57e04f6
MD5 2f030b50c1d77adfc3d63121aa2a3851
BLAKE2b-256 4caa7542cbfbe9eb528b6679195cf0b184bff544a22358b930bb04efb0ef28f2

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page