Declarative, prefix-reusing workflow sweeps for MARCA.
Project description
marca-workflow
Declarative, prefix-reusing workflow sweeps for MARCA. Problem-agnostic: it knows nothing about your algorithms -- only ports and variants.
You describe each step as a connector (named inputs/outputs, optional, with a list of variants to sweep). marca-workflow wires the steps by port name, topologically orders them, and sweeps every combination -- computing each shared prefix exactly once and reusing it for every child. The reuse is keyed on the chosen variant indices (a cheap token), never on a hash of the data, so large intermediate objects cost nothing to cache.
Why
A grid sweep A × B × C is a prefix tree: consecutive configurations share a
long prefix. Hand-written nested loops capture that reuse but are rigid — a new
step means editing the loop. marca-workflow keeps the reuse but makes the structure
declarative: add a step by appending one Step.
Example
from marca_workflow import Step, Pipeline
pipe = Pipeline([
Step("rank", run_rank, consumes=("rules", "measures"),
produces="ranked", variants=rankers),
Step("prune", run_prune, consumes=("ranked",),
produces="pruned", variants=pruners),
Step("clf", run_clf, consumes=("pruned",),
produces="model", variants=classifiers),
])
results = pipe.run(
seed={"rules": rules, "measures": im},
sink=lambda ctx: evaluate(ctx["model"]),
)
# [("BordaRank+M1Prune+OrdinalClassifier", <metric>), ...] one record per leaf
rank runs once per ranker, prune once per (ranker, pruner), clf once per
leaf — automatically.
The one rule: steps must be pure
fn(variant, *inputs) -> output must not mutate its inputs after returning
them. The executor memoizes and shares outputs across configurations; mutating a
shared output in place corrupts siblings. Purity is also what makes per-step
parallelism safe (see Step.parallel, reserved).
Concepts
- Port — a named value in the run context. A step
producesone port andconsumeszero or more. Ports not produced by any step are seeds, supplied torun. - Variant — one choice on a step's sweep axis.
variants=(None,)is a step with no algorithm choice (a fixed transform). - Optional step —
optional=True: when the chosen variant isNonethe step is skipped and its output is taken fromfallback. PutNoneinvariantsto sweep "with and without" the step. - Wiring is implicit —
consumes/producesnames form the DAG; declaration order does not matter, dependencies decide execution order.
Validation rejects duplicate writers, cycles, unreachable fallbacks, and missing seeds with clear errors.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file marca_workflow-0.1.1.tar.gz.
File metadata
- Download URL: marca_workflow-0.1.1.tar.gz
- Upload date:
- Size: 7.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
bd7644eccbe5f8ac8c1c3e1ab1e302ae54838a69bcd193f0a93f689ce62c33c5
|
|
| MD5 |
a297b28e00e0a3b1a3a4d015d1421904
|
|
| BLAKE2b-256 |
839a3f19d233e123475c558cd8169159d46a7e0ec30e4cfeb3ec0675ee2aca82
|
File details
Details for the file marca_workflow-0.1.1-py3-none-any.whl.
File metadata
- Download URL: marca_workflow-0.1.1-py3-none-any.whl
- Upload date:
- Size: 7.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
38ddf8aecd111bfb61ed5e4335bfefbc3f2f1e34bcb5cbcf9d849118e57e04f6
|
|
| MD5 |
2f030b50c1d77adfc3d63121aa2a3851
|
|
| BLAKE2b-256 |
4caa7542cbfbe9eb528b6679195cf0b184bff544a22358b930bb04efb0ef28f2
|