Skip to main content

Config-driven helpers for signac workflows with explicit dependencies and migrations

Project description

grubicy

CI Docs

grubicy is a small helper library + CLI that layers lightweight dependency management on top of signac.

It is named after Vittore Grubicy de Dragon, an influential promoter of Italian Divisionism. That movement “divided” light and color into strokes; grubicy does the same for workflows: it divides a signac project into stages, connects them with explicit parent -> child links, and keeps those links consistent even as your schema evolves.

With one TOML/YAML spec you can:

  • describe multi-action pipelines in a single file,
  • materialize signac jobs with parent pointers stored in state points,
  • record full parent state points in docs for traceability (deps_meta),
  • render row workflows, and
  • migrate existing workspaces with cascading pointer updates without doing it by hand.

Why use it

Signac projects are naturally flat, but real computational work is often staged:

  • Prepare -> simulate -> analyze
  • Preprocess -> train -> evaluate
  • Extract -> transform -> aggregate

grubicy helps when you want those stages to be:

  • cached and reusable (shared intermediates across experiments),
  • explicitly wired (no hidden coupling via shared parameter keys),
  • reviewable and reproducible (the pipeline is a spec file),
  • maintainable over time (schema changes do not break downstream links).

What you get:

  • Explicit dependencies: parent job ids live in the child state point, so “same params but different parents” never collide.
  • One spec for everything: job creation, row workflow rendering, and parameter collection are driven by a single config file.
  • Safe migrations: plan/apply state point migrations and automatically cascade dependency-pointer rewrites downstream, with progress logging.

When to use it

  • Use grubicy if you have multi-step experiments, pass results downstream between stages, or want row-ready workflows without writing manual include filters.
  • If your project is truly single-stage, grubicy will feel like extra structure you do not need.

Quick start

  1. Install
pip install git+https://github.com/davide-grheco/grubicy

For local development:

uv sync --extra dev
  1. Describe your pipeline (pipeline.toml)
[workspace]
value_file = "signac_statepoint.json"

[[actions]]
name = "s1"
sp_keys = ["p1"]
outputs = ["s1/out.json"]

[[actions]]
name = "s2"
sp_keys = ["p2", "test"]
deps = { action = "s1", sp_key = "parent_action" }
outputs = ["s2/out.json"]

[[actions]]
name = "s3"
sp_keys = ["p3"]
deps = { action = "s2", sp_key = "parent_action" }
outputs = ["s3/out.json"]

[[experiment]]
  [experiment.s1]
  p1 = 1
  [experiment.s2]
  p2 = 10
  test = true
  [experiment.s3]
  p3 = 0.1

Notes:

  • Each [[actions]] block defines a stage.
  • sp_keys lists the parameters that define identity for that stage.
  • deps declares which upstream action this stage depends on. The library writes the upstream job id into the dependent job’s state point using sp_key.
  • Experiments use per-action subsections: parameters do not need to be shared across stages.

Defining multiple experiments:

  • Repeat the [[experiment]] block to create multiple experiment rows. See a complete multi-experiment spec in examples/library-example/pipeline.toml.
  1. Materialize jobs and render a row workflow
grubicy prepare pipeline.toml --output workflow.toml

This will:

  • create/open signac jobs in topological order,
  • write action and dependency pointers (parent job ids) into each state point,
  • store deps_meta in job docs (including full parent state points),
  • generate workflow.toml for row.
  1. Run jobs (only ready directories)
grubicy submit pipeline.toml

If you want to submit everything to row directly, you can still run row submit.

  1. Collect downstream-ready parameters
grubicy collect-params pipeline.toml s3 --format csv > results.csv

This flattens the parameter chain for the s3 stage (and optionally selected doc fields), so you can analyze results without manually walking parents.

Core pieces

Spec

  • A spec file contains:
    • actions: list of stages with name, sp_keys, optional deps (parent action + sp_key used to store parent job id), optional outputs, optional runner.
    • experiment: list of experiments with per-action subsections.
    • optional workspace.value_file.
  • Supported formats: TOML and YAML.

Materialization

  • Creates/opens jobs in topological order and wires dependencies by writing parent job ids into the child state point. Also writes deps_meta into child job docs so parent state points are recorded for traceability and repair.

Row rendering

  • Builds workflow.toml with per-action include rules, using either your explicit runner or a default python actions/{name}.py {directory}.

Collection

  • collect-params flattens parameters (and optional document fields) across the dependency chain for a target stage.

Migration

  • Plan/apply state point migrations with collision detection, cascading parent-pointer rewrites downstream, and restartable progress logs under .pipeline_migrations/.
  • Useful when you add defaults (setdefault) or evolve the schema and need downstream pointers updated consistently.

Examples

  • examples/sample-project: a plain signac setup with hand-wired parent pointers.
  • examples/library-example: the same pipeline expressed with grubicy (pipeline.toml, CLI materialization, row workflow, and helper-based actions).

Documentation

Development

  • Install dev deps: uv sync --extra dev
  • Install hooks: uv run pre-commit install
  • Run hooks on all files: uv run pre-commit run --all-files

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

grubicy-1.2.0.tar.gz (33.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

grubicy-1.2.0-py3-none-any.whl (28.3 kB view details)

Uploaded Python 3

File details

Details for the file grubicy-1.2.0.tar.gz.

File metadata

  • Download URL: grubicy-1.2.0.tar.gz
  • Upload date:
  • Size: 33.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for grubicy-1.2.0.tar.gz
Algorithm Hash digest
SHA256 dc4441ae8d12abffc23f80d5f092ae2fa400bbb1b172d9138e2e8c338dd936c5
MD5 d58851c3cb3212650ce4cbb298933837
BLAKE2b-256 1632c7608d313f13fe9a1151ae0e9e36a1b2fc0dc0a9b6fdc19b47ab5204e5e6

See more details on using hashes here.

Provenance

The following attestation bundles were made for grubicy-1.2.0.tar.gz:

Publisher: publish.yml on davide-grheco/grubicy

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file grubicy-1.2.0-py3-none-any.whl.

File metadata

  • Download URL: grubicy-1.2.0-py3-none-any.whl
  • Upload date:
  • Size: 28.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for grubicy-1.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 7bceadafb5e9e91f324d9dc61b87c2b6248f7d4845dc8c5edce77ad055760b54
MD5 df4ba228a72f997bda647158d5405eec
BLAKE2b-256 ea7dbf7f962c56719a81a33b44285ca11f2c5402e2634cc8cca88c170b42cb5d

See more details on using hashes here.

Provenance

The following attestation bundles were made for grubicy-1.2.0-py3-none-any.whl:

Publisher: publish.yml on davide-grheco/grubicy

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page