Skip to main content

Config-driven helpers for signac workflows with explicit dependencies and migrations

Project description

grubicy

CI Docs

grubicy is a small helper library + CLI that layers lightweight dependency management on top of signac.

It is named after Vittore Grubicy de Dragon, an influential promoter of Italian Divisionism. That movement “divided” light and color into strokes; grubicy does the same for workflows: it divides a signac project into stages, connects them with explicit parent -> child links, and keeps those links consistent even as your schema evolves.

With one TOML/YAML spec you can:

  • describe multi-action pipelines in a single file,
  • materialize signac jobs with parent pointers stored in state points,
  • record full parent state points in docs for traceability (deps_meta),
  • render row workflows, and
  • migrate existing workspaces with cascading pointer updates without doing it by hand.

Why use it

Signac projects are naturally flat, but real computational work is often staged:

  • Prepare -> simulate -> analyze
  • Preprocess -> train -> evaluate
  • Extract -> transform -> aggregate

grubicy helps when you want those stages to be:

  • cached and reusable (shared intermediates across experiments),
  • explicitly wired (no hidden coupling via shared parameter keys),
  • reviewable and reproducible (the pipeline is a spec file),
  • maintainable over time (schema changes do not break downstream links).

What you get:

  • Explicit dependencies: parent job ids live in the child state point, so “same params but different parents” never collide.
  • One spec for everything: job creation, row workflow rendering, and parameter collection are driven by a single config file.
  • Safe migrations: plan/apply state point migrations and automatically cascade dependency-pointer rewrites downstream, with progress logging.

When to use it

  • Use grubicy if you have multi-step experiments, pass results downstream between stages, or want row-ready workflows without writing manual include filters.
  • If your project is truly single-stage, grubicy will feel like extra structure you do not need.

Quick start

  1. Install
pip install git+https://github.com/davide-grheco/grubicy

For local development:

uv sync --extra dev
  1. Describe your pipeline (pipeline.toml)
[workspace]
value_file = "signac_statepoint.json"

[[actions]]
name = "s1"
sp_keys = ["p1"]
outputs = ["s1/out.json"]

[[actions]]
name = "s2"
sp_keys = ["p2", "test"]
deps = { action = "s1", sp_key = "parent_action" }
outputs = ["s2/out.json"]

[[actions]]
name = "s3"
sp_keys = ["p3"]
deps = { action = "s2", sp_key = "parent_action" }
outputs = ["s3/out.json"]

[[experiment]]
  [experiment.s1]
  p1 = 1
  [experiment.s2]
  p2 = 10
  test = true
  [experiment.s3]
  p3 = 0.1

Notes:

  • Each [[actions]] block defines a stage.
  • sp_keys lists the parameters that define identity for that stage.
  • deps declares which upstream action this stage depends on. The library writes the upstream job id into the dependent job’s state point using sp_key.
  • Experiments use per-action subsections: parameters do not need to be shared across stages.

Defining multiple experiments:

  • Repeat the [[experiment]] block to create multiple experiment rows. See a complete multi-experiment spec in examples/library-example/pipeline.toml.
  1. Materialize jobs and render a row workflow
grubicy prepare pipeline.toml --project . --output workflow.toml

This will:

  • create/open signac jobs in topological order,
  • write action and dependency pointers (parent job ids) into each state point,
  • store deps_meta in job docs (including full parent state points),
  • generate workflow.toml for row.
  1. Run jobs (only ready directories)
grubicy submit pipeline.toml --project .

If you want to submit everything to row directly, you can still run row submit.

  1. Collect downstream-ready parameters
grubicy collect-params pipeline.toml s3 --format csv > results.csv

This flattens the parameter chain for the s3 stage (and optionally selected doc fields), so you can analyze results without manually walking parents.

Core pieces

Spec

  • A spec file contains:
    • actions: list of stages with name, sp_keys, optional deps (parent action + sp_key used to store parent job id), optional outputs, optional runner.
    • experiment: list of experiments with per-action subsections.
    • optional workspace.value_file.
  • Supported formats: TOML and YAML.

Materialization

  • Creates/opens jobs in topological order and wires dependencies by writing parent job ids into the child state point. Also writes deps_meta into child job docs so parent state points are recorded for traceability and repair.

Row rendering

  • Builds workflow.toml with per-action include rules, using either your explicit runner or a default python actions/{name}.py {directory}.

Collection

  • collect-params flattens parameters (and optional document fields) across the dependency chain for a target stage.

Migration

  • Plan/apply state point migrations with collision detection, cascading parent-pointer rewrites downstream, and restartable progress logs under .pipeline_migrations/.
  • Useful when you add defaults (setdefault) or evolve the schema and need downstream pointers updated consistently.

Examples

  • examples/sample-project: a plain signac setup with hand-wired parent pointers.
  • examples/library-example: the same pipeline expressed with grubicy (pipeline.toml, CLI materialization, row workflow, and helper-based actions).

Documentation

Development

  • Install dev deps: uv sync --extra dev
  • Install hooks: uv run pre-commit install
  • Run hooks on all files: uv run pre-commit run --all-files

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

grubicy-1.1.0.tar.gz (27.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

grubicy-1.1.0-py3-none-any.whl (23.5 kB view details)

Uploaded Python 3

File details

Details for the file grubicy-1.1.0.tar.gz.

File metadata

  • Download URL: grubicy-1.1.0.tar.gz
  • Upload date:
  • Size: 27.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for grubicy-1.1.0.tar.gz
Algorithm Hash digest
SHA256 64981ebdc3134dc6f0653621f467e2320691886f63626a6d34633cd21876f660
MD5 03c2c4851354d65a95378744423b9f29
BLAKE2b-256 55ece9315773ebbefb47862826c5191d7949e3037524c361f356444c50b06d9a

See more details on using hashes here.

Provenance

The following attestation bundles were made for grubicy-1.1.0.tar.gz:

Publisher: publish.yml on davide-grheco/grubicy

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file grubicy-1.1.0-py3-none-any.whl.

File metadata

  • Download URL: grubicy-1.1.0-py3-none-any.whl
  • Upload date:
  • Size: 23.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for grubicy-1.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 bfc9a381372cbd236a53334d8b24f11042f8aacd76d394eb5a043852d3d7ed6a
MD5 e78be69f47a473d9e19f7b6a985a32d6
BLAKE2b-256 f0e334d99d01f6bba85f30a1a0034c6de1663346899ed16d4d202861cf971d28

See more details on using hashes here.

Provenance

The following attestation bundles were made for grubicy-1.1.0-py3-none-any.whl:

Publisher: publish.yml on davide-grheco/grubicy

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page