Skip to main content

Config-driven helpers for signac workflows with explicit dependencies and migrations

Project description

grubicy

CI Docs

grubicy is a small helper library + CLI that layers lightweight dependency management on top of signac.

It is named after Vittore Grubicy de Dragon, an influential promoter of Italian Divisionism. That movement “divided” light and color into strokes; grubicy does the same for workflows: it divides a signac project into stages, connects them with explicit parent -> child links, and keeps those links consistent even as your schema evolves.

With one TOML/YAML spec you can:

  • describe multi-action pipelines in a single file,
  • materialize signac jobs with parent pointers stored in state points,
  • record full parent state points in docs for traceability (deps_meta),
  • render row workflows, and
  • migrate existing workspaces with cascading pointer updates without doing it by hand.

Why use it

Signac projects are naturally flat, but real computational work is often staged:

  • Prepare -> simulate -> analyze
  • Preprocess -> train -> evaluate
  • Extract -> transform -> aggregate

grubicy helps when you want those stages to be:

  • cached and reusable (shared intermediates across experiments),
  • explicitly wired (no hidden coupling via shared parameter keys),
  • reviewable and reproducible (the pipeline is a spec file),
  • maintainable over time (schema changes do not break downstream links).

What you get:

  • Explicit dependencies: parent job ids live in the child state point, so “same params but different parents” never collide.
  • One spec for everything: job creation, row workflow rendering, and parameter collection are driven by a single config file.
  • Safe migrations: plan/apply state point migrations and automatically cascade dependency-pointer rewrites downstream, with progress logging.

When to use it

  • Use grubicy if you have multi-step experiments, pass results downstream between stages, or want row-ready workflows without writing manual include filters.
  • If your project is truly single-stage, grubicy will feel like extra structure you do not need.

Quick start

  1. Install
pip install git+https://github.com/davide-grheco/grubicy

For local development:

uv sync --extra dev
  1. Describe your pipeline (pipeline.toml)
[workspace]
value_file = "signac_statepoint.json"

[[actions]]
name = "s1"
sp_keys = ["p1"]
outputs = ["s1/out.json"]

[[actions]]
name = "s2"
sp_keys = ["p2", "test"]
deps = { action = "s1", sp_key = "parent_action" }
outputs = ["s2/out.json"]

[[actions]]
name = "s3"
sp_keys = ["p3"]
deps = { action = "s2", sp_key = "parent_action" }
outputs = ["s3/out.json"]

[[experiment]]
  [experiment.s1]
  p1 = 1
  [experiment.s2]
  p2 = 10
  test = true
  [experiment.s3]
  p3 = 0.1

Notes:

  • Each [[actions]] block defines a stage.
  • sp_keys lists the parameters that define identity for that stage.
  • deps declares which upstream action this stage depends on. The library writes the upstream job id into the dependent job’s state point using sp_key.
  • Experiments use per-action subsections: parameters do not need to be shared across stages.

Defining multiple experiments:

  • Repeat the [[experiment]] block to create multiple experiment rows. See a complete multi-experiment spec in examples/library-example/pipeline.toml.
  1. Materialize jobs and render a row workflow
grubicy prepare pipeline.toml --output workflow.toml

This will:

  • create/open signac jobs in topological order,
  • write action and dependency pointers (parent job ids) into each state point,
  • store deps_meta in job docs (including full parent state points),
  • generate workflow.toml for row.
  1. Run jobs (only ready directories)
grubicy submit pipeline.toml

If you want to submit everything to row directly, you can still run row submit.

  1. Collect downstream-ready parameters
grubicy collect-params pipeline.toml s3 --format csv > results.csv

This flattens the parameter chain for the s3 stage (and optionally selected doc fields), so you can analyze results without manually walking parents.

Core pieces

Spec

  • A spec file contains:
    • actions: list of stages with name, sp_keys, optional deps (parent action + sp_key used to store parent job id), optional outputs, optional runner.
    • experiment: list of experiments with per-action subsections.
    • optional workspace.value_file.
  • Supported formats: TOML and YAML.

Materialization

  • Creates/opens jobs in topological order and wires dependencies by writing parent job ids into the child state point. Also writes deps_meta into child job docs so parent state points are recorded for traceability and repair.

Row rendering

  • Builds workflow.toml with per-action include rules, using either your explicit runner or a default python actions/{name}.py {directory}.

Collection

  • collect-params flattens parameters (and optional document fields) across the dependency chain for a target stage.

Migration

  • Plan/apply state point migrations with collision detection, cascading parent-pointer rewrites downstream, and restartable progress logs under .pipeline_migrations/.
  • Useful when you add defaults (setdefault) or evolve the schema and need downstream pointers updated consistently.

Examples

  • examples/sample-project: a plain signac setup with hand-wired parent pointers.
  • examples/library-example: the same pipeline expressed with grubicy (pipeline.toml, CLI materialization, row workflow, and helper-based actions).

Documentation

Development

  • Install dev deps: uv sync --extra dev
  • Install hooks: uv run pre-commit install
  • Run hooks on all files: uv run pre-commit run --all-files

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

grubicy-1.3.0.tar.gz (35.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

grubicy-1.3.0-py3-none-any.whl (30.3 kB view details)

Uploaded Python 3

File details

Details for the file grubicy-1.3.0.tar.gz.

File metadata

  • Download URL: grubicy-1.3.0.tar.gz
  • Upload date:
  • Size: 35.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for grubicy-1.3.0.tar.gz
Algorithm Hash digest
SHA256 49720fd7fa9e240befdce5e444cd802a460e04e94914673fa8e4c41403894fc3
MD5 7bf1701aa1b33e6c2a8f371a1709d875
BLAKE2b-256 f555aa957e3817038cb1e43ee31f03674e8eb56740dbc8231ce8727101945399

See more details on using hashes here.

Provenance

The following attestation bundles were made for grubicy-1.3.0.tar.gz:

Publisher: publish.yml on davide-grheco/grubicy

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file grubicy-1.3.0-py3-none-any.whl.

File metadata

  • Download URL: grubicy-1.3.0-py3-none-any.whl
  • Upload date:
  • Size: 30.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for grubicy-1.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 fcf762d93ec0307c0c5f55584535adf492d4f4cd292c979310d9ab5fec0d8208
MD5 4a6287f113bc94e25e437cbe3673782f
BLAKE2b-256 28a1403145bdae03480bd6a4dc9ec605dab831abac973c8c1a9f8a3122735322

See more details on using hashes here.

Provenance

The following attestation bundles were made for grubicy-1.3.0-py3-none-any.whl:

Publisher: publish.yml on davide-grheco/grubicy

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page