Tools for generating Stimela cab definitions from Python functions
Project description
hip-cargo
A guide to designing auto-documenting CLI interfaces using Typer + conversion utilities.
If you are creating a new package the instructions below will guide you on how to structure it.
The generate-function utility is available to assist in converting an existing package to the hip-cargo format but there will be some manual steps involved.
The philosophy behind this design is to allow having a lightweight version of the package that only installs the bits required to generate --help from the CLI and the cab definitions that can then be used with stimela.
The full package should be available as a container image that can be used with stimela.
The image should be tagged with the package version so that stimela will automatically pull the image that matches the cab configuration.
Installation
pip install hip-cargo
Or for development:
git clone https://github.com/landmanbester/hip-cargo.git
cd hip-cargo
uv sync
Quick Start
1. Decorate your Python CLI
Something like the following goes in src/mypackage/cli/process.py
import typer
from pathlib import Path
from typing import NewType
from typing_extensions import Annotated
from hip_cargo import stimela_cab, stimela_output
# custom types (stimela has e.g. File, URI, MS and Directory)
File = NewType("File", Path)
URI = NewType("URI", Path)
MS = NewType("MS", Path)
Directory = NewType("Directory", Path)
@stimela_cab(
name="my_processor",
info="Process data files",
)
@stimela_output(
name="output_file",
dtype="File",
info="{input_file}.processed",
required=True,
)
def process(
input_ms: Annotated[MS, typer.Argument(parser=MS, help="Input MS to process")], # note the parser=MS bit. This is required for non-standard types
output_dir: Annotated[Directory, typer.Option(parser=Directory, help="Output Directory for results")] = Path("./output"),
threshold: Annotated[float, typer.Option(help="Threshold value")] = 0.5,
):
"""
Process a data file.
"""
# All your manual parameter wrangling here
from mypackage.core.process import process as process_core
return process_core(*args, **kwargs)
Note that *args and **kwargs need to passed explicitly.
Then register the command in the src/mypackage/cli/__init__.py with something like the following
"""Lightweight CLI for mypackage."""
import typer
app = typer.Typer(
name="mypackage",
help="Scientific computing package",
no_args_is_help=True,
)
# Register commands
from mypackage.cli.process import process
app.command(name="process")(process)
__all__ = ["app"]
That's it, if you have something like the following
[project.scripts]
mypackage = "mypackage.cli:app"
in your pyproject.toml you should now be able to run
app --help
and
app process --help
from the command line and have a beautifully formatted CLI for your package.
Note that you can register multiple commands under app.
2. Generate the Stimela cab definition
If you have the CLI definition you can convert it to a can using e.g.
cargo generate-cab mypackage.process src/mypackage/cabs/process.yaml
This should be automated using scrips/generate_cabs.py, but the above command is useful for testing.
3. Generate Python function from existing cab (reverse)
If you are converting an existing package to the hip-cargo format there is a utility function available viz.
cargo generate-function /path/to/existing_cab.yaml -o myfunction.py
Currently, this won't add things like rich_output_panel, but it should help to get you started.
The program should recognize custom types and add the
from pathlib import Path
from typing import NewType
MS = NewType("MS", Path)
bit for you. It should also add the parser=MS in the typer.Option() bit for you.
Project Structure for hip-cargo Packages
Packages following the hip-cargo pattern should be structured to enable both lightweight cab definitions and full execution environments:
my-scientific-package/
├── src/
│ └── mypackage/
│ ├── __init__.py
│ ├── utils/ # Utilities used by core algorithms
│ │ ├── __init__.py
│ │ └── operator.py
│ ├── core/ # Core implementations with standard python type hints (no Annotated or custom types)
│ │ ├── __init__.py
│ │ ├── process.py
│ │ └── analyze.py
│ ├── cli/ # Lightweight CLI layer
│ │ ├── __init__.py # Main Typer app
│ │ ├── process.py # Individual commands
│ │ └── analyze.py
│ └── cabs/ # Generated cab definitions (inside mypackage)
│ ├── __init__.py
│ ├── process.yaml
│ └── analyze.yaml
├── scripts/
│ └── generate_cabs.py # Automation script
├── Dockerfile # For containerization
├── pyproject.toml
└── README.md
Key Principles
- Separate CLI from implementation: Keep CLI modules lightweight with lazy imports. Keep them all in the
src/mypackage/clidirectory and define the CLI for each command in a separate file. Construct the main Typer app insrc/mypackage/cli/__init__.pyand register commands there. - Separate cabs directory at same level as
cli: Usehip-cargoto auto-generate cabs into insrc/mypackage/cabs/directory with thegenerate_cabs.pyscript. There should be a separate file for each cab. - Single app, multiple commands: Use one Typer app that registers all commands. If you need a separate app you might as well create a separate repository for it.
- Lazy imports: Import heavy dependencies (NumPy, JAX, Dask) only when executing
- Linked GitHub package with container image: Maintain an up to date
Dockerfilethat installs the full package and use Docker (or Podman) to upload the image to the GitHub Container registry. Link this to your GitHub repository.
Example Structure
src/mypackage/cli/__init__.py:
"""Lightweight CLI for mypackage."""
import typer
app = typer.Typer(
name="mypackage",
help="Scientific computing package",
no_args_is_help=True,
)
# Register commands
from mypackage.cli.process import process
from mypackage.cli.analyze import analyze
app.command(name="process")(process)
app.command(name="analyze")(analyze)
__all__ = ["app"]
src/mypackage/cli/process.py:
"""Process command - lightweight wrapper."""
from pathlib import Path
from typing import NewType
from typing_extensions import Annotated
import typer
from hip_cargo import stimela_cab, stimela_output
MS = NewType("MS", Path)
@stimela_cab(name="mypackage_process", info="Process data")
@stimela_output(name="output", dtype="File", info="{input_file}.out")
def process(
input_ms: Annotated[MS, typer.Argument(parser=MS, help="Input File")],
param: Annotated[float, typer.Option(help="Parameter")] = 1.0,
):
"""Process data files."""
# Lazy import - only loaded when executing
from mypackage.operators.core_algorithm import process_data
return process_data(input_file, param)
pyproject.toml:
[project]
name = "mypackage"
dependencies = [
"typer>=0.12.0",
"hip-cargo>=0.1.0",
]
[project.optional-dependencies]
# Full scientific stack
full = [
"numpy>=1.24.0",
"jax>=0.4.0",
# ... heavy dependencies
]
[project.scripts]
mypackage = "mypackage.cli:app"
scripts/generate_cabs.py:
"""Generate all cab definitions."""
import subprocess
from pathlib import Path
CLI_MODULES = [
"mypackage.cli.process",
"mypackage.cli.analyze",
]
CABS_DIR = Path("src/mypackage/cabs")
CABS_DIR.mkdir(exist_ok=True)
for module in CLI_MODULES:
cmd_name = module.split(".")[-1]
output = CABS_DIR / f"{cmd_name}.yaml"
print(f"Generating {output}...")
subprocess.run([
"cargo", "generate-cab",
module,
str(output)
], check=True)
print("✓ All cabs generated")
Installation Modes
Users can install your package in different ways:
# Lightweight (just CLI and cab definitions)
pip install mypackage
# Full (with all scientific dependencies)
pip install mypackage[full]
# Development
pip install -e "mypackage[full,dev]"
Integration with cult-cargo
For integration with Stimela's cult-cargo:
- Make cabs discoverable:
# src/mypackage/cabs/__init__.py
from pathlib import Path
CAB_DIR = Path(__file__).parent
AVAILABLE_CABS = [p.stem for p in CAB_DIR.glob("*.yml")]
def get_cab_path(name: str) -> Path:
"""Get path to a cab definition."""
return CAB_DIR / f"{name}.yml"
- cult-cargo imports lightweight version:
We have to decide whether we want to add this kind of thing to cult-cargo:
# In cult-cargo's pyproject.toml
[tool.poetry.dependencies]
mypackage = "^1.0.0" # Not mypackage[full]
However, it should be possible to just
uv pip install mypackage==x.x.x
without any dependency conflicts. If not we have to think about ephemeral virtual environments.
- Users run with Stimela:
# Native: requires full installation
pip install mypackage[full]
stimela run recipe.yml
# Singularity: uses container (lightweight install sufficient)
pip install mypackage
stimela run recipe.yml -S
Container Images and GitHub Actions
For Stimela to use your package in containerized environments, you should publish OCI container images to GitHub Container Registry (ghcr.io). This section shows how to automate this with GitHub Actions.
1. Create a Dockerfile
Add a Dockerfile at the root of your repository:
FROM python:3.11-slim
WORKDIR /app
# Install uv for fast package installation
COPY --from=ghcr.io/astral-sh/uv:latest /uv /usr/local/bin/uv
# Copy package files
COPY pyproject.toml README.md ./
COPY src/ src/
# Install package with full dependencies using uv (much faster than pip)
RUN uv pip install --system --no-cache .
# Make CLI available
ENTRYPOINT ["mypackage"]
CMD ["--help"]
2. Set up GitHub Actions Workflow
Create .github/workflows/publish-container.yml:
name: Build and Publish Container
on:
push:
tags:
- 'v*.*.*' # Trigger on version tags (e.g., v1.0.0)
workflow_dispatch: # Allow manual triggering
env:
REGISTRY: ghcr.io
IMAGE_NAME: ${{ github.repository }}
jobs:
build-and-push:
runs-on: ubuntu-latest
permissions:
contents: read
packages: write
steps:
- name: Checkout repository
uses: actions/checkout@v5
- name: Log in to Container Registry
uses: docker/login-action@v3
with:
registry: ${{ env.REGISTRY }}
username: ${{ github.actor }}
password: ${{ secrets.GITHUB_TOKEN }}
- name: Extract metadata (tags, labels)
id: meta
uses: docker/metadata-action@v5
with:
images: ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}
tags: |
type=semver,pattern={{version}}
type=semver,pattern={{major}}.{{minor}}
type=semver,pattern={{major}}
type=sha,prefix={{branch}}-
- name: Build and push Docker image
uses: docker/build-push-action@v5
with:
context: .
push: true
tags: ${{ steps.meta.outputs.tags }}
labels: ${{ steps.meta.outputs.labels }}
3. Link Container to GitHub Package
To associate the container image with your repository:
-
Automatic linking: If your workflow pushes to
ghcr.io/username/repository-name, GitHub automatically creates a package linked to the repository. -
Manual linking (if needed):
- Go to your repository on GitHub
- Navigate to the "Packages" section
- Click on your container package
- Click "Connect repository" in the sidebar
- Select your repository from the dropdown
-
Set package visibility:
- In the package settings, set visibility to "Public" for open-source projects
- This allows Stimela to pull images without authentication
4. Version Tagging Best Practices
The workflow above creates multiple tags for each release:
# For release v1.2.3, creates:
ghcr.io/username/mypackage:1.2.3 # Full version
ghcr.io/username/mypackage:1.2 # Minor version
ghcr.io/username/mypackage:1 # Major version
ghcr.io/username/mypackage:main-sha123456 # Branch + commit SHA
This allows users to pin to specific versions or track latest minor/major releases.
5. Triggering a Build
Automated (recommended):
# Create and push a version tag
git tag v1.0.0
git push origin v1.0.0
The GitHub Action will automatically build and publish the container.
Manual:
- Go to "Actions" tab in GitHub
- Select "Build and Publish Container"
- Click "Run workflow"
6. Using the Container with Stimela
Once published, users can reference your container in Stimela recipes:
cabs:
- name: mypackage
image: ghcr.io/username/mypackage:1.0.0
Stimela will automatically pull the matching version based on the cab configuration.
7. Local Testing
Test your container locally before pushing:
# Build
docker build -t mypackage:test .
# Run
docker run --rm mypackage:test --help
docker run --rm mypackage:test process --help
# Test with mounted data
docker run --rm -v $(pwd)/data:/data mypackage:test process /data/input.ms
Type Inference
hip-cargo automatically recognizes custom stimela types. The generate-cab command should add
from pathlib import Path
from typing import NewType
MS = NewType("MS", Path)
Directory = NewType("Directory", Path)
URI = NewType("URI", Path)
File = NewType("File", Path)
to the preamble of functions generated from cabs that use these types.
It should also add the parser bit to the type hint Annotation e.g. for the custom MS dtype we need
def process(input_ms: Annotated[MS, typer.Option(parser=MS)]):
pass
One quirk of this approach is that parameters which have None as the default need to be defined as e.g.
def process(input_ms: Annotated[MS | None, typer.Option(parser=MS)]) = None:
pass
Python then parses this as Optional[MS] which is just an alias for Union[MS | None]. This should be handled correctly such that the generate-cab command places dtype: MS in the cab definition and the generate-function command correctly generates the function signature above. These custom types are currently limited to only two possible types in the Union and should be specified using the newer dtype1 | dtype2 format in the function definition (one of which may be None). All standard python types should just work.
Decorators
@stimela_cab
Marks a function as a Stimela cab.
name: Cab nameinfo: Descriptionpolicies: Optional dict of cab-level policies
@stimela_output
Defines a stimela output. When defining functions from cabs the generate-function command should check for the following parameter fields
name: Output name (top level, one belowcabs)dtype: Data type (File, Directory, MS, etc.)info: Help stringrequired: Whether output is required (default: False)implicit: If implicit isTruethe parameter should not be placed in the function definition. If implicit isFalse(the default), the parameter needs to be added to the function signature.
Features
- ✅ Automatic type inference from Python type hints
- ✅ Support for Typer Arguments (positional) and Options
- ✅ Multiple outputs automatically added to function signature if they are not implicit
- ✅ List types with automatic
repeat: listpolicy - ✅ Proper handling of default values and required parameters
Development
This project uses:
Setting Up Development Environment
# Clone the repository
git clone https://github.com/landmanbester/hip-cargo.git
cd hip-cargo
# Install dependencies with development tools
uv sync --group dev
# Install pre-commit hooks (recommended)
uv run pre-commit install
Pre-commit Hooks
This project uses pre-commit to automatically check code quality before commits. The hooks run:
- ruff linting: Checks code style and catches common errors
- ruff formatting: Ensures consistent code formatting
- trailing whitespace: Removes trailing whitespace
- end-of-file-fixer: Ensures files end with a newline
- check-yaml: Validates YAML syntax
- check-toml: Validates TOML syntax
- check-merge-conflict: Prevents committing merge conflict markers
- check-added-large-files: Prevents accidentally committing large files
Installing Pre-commit Hooks
After cloning the repository, install the pre-commit hooks:
uv run pre-commit install
This will automatically run the hooks before each commit. If any checks fail, the commit will be blocked until you fix the issues.
Running Hooks Manually
You can run the hooks manually on all files:
# Run on all files
uv run pre-commit run --all-files
# Run on staged files only
uv run pre-commit run
Updating Hook Versions
To update hook versions to the latest:
uv run pre-commit autoupdate
Manual Code Quality Checks
If you prefer to run checks manually without pre-commit:
# Format code
uv run ruff format .
# Check and auto-fix linting issues
uv run ruff check . --fix
# Run tests
uv run pytest -v
# Run tests with coverage
uv run pytest --cov=hip_cargo --cov-report=term-missing
Contributing Workflow
-
Create a feature branch:
git checkout -b feature/your-feature-name
-
Make your changes and ensure tests pass:
uv run pytest -v
-
Format and lint (automatically done by pre-commit):
git add . git commit -m "feat: your feature description" # Pre-commit hooks run automatically
-
Push and create a pull request:
git push origin feature/your-feature-name
License
MIT License
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file hip_cargo-0.1.2.tar.gz.
File metadata
- Download URL: hip_cargo-0.1.2.tar.gz
- Upload date:
- Size: 20.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
6a0a023ae1824e1c910d8e70cbfb6de0657e02babd34b8f43bf299b1e8d900af
|
|
| MD5 |
d2a5d5a5bb106057fbbf1eb2503b0b2d
|
|
| BLAKE2b-256 |
160c7612d1605ef4a7ca36749e0e1072d300bb565ceb0270f7c27cfcfe6e601a
|
Provenance
The following attestation bundles were made for hip_cargo-0.1.2.tar.gz:
Publisher:
publish.yml on landmanbester/hip-cargo
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
hip_cargo-0.1.2.tar.gz -
Subject digest:
6a0a023ae1824e1c910d8e70cbfb6de0657e02babd34b8f43bf299b1e8d900af - Sigstore transparency entry: 705348577
- Sigstore integration time:
-
Permalink:
landmanbester/hip-cargo@ea4daefb5286960c5b15705d1679165d007a219f -
Branch / Tag:
refs/tags/v0.1.2 - Owner: https://github.com/landmanbester
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@ea4daefb5286960c5b15705d1679165d007a219f -
Trigger Event:
push
-
Statement type:
File details
Details for the file hip_cargo-0.1.2-py3-none-any.whl.
File metadata
- Download URL: hip_cargo-0.1.2-py3-none-any.whl
- Upload date:
- Size: 26.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c2ef12c0c7df201e3c1a19b89cb852475b7da6fce3e6a8695832db149c20610f
|
|
| MD5 |
92947e6d34c190fad38fa9c9ff76f38e
|
|
| BLAKE2b-256 |
c452ae174631dbbfdc9d758c160a09e547ddb9fc961f902e655ba062a7968e44
|
Provenance
The following attestation bundles were made for hip_cargo-0.1.2-py3-none-any.whl:
Publisher:
publish.yml on landmanbester/hip-cargo
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
hip_cargo-0.1.2-py3-none-any.whl -
Subject digest:
c2ef12c0c7df201e3c1a19b89cb852475b7da6fce3e6a8695832db149c20610f - Sigstore transparency entry: 705348578
- Sigstore integration time:
-
Permalink:
landmanbester/hip-cargo@ea4daefb5286960c5b15705d1679165d007a219f -
Branch / Tag:
refs/tags/v0.1.2 - Owner: https://github.com/landmanbester
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@ea4daefb5286960c5b15705d1679165d007a219f -
Trigger Event:
push
-
Statement type: