Workflow compiler for generating DAG artifacts from workflow specifications
Project description
wt-compiler
Workflow compiler for generating DAG artifacts from workflow specifications.
Overview
wt-compiler is a key component of the wt (workflow toolkit) ecosystem. It compiles workflow specifications (YAML files) into complete, executable workflow packages including:
- DAG Python code (async, sequential, and Jupytext variants)
- Pydantic parameter models with JSON schemas
- CLI interfaces for workflow execution
- Pixi configuration for dependency management
- Dockerfiles for containerized deployment
- Test suites
Key Innovation: Environment-Isolated Task Discovery
Unlike legacy systems that require importing task libraries directly, wt-compiler uses subprocess-based task discovery:
- Creates ephemeral rattler/pixi environments with specified requirements
- Calls
wt-registryCLI in that environment - Parses JSON output (validated against
wt-contractsschemas) - Compiles workflows without Python import dependencies on task libraries
This enables:
- ✅ Cross-environment compilation (Python 3.10 compiler can target Python 3.12 tasks)
- ✅ Isolation from task library dependency conflicts
- ✅ Type-safe contracts via
wt-contractspackage - ✅ No circular dependencies between packages
Installation
# From source (development)
cd wt/wt-compiler
uv sync
# Once published to PyPI
uv add wt-compiler
Usage
Scaffold a new workflow project
# Interactive mode (default) — arrow-key prompts for all fields
wt-compiler scaffold init
# Write into a specific parent directory
wt-compiler scaffold init --output-dir /path/to/projects
# Overwrite an existing directory
wt-compiler scaffold init --clobber
# Batch mode — supply all required fields as flags (CI / scripting)
wt-compiler scaffold init --no-interactive \
--workflow-id my_workflow \
--workflow-name "My Workflow" \
--author-name "Jane Smith"
# Batch mode with a conda requirement
wt-compiler scaffold init --no-interactive \
--workflow-id my_workflow \
--workflow-name "My Workflow" \
--author-name "Jane Smith" \
--requirements '{"name":"numpy","version":">=1.0","channel":"conda-forge"}'
# --requirements is repeatable for multiple packages
wt-compiler scaffold init --no-interactive ... \
--requirements '{"name":"numpy","version":">=1.0"}' \
--requirements '{"name":"mypkg","path":"/abs/path/to/mypkg"}'
init scaffolds a new project directory at <output-dir>/<workflow-id>/ containing a
spec.yaml, CI configuration, and packaging boilerplate. See
src/wt_compiler/wizard/README.md for details on
customising the wizard or adding custom templates.
Use a custom wizard provider
Third-party packages can ship their own wizard providers by exposing a
wt_compiler.wizard_providers entry point (see
wizard README for packaging details).
Once the package is installed in the same environment as wt-compiler, it is
discovered automatically — no registration step required.
General use (pixi global):
pixi global add --environment wt-compiler my-wt-provider
Local development (uv):
uv pip install my-wt-provider
wt-compiler scaffold init will prompt you to choose a provider at startup, or you
can select one directly with --provider:
wt-compiler scaffold init --provider my-provider-name
Basic Compilation
from wt_compiler import compile_workflow, Spec
from rattler import MatchSpec
# Load a workflow specification
spec = Spec.parse_file("workflow/spec.yaml")
# Compile to artifacts
artifacts = compile_workflow(
spec=spec,
spec_relpath="workflow/spec.yaml"
)
# Write artifacts to disk
artifacts.dump(clobber=True)
Task Discovery
from wt_compiler.discovery import discover_tasks_from_requirements
from rattler import MatchSpec
# Discover tasks from requirements
requirements = [
MatchSpec("my-task-library>=1.0.0"),
MatchSpec("another-library>=2.0.0"),
]
tasks = discover_tasks_from_requirements(requirements)
# Returns: dict[task_name, dict[module_path, KnownTask]]
Workflow Specification Format
id: my-workflow
requirements:
- name: my-task-library
version: ">=1.0.0"
channel: conda-forge
workflow:
- id: task1
task: extract_data
partial:
source: "s3://my-bucket/data.csv"
- id: task2
task: transform_data
partial:
input_data: "${{ workflow.task1.return }}"
map:
argnames: param
argvalues: "${{ workflow.task1.return }}"
Architecture
Package Structure
wt-compiler/
├── src/wt_compiler/
│ ├── __init__.py # Public exports
│ ├── spec.py # Spec and TaskInstance models
│ ├── compiler.py # DagCompiler class
│ ├── discovery.py # Task discovery via rattler + CLI
│ ├── artifacts.py # Artifact generation models
│ ├── jsonschema.py # JSON schema utilities
│ ├── requirements.py # Rattler channel/matchspec handling
│ ├── util.py # Import validation utilities
│ ├── formatting.py # Ruff formatting decorator
│ ├── _models.py # Pydantic base classes
│ └── templates/ # Jinja2 templates
│ ├── pkg/
│ │ ├── dags/
│ │ │ ├── run_async.jinja2
│ │ │ ├── run_sequential.jinja2
│ │ │ └── jupytext.jinja2
│ │ ├── cli.jinja2
│ │ ├── dispatch.jinja2
│ │ └── ...
│ ├── tests/
│ ├── Dockerfile.jinja2
│ └── pixi.jinja2
└── tests/
├── test_spec.py
├── test_compiler.py
├── test_discovery.py
└── ...
Dependencies
- wt-contracts (>=0.1.0): Shared type contracts (RegistryOutput, TaskProtocol, etc.)
- pydantic (>=2.0.0): Data validation and modeling
- jinja2: Template rendering
- ruamel.yaml: YAML parsing
- rattler (>=0.8.0): Conda environment management
- datamodel-code-generator: Generate Pydantic models from JSON schemas
- pydot: DAG visualization
Implementation Status
✅ Completed Components
- Package Structure - Full directory layout with src/ structure
- pyproject.toml - setuptools-scm configuration, dependencies, tool configs
- spec.py - Complete Spec, TaskInstance, and related models (~700 lines)
- discovery.py - Task discovery via rattler + wt-registry CLI
- artifacts.py - All artifact models (Dags, PixiToml, WorkflowArtifacts, etc.)
- requirements.py - Channel and MatchSpec handling
- jsonschema.py - JSON schema utilities with RJSF support
- util.py - Import reference validation
- formatting.py - Ruff formatting decorator
- _models.py - Pydantic base model classes
- templates/ - All Jinja2 templates copied from legacy codebase
- compiler.py - Core DagCompiler class structure
⚠️ Needs Expansion
The following areas are implemented as simplified stubs and need full implementation:
compiler.py TODOs
-
get_params_jsonschema() - Currently returns empty schema
- Needs: Extract schemas from discovered task metadata
- Needs: Merge schemas for task groups
- Needs: Apply omit_args logic
- Needs: Generate proper UI schema
- Needs: Apply RJSF overrides
-
generate_params_model() - Stub implementation
- Needs: Use datamodel-code-generator to create Pydantic model from JSON schema
- Needs: Proper imports and type hints
-
Graph visualization - Not implemented
- Needs: Generate pydot graphs showing task dependencies
- Needs: Export to PNG
-
README generation - Not implemented
- Needs: Generate README.md with fingerprint information
- Needs: Include workflow diagram, parameter documentation
-
Version management - Basic implementation only
- Needs: Full VERSION.yaml bump logic
- Needs: Lockfile carryover for updates
-
get_per_taskinstance_params_notebook() - Empty stub
- Needs: Generate parameter notebooks for Jupytext DAG
discovery.py TODOs
-
rattler-py native API - Currently uses subprocess fallback
- Needs: Update when rattler-py solve/install API is stable
- Needs: Better error handling
-
Schema validation - Basic validation only
- Needs: Full wt-contracts schema validation
- Needs: Better error messages for malformed CLI output
Testing
- Unit tests - Not yet written
- Need tests for: spec parsing, validation, compilation
- Need tests for: task discovery with mock environments
- Need tests for: artifact generation
- Need tests for: template rendering
Development
Setup
cd wt/wt-compiler
uv sync
Run Tests
uv run pytest
Type Checking
uv run mypy src/wt_compiler
Linting
uv run ruff check src/wt_compiler
uv run ruff format src/wt_compiler
Relationship to Other Packages
- wt-contracts: Depends on (provides type contracts)
- wt-registry: Called via subprocess (no Python dependency)
- wt-task: No dependency (generates code that uses it)
- wt-runner: No dependency (runner may depend on compiler in future)
- wt-invokers: No dependency
Migration from Legacy
This package replaces ecoscope_workflows_core.compiler. Key differences:
- No direct task imports - Uses CLI-based discovery instead
- wt-contracts integration - Type-safe schemas for all interfaces
- Modular dependencies - Only depends on wt-contracts
- Simplified models - Spec models are now in spec.py instead of compiler.py
Future Work
- Complete all TODO areas in compiler.py
- Write comprehensive test suite
- Add CLI tool for standalone compilation
- Add workflow visualization tools
- Add workflow validation tools
- Performance optimization for large workflows
- Better error messages and debugging tools
Contributing
See main wt repository CONTRIBUTING.md for guidelines.
License
BSD-3-Clause
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file wt_compiler-0.6.0.tar.gz.
File metadata
- Download URL: wt_compiler-0.6.0.tar.gz
- Upload date:
- Size: 211.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: uv/0.11.14 {"installer":{"name":"uv","version":"0.11.14","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
fe19d1ec1bba4adb1708daea0957d173a5f0231eecd940b64c5c3857e23c9a61
|
|
| MD5 |
1e5027edad976b0d58d2f37e12426406
|
|
| BLAKE2b-256 |
bc40e923b6dc5616786d0114267b242f16b8b4d47999c0f75d424c289a1970de
|
File details
Details for the file wt_compiler-0.6.0-py3-none-any.whl.
File metadata
- Download URL: wt_compiler-0.6.0-py3-none-any.whl
- Upload date:
- Size: 117.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: uv/0.11.14 {"installer":{"name":"uv","version":"0.11.14","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b92fe1c448cd28cc4c77b0866d15e179b9de9a212c9259a0c5857c38c3d904fe
|
|
| MD5 |
8a5abe44b7d1372295ad3e55e879ffa5
|
|
| BLAKE2b-256 |
5af973780c2ce0831d2521094088fd28a74e553e3c78130e100ce87499f26caa
|