ETL Subtask management library written in Rust
Project description
subtask-manager
subtask-manager is a Rust-powered Python package for discovering, classifying, loading, and rendering ETL subtasks from a filesystem structure.
It is designed for ETL projects where task metadata is encoded in folder names (entity, stage, system) and task content lives in files (.sql, .py, .sh, etc.).
Features
- Fast core implementation in Rust (PyO3 extension module)
- Python-friendly API
- Recursive file scanning by supported extensions
- Automatic classification of tasks from folder structure
- Lazy loading of task contents
- Rich filtering (
stage,entity,system_type,task_type,is_common) - Parameter extraction and rendering with multiple placeholder styles
- Immutable parameter application (returns new objects)
Installation
From PyPI
pip install subtask-manager
From source (local dev)
# Build extension and install in editable/dev mode
maturin develop
Or build wheels:
maturin build --release
Supported task types (by extension)
- SQL:
sql,psql,tsql,plpgsql - Shell:
sh - PowerShell:
ps1 - Python:
py - GraphQL:
graphql,gql - JSON:
json,jsonl - YAML:
yaml,yml
Folder conventions
Classification is based on the file path relative to a base directory.
Expected relative folder depth: up to 3 components before the file.
Typical pattern:
<base>/<entity>/<stage>/<system>/<task_file>
Examples:
customers/01_extract/pg/extract_data.sqlorders/02_transform/duck/normalize.py
Common tasks
A file directly under <base> is treated as a common task:
<base>/shared.yaml
Enums and aliases
EtlStage
SetupExtractTransformLoadCleanupPostprocessingOther
Recognized aliases include names like:
01_extract,extract,e,01- etc.
SystemType
Includes:
PostgreSQL,Duckdb,Clickhouse,MySQL,OracleDB,SQLite,SqlServer,Vertica,Other
Example aliases:
pg,postgres,duck,duckdb, etc.
TaskType
Sql,Shell,Powershell,Python,Graphql,Json,Yaml,Other
Quick usage
from pathlib import Path
from subtask_manager import SubtaskManager, EtlStage, SystemType, ParamType
base = Path("tests/test_data/subtasks")
sm = SubtaskManager(base)
print(sm.base_path)
print(sm.num_files)
print(sm.file_paths[:3])
# Lazy-loaded subtasks
tasks = sm.subtasks
print(len(tasks))
# Get a single task
task = sm.get_task("extract_data.sql")
print(task.name, task.entity, task.stage, task.system_type)
# Filter tasks
extract_pg = sm.get_tasks(
etl_stage=EtlStage.Extract,
system_type=SystemType.PostgreSQL,
include_common=False,
)
print(len(extract_pg))
# Inspect parameter names
params = task.get_params()
print(params)
# Apply parameters immutably
rendered = task.apply_parameters(
{"date": "2025-01-01", "env": "prod"},
styles=[ParamType.Curly, ParamType.DollarBrace],
ignore_missing=True,
)
print(rendered.get_command())
Parameter styles
Supported placeholder styles:
Curly:{name}Dollar:$nameDollarBrace:${name}DoubleCurly:{{name}}DoubleUnderscore:__name__Percent:%name%Angle:<name>
Useful methods:
subtask.get_params(styles=None) -> set[str]subtask.apply_parameters(params, styles=None, ignore_missing=False) -> Subtasksubtask.render_with_params(params, styles=None, ignore_missing=False) -> RenderedSubtasksubtask.render() -> Subtasksubtask.render_lightweight() -> RenderedSubtasksubtask.get_stored_params() -> dict[str, str]subtask.get_command() -> str | None
Public classes
SubtaskManagerSubtaskRenderedSubtaskFileScannerFileClassifierEtlStageSystemTypeTaskTypeParamType
Development
Prerequisites
- Rust toolchain
- Python 3.12+
uv(recommended) orpipmaturin
Install dev dependencies
uv sync --dev
Run tests
cargo test
uv run -m pytest
or:
make test
Lint/format (Python)
uv run ruff check .
uv run ruff format .
Build and release
Cross-platform wheel publishing is automated with GitHub Actions.
See the full runbook:
It documents:
- TestPyPI dry runs
- PyPI production release flow
- Trusted Publishing setup
- version/tag conventions
Versioning notes
Keep versions aligned between:
Cargo.toml([package].version)pyproject.toml([project].version)
Use Makefile version helpers (if present) to bump consistently.
License
MIT
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distributions
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file subtask_manager-0.2.6.tar.gz.
File metadata
- Download URL: subtask_manager-0.2.6.tar.gz
- Upload date:
- Size: 51.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3feb895f4bbeb532e60b4f3c379b7895c4460abd65e22e7634d50f5381fe6225
|
|
| MD5 |
4559ceccf59338e3718fa9648d0814c9
|
|
| BLAKE2b-256 |
19e6acd685a58bd178560dffb03f6c6cac21c07b442fcc03e4cc722a95933362
|
Provenance
The following attestation bundles were made for subtask_manager-0.2.6.tar.gz:
Publisher:
release.yml on VladimirKosimovsky/subtask-manager
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
subtask_manager-0.2.6.tar.gz -
Subject digest:
3feb895f4bbeb532e60b4f3c379b7895c4460abd65e22e7634d50f5381fe6225 - Sigstore transparency entry: 1067430518
- Sigstore integration time:
-
Permalink:
VladimirKosimovsky/subtask-manager@c7eead94832a9dc3c4583c5a20a5070ed7fb62ae -
Branch / Tag:
refs/tags/v0.2.6 - Owner: https://github.com/VladimirKosimovsky
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@c7eead94832a9dc3c4583c5a20a5070ed7fb62ae -
Trigger Event:
push
-
Statement type:
File details
Details for the file subtask_manager-0.2.6-cp39-abi3-win_amd64.whl.
File metadata
- Download URL: subtask_manager-0.2.6-cp39-abi3-win_amd64.whl
- Upload date:
- Size: 860.2 kB
- Tags: CPython 3.9+, Windows x86-64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2e89c6a569e366f2b8d53d9415ea94460e538ed5006b54545b40ba643082db09
|
|
| MD5 |
272d6b69550a2784a991e29c49a25125
|
|
| BLAKE2b-256 |
b0e30a2377e7e2e76b0622f8b61f27e84c487a6643b0472a5af74725f70dfada
|
Provenance
The following attestation bundles were made for subtask_manager-0.2.6-cp39-abi3-win_amd64.whl:
Publisher:
release.yml on VladimirKosimovsky/subtask-manager
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
subtask_manager-0.2.6-cp39-abi3-win_amd64.whl -
Subject digest:
2e89c6a569e366f2b8d53d9415ea94460e538ed5006b54545b40ba643082db09 - Sigstore transparency entry: 1067430650
- Sigstore integration time:
-
Permalink:
VladimirKosimovsky/subtask-manager@c7eead94832a9dc3c4583c5a20a5070ed7fb62ae -
Branch / Tag:
refs/tags/v0.2.6 - Owner: https://github.com/VladimirKosimovsky
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@c7eead94832a9dc3c4583c5a20a5070ed7fb62ae -
Trigger Event:
push
-
Statement type:
File details
Details for the file subtask_manager-0.2.6-cp39-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.
File metadata
- Download URL: subtask_manager-0.2.6-cp39-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
- Upload date:
- Size: 1.1 MB
- Tags: CPython 3.9+, manylinux: glibc 2.17+ x86-64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
5587cbaa05dbf82d040c6ceb6b439c5e5ee510903cc6909576e6e62ac08bb605
|
|
| MD5 |
ef442164f1c34e1533ecb06367cd2463
|
|
| BLAKE2b-256 |
485928fc1b1f272335965366ed8061e84de94d619bcf946ccc07a2ececaa0d18
|
Provenance
The following attestation bundles were made for subtask_manager-0.2.6-cp39-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl:
Publisher:
release.yml on VladimirKosimovsky/subtask-manager
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
subtask_manager-0.2.6-cp39-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl -
Subject digest:
5587cbaa05dbf82d040c6ceb6b439c5e5ee510903cc6909576e6e62ac08bb605 - Sigstore transparency entry: 1072812042
- Sigstore integration time:
-
Permalink:
VladimirKosimovsky/subtask-manager@2fc571c16bcd3bcc438db6080fa820a136f967b2 -
Branch / Tag:
refs/tags/v0.2.8 - Owner: https://github.com/VladimirKosimovsky
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@2fc571c16bcd3bcc438db6080fa820a136f967b2 -
Trigger Event:
push
-
Statement type:
File details
Details for the file subtask_manager-0.2.6-cp39-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.
File metadata
- Download URL: subtask_manager-0.2.6-cp39-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
- Upload date:
- Size: 1.1 MB
- Tags: CPython 3.9+, manylinux: glibc 2.17+ ARM64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f4c7bb9b9bdb943a1734bcab758b4aaf2f47b86a5023959eb1586da6d194471c
|
|
| MD5 |
4ed282966a96ba054c5cbea83c1b418b
|
|
| BLAKE2b-256 |
30dc4780f5de1c20bf9c7a93d0359f35f35f8e8f1b39f835bd5f90d811d20455
|
Provenance
The following attestation bundles were made for subtask_manager-0.2.6-cp39-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl:
Publisher:
release.yml on VladimirKosimovsky/subtask-manager
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
subtask_manager-0.2.6-cp39-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl -
Subject digest:
f4c7bb9b9bdb943a1734bcab758b4aaf2f47b86a5023959eb1586da6d194471c - Sigstore transparency entry: 1072812075
- Sigstore integration time:
-
Permalink:
VladimirKosimovsky/subtask-manager@2fc571c16bcd3bcc438db6080fa820a136f967b2 -
Branch / Tag:
refs/tags/v0.2.8 - Owner: https://github.com/VladimirKosimovsky
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@2fc571c16bcd3bcc438db6080fa820a136f967b2 -
Trigger Event:
push
-
Statement type:
File details
Details for the file subtask_manager-0.2.6-cp39-abi3-macosx_11_0_arm64.whl.
File metadata
- Download URL: subtask_manager-0.2.6-cp39-abi3-macosx_11_0_arm64.whl
- Upload date:
- Size: 950.3 kB
- Tags: CPython 3.9+, macOS 11.0+ ARM64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
5e5e663974e7e8d211e735b2fda40b3bb9cc26c0925ba291f8e6fe16bbbbb126
|
|
| MD5 |
476d018bb3b1c868d8bbb3f46599144b
|
|
| BLAKE2b-256 |
b00d72e577ae6a76536d73d76baf0bce9bb6c5bc9156f5f1d30266137672339b
|
Provenance
The following attestation bundles were made for subtask_manager-0.2.6-cp39-abi3-macosx_11_0_arm64.whl:
Publisher:
release.yml on VladimirKosimovsky/subtask-manager
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
subtask_manager-0.2.6-cp39-abi3-macosx_11_0_arm64.whl -
Subject digest:
5e5e663974e7e8d211e735b2fda40b3bb9cc26c0925ba291f8e6fe16bbbbb126 - Sigstore transparency entry: 1072812011
- Sigstore integration time:
-
Permalink:
VladimirKosimovsky/subtask-manager@2fc571c16bcd3bcc438db6080fa820a136f967b2 -
Branch / Tag:
refs/tags/v0.2.8 - Owner: https://github.com/VladimirKosimovsky
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@2fc571c16bcd3bcc438db6080fa820a136f967b2 -
Trigger Event:
push
-
Statement type:
File details
Details for the file subtask_manager-0.2.6-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.
File metadata
- Download URL: subtask_manager-0.2.6-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
- Upload date:
- Size: 1.1 MB
- Tags: CPython 3.8, manylinux: glibc 2.17+ x86-64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
70667967c758c3b91b1e2015e9b1b48c6ca521d9c8e3bfe2068bcb7e72a1a26e
|
|
| MD5 |
71c0ab0fa198e213d95eb650aa86218e
|
|
| BLAKE2b-256 |
6c3ed4467f957b3087e80286732ce6ccd3806208786ff961ff847ead1ff32578
|
Provenance
The following attestation bundles were made for subtask_manager-0.2.6-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl:
Publisher:
release.yml on VladimirKosimovsky/subtask-manager
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
subtask_manager-0.2.6-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl -
Subject digest:
70667967c758c3b91b1e2015e9b1b48c6ca521d9c8e3bfe2068bcb7e72a1a26e - Sigstore transparency entry: 1067430589
- Sigstore integration time:
-
Permalink:
VladimirKosimovsky/subtask-manager@c7eead94832a9dc3c4583c5a20a5070ed7fb62ae -
Branch / Tag:
refs/tags/v0.2.6 - Owner: https://github.com/VladimirKosimovsky
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@c7eead94832a9dc3c4583c5a20a5070ed7fb62ae -
Trigger Event:
push
-
Statement type: