Skip to main content

TidyRun

Project description

TidyRun

A tool to orchestrate the compute and storage of Python DAGs

Features

Compute Orchestration

TidyRun provides first-class deferred compute primitives for DAG workflows:

  • Deferred Primitives: Model work with Job, ParametrizedJob, and nested DAG
  • Dependency-Aware Scheduling: Evaluate DAGs with topological execution and fail-fast behavior
  • Execution Modes: Choose subprocess (default), thread, or process
  • Parallel Evaluation: Run independent nodes with DAG.evaluate(max_workers=...)
  • Materialized Plans: Compile reproducible execution plans before running jobs
  • Resumable Runs: Re-run materialized plans with execute_materialized(skip_completed=True)
  • Pluggable Executors: Use local executors, SlurmExecutor, or AwsBatchExecutor

Serialization and Storage

TidyRun also includes a comprehensive serialization system for storing and retrieving Python objects:

  • Type-Aware Encoding: Automatically selects folder, Parquet, HDF5, JSON, or pickle based on value type
  • Lazy Evaluation: Directories deserialize into LazyDict objects that load values on-demand
  • Recursive Concatenation: Aggregate DataFrames across nested structures with LazyDict.concat() (optionally parallel with max_workers)
  • Metadata Sidecars: Each output is tracked with .tidyrun metadata files for format versioning
  • Extensible Pipeline: Customize encoders or add support for custom types
  • Intelligent Fallback: Parquet → HDF5 → JSON → Pickle chain ensures robust serialization

Quick Example:

Compute (DAG execution):

from tidyrun import DAG, Job


def square(x: int) -> int:
    return x * x


dag = DAG()
dag["a"] = Job(func=square, kwargs={"x": 3})

# Fast local execution without subprocess spawn overhead
outputs = dag.evaluate("./local_dag", execution_mode="thread", max_workers=4)
print(outputs["a"])  # 9

Serialization and lazy loading:

from tidyrun import serialize, deserialize
import pandas as pd

# Save nested data with smart format selection
serialize({
    "metrics": pd.DataFrame({"score": [9]}),
    "config": {"lr": 0.001},
}, "./results/exp_001")

# Load with lazy evaluation
results = deserialize("./results/exp_001")
df = results["metrics"]  # Loads on access

# Aggregate across nested structures
combined = results.concat(names=["run_id"])

Learn More:

  • Quick Start — Local docs workflow and publishing notes
  • DAG Guide — Jobs, parametrized jobs, executors, and evaluation modes
  • Serialization Guide — Complete API reference, quick reference, and examples

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tidyrun-0.0.4.tar.gz (37.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

tidyrun-0.0.4-py3-none-any.whl (47.1 kB view details)

Uploaded Python 3

File details

Details for the file tidyrun-0.0.4.tar.gz.

File metadata

  • Download URL: tidyrun-0.0.4.tar.gz
  • Upload date:
  • Size: 37.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for tidyrun-0.0.4.tar.gz
Algorithm Hash digest
SHA256 1fc9989422ae0d5d1ad51d2838171a9d17f010f87fe7dfecaf22eec1a149e974
MD5 f3b9441edd8eb55a2629264c25627620
BLAKE2b-256 bc2acce055066b5f7065fde2bc0246138f2da21fa13ea32d814a52050a59579e

See more details on using hashes here.

Provenance

The following attestation bundles were made for tidyrun-0.0.4.tar.gz:

Publisher: publish-pypi.yml on mwouts/tidyrun

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file tidyrun-0.0.4-py3-none-any.whl.

File metadata

  • Download URL: tidyrun-0.0.4-py3-none-any.whl
  • Upload date:
  • Size: 47.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for tidyrun-0.0.4-py3-none-any.whl
Algorithm Hash digest
SHA256 3879bab2bb2df07d39efb3662e59d96a71bfb70259a53db0b20e64c447146545
MD5 1073f06134e164b6e3a12d2e751b6e3a
BLAKE2b-256 d24dbc5e4acab8d54f8305f1aed3c96bbfa8ff695c36dc6575bdfef6dadb8fa1

See more details on using hashes here.

Provenance

The following attestation bundles were made for tidyrun-0.0.4-py3-none-any.whl:

Publisher: publish-pypi.yml on mwouts/tidyrun

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page