Skip to main content

A simple, opinionated task manager

Project description

modak

GitHub Release GitHub last commit GitHub License PyPI - Version

modak is a simple-to-use, opinionated task queue system with dependency management, resource allocation, and isolation control. Tasks are run respecting topological dependencies, resource limits, and optional isolation.

This library only has two classes, Tasks, which are an abstract class with a single method to override, run(self) -> None, and a TaskQueue which manages the execution order. Additionally, modak comes with a task monitor TUI which can be invoked with the modak shell command.

By default, modak scripts will create a state file and logs in a directory called .modak in the current working directory. This can be changed by setting the MODAK_PATH environment variable. The modak CLI also supports an optional -c/--cwd <PATH> argument to set the working directory containing the .modak file.

Features

  • Topological task scheduling
  • Persistent state and log files
  • Resource-aware execution
  • Isolated task handling
  • Skipping of previously completed tasks based on input/output hashes

Installation

pip install modak

Or with uv:

pip install modak

Examples

A simple chain of tasks

from modak import Task, TaskQueue

class PrintTask(Task):
    def run(self):
        self.log_info(f"Running {self.name}")

t1 = PrintTask(name="task1")
t2 = PrintTask(name="task2", inputs=[t1])
t3 = PrintTask(name="task3", inputs=[t2])

queue = TaskQueue()
queue.run([t3])

Fan-in, fan-out

from pathlib import Path
from modak import Task, TaskQueue

class DummyTask(Task):
    def run(self):
        self.log_info(f"Running {self.name}")
        for output in self.outputs:
            output.write_text(f"Output of {self.name}")

# Leaf tasks
a = DummyTask(name="A", outputs=[Path("a.out")])
b = DummyTask(name="B", outputs=[Path("b.out")])
c = DummyTask(name="C", outputs=[Path("c.out")])

# Fan-in: D depends on A, B, C
d = DummyTask(name="D", inputs=[a, b, c], outputs=[Path("d.out")])

# Fan-out: E and F both depend on D
e = DummyTask(name="E", inputs=[d], outputs=[Path("e.out")])
f = DummyTask(name="F", inputs=[d], outputs=[Path("f.out")])

queue = TaskQueue()
queue.run([e, f])

A complex workflow

from pathlib import Path
from modak import Task, TaskQueue

class SimTask(Task):
    def run(self):
        self.log_info(f"{self.name} starting with {self.requirements}")
        for out in self.outputs:
            out.write_text(f"Generated by {self.name}")

# Raw data preprocessing
pre_a = SimTask(name="PreA", outputs=[Path("a.pre")], requirements={"cpu": 1})
pre_b = SimTask(name="PreB", outputs=[Path("b.pre")], requirements={"cpu": 1})
pre_c = SimTask(name="PreC", outputs=[Path("c.pre")], requirements={"cpu": 1})

# Feature extraction (can run in parallel)
feat1 = SimTask(name="Feature1", inputs=[pre_a], outputs=[Path("a.feat")], requirements={"cpu": 2})
feat2 = SimTask(name="Feature2", inputs=[pre_b], outputs=[Path("b.feat")], requirements={"cpu": 2})
feat3 = SimTask(name="Feature3", inputs=[pre_c], outputs=[Path("c.feat")], requirements={"cpu": 2})

# Aggregation step
aggregate = SimTask(
    name="Aggregate",
    inputs=[feat1, feat2, feat3],
    outputs=[Path("agg.out")],
    requirements={"cpu": 3}
)

# Final model training (expensive, must be isolated)
train = SimTask(
    name="TrainModel",
    inputs=[aggregate],
    outputs=[Path("model.bin")],
    isolated=True,
    requirements={"cpu": 3, "gpu": 1}
)

# Side analysis and visualization can run independently
viz = SimTask(name="Visualization", inputs=[feat1, feat2], outputs=[Path("viz.png")], requirements={"cpu": 1})
stats = SimTask(name="Stats", inputs=[feat3], outputs=[Path("stats.txt")], requirements={"cpu": 1})

queue = TaskQueue(
    workers=4,
    resources={"cpu": 4, "gpu": 1}
)

queue.run([train, viz, stats])

Future Plans

I'll probably make small improvements to the TUI and add features as I find the need. Contributions are welcome, just open an issue or pull request on GitHub and I'll try to respond as soon as I can.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

modak-0.1.0.tar.gz (18.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

modak-0.1.0-py3-none-any.whl (17.7 kB view details)

Uploaded Python 3

File details

Details for the file modak-0.1.0.tar.gz.

File metadata

  • Download URL: modak-0.1.0.tar.gz
  • Upload date:
  • Size: 18.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for modak-0.1.0.tar.gz
Algorithm Hash digest
SHA256 78b7477dbe7f6c1fdd245f7d3efe6d63b2956c90d3b06c787e979f742f63dadf
MD5 46dd443a3bfdd7b522b38a8d31067bcf
BLAKE2b-256 c4cc3263fc951827d8c4b71e405a133db515e62f3cbf1f533a0725e66337c6f1

See more details on using hashes here.

Provenance

The following attestation bundles were made for modak-0.1.0.tar.gz:

Publisher: release.yml on denehoffman/modak

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file modak-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: modak-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 17.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for modak-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 28d4ee8fe048d97893aaf102727427b8047e5e10deb74c7743e3095ef8ae77c9
MD5 50c3ffb6074fe2753ec8ae1d62eae1e2
BLAKE2b-256 51703f532975ff6fed3fad607d9e35fd048184da215d624842a090e63b81c431

See more details on using hashes here.

Provenance

The following attestation bundles were made for modak-0.1.0-py3-none-any.whl:

Publisher: release.yml on denehoffman/modak

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page