Skip to main content

No project description provided

Project description

dml-util

PyPI - Version PyPI - Python Version


Overview

dml-util provides utilities and adapters for DaggerML, enabling seamless function wrapping, artifact storage, and execution in local and cloud environments. It is designed to work with the daggerml ecosystem.

Front-End vs Back-End

  • Front-End (User Interface):
    • The main entrypoints for most users are the funkify decorator, S3Store, and the included adapters/runners (e.g. conda, hatch, ssh, batch -- all via funkify).
    • These let you wrap Python functions as DAG nodes, store artifacts, and run code in a variety of environments with minimal setup.
    • Beginners should start by using these included adapters and runners, as shown in the examples.
  • Back-End (Advanced/Extensible):
    • The back-end consists of the adapters and runners themselves, which handle execution, state, and integration with cloud/local resources.
    • Advanced users can write their own adapters or runners by following the patterns in src/dml_util/adapters/ and src/dml_util/runners/.
    • See the source and docstrings for guidance.

Installation

pip install dml-util

Usage

Wrapping Functions as DAG Nodes

from daggerml import Dml
from dml_util import funkify

@funkify
def add_numbers(dag):
    """Add numbers together.

    Parameters
    ----------
    dag : DmlDag
        The DAG context provided by DaggerML.

    Returns
    -------
    int
        The sum of the input numbers.
    """
    dag.result = sum(dag.argv[1:].value())
    return dag.result

dml = Dml()
with dml.new("simple_addition") as dag:
    dag.add_fn = add_numbers
    dag.sum = dag.add_fn(1, 2, 3)
    print(dag.sum.value())  # Output: 6

S3 Storage

from dml_util import S3Store
s3 = S3Store()
uri = s3.put(b"my data", name="foo.txt").uri
print(uri)  # s3://<bucket>/<prefix>/data/foo.txt

Advanced: Docker, Batch, and ECR

from dml_util import dkr_build, funkify, S3Store

@funkify
def fn(dag):
    *args, denom = dag.argv[1:].value()
    return sum(args) / denom

s3 = S3Store()
tar = s3.tar(dml, ".")
img = dkr_build(tar, ["--platform", "linux/amd64", "-f", "Dockerfile"])
batch_fn = funkify(fn, data={"image": img.value()}, adapter=dag.batch.value())
result = batch_fn(1, 2, 3, 4)

Experimental Api

from dml_util.experimental import api

Example: A mostly complete working example from the tests

The only modification is that we added a api.load call for epistemic purposes.

from contextlib import redirect_stderr, redirect_stdout

from dml_util import S3Store
from tests.util import _root_

def get_tarball(dml):
    s3 = S3Store()
    excludes = [
        "tests/*.py",
        ".pytest_cache",
        ".ruff_cache",
        "**/__about__.py",
        "__pycache__",
        "examples",
        ".venv",
        "**/.venv",
    ]
    with redirect_stdout(None), redirect_stderr(None):
        return s3.tar(dml, str(_root_), excludes=excludes)

class A0(api.Dag):
    dkr_build: Any = funk.dkr_build
    dkr_flags: Any = api.field(default=docker_flags)
    tarball: Node = api.field(default_function=lambda dag: get_tarball(dag.dml))
    not_used0: Any = api.load("path:to:dag::OtherDag")  # loads <...OtherDag>.result
    not_used1: Any = api.load("path:to:dag::OtherDag", "a")  # loads <...OtherDag>.a

    @api.funk(prepop={"x": 1})
    def fn(self, *args):
        """Adds up *args and adds `x` to the result.

        Parameters
        ----------
        *args : ScalarNode
            The values to sum -- should be ints or floats under the hood.

        Other Parameters
        ----------------
        x : ScalarNode, optional
            A value to add to the sum of `*args` (default is 1).
            Keyword-only.

        Returns
        -------
        float
            The sum of the input values plus `x` (defaults to 1).
        """
        return sum([x.value() for x in args]) + self.x.value()

    def dockerify(self, tarball):
        img = self.dkr_build(
            tarball,
            ["--platform", "linux/amd64", "-f", "tests/assets/dkr-context/Dockerfile"],
            timeout=60_000,
            name="img",
        )
        fn = self.wrap(
            self.fn,
            {
                "uri": "docker",
                "data": {"image": img, "flags": self.dkr_flags},
                "adapter": "dml-util-local-adapter",
            },
        )
        return fn

    def run(self, *vals):
        fn = self.dockerify(self.tarball, name="dockerified-fn")
        baz = fn(*vals, name="baz")
        return baz

with A0() as dag:
    vals = [1, 2, 3, 4, 5]
    resp = dag.run(*vals)
    assert resp.value() == sum(vals) + 1

Then later, you want to look at the results (dig in and pull node values). You can load this dag (once completed) via:

from daggerml import Dml

dml = Dml()
dag = dml.load("path:to:dag:file::A0")
assert dag.result.load() == sum(vals) + 1

Example: Basic DAG with fields and steps

Notes:

  1. Fields are coerced to nodes, so you must use .value() to get their values.
  2. Steps can call other steps and the resulting dependency graph is computed automatically (so caching still makes sense).
class MyDag(api.Dag):
    a: int = api.field(default=1)
    b: int = api.field(default=2)

    def step0(self):
        return self.a.value() + self.b.value()

    def step1(self):
        return self.step0(name="s0") * 10

# Instantiate and use the DAG
with MyDag() as dag:
    v = dag.step1(name="my-v")
    print(v.value())  # Output: 30
    dag.commit(v)

from daggerml import Dml
dml = Dml()
dag = dml.load("path:to:dag:file::MyDag")
try:
    dag.result.load().a.value()
except Exception as e:
    print("`a` is not a node in `step1`")
dag.result.load().s0.load().a.value()  # Output: 1

Example: Using default_function for dynamic fields

from dml_util.experimental import api

class MyDag2(api.Dag):
    a: int = api.field(default=5)
    b: int = api.field(default_function=lambda dag: dag.a.value() * 2)

    def compute(self):
        return self.a.value() + self.b.value()

with MyDag2() as dag:
    dag.commit(dag.compute())

Example: Loading dag values

from dml_util.experimental import api

class MyDag3(api.Dag):
    a: int = api.field(default=1)
    b: int = api.load("path:to:dag:file::MyDag2", "b")

    def add(self):
        return self.a.value() + self.b.value()

Example: api.funk

This is the experimental equivalent of dml_util.funkify.

from dml_util.experimental import api

class MyDag4(api.Dag):
    @api.funk(uri=api.load("batch"), adapter="batch")
    def multiply(self, x, y):
        return x.value() * y.value()

Documentation

License

dml-util is distributed under the terms of the MIT license.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dml_util-0.0.21.post4.tar.gz (82.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

dml_util-0.0.21.post4-py3-none-any.whl (52.1 kB view details)

Uploaded Python 3

File details

Details for the file dml_util-0.0.21.post4.tar.gz.

File metadata

  • Download URL: dml_util-0.0.21.post4.tar.gz
  • Upload date:
  • Size: 82.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: python-httpx/0.28.1

File hashes

Hashes for dml_util-0.0.21.post4.tar.gz
Algorithm Hash digest
SHA256 d2a7e715610a0bfce01262525e70fb605dde26dfef91b0593cbeb9ba80b0a57d
MD5 f53925074c39a017db6b6496cc1d7daa
BLAKE2b-256 6af6ea6628b6ff16d62ba3ccb9ba7fe364a918f73960a12a35ce222d4a7d06b2

See more details on using hashes here.

File details

Details for the file dml_util-0.0.21.post4-py3-none-any.whl.

File metadata

File hashes

Hashes for dml_util-0.0.21.post4-py3-none-any.whl
Algorithm Hash digest
SHA256 809046c4128bf5131fa62e4c13959a6b146e545211d36ee0d2b3b8038dda6c80
MD5 53f92178d3ccd6ee1e1ccfba4ea3acdb
BLAKE2b-256 75be51b1ffd949bf9dc1961b5dd63807532ebb59f923d4a65a828dab3384ae85

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page