Skip to main content

Easily run experiment permutations with multi-processing and caching.

Project description

Labtech makes it easy to define multi-step experiment pipelines and run them with maximal parallelism and result caching:

  • Defining tasks is simple; write a class with a single run() method and parameters as dataclass-style attributes.
  • Flexible experiment configuration; simply create task objects for all of your parameter permutations.
  • Handles pipelines of tasks; any task parameter that is itself a task will be executed first and make its result available to its dependent task(s).
  • Implicit parallelism; Labtech resolves task dependencies and runs tasks in sub-processes with as much parallelism as possible.
  • Implicit caching and loading of task results; configurable and extensible options for how and where task results are cached.
  • Integration with mlflow; Automatically log task runs to mlflow with all of their parameters.

Installation

pip install labtech

Usage

from time import sleep

import labtech

# Decorate your task class with @labtech.task:
@labtech.task
class Experiment:
    # Each Experiment task instance will take `base` and `power` parameters:
    base: int
    power: int

    def run(self) -> int:
        # Define the task's run() method to return the result of the experiment:
        labtech.logger.info(f'Raising {self.base} to the power of {self.power}')
        sleep(1)
        return self.base ** self.power

def main():
    # Configure Experiment parameter permutations
    experiments = [
        Experiment(
            base=base,
            power=power,
        )
        for base in range(5)
        for power in range(5)
    ]

    # Configure a Lab to run the experiments:
    lab = labtech.Lab(
        # Specify a directory to cache results in (running the experiments a second
        # time will just load results from the cache!):
        storage='demo_lab',
        # Control the degree of parallelism:
        max_workers=5,
    )

    # Run the experiments!
    results = lab.run_tasks(experiments)
    print([results[experiment] for experiment in experiments])

if __name__ == '__main__':
    main()

Animated GIF of labtech demo on the command-line

Labtech can also produce graphical progress bars in Jupyter when the Lab is initialized with notebook=True:

Animated GIF of labtech demo in Jupyter

Tasks parameters can be any of the following types:

  • Simple scalar types: str, bool, float, int, None
  • Collections of any of these types: list, tuple, dict, Enum
  • Task types: A task parameter is a "nested task" that will be executed before its parent so that it may make use of the nested result.

Here's an example of defining a single long-running task to produce a result for a large number of dependent tasks:

from time import sleep

import labtech

@labtech.task
class SlowTask:
    base: int

    def run(self) -> int:
        sleep(5)
        return self.base ** 2

@labtech.task
class DependentTask:
    slow_task: SlowTask
    multiplier: int

    def run(self) -> int:
        return self.multiplier * self.slow_task.result

def main():
    some_slow_task = SlowTask(base=42)
    dependent_tasks = [
        DependentTask(
            slow_task=some_slow_task,
            multiplier=multiplier,
        )
        for multiplier in range(10)
    ]

    lab = labtech.Lab(storage='demo_lab')
    results = lab.run_tasks(dependent_tasks)
    print([results[task] for task in dependent_tasks])

if __name__ == '__main__':
    main()

Labtech can even generate a Mermaid diagram to visualise your tasks:

from labtech.diagram import display_task_diagram

some_slow_task = SlowTask(base=42)
dependent_tasks = [
    DependentTask(
        slow_task=some_slow_task,
        multiplier=multiplier,
    )
    for multiplier in range(10)
]

display_task_diagram(dependent_tasks)
classDiagram
    direction BT

    class DependentTask
    DependentTask : SlowTask slow_task
    DependentTask : int multiplier
    DependentTask : run() int

    class SlowTask
    SlowTask : int base
    SlowTask : run() int


    DependentTask <-- SlowTask: slow_task

To learn more, dive into the following resources:

Mypy Plugin

For mypy type-checking of classes decorated with labtech.task, simply enable the labtech mypy plugin in your mypy.ini file:

[mypy]
plugins = labtech.mypy_plugin

Contributing

  • Install Poetry dependencies with make deps
  • Run linting, mypy, and tests with make check
  • Documentation:
    • Run local server: make docs-serve
    • Build docs: make docs-build
    • Deploy docs to GitHub Pages: make docs-github
    • Docstring style follows the Google style guide

TODO

  • Add unit tests

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

labtech-0.5.1.tar.gz (35.4 kB view hashes)

Uploaded Source

Built Distribution

labtech-0.5.1-py3-none-any.whl (38.0 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page