taskgraph·PyPI

Parallel task graph framework

These details have not been verified by PyPI

Project links

Homepage

Project description

TaskGraph is a library that was developed to help manage complicated computational software pipelines consisting of long running individual tasks. Many of these tasks could be executed in parallel, almost all of them wrote results to disk, and many times results could be reused from part of the pipeline. TaskGraph manages all of this for you. With it you can schedule tasks with dependencies, avoid recomputing results that have already been computed, and allot multiple CPU cores to execute tasks in parallel if desired.

TaskGraph Dependencies

Task Graph is written in pure Python, but if the psutils package is installed the distributed multiprocessing processes will be niced.

Example Use

Install TaskGraph with

pip install taskgraph

Then

import os
import pickle
import logging

import taskgraph

logging.basicConfig(level=logging.DEBUG)

def _create_list_on_disk(value, length, target_path):
    """Create a numpy array on disk filled with value of `size`."""
    target_list = [value] * length
    pickle.dump(target_list, open(target_path, 'wb'))


def _sum_lists_from_disk(list_a_path, list_b_path, target_path):
    """Read two lists, add them and save result."""
    list_a = pickle.load(open(list_a_path, 'rb'))
    list_b = pickle.load(open(list_b_path, 'rb'))
    target_list = []
    for a, b in zip(list_a, list_b):
        target_list.append(a+b)
    pickle.dump(target_list, open(target_path, 'wb'))

# create a taskgraph that uses 4 multiprocessing subprocesses when possible
if __name__ == '__main__':
    workspace_dir = 'workspace'
    task_graph = taskgraph.TaskGraph(workspace_dir, 4)
    target_a_path = os.path.join(workspace_dir, 'a.dat')
    target_b_path = os.path.join(workspace_dir, 'b.dat')
    result_path = os.path.join(workspace_dir, 'result.dat')
    result_2_path = os.path.join(workspace_dir, 'result2.dat')
    value_a = 5
    value_b = 10
    list_len = 10
    task_a = task_graph.add_task(
        func=_create_list_on_disk,
        args=(value_a, list_len, target_a_path),
        target_path_list=[target_a_path])
    task_b = task_graph.add_task(
        func=_create_list_on_disk,
        args=(value_b, list_len, target_b_path),
        target_path_list=[target_b_path])
    sum_task = task_graph.add_task(
        func=_sum_lists_from_disk,
        args=(target_a_path, target_b_path, result_path),
        target_path_list=[result_path],
        dependent_task_list=[task_a, task_b])

    task_graph.close()
    task_graph.join()

    # expect that result is a list `list_len` long with `value_a+value_b` in it
    result = pickle.load(open(result_path, 'rb'))

Caveats

Taskgraph’s default method of checking whether a file has changed (hash_algorithm='sizetimestamp') uses the filesystem’s modification timestamp, interpreted in integer nanoseconds. This check is only as accurate as the filesystem’s timestamp. For example:
- FAT and FAT32 timestamps have a 2-second modification timestamp resolution
- exFAT has a 10 millisecond timestamp resolution
- NTFS has a 100 nanosecond timestamp resolution
- HFS+ has a 1 second timestamp resolution
- APFS has a 1 nanosecond timestamp resolution
- ext3 has a 1 second timestamp resolution
- ext4 has a 1 nanosecond timestamp resolution
If you suspect timestamp resolution to be an issue on your filesystem, you may wish to store your files on a filesystem with more accurate timestamps or else consider using a different hash_algorithm.

Running Tests

Taskgraph includes a tox configuration for automating builds across multiple python versions and whether psutil is installed. To execute all tests on all platforms, run:

$ tox

Alternatively, if you’re only trying to run tests on a single configuration (say, python 3.7 without psutil), you’d run:

$ tox -e py37

Or if you’d like to run the tests for the combination of Python 3.7 with psutil, you’d run:

$ tox -e py37-psutil

If you don’t have multiple python installations already available on your system, an easy way to accomplish this is to use tox-conda (https://github.com/tox-dev/tox-conda) which will use conda environments to manage the versions of python available:

$ pip install tox-conda
$ tox

Project details

These details have not been verified by PyPI

Project links

Homepage

Release history Release notifications | RSS feed

This version

0.11.2

May 22, 2025

0.11.1

Oct 27, 2023

0.11.0

Oct 13, 2021

0.10.3

Jan 29, 2021

0.10.2

Dec 11, 2020

0.10.1

Dec 11, 2020

0.10.0

Aug 25, 2020

0.9.1

Jun 4, 2020

0.9.0

Mar 6, 2020

0.8.5

Sep 11, 2019

0.8.3

Feb 28, 2019

0.8.2

Jan 31, 2019

0.8.1

Jan 10, 2019

0.7.2

Nov 21, 2018

0.7.1

Nov 19, 2018

0.7.0

Oct 22, 2018

0.6.1

Aug 14, 2018

0.6.0

Jul 24, 2018

0.5.2

Jul 17, 2018

0.5.1

Jun 20, 2018

0.5.0

May 4, 2018

0.4.0

Apr 19, 2018

0.3.0

Nov 29, 2017

0.2.6

Nov 9, 2017

0.2.5

Oct 11, 2017

0.2.4

Sep 19, 2017

0.2.3

Sep 19, 2017

0.2.2

Aug 16, 2017

0.2.0

Aug 3, 2017

0.1.2

Jul 31, 2017

0.1.1

Jul 31, 2017

0.1.0

Jul 30, 2017

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

taskgraph-0.11.2.tar.gz (42.7 kB view details)

Uploaded May 22, 2025 Source

Built Distribution

taskgraph-0.11.2-py3-none-any.whl (23.1 kB view details)

Uploaded May 22, 2025 Python 3

File details

Details for the file taskgraph-0.11.2.tar.gz.

File metadata

Download URL: taskgraph-0.11.2.tar.gz
Upload date: May 22, 2025
Size: 42.7 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.12.10

File hashes

Hashes for taskgraph-0.11.2.tar.gz
Algorithm	Hash digest
SHA256	`1df18759418150e74c54dfd7c23d1272974e88e7f96257de3ec65156ef1de532`
MD5	`fff5986e9b055bf69352b3102eeaa6be`
BLAKE2b-256	`9f6124e3bd2f340f8270599d93ecf9dcd12061c2e561e1019c98b7aeaa144369`

See more details on using hashes here.

File details

Details for the file taskgraph-0.11.2-py3-none-any.whl.

File metadata

Download URL: taskgraph-0.11.2-py3-none-any.whl
Upload date: May 22, 2025
Size: 23.1 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.12.10

File hashes

Hashes for taskgraph-0.11.2-py3-none-any.whl
Algorithm	Hash digest
SHA256	`dc0bdabb4489c779bba26a8e1b1420bd9b5f47eba78ffbc352bb0866000ad5cf`
MD5	`687460c3e0bc0d9dc0a703b3986d2835`
BLAKE2b-256	`e72093d337c1f82bcafe888d472e9b12651cfbb1c51338318d1d0fdcad54189f`

See more details on using hashes here.

taskgraph 0.11.2

Navigation

Verified details

Owner

Unverified details

Project links

Meta

Classifiers

Project description

TaskGraph Dependencies

Example Use

Caveats

Running Tests

Project details

Verified details

Owner

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes