Parallelize the execution of tasks with pytask.
Project description
pytask-parallel
Parallelize the execution of tasks with pytask-parallel which is a plugin for
pytask.
Installation
pytask-parallel is available on PyPI and Anaconda.org. Install it with
$ pip install pytask-parallel
# or
$ conda install -c conda-forge pytask-parallel
By default, the plugin uses loky's robust implementation of the ProcessPoolExecutor.
It is also possible to select the ProcessPoolExecutor or ThreadPoolExecutor from the
concurrent.futures module
as backends to execute tasks asynchronously.
Usage
To parallelize your tasks across many workers, pass an integer greater than 1 or
'auto' to the command-line interface.
$ pytask -n 2
$ pytask --n-workers 2
# Starts os.cpu_count() - 1 workers.
$ pytask -n auto
Using processes to parallelize the execution of tasks is useful for CPU bound tasks such as numerical computations. (Here is an explanation on what CPU or IO bound means.)
For IO bound tasks, tasks where the limiting factor are network responses, access to files, you can parallelize via threads.
$ pytask --parallel-backend threads
You can also set the options in a pyproject.toml.
# This is the default configuration. Note that, parallelization is turned off.
[tool.pytask.ini_options]
n_workers = 1
parallel_backend = "loky" # or processes or threads
Some implementation details
Parallelization and Debugging
It is not possible to combine parallelization with debugging. That is why --pdb or
--trace deactivate parallelization.
If you parallelize the execution of your tasks using two or more workers, do not use
breakpoint() or import pdb; pdb.set_trace() since both will cause exceptions.
Threads and warnings
Capturing warnings is not thread-safe. Therefore, warnings cannot be captured reliably
when tasks are parallelized with --parallel-backend threads.
Changes
Consult the release notes to find out about what is new.
Development
-
pytask-paralleldoes not call thepytask_execute_task_protocolhook specification/entry-point becausepytask_execute_task_setupandpytask_execute_taskneed to be separated frompytask_execute_task_teardown. Thus, plugins which change this hook specification may not interact well with the parallelization. -
There are two PRs for CPython which try to re-enable setting custom reducers which should have been working, but does not. Here are the references.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file pytask_parallel-0.2.1.tar.gz.
File metadata
- Download URL: pytask_parallel-0.2.1.tar.gz
- Upload date:
- Size: 13.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.1 CPython/3.9.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f40b58ddb8c101da173a1a9ae7bbc8dcdd18f5aa77a669c0eeb524798035c67f
|
|
| MD5 |
4131f6db22d56f77190c6afcceb8aa96
|
|
| BLAKE2b-256 |
85e2ff8923eb7ce8abccc9dd3c4b0cdd373845682f8521bc61a9155bb8536c42
|
File details
Details for the file pytask_parallel-0.2.1-py3-none-any.whl.
File metadata
- Download URL: pytask_parallel-0.2.1-py3-none-any.whl
- Upload date:
- Size: 11.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.1 CPython/3.9.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
7efa79b9d24b83e30ec06e5ddf5f91be6a55e046c6100c060c4f1088fa6ffd55
|
|
| MD5 |
f4fcaaf079b9c5e76b0be62ebfab8b78
|
|
| BLAKE2b-256 |
985ecf9d15f8c8519f87fd906352d828c3a9915597c7688609ccf7665ae44f92
|