parallel-dag is a package that allows parallel execution of a set of functions written in Python
Project description
tawazi
Introduction
This library helps you execute a set of functions in a DAG dependency structure in parallel. It aims at providing This in a production environment; hence it satisfies (will satisfy in the near future):
- Stable, Robust, well tested
- lightweight
- Thread Safety
- Low to no dependencies
- Legacy python versions support
- pypy support
In the context of the DAG, these functions are called ExecNode
s.
This library supports:
- Limiting the number of "Threads" to be used
- Priority Choice of each
ExecNode
- Per
ExecNode
choice of parallelization (i.e. AnExecNode
is allowed to run in parallel with anotherExecNode
or not)
Usage
from time import sleep
from tawazi import DAG, ExecNode
def a():
print("Function 'a' is running", flush=True)
sleep(1)
return "A"
def b():
print("Function 'b' is running", flush=True)
sleep(1)
return "B"
def c(a, b):
print("Function 'c' is running", flush=True)
print(f"Function 'c' received {a} from 'a' & {b} from 'b'", flush=True)
return f"{a} + {b} = C"
if __name__ == "__main__":
# Define dependencies
# ExecNodes are defined using an id_: it has to be hashable (It can be the function itself)
exec_nodes = [
ExecNode(a, a, is_sequential=False),
ExecNode(b, b, is_sequential=False),
ExecNode(c, c, [a, b], is_sequential=False),
]
g = DAG(exec_nodes, max_concurrency=2)
g.build()
g.execute()
print(g.results_dict)
Reason for the name
The libraries name is inspired from the arabic word تَوَازٍ which means parallel.
Future developments
This library is still in development. Breaking changes are expected.
A couple of features will be released soon:
- support multiprocessing
- simulation of the execution using a DAG stored ledger
- support more python versions
- Try the library on windows machine
- Include python, mypy, black etc. in the README
- Disallow execution in parallel of some threads in parallel with some other threads
- maybe by making a group of threads that are CPU bound and a group of threads that are IO bound?
- Remove dependency on networkx !?
- add line maximum columns
- decide whether to identify the ExecNode by a Hashable ID or by its own Python ID. This is breaking change and must change to 0.2.1
- support multiple return of a function!? this is rather complicated!? I have to wrap every returned value in an object and then decide the dependencies using that
- change the name of the library to tawazi
- the goal of this library is to run the DAG nodes in parallel and to run the same DAG in parallel in multiple threads or to run the same ops between different DAGs with no side effects what so ever
- run subset of execnodes only
- clean the new DAG interface and document it
- document dagster interface and correct the tests
- put documentation about different cases where it is advantageous to use it
- in methods not only in functions
- in a gunicorn application
- for getting information from multiple resources
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file tawazi-0.1.2.tar.gz
.
File metadata
- Download URL: tawazi-0.1.2.tar.gz
- Upload date:
- Size: 16.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: python-requests/2.27.1
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 4c090bf8107cd398b25466e713643f1d9507fde6a9c29b08f1230732c39bd39d |
|
MD5 | ff20cf9e2a861090dbbfb6c1d115c615 |
|
BLAKE2b-256 | 3d82c89636a5585e14907f75694aa034466b02c115f9ea05783dae71f2e38a7e |
File details
Details for the file tawazi-0.1.2-py3-none-any.whl
.
File metadata
- Download URL: tawazi-0.1.2-py3-none-any.whl
- Upload date:
- Size: 14.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: python-requests/2.27.1
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | aca99b0a874a7ffa084e4c150534211b0e78503843dfdab4546c8460a03071ba |
|
MD5 | bf8066a4247b0da26eee950c587a3ff6 |
|
BLAKE2b-256 | 943b2323698cd972dd3278d61fe5cfa94e84eec3119369362b48744ecfb7dac8 |