A job queue with data dependencies
Project description
parallex
System Requirements
Python >= 3.8
install
pip install tx-parallex
Install from source
- Clone the repo
- Easy install instructions:
# Create a virtual environment called 'px'
conda create -n px python=3.8
# start-up the environment you just created
conda activate px
# install the rest of the tx-parallex pre-requirements
pip install -r requirements.txt
- Test
# run the tests, a number of test 'specs'
PYTHONPATH=src pytest -x -vv --full-trace -s --timeout 10
# deactivate the environment (if desired)
conda deactivate
set log level
set environment variable LOG_LEVEL to one of Python's logging library setLevel.
Introduction
A queue with dependencies
Usage
from tx.parallex import run_python
ret = run_python(number_of_workers = 4, pyf = "spec.py", dataf = "data.yml")
Spec
tx-parallex specs can be written in YAML or a Python-like DSL. The Python-like DSL is translated to YAML by tx-parallex. Each object in a spec specifies a task. When the task is executed, it is given a dict called data. The pipeline will return a dictionary.
YAML
Assuming you have a function sqr defined in module math which returns the square of its argument.
def sqr(x):
return x * x
let
The let task sets data for its subtask. It adds a new var value pair into data within the scope of its subtask, and executes that task.
Syntax:
type: let
var: <var>
obj: <value>
sub: <subtask>
Example:
type: let
var: a
obj:
data: 1
sub:
type: python
name: y
mod: math
func: sqr
params:
x:
name: a
map
The map task reads a list coll from data and applies a subtask to each member of the list. The members will be assigned to var in data passed to those tasks
Syntax:
type: map
coll: <value>
var: <variable name>
sub: <subtask>
<value> is an object of the form:
Reference an entry in data or the name of a task
"name": <variable name>
Constant
"data": <constant>
Example:
type: map
coll:
data:
- 1
- 2
- 3
var: a
sub:
type: python
name: y
mod: math
func: sqr
params:
x:
name: a
cond
The cond task reads a boolean value and if it is true then it executes the then task otherwise it executes the else task.
Syntax:
type: cond
on: <value>
then: <subtask>
else: <subtask>
Example:
type: cond
on:
data:
true
then:
type: ret
var: x
obj:
data: 1
else:
type: ret
var: x
obj:
data: 0
python
You can use any Python module.
The python task runs a Python function. It reads parameters from data. The return value must be pickleable.
Syntax:
type: python
name: <name>
mod: <module>
func: <function>
params: <parameters>
<parameters> is an object of the form:
<param> : <value>
...
<param> : <value>
where <param> can be either name or position.
Example:
type: python
name: y
mod: math
func: sqr
params:
x:
data: 1
top
The top task toplogically sorts subtasks based on their dependencies and ensure the tasks are executed in parallel in the order compatible with those dependencies.
Syntax:
type: top
sub: <subtasks>
It reads the name properties of subtasks that are not in data.
Example:
type: top
sub:
- type: python
name: y
mod: math
func: sqr
params:
x:
data: 1
- type: python
name: z
mod: math
func: sqr
params:
x:
name: y
ret
ret specify a name that will map to a value. The pipeline will return a dictionary containing these names. When a task appears under a map task, each name is prefix with the index of the element in that collection as following
<index>.<name>
For nested maps, the indices will be chained together as followings
<index>. ... .<index>.<name>
Syntax:
type: ret
var: <var>
obj: <value>
Example:
type: ret
var: x
obj:
name: z
Python
A dsl block contains a subset of Python.
- There is a semantic difference from python. Any assignment in block is not visiable outside of the block.
- Assignment within a block are unordered
- return statement
Available syntax:
import
from <module> import *
from <module> import <func>, ..., <func>
import names from module
<module> absolute module names
assignment
<var> = <const>
where
<const> = <integer> | <number> | <boolean> | <string> | <list> | <dict>
This translates to let.
Example:
a = 1
y = sqr(x=a)
return {
"b": y
}
function application
<var> = [<module>.]<func>(<param>=<expr>, ...) | <expr>
This translate to python.
where <var> is name
<expr> is
<expr> = <expr> if <expr> else <expr> | <expr> <binop> <expr> | <expr> <boolop> <expr> | <expr> <compare> <expr> | <unaryop> <expr> | <var> | <const>
<binop>, <boolop> and <compare> and <unaryop> are python BinOp, BoolOp, Compare, and UnaryOp. <expr> is translated to a set of assignments, name, or data depending on its content.
Example:
y = math.sqr(1)
z = math.sqr(y)
return {
"c": z
}
parallel for
for <var> in <expr>:
...
This translates to map.
Example:
for a in [1, 2, 3]:
y = math.sqr(a)
return {
"b": y
}
if
if <expr>:
...
else:
...
This translates to cond.
Example:
if z:
return {
"x": 1
}
else:
return {
"x": 0
}
The semantics of if is different from python, variables inside if is not visible outside
return
return <dict>
This translates to ret. The key of the dict will be translated to the var in ret.
Example:
y = math.sqr(1)
return {
"b": y
}
Data
data can be arbitrary yaml
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file tx-parallex-0.0.48.tar.gz.
File metadata
- Download URL: tx-parallex-0.0.48.tar.gz
- Upload date:
- Size: 19.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/2.0.0 pkginfo/1.5.0.1 requests/2.23.0 setuptools/46.1.3.post20200330 requests-toolbelt/0.9.1 tqdm/4.46.0 CPython/3.8.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b1c54864c2a0ac1dfd145934540c87906c254903402a07d31e11978fc86be574
|
|
| MD5 |
c935d65fd642c4e1647b8c0ba7f5ca1f
|
|
| BLAKE2b-256 |
061197b3cc9120624215930707cf800038fa3bbd74c7895e2dfe97348bc63d29
|
File details
Details for the file tx_parallex-0.0.48-py3-none-any.whl.
File metadata
- Download URL: tx_parallex-0.0.48-py3-none-any.whl
- Upload date:
- Size: 18.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/2.0.0 pkginfo/1.5.0.1 requests/2.23.0 setuptools/46.1.3.post20200330 requests-toolbelt/0.9.1 tqdm/4.46.0 CPython/3.8.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c772f61d13d101653a3bc0a517e88e933a28b8cf1596a730ddbda559f64bd301
|
|
| MD5 |
f1ca4bb0aea080904e32bfd37a9d6bd3
|
|
| BLAKE2b-256 |
bc4e968bf24e3b44669ebd9f8566a0c5cc284286c6806c24face605393bcce73
|