"A Make-like tool with a syntax similar to Drake."
Faz is a data workflow tool heavily inspired in .. _Drake: https://github.com/Factual/drake
The intended use is combining data treatment scripts in bash, python, ruby (or anything else, with a little coding) into a single text file.
The name “faz” is portuguese for “do” or “make”.
The various steps can be separated into tasks, with defined inputs and outputs. Dependencies between the tasks are determined from inputs and outputs of every task. The program executes all tasks in the appropriate order, checking for the existence of output and input files.
- Because I like Drake but can’t stand the startup time of java.
- Because I can (actually to see if I can, but it turns out I can).
- simple but robust functionality
- easy to use and extend (the code, minus the tests, is around 300 lines of python)
- fast startup time (compared to Drake)
- Documentation: https://faz.readthedocs.org.
pip install faz
From the command line, just type
without arguments, the program will read the tasks from a file called “fazfile”. If you want to use another filename, just give that as an argumento to the program
to get a list of command line arguments type
Task file basics
The task file is a plain text file, with a syntax similar to Drake input files. The following is an example with two tasks
# file1 <- touch file1 # file2 <- file1 cat file1 > file2
Lines starting with “#” and having the symbols “<-” signal a task. On the left of the “<-” is a (comma separated) list of the files produced by the task. On the right are the task dependencies, the files needed to run that task. In the above example the first task has no dependencies, and produces a file called “file1”. The second task has “file1” as a dependency, and has as output a file called “file2”.
The outputs and inputs and inputs of each task are used by the program to estabilish the order by which the tasks have to be run, and if they need to be run. In the example above, if a file called “file1” was already present in the directory the program was run, the first task would not be executed.
The code sections, are all the lines in betweeen the two task lines. In these two tasks, they are just are just plain bash commands but could be, for example, python code
# file1 <- touch file1 # file2 <- file1 :python f1 = open("file1") text = file1.read() f2 = open("file2", "w") f2.write(text)
note that, in the second task, there’s an extra option “:python”, wich indicates to the program that the code from this task is python code. Options are a list of (comma separated) keywords follwing the “:”, and must be placed after the inputs.
- First release.
- Bug fixes.
- Project name change.
- NetworkX dependency removed.
- Input and output names added to task environment.
- Bug Fixes in variable expansion code.
- Added a mechanism to include other task files.
- dependencies and outputs can now be on different directories.
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
|Filename, Size & Hash SHA256 Hash Help||File Type||Python Version||Upload Date|
(11.6 kB) Copy SHA256 Hash SHA256
|Wheel||2.7||Jul 20, 2016|
(21.2 kB) Copy SHA256 Hash SHA256
|Source||None||Jul 20, 2016|