Data workflow tool (rough copy of drake for python)
Project description
Snake is a tool for managing programming workflow dependencies. It's an attempt at a port of Factual's drake (https://github.com/Factual/drake) to Python.
-To get started with snake:
1) pip install python-snake
2) Create a file named Snakefile in the directory of the data workflow.
3) Run snake.py in the dataworkflow directory to execute the Snakefile
-Creating a basic Snakefile
The Snakefile contains the information about the data dependencies. It contains a list of dependency rules and the bash commands they entail.
Example rule:
"out.txt" <- "in.txt"
echo "test"; cat "in.txt" > "out.txt"
That rule encodes the fact that "out.txt" depends on "in.txt". To generate "out.txt" from "in.txt" snake will run the bash command 'echo "test"; cat "in.txt" > "out.txt"'.
More advanced examples:
basic_cmd = """(echo "test"; cat $INPUT0) > $OUTPUT0"""
"v5.txt" <- "v1.txt", "v2.txt" [cmd:basic_cmd]
"v6.txt" <- "v3.txt", "v4.txt" [cmd:basic_cmd]
"v7.txt" <- "v5.txt", "v6.txt" [cmd:basic_cmd]
"v8.txt", "v9.txt" <- "v7.txt" [cmd:basic_cmd]
"v10.txt", "v11.txt" <- "v8.txt" [cmd:basic_cmd]
"v12.txt", "v13.txt" <- "v9.txt" [cmd:basic_cmd]
for i in range(1,6):
next = i+1
output = "n{next}.txt".format(**vars())
input = "n{i}.txt".format(**vars())
output <- input
(echo "test"; cat $INPUT0) > $OUTPUT0
-To get started with snake:
1) pip install python-snake
2) Create a file named Snakefile in the directory of the data workflow.
3) Run snake.py in the dataworkflow directory to execute the Snakefile
-Creating a basic Snakefile
The Snakefile contains the information about the data dependencies. It contains a list of dependency rules and the bash commands they entail.
Example rule:
"out.txt" <- "in.txt"
echo "test"; cat "in.txt" > "out.txt"
That rule encodes the fact that "out.txt" depends on "in.txt". To generate "out.txt" from "in.txt" snake will run the bash command 'echo "test"; cat "in.txt" > "out.txt"'.
More advanced examples:
basic_cmd = """(echo "test"; cat $INPUT0) > $OUTPUT0"""
"v5.txt" <- "v1.txt", "v2.txt" [cmd:basic_cmd]
"v6.txt" <- "v3.txt", "v4.txt" [cmd:basic_cmd]
"v7.txt" <- "v5.txt", "v6.txt" [cmd:basic_cmd]
"v8.txt", "v9.txt" <- "v7.txt" [cmd:basic_cmd]
"v10.txt", "v11.txt" <- "v8.txt" [cmd:basic_cmd]
"v12.txt", "v13.txt" <- "v9.txt" [cmd:basic_cmd]
for i in range(1,6):
next = i+1
output = "n{next}.txt".format(**vars())
input = "n{i}.txt".format(**vars())
output <- input
(echo "test"; cat $INPUT0) > $OUTPUT0
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
python-snake-0.0.6.tar.gz
(6.4 kB
view hashes)