Simple Orchestration for running a set of jobs each of which consist of a set of tasks.
Project description
#Job Orchestration
What is the point in this package?
Whilst running long experiments for https://aclanthology.org/2021.eacl-main.219/ There was a number of characteristics / pain points that felt unaddressed by the current open source packages.
- I want to run large number of experiments that are different in some small number of ways that can be controlled via
config.
- I want to be able to set of a list of experiments that should be run and not have to wait for experiment X to finish before I kick off experiment Y
- I want to be able to run these in parallel when appropriate.
- I don't want the failure of one experiment to effect another experiment.
- These experiments naturally broke down into a set of atomic tasks and when the 4th Task Failed I wanted to be able to not have to repeat the first 3 experiments.
- If I get a future result that was surprising, I want to be able to go back to an earlier experiment and
- Look at the raw output including logs from that experiment.
- Rerun in exactly the same setup as before (including being able to revert my local version of the code base to that point in time).
- Be able to compare what has changed in the code base between the two runs.
- If there was a train model as part of this experiment I wanted this to still be present.
- While the experiments are running I want an easy centralised place that I can see progress some estimation of how long was left.
- I want validation to run prior to the experiment running. For example, I didn't want to train a model for X hours only to find that the result didn't save due to a missing path.
This package is an attempt to provide a basic framework to start to address these pain points. It was also important that the package was Lightweight and didn't increase the run time significantly. Also while it was accepted that this package would require some level opinionation to achieve these goals there was an aim to under index on this for the main feature of the package where possible. Finally, we only considered experiments that could be run on a single machine scaling out to multiple machines can be achieved by just partitioning the config files.
Get Started:
Download the package by running the following commands.
pip install job-orchestration
The first thing that you will need to do is set the JOB_ORCHESTRATION_WORKSPACE
environment variable to tell the
module where you will write logs / output / etc too.
Interface:
To enable this module to run your code we require that you follow the convention that you have a file called Tasks
which contains a class that implements the TaskBase
class. So the class needs a constructor which takes in the config
dictionary and provides a run method to do the actual work. Optionally, you can also provide a validate method
(Recommended that you do ).
Config Files:
To control the execution of your program you will need to provide one or more yaml config files. As a minimum we require that you provide us with the following fields: githubRepository: Url of the git repo that your project is under. pathToTasks: This is the parent file of the Tasks.py file. outputDir: Path to write the output too it - this will be a sub folder of the Output Directory. tasks: A list of the tasks that need to be performed by the library. These must contain at least: - id - must be unique in the set of tasks in this config file. - method - The name of the class in the Tasks file to execute.
These fields plus any others that are present will be passed through to as a dictionary object to the task constructor and so can be used to control execution there.
Now the next question is where do I put this config? You must put it in the "ConfigSources" sub-directory under the
path given by theJOB_ORCHESTRATION_WORKSPACE
environment variable. (Pro-tip once you have set the environment variable
just run python -m job_orchestration
to create all the subdirectories).
Now we need to ready this config file to do this run
python -m job_orchestration -action readyConfigs
- this will validate and setup the output location for any configs
in the ConfigSources directory before moving them to the ConfigsToRun directory.
Finally, we need to run a worker which will run the tasks one by one in each of the yaml files. The worker will run
and pick up a config file from "ConfigsToRun" directory and attempt to run it and repeat.
python -m job_orchestration -action readyConfigs
For more clarity please see the example Tasks.py and config file in the /demo folder:
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for job_orchestration-0.0.4-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 41d06852bf425228771602c36df36954fc5380b09423dcf750b5884ceb251566 |
|
MD5 | dde726ab183d12b671c7709ecd958077 |
|
BLAKE2b-256 | 22d302287b9c3dfd6c47942a89ac7e33b6acff5d9efbfb88301ce5bfb5f6902f |