Job-Runner is a crontab like tool, with a nice web-frontend for
administration and (live) monitoring the current status.
- Schedule recurring jobs
- Chaining of jobs
- Load-balance workers by putting them in a pool
- Schedule jobs to run on all workers within a pool
- Live dashboard (with option to kill runs and ad-hoc scheduling)
- Multiple projects and per-project permission management
The whole project consists of three separate components (and repositories):
- Job-Runner: provides the REST interface, admin interface and (live)
dashboard. As well this component provides a long-running process
(manage.py broadcast_queue) to broadcast messages (over ZeroMQ) to the
workers. See: https://github.com/spilgames/job-runner
- Job-Runner Worker: the process that is responsible for executing the job.
It subscribes to (ZeroMQ) messages coming from broadcast_queue, send data
back over the REST interface and publishes events to the
Job-Runner WebSocket Server.
You can run as many workers as you like, as long as every worker has it’s own
API key (eg: when you want to run jobs on multiple servers or under different
usernames on the same server). API keys can be created in the Job-Runner
- Job-Runner WebSocket Server: will subscribe to Job-Runner Worker events
and re-broadcast them to WebSocket connections coming from the Job-Runner
dashboard. This makes it possible to add realtime monitoring to the
- Added more logging (notifications) and fixed minor logging issues
- Add logging about scheduling, job kills and general failures
- Add support for filtering on title in the API.
- Add support for updating and creating jobs in the API.
- Add worker_version and concurrent_jobs fields to the Worker resource.
These fields will be send by the worker at every ping response and are
displayed in the admin.
- Handle error response when the user does not have access anymore to the
worker data in the REST API. This can happen when a worker is removed from a
- Show the full job-chain on the jobs page.
- Add check to assure that there is no recursion in the job-chain
- Make it possible to continue the chain of jobs when one of the jobs failed
(or when a worker failed when run on all workers is selected).
- Add management-command health_check which marks runs assigned to
unresponsive workers as failed and alert when all workers within a pool
- Fix order_by in reschedule method.
- Redirect to the last selected project (failing that, to the first project
in the list).
- Fix: do not schedule disabled children (regression introduced by the new
way of scheduling in v3.1.0).
- Markdown support for job descriptions was added.
- Scheduled runs are now displayed in the job details.
- Improve performance by increasing the default API page-size to 200.
- Show the user a warning when there are no projects.
Before deploying this version, make sure that there aren’t any
running jobs! One way to do this is to disable all projects in the
admin, let all runs complete, and then upgrade.
- Schedule the next run before broadcasting the current scheduled run.
- Remove “after complete dts” reschedule option.
- Make sure to rollback transaction in case of an exception in
broadcast_queue management command.
- Fix issue with “run on all workers” jobs, where it wasn’t detected that
all siblings finished.
- Apply “Save as new” button to all model admins.
- Fix validation of the reschedule interval field (should not accept 0).
- Add dropdown option to see the runs / jobs of all projects.
- Show assigned workers in last 100 completed runs.
- The broadcast_queue command is now locking the selected runs. This is
needed to make sure that when there are multiple broadcasters, no duplicated
runs are generated (in case of run on all workers).
- Fix re-schedule issue (duplicates) when manually schedule a recurring job.
- Fix schedule-time when switching from or to daylight saving-time. Before
the time of a job would change when switching from or to DST. Now a job
will be always re-scheduled at the same time (when increment schedule dts by
interval is selected).
- New dashboard layout to make it more easy to view large sets of data.
- Optimization of the initial load of data (fewer API requests).
- Add support to run a job across all workers within a worker-pool.
- Optimize the way how the ModelAuthorization class is testing if the user
or worker has access to the object.
Restructure the relations between models. Workers are now project independent
and are grouped by pools. Permissions are now managed on project level
(instead on project and job-template level) By assigning a job to a pool
containing multiple workers, the job will be loadbalanced
(by selecting a random worker).
After upgrading, make sure to run manage.py migrate to migrate your
data to the new structure.
Before running manage.py migrate, make a backup of your
data! The new structure is not backwards compatible and
thus can not be migrated backwards.
- Fix duplicated enqueues when the worker is down or the enqueue is disabled
for the job. This happened for example when a parent-job tried to schedule
a child job which was disabled.
- Fix the issue where runs from other projects than the selected one, were
rendered on the dashboard.
- Add caching of objects to improve the performance (frontend).
- Complete refactor of front-end code. The front-end is now based on AngularJS.
- Misc admin interface improvements (sorting, labels, etc…).
- Add if the run was manual and / or killed to the error e-mail template.
- Broadcast ping requests to the worker (default: every 5 minutes) and show
last ping response in admin. This will make it more easy to discover problems
- Spil specific settings removed.
- Fix related name of run_log, to make sure it shows up correctly in the
template when there is an error.
- Move logs to separate model and RESTful resource. Make sure that you update
the worker to >= v1.1.0.
- Show full path (project - template - worker …) in object title.
- Improve ordering of objects in the admin.
- Added getting started section to the docs.
- Misc documentation improvements.
- Fix kill button so that it is only visible when the user has permission to
kill a job-run (would else result in a HTTP error).
- Add option in dashboard to kill job-runs.
- Fix time-zone in failed-run e-mail template (will now use the time-zone
configured in the Django config).
- Disable the job when it failed more than x times (optional setting).
- Make it possible to disable the enqueue of a project, worker or job-template.
- Display parent - child relationships in job details.
- Fix an other issue with the run broadcaster to make sure it doesn’t broadcast
multiple runs for the same job.
- Fix styling glitches by adding a boot.css which is used as long the
.less files aren’t compiled yet.
- Fix run broadcaster so that it doesn’t send runs to the workers when there
is still an other run for the same job active (in queue or started).
- Fix autoselect environment settings.
- Update hostnames in configuration.
- Add filters to limit the number of displayed jobs.
- Add status icon to display if there is a connection with the WebSocket server
- Add option for monthly re-scheduling Note: monthly re-scheduling works
by incrementing the dts with the number of days that are in the dts.
When incrementing by multiple months, it will check the days for each month.
- Add description fields to projects, workers, job-templates and jobs
- Add TransactionMiddleware
- Fix rescheduling when two runs are active of the same job
- Add MySQL to requirements.txt (since the python setup.py install is
creating a zipped .egg which doesn’t work when the user does not have
a homedir (or when the homedir is not executable).
- Show job details in a column instead of a modal
- Show last 100 runs of a job incl. duration graph
- Fix AUTHENTICATION_BACKENDS setting for staging and production
(without ModelBackend included, permissions are not working!)
- A job-title must now be unique per job-template
- Show re-schedule interval in job details
- Fix dependencies in setup.py (was not using the development version
- Run and job details are made deeplinkable
- Runs and jobs that are suspended are greyed-out
- Fix size (height) of run / job headers
- Fix order of run objects
- Fix escaping of HTML characters in job script and log output
- Enable timezone, all data is now presented (and expected to be) in the
- Order scheduled runs ascending (first to run on top)
- Add option to schedule children or not, when manually scheduling runs
- Internal related changes.
- Add job_runner.settings.env.production_longrun settings module for long
running processes to avoid “cached” results.
- Fix issue where filtering the groups would result in duplicated results.
- Remove WebKit browser notifications, since it was breaking the front-end in
- Check that runs received from the WebSocket server are within the current
- Add WebKit browser notifications.
- Major refactor: It is now possible to use AD assigned groups when
creating projects. Since basically all models are changed / renamed, it was
not possible to migrate old data to the new structure. Therefore you should
re-create the database!
- Add broadcast_queue management command to publish enqueueable runs to
- Add missing static-file and logging settings.
- Add MySQL package as a requirement + update requirements in docs.
- Make it possible to override production settings.
- Create standalone application.
- Add overview of jobs + scheduling.
TODO: Brief introduction on what you do with files - including link to relevant help section.