Cron alert board
Crab is a dashboard system for monitoring cron jobs, or other scheduled tasks. The Crab server receives messages when tasks start or finish, and displays the status of all of the tasks via a web interface. It can also send notifications by email, for example to warn if a task fails, is missed or does not complete within its time-out period.
Tasks communicate with the Crab server by JSON messages sent by HTTP PUT requests. The finish message includes the status of the job, and any output from it. Further messages are used to import and export the client’s crontab, which the server uses to determine the intended schedule.
crontab (0.15 or newer): http://pypi.python.org/pypi/crontab/#downloads
Mako templates: http://pypi.python.org/pypi/Mako/#downloads
PyRSS2Gen: http://dalkescientific.com/Python/PyRSS2Gen.html (optional)
Font Awesome: http://fortawesome.github.com/Font-Awesome (optional)
- Crab server
Has been tested on Python 2.6, 2.7 and 3.2.
- Client library and utilities
Works with Python 2.4 in addition to the above versions (but may require the pytz and simplejson packages also to be installed).
The Crab server, clients and libraries can be installed as follows:
python setup.py install
If necessary, the --install-data option can be used to configure the location in which the templates (templ), resources (res) and example files (doc) should be installed.
To run Crab without installing it, and if any of the Python dependancies listed above can not be installed, they can be symlinked into the lib directory in the following locations:
lib/PyRSS2Gen.py lib/cherrypy lib/crontab lib/mako
To use Font Awesome icons, copy or symlink its font directory into Crab’s res directory, and also place its stylesheet inside that subdirectory, giving:
The Crab Server
A SQLite database file can be prepared for Crab using the schema provided:
% sqlite3 crab.db < doc/schema.txt
The Crab server is configured by a crabd.ini file which can be placed either in /etc/crab/ or ~/.crab/. Note that this is a CherryPy configuration file, which is read slightly differently to typical .ini files which use Python’s ConfigParser.
% cp doc/crabd.ini ~/.crab/
The example crabd.ini file should be edited to uncomment the [crab] and [store] sections. The home and file entries must point to the location of Crab’s data files and the database file just created. By default the data files are installed in share/crab relative to the Python system prefix (sys.prefix).
The Crab server is run as crabd. When the server is executed directly, it will stay in the foreground:
It can also be run in the background with the crabd-check script, which checks that it is not still running from a previous invocation of crabd-check. Therefore this is suitable for running from cron to keep the server running:
PYTHONPATH=/path/to/crab/lib PATH=/path/to/crab/scripts:/bin:/usr/bin 7-57/10 * * * * CRABIGNORE=yes crabd-check
With the server running, the Crab dashboard should be visible from a web browser, by default on port 8000. The Crab clients will use this same web service to communicate with the server.
Monitoring Cron Jobs
There are two Crab client commands: the crab utility, and the crabsh wrapper shell. Cron jobs can either be run under crabsh, or they can be updated to report their own status to the Crab server.
The Crab clients are configured by a crab.ini file which can be placed either in /etc/crab/ or ~/.crab/. The file specifies how to contact the Crab server, and the username and hostname which the client will use to report cron jobs.
% cp doc/crab.ini ~/.crab/
The configuration can be checked with the crab info command. This reports the settings, and indicates which configuration files were read. It is a useful way to check that everything is in order before importing a crontab.
The crabsh Wrapper
crabsh is a wrapper script designed to act like a shell. It can therefore be invoked by cron via the SHELL variable, for example:
PYTHONPATH=/path/to/crab/lib SHELL=/path/to/crab/scripts/crabsh 0 10 * * 1-5 CRABID=test echo "Test cron job"
Where the rules following the SHELL assignment will be run with the wrapper. The PYTHONPATH will need to be set if Crab is not installed where the system can find it. Cron requires the full path when specifying the SHELL. The CRABID parameter is used to give the cron job a convenient and unique name. This is optional, unless there are multiple jobs with the same command, in which case they would otherwise be indistinguishable. However if it specified, then it must be unique for a given host and user, as the Crab server will use it in preference to the command string to identify cron job reports.
crabsh will notify the server when the job starts, and when it finishes, assuming it succeeded if the exit status was zero.
Crab-aware Cron Jobs
Alternatively a cron job can report its own status to the Crab server. The most straightforward way to do this is to execute the crab utility. So a cron job written as a shell script could include commands such as:
% crab start -c "$0" % crab finish -c "$0" % crab fail -c "$0"
In this way you can also report an unknown status with crab unknown.
If the cron job is written in Python, it could import crab.client directly and make use of the CrabClient class.
A Perl module WWW::Crab::Client is also available.
- Other languages
Other language libraries could be written. They would need to make HTTP PUT requests with an appropriate JSON message.
Managing the Cron Job List
The Crab server needs to be given the schedule for each job so that it can detect when a job is late or missed. This is done by “importing” a user’s crontab file:
% crab import
The database entries can then be checked by “exporting” them, again using the crab utility:
% crab export > CRON_TZ=Pacific/Honolulu > 0 10 * * 1-5 CRABID=test echo "Test cron job"
The output is a set of crontab-style lines representing the entries from the database. The crontab can be retrieved exactly as last imported (from a separate database table containing the raw crontab) by giving the --raw option as follows:
% crab export --raw
This is useful as a backup in case a crontab is accidentally lost. However it will not contain any new jobs which have been added automatically by the Crab server since the last import.
Cron Job Parameters
In order to specify the Crab specific parameters of a cron job, Bourne-style shell variables at the start of a command are used. The syntax for each cron job is as follows:
<schedule> [CRABIGNORE=yes] [CRABID=<identifier>] <command string>
A command starting with CRABIGNORE set to a value other than 0/no/off/false will be ignored when importing a crontab, and crabsh will not report its status to the Crab server.
A CRABID specification will override any CRABID environment variable in effect, and is a better way of specifying the identifier as it can not apply to more than one cron job. There should not be multiple jobs with the same identifier for any user and host.
The Crab parameters can be placed in any order before the remainder of the command string, but they must precede any other variables.
If present and not set to 0/no/off/false then crabsh will print out the standard output and standard error it receives from the cron job. This allows the output to be sent by email via cron’s default behavior as well as being captured by the Crab system.
If present overrides the Crab server home directory, where the res and templ directories are to be found.
Specifies the Crab server to which clients should connect, overriding the setting in the configuration file.
Specifies the job identifier which crabsh will use to file reports if there is no CRABID= variable at the start of the cron command. This should be used with caution to avoid specifying the same identifier for multiple cron jobs.
Prevents Crab from acting on specific cron jobs. Jobs imported with this value present and not set to 0/no/off/false will not be entered into the database. Additionally if the crabsh wrapper script is used to run such a job, it will not report its status to the Crab server.
Gives the path to a PID file which crabsh should use to control the execution of a cron job. When this parameter is set, it will use the file to try not to run multiple copies of the job at the same time. Each job should have a separate PID file, so this parameter is most conveniently given at the start of a command string.
Specifies the port on the Crab server, overriding the setting in the configuration file.
The shell which crabsh will use to invoke the cron job command. Defaults to /bin/sh regardless of the user’s shell to replicate cron’s behavior.
The directory to be searched for system-level configuration files. If not set, then /etc/crab will be used.
Cron reads this variable to know in which timezone to interpret the crontab schedule. When the server receives a crontab, it will check for this timezone and use it to override the general timezone which the crab utility will send with the crontab (if it is able to determine it).
Configures the email address to which cron sends email. This is useful when CRABECHO is on, or if crabsh needs to report a failure to contact the Crab server.
Cron uses this variable to select the shell which will be used to execute the cron jobs. The full path must be specified. Crab does not use this variable itself.
This can be set to the system timezone, in which case crab import will use it as the default timezone for the crontab.
The Web Interface
The Crab dashboard allows the status of the jobs to be monitored. On this page, the job status column will change color to indicate the status, and it will flash while the job is running. Clicking on the status will lead to the most recent output recorded for the job.
The host and user columns contain links leading to a summary page of the cron jobs for a given user or host. From this page, the links below each table can be used to show deleted jobs, and to display the raw crontab as last imported.
Clicking on a job ID or command link leads to the job information page, giving a summary of the job’s parameters and a table of the most recent events. Clicking the status of any job finish event leads to the corresponding output.
Below the summary on the job information page, there is a link allowing the job’s configuration to be edited. Any parameter which is left blank here will use the default value.
If a job is deleted, then its configuration is considered to be orphaned. In this case, when configuring a job for which no configuration exists, the system will offer a list of orphaned configurations for re-linking. This should be used when the job is actually the continuation of a previous job. Note that notifications which are attached to specific jobs are linked via the configuration. Therefore re-linking the configuration will re-attach all associated notifications.
However this problem can generally be avoided by giving the jobs suitable names via the CRABID parameter. Crab will then be able to recognize jobs by name even if the command string changes.
Crab includes a configurable notifications system, which currently supports sending notification messages by email. Notifications can either be attached to a specific job, or configured by host name and/or by user name.
A link below the summary on the job information page allows notifications to be attached to that job. Check-boxes for each notification can be used to select which severity of events should be featured, and whether the job output should be included. The schedule box should contain a cron-style schedule specification (e.g. 0 12 * * *), and if left blank, will default to the value given in the crabd.ini file, allowing all notification schedules to be managed in one place. Notifications will only be sent if there are relevant events, so it is possible to request almost-immediate error warnings by including a schedule of * * * * * and selecting errors only.
The add and delete links can be used to add and remove notifications, but the changes are not saved until the Configure button is clicked.
The drop-down menu which appears when the mouse is positioned over the Crab heading at the top of each page includes a link to the main notifications page. This allows notifications to be configured by host name and/or by user name. Notifications will include any jobs where the host and user match the specified values, but if either is left blank, then it will match all entries.
Copyright (C) 2012 Science and Technology Facilities Council.
Crab is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
You should have received a copy of the GNU General Public License along with Crab. If not, see <http://www.gnu.org/licenses/>.
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.