Skip to main content

Nagios / Icinga monitoring plugin to check systemd for failed units.

Project description

This package on the Python Package Index Tests Documentation Status

check_systemd

check_systemd is a Nagios / Icinga monitoring plugin to check systemd. This Python script will report a degraded system to your monitoring solution. It can also be used to monitor individual systemd services (with the -u, --unit parameter) and timers units (with the -t, --dead-timers parameter). The only dependency the plugin needs is the Python library nagiosplugin.

Installation

pip install check_systemd

Packages

check_systemd on repology.org.

  • archlinux (package, source code): yaourt -S check_systemd

  • Ubuntu (package, source code): apt install monitoring-plugins-systemd

  • Debian (package, source code): apt install monitoring-plugins-systemd

  • NixOS (package, source code): nix-env -iA nixos.check_systemd

  • Fedora (package, source code): dnf install nagios-plugins-systemd

  • OracleLinux9 / RHEL9 (package, source code, binary): This package includes one single binary compiled with the Python compiler Nuitka, including all dependencies. The package is built via GitLab CI as a nightly release and is considered experimental. curl -L -o check_systemd-1.0-1.x86_64.rpm "https://gitlab.com/msfgitlab/check_systemd_build_rpm/-/jobs/artifacts/main/raw/output/check_systemd-1.0-1.x86_64.rpm?job=release_rpm" && sudo dnf install -y ./check_systemd-1.0-1.x86_64.rpm

Command line interface

usage: check_systemd [-h] [-v] [-d] [-V] [-i] [-I REGEXP] [-u UNIT_NAME]
                     [--include-type UNIT_TYPE [UNIT_TYPE ...]] [-e REGEXP]
                     [--exclude-unit UNIT_NAME [UNIT_NAME ...]]
                     [--exclude-type UNIT_TYPE]
                     [--state {active,reloading,inactive,failed,activating,deactivating}]
                     [-t] [-W SECONDS] [-C SECONDS] [-n] [-w SECONDS]
                     [-c SECONDS] [--dbus | --cli] [--user] [-P | -p]

Copyright (c) 2014-18 Andrea Briganti <kbytesys@gmail.com>
Copyright (c) 2019-24 Josef Friedrich <josef@friedrich.rocks>

Nagios / Icinga monitoring plugin to check systemd.

options:
  -h, --help            show this help message and exit
  -v, --verbose         Increase output verbosity (use up to 3 times).
  -d, --debug           Increase debug verbosity (use up to 2 times): -d: info
                        -dd: debug.
  -V, --version         show program's version number and exit

Options related to unit selection:
  By default all systemd units are checked. Use the option '-e' to exclude units
  by a regular expression. Use the option '-u' to check only one unit.

  -i, --ignore-inactive-state
                        Ignore an inactive state on a specific unit. Oneshot
                        services for example are only active while running and
                        not enabled. The rest of the time they are inactive.
                        This option has only an affect if it is used with the
                        option -u.
  -I REGEXP, --include REGEXP
                        Include systemd units to the checks. This option can be
                        applied multiple times, for example: -I mnt-data.mount
                        -I task.service. Regular expressions can be used to
                        include multiple units at once, for example: -i
                        'user@\d+\.service'. For more informations see the
                        Python documentation about regular expressions
                        (https://docs.python.org/3/library/re.html).
  -u UNIT_NAME, --unit UNIT_NAME, --include-unit UNIT_NAME
                        Name of the systemd unit that is being tested.
  --include-type UNIT_TYPE [UNIT_TYPE ...]
                        One or more unit types (for example: 'service', 'timer')
  -e REGEXP, --exclude REGEXP
                        Exclude a systemd unit from the checks. This option can
                        be applied multiple times, for example: -e mnt-
                        data.mount -e task.service. Regular expressions can be
                        used to exclude multiple units at once, for example: -e
                        'user@\d+\.service'. For more informations see the
                        Python documentation about regular expressions
                        (https://docs.python.org/3/library/re.html).
  --exclude-unit UNIT_NAME [UNIT_NAME ...]
                        Name of the systemd unit that is being tested.
  --exclude-type UNIT_TYPE
                        One or more unit types (for example: 'service', 'timer')
  --state {active,reloading,inactive,failed,activating,deactivating}, --required {active,reloading,inactive,failed,activating,deactivating}, --expected-state {active,reloading,inactive,failed,activating,deactivating}
                        Specify the active state that the systemd unit must have
                        (for example: active, inactive)

Timers related options:
  -t, --timers, --dead-timers
                        Detect dead / inactive timers. See the corresponding
                        options '-W, --dead-timer-warning' and '-C, --dead-
                        timers-critical'. Dead timers are detected by parsing
                        the output of 'systemctl list-timers'. Dead timer rows
                        displaying 'n/a' in the NEXT and LEFT columns and the
                        time span in the column PASSED exceeds the values
                        specified with the options '-W, --dead-timer-warning'
                        and '-C, --dead-timers-critical'.
  -W SECONDS, --timers-warning SECONDS, --dead-timers-warning SECONDS
                        Time ago in seconds for dead / inactive timers to
                        trigger a warning state (by default 6 days).
  -C SECONDS, --timers-critical SECONDS, --dead-timers-critical SECONDS
                        Time ago in seconds for dead / inactive timers to
                        trigger a critical state (by default 7 days).

Startup time related options:
  -n, --no-startup-time
                        Don’t check the startup time. Using this option the
                        options '-w, --warning' and '-c, --critical' have no
                        effect. Performance data about the startup time is
                        collected, but no critical, warning etc. states are
                        triggered.
  -w SECONDS, --warning SECONDS
                        Startup time in seconds to result in a warning status.
                        The default is 60 seconds.
  -c SECONDS, --critical SECONDS
                        Startup time in seconds to result in a critical status.
                        The default is 120 seconds.

Monitoring data acquisition:
  --dbus                Use the systemd’s D-Bus API instead of parsing the text
                        output of various systemd related command line
                        interfaces to monitor systemd. At the moment the D-Bus
                        backend of this plugin is only partially implemented.
  --cli                 Use the text output of serveral systemd command line
                        interface (cli) binaries to gather the required data for
                        the monitoring process.
  --user                Also show user (systemctl --user) units.

Performance data:
  By default performance data is attached.

  -P, --performance-data
                        Attach performance data to the plugin output.
  -p, --no-performance-data
                        Attach no performance data to the plugin output.

Performance data:
  - count_units
  - startup_time
  - units_activating
  - units_active
  - units_failed
  - units_inactive

Project pages

Behind the scenes

dbus

Command line interface (cli) parsing:

To detect failed units this monitoring script runs:

systemctl list-units --all

To get the startup time it executes:

systemd-analyze

To find dead timers this plugin launches:

systemctl list-timers --all

To learn how systemd produces the text output on the command line, it is worthwhile to take a look at systemd’s source code. Files relevant for text output are: basic/time-util.c, analyze/analyze.c.

Testing

pyenv install 3.6.12
pyenv install 3.7.9
pyenv local 3.6.12 3.7.9
pip3 install tox
tox

Test a single test case:

tox -e py38 -- test/test_scope_timers.py:TestScopeTimers.test_all_n_a

Deploying

Edit the version number in check_systemd.py (without v). Use the -s option to sign the tag (required for the Debian package).

git tag -s v2.0.11
git push --tags

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

check_systemd-4.1.1.tar.gz (33.1 kB view details)

Uploaded Source

Built Distribution

check_systemd-4.1.1-py3-none-any.whl (31.6 kB view details)

Uploaded Python 3

File details

Details for the file check_systemd-4.1.1.tar.gz.

File metadata

  • Download URL: check_systemd-4.1.1.tar.gz
  • Upload date:
  • Size: 33.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.4 CPython/3.12.1 Linux/6.5.0-1025-azure

File hashes

Hashes for check_systemd-4.1.1.tar.gz
Algorithm Hash digest
SHA256 5efbd687dd63d20662abba1aef7994973224dabb5df7c70a0a3bc92a88370dae
MD5 eff91f5947fb75ada538950ced2de1b7
BLAKE2b-256 888a343c74684ad872b5128415f22685a8e1cf6ff85e255f642e56d5acc7903d

See more details on using hashes here.

File details

Details for the file check_systemd-4.1.1-py3-none-any.whl.

File metadata

  • Download URL: check_systemd-4.1.1-py3-none-any.whl
  • Upload date:
  • Size: 31.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.4 CPython/3.12.1 Linux/6.5.0-1025-azure

File hashes

Hashes for check_systemd-4.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 ca734c44e42c51a66ded33eae280c15149ae31e1fb490d082f498333007c4148
MD5 bba6bcf76ef609c3fff96c3aa5490476
BLAKE2b-256 45447fdfcf8e61c2eaa019cbf585208e3f9d3b78a02d8065f917770baf76358f

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page