Skip to main content

NRPE plugin for monitoring Docker containers and swarms

Project description

Build Status Code Climate Test Coverage

check_docker

Nagios/NRPE compatible plugins for checking docker based services. Currently there are two nagios checks

  • check_docker which checks docker container health
  • check_swarm which checks health of swarm nodes and services

With check_docker can use it to check and alert on

  • memory consumption in absolute units (bytes, kb, mb, gb) and as a percentage (0-100%) of the container limit.
  • CPU usages as a percentage (0-100%) of container limit.
  • automatic restarts performed by the docker daemon
  • container status, i.e. is it running?
  • container health checks are passing?
  • uptime, i.e. is it able to stay running for a long enough time?
  • the presence of a container or containers matching specified names
  • image version (experimental!), does the running image match that in the remote registry?

With check_swarm you can alert

  • if a node is not joined to a docker swarm
  • if a service is running in a swarm

These checks can communicate with a local docker daemon socket file (default) or with local or remote docker daemons using secure and non-secure TCP connections.

These plugins require python 3. It is tested on 3.3 and greater but may work on older versions of 3.

check_docker Usage

usage: check_docker [-h]
                    [--connection [/<path to>/docker.socket|<ip/host address>:<port>]
                    | --secure-connection [<ip/host address>:<port>]]
                    [--timeout TIMEOUT]
                    [--containers CONTAINERS [CONTAINERS ...]] [--present]
                    [--cpu WARN:CRIT] [--memory WARN:CRIT:UNITS]
                    [--status STATUS] [--health] [--uptime WARN:CRIT]
                    [--version] [--restarts WARN:CRIT]

Check docker containers.

optional arguments:
  -h, --help            show this help message and exit
  --connection [/<path to>/docker.socket|<ip/host address>:<port>]
                        Where to find docker daemon socket. (default:
                        /var/run/docker.sock)
  --secure-connection [<ip/host address>:<port>]
                        Where to find TLS protected docker daemon socket.
  --timeout TIMEOUT     Connection timeout in seconds. (default: 10.0)
  --containers CONTAINERS [CONTAINERS ...]
                        One or more RegEx that match the names of the
                        container(s) to check. If omitted all containers are
                        checked. (default: ['all'])
  --present             Modifies --containers so that each RegEx must match at
                        least one container.
  --cpu WARN:CRIT       Check cpu usage percentage taking into account any
                        limits. Valid values are 0 - 100.
  --memory WARN:CRIT:UNITS
                        Check memory usage taking into account any limits.
                        Valid values for units are %,b,k,m,g.
  --status STATUS       Desired container status (running, exited, etc).
                        (default: None)
  --health              Check container's health check status
  --uptime WARN:CRIT    Minimum container uptime in seconds. Use when
                        infrequent crashes are tolerated.
  --version             Check if the running images are the same version as
                        those in the registry. Useful for finding stale
                        images. Only works with public registry.
  --restarts WARN:CRIT  Container restart thresholds.

check_swarm Usage

usage: check_swarm [-h]
                   [--connection [/<path to>/docker.socket|<ip/host address>:<port>]
                   | --secure-connection [<ip/host address>:<port>]]
                   [--timeout TIMEOUT]
                   (--swarm | --service SERVICE [SERVICE ...])

Check docker swarm.

optional arguments:
  -h, --help            show this help message and exit
  --connection [/<path to>/docker.socket|<ip/host address>:<port>]
                        Where to find docker daemon socket. (default:
                        /var/run/docker.sock)
  --secure-connection [<ip/host address>:<port>]
                        Where to find TLS protected docker daemon socket.
  --timeout TIMEOUT     Connection timeout in seconds. (default: 10.0)
  --swarm               Check swarm status
  --service SERVICE [SERVICE ...]
                        One or more RegEx that match the names of the
                        services(s) to check.
usage: check_swarm [-h]
                   [--connection [/<path to>/docker.socket|<ip/host address>:<port>]
                   | --secure-connection [<ip/host address>:<port>]]
                   [--timeout TIMEOUT]
                   (--swarm | --service SERVICE [SERVICE ...])

Gotchas:

  • When using check_docker with older versions of docker (I have seen 1.4 and 1.5) –status only supports ‘running’, ‘restarting’, and ‘paused’.
  • When using check_docker, if no container is specified, all containers are checked. Some containers may return critcal status if the selected check(s) require a running container.

Project details


Release history Release notifications

History Node

2.0.6

History Node

2.0.5

History Node

2.0.1

History Node

2.0.0

This version
History Node

1.0.5

History Node

1.0.4

History Node

1.0.3

History Node

1.0.1

History Node

1.0.0

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Filename, size & hash SHA256 hash help File type Python version Upload date
check_docker-1.0.5.tar.gz (21.5 kB) Copy SHA256 hash SHA256 Source None Oct 29, 2017

Supported by

Elastic Elastic Search Pingdom Pingdom Monitoring Google Google BigQuery Sentry Sentry Error logging CloudAMQP CloudAMQP RabbitMQ AWS AWS Cloud computing DataDog DataDog Monitoring Fastly Fastly CDN DigiCert DigiCert EV certificate StatusPage StatusPage Status page