Skip to main content

A system monitoring and action triggering tool

Project description

Periodic Analyzer - prdanlz

Prdanlz can refresh data and evaluate rules periodically. It fetches data and saves to a variable. Derivatives are calculated from variables. Once all variables are refreshed, each incident rule is evaluated based on variables and derivatives.

An incident is a unit of monitoring. Each incident defines 3 levels: "error", "warn", and "info." An action is triggered when an incident goes into a new status. Each level defines a range of a value.

What can Prdanlz Help?

  1. Monitor a system and report an incident
  2. Monitor a system and take actions/remediation automatically

Examples

  1. Swap usage monitor
  2. CPU load monitor
  3. Battery monitor
  4. Last pid monitor

Aiming

  1. Help monitoring a small set of FreeBSD machines
  2. Obtaining system information and statistics with sysctl and commands
  3. Experimenting auto-turn FreeBSD system

Index

  1. Periodic Analyzer - prdanlz
  2. What can Prdanlz Help?
  3. Examples
  4. Index
  5. How to Install
    1. PyPI
    2. PKG + Ports
  6. How to Setup
  7. How to Run
  8. How does pradnlz Work
  9. Motivations
  10. JSON Format
    1. Top Level

    2. "constants", "variables", and "derivatives"

      1. "Syscmd" type
      2. "Sysctl" type
      3. Order of Evaluations among Variables
      4. Historical Values
        1. How to Specify How Many to Keep
        2. [How to Access] (./README.md#how-to-access)
          1. When a Value isn't yet Available
    3. "Incidents" and their "Levels"

      1. "Incident" Definition
      2. "Level" Definition
      3. "Level" Inheritance
    4. "Incident" / "Level" Variables

  11. Verify Input Configuration

How to Install

It runs on FreeBSD only.

PyPI

% python -m venv venv
% source venv/bin/activate.csh
% pip install prdanlz

PKG + Ports

py-prdanlz is not yet in FreeBSD's ports. However, ports' Makefile is prepared with the code base.

$ cd /usr/ports
$ fetch https://github.com/uyota/py-prdanlz/archive/refs/tags/v0.0.5.tar.gz
$ tar xf v0.0.5.tar.gz
$ cd py-prdanlz/ports
$ make fetch makesum install
  1. Check release page for the latest version.
  2. Ports "distinfo" file contains timestamp, checksum, and size of the release file. This file cannot be maintained well in the source repo itself due to chicken and egg problem. Run 'makesum' to generate it as a part of build process.

How to Setup

  1. Define 'constants' in JSON file
  2. Define 'variables' in JSON file
  3. Define 'derivatives' in JSON file
  4. Define 'incidents' in JSON file
  5. Invoke prdanlz program with interval and JSON file(s)

How to Run

% python -m prdanlz -c config.json -i 10 -l prdanlz.log

How does prdanlz Work?

  1. Fetch all of constants at startup
  2. After waiting an interval second, fetch all variables
  3. After all variables are fetched, calculate all of derivatives
  4. After all variables are fetched and derivatives are calculated, evaluate all of incidents
  5. If a value moves into a new level of an incident, trigger an action
  6. Wait for another interval period and repeat

Motivations

BSD's sysctl provides a lot of information about the running system.

Sysctl is a system call as well as a corresponding program name.

Base system provides number of other tools to capture system information one-time or a period of time such as "systat", "vmstat", "top", "swapctl", and etc.

Using sysctl program becomes very expensive as it starts a new process each time when gathering number of information. Calling sysctl system call saves process creation and decoding overhead; however, writing or adjusting C/C++ program requires a lot of programming hours.

prdanlz uses Python binding of sysctl system call via a library. This avoid process creation overhead. By taking advantage of Python interpreter, prdanlz can also provide and perform complex arithmetics at very low cost compare to shell scripts.

JSON Format

Top Level

Top level is a dictionary that may contain any of "constants", "variables", and/or "incidents".

Multiple JSON files may be supplied to prdanlz.

All of "constants" and "variables" names must be unique among them per prdanlz invocation. All of "incidents" names must be unique among incidents. An incident can access all of variables regardless of the order in JSON files.

All JSON configuration keywords are spelled in lower cases only.

"constants", "variables", and "derivatives"

All of "constants", "variables", "derivatives" are all dictionaries.

All of "constants", "variables", "derivatives" are variables but they have different properties. "constants" are only fetched once and thus resulting the same value all times during the monitoring. "variables" are re-fetched at each cycle. "derivatives" are calculated variables; they are derived from "constants" and variables".

Both of "constants" and "variables" are to capture system output and not convenient to manipulate their values on the fly at same time. "derivatives" can be used to calculate from "constants" and "variables" to run arithmetics on and store their results.

All of "constants" and "variables" are stored in their given names. In addition, "variables" and "derivatives" generate variables that start with "last_" allowing to refer to previous values.

A dictionary key specifies the name of a variable and dictionary value represents how to fetch data. There are 2 data types supported: "syscmd" and "sysctl"

For example,

"variables": {
    "time": {"type": "syscmd", "syscmd": "date '+%H:%M:%S'"}
}

creates "time" with "19:28:37" and "last_time" with "19:26:37" with 2 minutes interval.

"Syscmd" type

A dictionary key of "type" with "syscmd" indicates SYStem CoManD. A dictionary value of "syscmd" specifies unix command(s) to run. A sequence of commands with a pipe may be also specified.

"time": {"type": "syscmd", "syscmd": "date '+%H:%M:%S'"}
"var_messages": {"type": "syscmd", "syscmd": "wc -l /var/log/messages | awk '{print $1}'"

"Sysctl" type

A dictionary key of "type" with "sysctl" indicates capturing a sysctl value. Prdanlz uses C binding and does not invoke external 'sysctl' command.

"vm__kmem_size": {"type": "sysctl", "sysctl": "vm.kmem_size"},
"hw__ncpu": {"type": "sysctl", "sysctl": "hw.ncpu"}

Supported Structure Sysctl Types

Refer to Sysctl Types for each of struct sysctl format and its output.

Order of Evaluations among Variables

  1. All "constants" are fetched at start time and only once, first.
  2. Then, all "variables" are fetched at each cycle.
  3. Then, all "derivatives" are calculated.

Priorities are not defined among same type of variables. Therefore, they cannot depend on each other. However, different tiers of variables are always evaluated in the same order as above and thus safe to assume values are latest.

Historical Values

"Variables" and "derivatives" can hold multiple data.

How to Specify How Many to Keep

Specify "depth" and its level as argument. For example, '"depth": 10' keeps latest 10 values.

How to Access

A "Variable" or "derivative" will hold multiple values. Use Python array notation to access specify index. For example, "value[0]" will be the oldest value, "value[-1]" will be the latest value, "value[2]" will be the 3rd oldest value, and so on.

When a Value isn't yet Available

One may want to start evaluating after some values are accumulated with history. Accessing to not-yet-populated value raises an exception but caught silently in prdanlz. In other words, all rules are ignored until enough data points are generated. For example, if an evaluation is like "value[10] > 90", all evaluation up to and including 9th time are ignored.

"Incidents" and their "Levels"

"Incidents" is a dictionary that contains "indecent" definitions.

"Incident" Definition

An incident is a directory.

  1. Requires "description"
  2. Requires either one or more of "error", "warn", "info" level definition

Custom Levels

The 3 tier level of "error", "warn", and "info" is a preset value. Custom level can be specified with --levels option. The levels are ordered from highest to lowest severity.

% python --levels critical error warning info ...

"Level" Definition

A level is a directory.

  1. Requires "description"
  2. Requires "trigger"
  3. Requires "untrigger"
  4. Requires "escalation"

"description"

Useful description is helpful to keep track what it does, how it does, who wants what.

"trigger"

A trigger is an expression to indicate a variable becomes within threshold. When an expression is evaluated to true for the first time, its "escalation" is executed.

"untrigger"

An untrigger is an expression. When an expression is evaluated to true for being alerted, its "escalated" status is cleared.

Both true values from trigger and untrigger expressions indicates a value is within threshold. An untrigger is a low watermark and an trigger is a hight watermark.

"trigger"/"untrigger" Formats

"trigger" and "untrigger" take a string of expressions. These expressions can access constants and variables by surrounded variable name with curly braces.

Examples

Given

"trigger": "90 < {value}",
"untrigger": "{value} < 85"
  1. When value changes [ 10, 20, 30 ], nothing is triggered
  2. When value changes [ 0, 95 ], 95 is a trigger
  3. When value changes [ 0, 88 ], 88 is NOT a trigger
  4. When value changes [ 0, 95, 83 ], 95 is a trigger and 83 is an untrigger
  5. When value changes [ 0, 95, 93 ], 95 is a trigger but 93 is neither trigger nor untrigger
  6. When value changes [ 0, 95, 89, 91 ], 95 is an trigger but neither 89 nor 91

"escalation" and its Format

An escalation is an expression that is executed as a shell command. A sequence of commands can be specified. Variables surrounded by curly braces are replaced to actual values.

Examples

An useful way is send '{description} with 'logger' to write to syslog.

"escalation": "logger '{level} {description}'"

"Level" Inheritance

If any of "trigger", "untrigger", or "escalation" is not specified, lower levels assume they are same as a higher level entry.

Use "level" local variables to simplify and re-use same trigger, untrigger, and/or escalation at multiple levels.

Given

"error": { "trigger":..., "untrigger":..., "escalation": "logger '${level}'" },
"warn": { "trigger":..., "untrigger":...  },
"info": { "trigger":..., "untrigger":..., "escalation": "echo ${level}" }

"error" and "warn" are sent to syslog but "info" is only echo-ed.

"Incident" / "Level" Variables

Primitive values in each of "incident" and "level" dictionaries are accessible. However, collisions with other variables are not warned.

Given

{
"error": { "urgency": "!!!", "escalation": "echo 'Help{urgency}', ... }
"warn": { "urgency": "!", ... }
"info": { "urgency": "", ... }
}

"error" prints "Help!!!" and "warn" does "Help!" and "info" does "Help" accordingly.

Verify Input Configuration

Use --verify to check configuration files. It loads constants and variables, calculate derivatives, and evaluates triggers, untriggers, and escalation. The results are printed.

% python -m prdanlz --verify -c prdanlz.json

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

prdanlz-0.2.2.tar.gz (20.9 kB view hashes)

Uploaded Source

Built Distribution

prdanlz-0.2.2-py3-none-any.whl (23.0 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page