graphite-metrics·PyPI

Standalone graphite collectors for various stuff not (or poorly) handled by other monitoring daemons

These details have not been verified by PyPI

Project links

Project description

graphite-metrics: standalone graphite collectors for various stuff not (or
poorly) handled by other monitoring daemons

Core of the project is a simple daemon (harvestd), which collects
metric values and sends them to graphite once per interval.

Consists of separate components ("collectors") for processing of:
* /proc/slabinfo for useful-to-watch values, not everything
(configurable).
* /proc/vmstat and /proc/meminfo in a consistent way.
* /proc/stat for irq, softirq, forks.
* /proc/buddyinfo and /proc/pagetypeinfo (memory fragmentation).
* /proc/interrupts and /proc/softirqs.
* Cron log to produce start/finish events and duration for each job
into a separate metrics, adapts jobs to metric names with regexes.
* Per-system-service accounting using [1]systemd and it's cgroups.
* [2]sysstat data from sadc logs (use something like sadc -F -L -S
DISK -S XDISK -S POWER 60 to have more stuff logged there) via
sadf binary and it's json export (sadf -j, supported since
sysstat-10.0.something, iirc).
* iptables rule "hits" packet and byte counters, taken from
ip{,6}tables-save, mapped via separate "chain_name rule_no
metric_name" file, which should be generated along with firewall
rules (I use [3]this script to do that).

Additional metric collectors can be added via setuptools
graphite_metrics.collectors entry point. Look at shipped collectors
for API examples.

Running

% harvestd -h
usage: harvestd [-h] [-t host[:port]] [-i seconds] [-e collector]
[-d collector] [-c path] [-n] [--debug]

Collect and dispatch various metrics to carbon daemon.

optional arguments:
-h, --help show this help message and exit
-t host[:port], --carbon host[:port]
host[:port] (default port: 2003, can be overidden via
config file) of carbon tcp line-receiver destination.
-i seconds, --interval seconds
Interval between collecting and sending the
datapoints.
-e collector, --enable collector
Enable only the specified metric collectors, can be
specified multiple times.
-d collector, --disable collector
Explicitly disable specified metric collectors, can be
specified multiple times. Overrides --enabled.
-c path, --config path
Configuration files to process. Can be specified more
than once. Values from the latter ones override values
in the former. Available CLI options override the
values in any config.
-n, --dry-run Do not actually send data.
--debug Verbose operation mode.

See also: [4]default harvestd.yaml configuration file, which contains
configuration for all loaded collectors and can/should be overidden
using -c option.

Note that you don't have to specify all the options in each
override-config, just the ones you need to update.

For example, simple-case configuration file (say, /etc/harvestd.yaml)
just to specify carbon host and log lines format (dropping timestamp,
since it will be piped to syslog or systemd-journal anyway) might look
like this:
carbon:
host: carbon.example.host

logging:
formatters:
basic:
format: '%(levelname)s :: %(name)s: %(message)s'

And be started like this: harvestd -c /etc/harvestd.yaml

Rationale

Most other tools can (in theory) collect this data, and I've used
[5]collectd for most of these, but it:
* Doesn't provide some of the most useful stuff - nfs stats, disk
utilization time percentage, etc.
* Fails to collect some other stats, producing bullshit like 0'es,
clearly-insane or negative values (for io, network, sensors, ...).
* General-purpose plugins like "tail" add lot of complexity, making
configuration into a mess, while still lacking some basic
functionality which 10 lines of code can easily provide.
* Mangles names for metrics, as provided by /proc and referenced in
kernel docs and on the internets, no idea what the hell for,
"readability"?

Initially I've tried to implement these as collectd plugins, but it's
python plugin turned out to be leaking RAM and collectd itself
segfaults something like once-a-day, even in the latest releases
(although probably because of bug in some plugin).

Plus, collectd data requires post-processing anyway - proper metric
namespaces, counters, etc.

Given that the alternative is to just get the data and echo it as
"name val timestamp" to tcp socket, I just don't see why would I need
all the extra complexity and fail that collectd provides.

Other than collectd, I've experimented with [6]ganglia, but it's
static schema is a no-go and most of stuff there doesn't make sense in
graphite context.

Daemon binary is (weirdly) called "harvestd" because "metricsd" name
is already used to refer to [7]another graphite-related daemon (also,
[8]there is "metrics" w/o "d", probably others), and is too generic to
be used w/o extra confusion, I think. That, and I seem to lack
creativity to come up with a saner name ("reaperd" sounds too
MassEffect'ish these days).

References

1. http://www.freedesktop.org/wiki/Software/systemd
2. http://sebastien.godard.pagesperso-orange.fr/
3. https://github.com/mk-fg/trilobite
4. https://github.com/mk-fg/graphite-metrics/blob/master/graphite_metrics/harvestd.yaml
5. http://collectd.org/
6. http://ganglia.sourceforge.net/
7. https://github.com/kpumuk/metricsd
8. https://github.com/codahale/metrics

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

15.03.0

Mar 16, 2015

14.09.0

Sep 21, 2014

14.06.2

Jun 1, 2014

14.01.13

Jan 29, 2014

14.01.9

Jan 28, 2014

14.01.0

Jan 25, 2014

13.08.6

Aug 29, 2013

13.08.1

Aug 8, 2013

13.08.0

Aug 3, 2013

13.07.5

Aug 3, 2013

13.07.4

Jul 29, 2013

12.11.0

Nov 19, 2012

12.07.4

Jul 29, 2012

12.07.3

Jul 28, 2012

12.07.2

Jul 2, 2012

12.07.0

Jul 2, 2012

12.06.15

Jun 29, 2012

12.06.14

Jun 21, 2012

12.06.12

Jun 13, 2012

12.06.9

Jun 5, 2012

12.05.8

May 24, 2012

12.05.7

May 21, 2012

12.05.1

May 2, 2012

12.05.0

May 2, 2012

12.04.46

Apr 26, 2012

12.04.44

Apr 26, 2012

12.04.43

Apr 25, 2012

12.04.42

Apr 25, 2012

12.04.37

Apr 22, 2012

12.04.35

Apr 22, 2012

12.04.33

Apr 22, 2012

12.04.31

Apr 20, 2012

12.04.26

Apr 15, 2012

12.04.22

Apr 13, 2012

12.04.21

Apr 12, 2012

This version

12.04.20

Apr 12, 2012

graphite-metrics 12.04.20

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed