The original BigBrother design, that introduced the great plain-text protocol, the clear web-interface and much more that is still used, was implemented in shell-script. That did not only limit performance due to forking problems but also required to write a lot of data to disk in order to persist state. So it was possible to grep trough that on-disk files to create custom reports.
The design of Xymon is optimised for speed and scaling. One of the consequences is the attempt to avoid disk-IO whenever possible and thus most of the transient states are only held in memory. So extracting additional information from saved status-logs by grep’ing through is not possible any more1.
Fortunately Xymon has very powerful tools to query states and the collected data via CLI.
Note: All commands below are executed on the Xymon server directly to avoid dealing with IP restrictions and other security settings.
Get the full status-log for a test with
To resemble the behaviour of grep’ing through log files on ancient BB we use
xymon-tool to send the
xymondlog HOSTNAME.TESTNAME message to get the
full status-log for a single test2:
root@bb:~# /usr/lib/xymon/client/bin/xymon 127.0.0.1 "xymondlog bb.local.cpu" | head -n10 bb.local|cpu|green||1425021929|1425025242|1425027042|0|0|127.0.0.1||||Y| green Fri Feb 27 12:20:37 MSK 2015 up: 05:52, 1 users, 80 procs, load=0.01 System clock is 0 seconds off top - 12:20:37 up 5:52, 1 user, load average: 0.00, 0.01, 0.05 Tasks: 80 total, 1 running, 79 sleeping, 0 stopped, 0 zombie %Cpu(s): 0.1 us, 0.1 sy, 0.0 ni, 99.9 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st KiB Mem: 355296 total, 152692 used, 202604 free, 33092 buffers KiB Swap: 735228 total, 0 used, 735228 free, 52592 cached
The pipe-delimited first line contains hostname, testname, color, testflags,
lastchange, validtime, … for the test (all details can be found in
If we are interested in more than a single test or want to extract data from
the first line we can use the
xymondboard-message instead of looping and
xymondlog. See the next section.
Query state-info with various criteria using
The exact syntax is
xymondboard [CRITERIA] [fields=FIELDLIST]3.
Get the state of all checks for host
bb.local using the default value for
fields (more on fields below):
root@bb:~# /usr/lib/xymon/client/bin/xymon 127.0.0.1 "xymondboard host=bb.local" bb.local|trends|green||0|0|0|0|0||| bb.local|info|green||0|0|0|0|0||| bb.local|xymond|green||1425021922|1425024629|1425026429|0|0|xymond||green bb.local|sslcert|green||1425021930|1425024640|1425026440|0|0|127.0.0.1||green Fri Feb 27 12:10:34 2015 bb.local|ssh|green|OrdastLe|1425021930|1425024640|1425026440|0|0|127.0.0.1||green <!-- [flags:OrdastLe] --> Fri Feb 27 12:10:34 2015 ssh ok bb.local|xymongen|green||1425021924|1425024634|1425026434|0|0|127.0.0.1||green Fri Feb 27 12:10:34 2015 bb.local|memory|green||1425021929|1425024640|1425026440|0|0|127.0.0.1||green Fri Feb 27 12:10:34 MSK 2015 - Memory OK bb.local|files|clear||1425021929|1425024640|1425026440|0|0|127.0.0.1||clear Fri Feb 27 12:10:34 MSK 2015 - Files ok bb.local|msgs|green||1425021929|1425024640|1425026440|0|0|127.0.0.1||green Fri Feb 27 12:10:34 MSK 2015 - System logs ok bb.local|ports|clear||1425021929|1425024640|1425026440|0|0|127.0.0.1||clear Fri Feb 27 12:10:34 MSK 2015 - Ports ok bb.local|procs|clear||1425021929|1425024640|1425026440|0|0|127.0.0.1||clear Fri Feb 27 12:10:34 MSK 2015 - Processes ok bb.local|inode|green||1425021929|1425024640|1425026440|0|0|127.0.0.1||green Fri Feb 27 12:10:34 MSK 2015 - Filesystems ok bb.local|disk|green||1425021929|1425024640|1425026440|0|0|127.0.0.1||green Fri Feb 27 12:10:34 MSK 2015 - Filesystems ok bb.local|cpu|green||1425021929|1425024640|1425026440|0|0|127.0.0.1||green Fri Feb 27 12:10:34 MSK 2015 up: 05:42, 1 users, 80 procs, load=0.01 bb.local|xymonnet|green||1425021930|1425024640|1425026440|0|0|127.0.0.1||green Fri Feb 27 12:10:34 2015 bb.local|http|green||1425021930|1425024640|1425026440|0|0|127.0.0.1||green Fri Feb 27 12:10:34 2015: OK bb.local|bbd|green|OrdastLe|1425021930|1425024640|1425026440|0|0|127.0.0.1||green <!-- [flags:OrdastLe] --> Fri Feb 27 12:10:34 2015 bbd ok bb.local|conn|green|OrdAstLe|1425021930|1425024640|1425026440|0|0|127.0.0.1||green <!-- [flags:OrdAstLe] --> Fri Feb 27 12:10:34 2015 conn ok
If the (
host=...) filter is omitted all hosts and checks are returned.
The CRITERIA can also be a
page or a
Filtering by test/check is also possible with
root@bb:~# /usr/lib/xymon/client/bin/xymon 127.0.0.1 "xymondboard host=bb.local test=cpu" bb.local|cpu|green||1425021929|1425024640|1425026440|0|0|127.0.0.1||green Fri Feb 27 12:10:34 MSK 2015 up: 05:42, 1 users, 80 procs, load=0.01
fields it is also possible to get direct access to particular
state-information without string-parsing the pipe-delimited output.
The examples above used the default settings for the fields, that returns the
following default-set of information:
There are other fields like
msg (returns the full message),
advanced information about flapping),
acklist (with details on
acknowlegements for a test with timestamp, user who asked, …) and more.
So by using the
fields=msg-filter to return the full message we can run a
query similar to the above
xymondlog-example to get the whole message (the
returned data has linebreaks as
\n that we replace for better readability
root@bb:~# /usr/lib/xymon/client/bin/xymon 127.0.0.1 "xymondboard host=bb.local test=cpu fields=msg" | sed -e 's;\\n;\n;g' | head -n10 status bb,local.cpu green Fri Feb 27 12:15:36 MSK 2015 up: 05:47, 1 users, 82 procs, load=0.01 System clock is 0 seconds off top - 12:15:36 up 5:47, 1 user, load average: 0.00, 0.01, 0.05 Tasks: 82 total, 1 running, 81 sleeping, 0 stopped, 0 zombie %Cpu(s): 0.1 us, 0.1 sy, 0.0 ni, 99.9 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st KiB Mem: 355296 total, 153376 used, 201920 free, 33004 buffers KiB Swap: 735228 total, 0 used, 735228 free, 52572 cached
As you can see this is a valid Xymon status-message, that could be fed back
into xymon (where
xymondlog does not return a
status bb,local.cpu ...-line
but instead a pipe-separated list of status values).
root@bb:~# /usr/lib/xymon/server/bin/xymon 127.0.0.1 "xymondboard test=fw fields=msg" | sed -e 's;\\n;\n;g' | egrep '^status|^file_md5
(In real life this would probably also have a
page=dc1/mailservers filter or
Just one additional
grep -v <desired-file-md5 would print non-conforming entries only.
Check the firmware of your HBAs if you use my raid-monitor extension
root@bb:~# /usr/lib/xymon/server/bin/xymon 127.0.0.1 "xymondboard test=raid fields=msg" | sed -e 's;\\n;\n;g' | egrep '^FW version:'
FW version: string needs to be adjusted of course. A similar approach can
be used to check the disk-firmware, HBA cache settings or raid-set
This article showed how to query the status logs of Xymon with the built-in CLI. This allows to create custom reports from the collected data, integrate with other systems like an system-overview webpage shared with a wider audience and much more.
The next article will show how to access the raw data sent by the clients.
Update (2015-04-11): Part 2 is available.
The included compatibility module
xymond_filestoreallows to write state to disk in the old BB-style. This is disabled by default in
tasks.cfgfor performance reasons and is not required for the techniques used in this article.↩
Yes, I’m a bit nostalgic with hostnames.↩
All examples below use
xymondboardwhich returns the non marked-up text-data where
xymondxboardreturns the data in XML format.↩