The original BigBrother design, that introduced the great plain-text protocol, the clear web-interface and much more that is still used, was implemented in shell-script. That did not only limit performance due to forking problems but also required to write a lot of data to disk in order to persist state. So it was possible to grep trough that on-disk files to create custom reports.

The design of Xymon is optimised for speed and scaling. One of the consequences is the attempt to avoid disk-IO whenever possible and thus most of the transient states are only held in memory. So extracting additional information from saved status-logs by grep’ing through is not possible any more1.

Fortunately Xymon has very powerful tools to query states and the collected data via CLI.

Note: All commands below are executed on the Xymon server directly to avoid dealing with IP restrictions and other security settings.

Get the full status-log for a test with xymondlog

To resemble the behaviour of grep’ing through log files on ancient BB we use the xymon-tool to send the xymondlog HOSTNAME.TESTNAME message to get the full status-log for a single test2:

root@bb:~# /usr/lib/xymon/client/bin/xymon 127.0.0.1 "xymondlog bb.local.cpu"  | head -n10
bb.local|cpu|green||1425021929|1425025242|1425027042|0|0|127.0.0.1||||Y|
green Fri Feb 27 12:20:37 MSK 2015 up: 05:52, 1 users, 80 procs, load=0.01
System clock is 0 seconds off


top - 12:20:37 up  5:52,  1 user,  load average: 0.00, 0.01, 0.05
Tasks:  80 total,   1 running,  79 sleeping,   0 stopped,   0 zombie
%Cpu(s):  0.1 us,  0.1 sy,  0.0 ni, 99.9 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
KiB Mem:    355296 total,   152692 used,   202604 free,    33092 buffers
KiB Swap:   735228 total,        0 used,   735228 free,    52592 cached

The pipe-delimited first line contains hostname, testname, color, testflags, lastchange, validtime, … for the test (all details can be found in man 1 xymon).

If we are interested in more than a single test or want to extract data from the first line we can use the xymondboard-message instead of looping and parsing with xymondlog. See the next section.

Query state-info with various criteria using xymondboard

The exact syntax is xymondboard [CRITERIA] [fields=FIELDLIST]3.

Get the state of all checks for host bb.local using the default value for fields (more on fields below):

root@bb:~# /usr/lib/xymon/client/bin/xymon 127.0.0.1 "xymondboard host=bb.local"
bb.local|trends|green||0|0|0|0|0|||
bb.local|info|green||0|0|0|0|0|||
bb.local|xymond|green||1425021922|1425024629|1425026429|0|0|xymond||green
bb.local|sslcert|green||1425021930|1425024640|1425026440|0|0|127.0.0.1||green Fri Feb 27 12:10:34 2015
bb.local|ssh|green|OrdastLe|1425021930|1425024640|1425026440|0|0|127.0.0.1||green <!-- [flags:OrdastLe] --> Fri Feb 27 12:10:34 2015 ssh ok
bb.local|xymongen|green||1425021924|1425024634|1425026434|0|0|127.0.0.1||green Fri Feb 27 12:10:34 2015
bb.local|memory|green||1425021929|1425024640|1425026440|0|0|127.0.0.1||green Fri Feb 27 12:10:34 MSK 2015 - Memory OK
bb.local|files|clear||1425021929|1425024640|1425026440|0|0|127.0.0.1||clear Fri Feb 27 12:10:34 MSK 2015 - Files ok
bb.local|msgs|green||1425021929|1425024640|1425026440|0|0|127.0.0.1||green Fri Feb 27 12:10:34 MSK 2015 - System logs ok
bb.local|ports|clear||1425021929|1425024640|1425026440|0|0|127.0.0.1||clear Fri Feb 27 12:10:34 MSK 2015 - Ports ok
bb.local|procs|clear||1425021929|1425024640|1425026440|0|0|127.0.0.1||clear Fri Feb 27 12:10:34 MSK 2015 - Processes ok
bb.local|inode|green||1425021929|1425024640|1425026440|0|0|127.0.0.1||green Fri Feb 27 12:10:34 MSK 2015 - Filesystems ok
bb.local|disk|green||1425021929|1425024640|1425026440|0|0|127.0.0.1||green Fri Feb 27 12:10:34 MSK 2015 - Filesystems ok
bb.local|cpu|green||1425021929|1425024640|1425026440|0|0|127.0.0.1||green Fri Feb 27 12:10:34 MSK 2015 up: 05:42, 1 users, 80 procs, load=0.01
bb.local|xymonnet|green||1425021930|1425024640|1425026440|0|0|127.0.0.1||green Fri Feb 27 12:10:34 2015
bb.local|http|green||1425021930|1425024640|1425026440|0|0|127.0.0.1||green Fri Feb 27 12:10:34 2015: OK
bb.local|bbd|green|OrdastLe|1425021930|1425024640|1425026440|0|0|127.0.0.1||green <!-- [flags:OrdastLe] --> Fri Feb 27 12:10:34 2015 bbd ok
bb.local|conn|green|OrdAstLe|1425021930|1425024640|1425026440|0|0|127.0.0.1||green <!-- [flags:OrdAstLe] --> Fri Feb 27 12:10:34 2015 conn ok

If the (host=...) filter is omitted all hosts and checks are returned. The CRITERIA can also be a page or a color.

Filtering by test/check is also possible with test=...:

root@bb:~# /usr/lib/xymon/client/bin/xymon 127.0.0.1 "xymondboard host=bb.local test=cpu"
bb.local|cpu|green||1425021929|1425024640|1425026440|0|0|127.0.0.1||green Fri Feb 27 12:10:34 MSK 2015 up: 05:42, 1 users, 80 procs, load=0.01

With fields it is also possible to get direct access to particular state-information without string-parsing the pipe-delimited output. The examples above used the default settings for the fields, that returns the following default-set of information: hostname,testname,color,flags,lastchange,logtime,validtime,acktime,disabletime,sender,cookie,line1.

There are other fields like msg (returns the full message), flapinfo (with advanced information about flapping), acklist (with details on acknowlegements for a test with timestamp, user who asked, …) and more.

So by using the fields=msg-filter to return the full message we can run a query similar to the above xymondlog-example to get the whole message (the returned data has linebreaks as \n that we replace for better readability right away):

root@bb:~# /usr/lib/xymon/client/bin/xymon 127.0.0.1 "xymondboard host=bb.local test=cpu fields=msg" | sed -e 's;\\n;\n;g' | head -n10
status bb,local.cpu green Fri Feb 27 12:15:36 MSK 2015 up: 05:47, 1 users, 82 procs, load=0.01
System clock is 0 seconds off


top - 12:15:36 up  5:47,  1 user,  load average: 0.00, 0.01, 0.05
Tasks:  82 total,   1 running,  81 sleeping,   0 stopped,   0 zombie
%Cpu(s):  0.1 us,  0.1 sy,  0.0 ni, 99.9 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
KiB Mem:    355296 total,   153376 used,   201920 free,    33004 buffers
KiB Swap:   735228 total,        0 used,   735228 free,    52572 cached

As you can see this is a valid Xymon status-message, that could be fed back into xymon (where xymondlog does not return a status bb,local.cpu ...-line but instead a pipe-separated list of status values).

Practical Examples

Check if all systems have the same firewall ruleset installed if you happen to use Shorewall with my shorewall-monitor extension

root@bb:~# /usr/lib/xymon/server/bin/xymon 127.0.0.1 "xymondboard test=fw fields=msg" | sed -e  's;\\n;\n;g' | egrep '^status|^file_md5

(In real life this would probably also have a page=dc1/mailservers filter or similar.)

Just one additional grep -v <desired-file-md5 would print non-conforming entries only.

Check the firmware of your HBAs if you use my raid-monitor extension

root@bb:~# /usr/lib/xymon/server/bin/xymon 127.0.0.1 "xymondboard test=raid fields=msg" | sed -e  's;\\n;\n;g' | egrep '^FW version:'

The FW version: string needs to be adjusted of course. A similar approach can be used to check the disk-firmware, HBA cache settings or raid-set configuration.

Conclusions

This article showed how to query the status logs of Xymon with the built-in CLI. This allows to create custom reports from the collected data, integrate with other systems like an system-overview webpage shared with a wider audience and much more.

The next article will show how to access the raw data sent by the clients.

Update (2015-04-11): Part 2 is available.


  1. The included compatibility module xymond_filestore allows to write state to disk in the old BB-style. This is disabled by default in tasks.cfg for performance reasons and is not required for the techniques used in this article.

  2. Yes, I’m a bit nostalgic with hostnames.

  3. All examples below use xymondboard which returns the non marked-up text-data where xymondxboard returns the data in XML format.