Saturday, September 10, 2011

Using xymon to monitor status of cfengine


During the deployment of the cfengine one of my biggest concerns was how to make sure that it is working as expected. Obviously there are multiple elements in the engine itself that can alert or even better - fix many issues.

Free version doesn't provide reporting, trends ... - no visualization but it has enough to build external analyzers and reporting, also external tests will give you a bit more confidence that everything is OK (or something is wrong)

As a main monitoring platform I'm using xymon and it's functionality can be easy extended.

Initially I'd like to ensure that all agents are alive and really talking to the server(s)
In my case I'm expecting that connection is established approximately every 5 min (default behavior), so you should expect that "last seen" value is less than 5 min + "splaytime".

Code of the extension is available for download from google code page

Or checkout most current version from svn:
svn co https://abris.googlecode.com/svn/trunk/xymon-ext/cfengine

requirements:
  • cfengine3
  • python 2.6+


Example of a healthy chart, looking from the cfserver:


Significant spikes in the chart indicate that you need to check status of the suspicions agent

I'm planning to add more features on this test , so stay tuned.

No comments: