how to monitor a cluster without a virtual IP

I have two servers in a cluster without a cluster IP, we use url to reach with DNS resolution to active. Both servers have different virtual IPs and will be active once the server is active.

Limitations I found: cannot add a url as a node interface. Socond- service availability is 50% if one out of two interfaces of a node is down.

Please provide a solution to monitor service availability of this cluster.


OpenNMS version
asked May 24, 2017 by AR (150 points)

1 Answer

Is the URL you reach something you can monitor with the Page Sequence Monitor?

Just some food for thought:

  • Create a virtual node which has a black box test for the service provided by your cluster. Check if it's possible to use the Page Sequence Monitor (PSM), it allows you to just use a dummy IP interface and use the capabilities to resolve the hostname using DNS. See the hint in the docs of the PSM. This test will tell you if the service provided by the cluster is in general still working, this would be an outage and a critical issue.
  • Create a node for each of your cluster members and create tests which need to be run against the cluster member individually to see service degradation.
  • Use the Business Service Monitoring (BSM) feature and create a Business Service as root which represents the high-level service provided by your cluster
  • Assign the PSM service on your virtual node as child and set the "Identity To" to "Critical" as input
  • Assign two cluster members to the Busines Service and set the "Identity To" to a lower severity for example "Minor" or "Warning".
The result would be, as soon the high-level black box test went down, the Business Service goes critical.
If the black box test is still functioning, but just one of your cluster members has issues, you're Business Service is in Warning.
I would recommend upgrading to latest version 19.1.0 which got a lot of bug fixes, cause the BSM feature was newly introduced in 18.
Docs to the Business Service Monitoring features:
answered Jun 2, 2017 by indigo (11,480 points)