check_ilo2_health
Check hardware health of HP Proliant Servers by querying the iLO2|3|4|5 Management Controller
Check hardware health of HP Proliant Servers by querying the iLO2|3|4|5 Management Controller. No need for snmp or installation of software.
Checks if all sensors are ok, returns warning on high temperatures and fan failures and critical on overall health failure.
A PERL plugin using Nagios::Plugin, IO::Socket::SSL and XML::Simple.
The plugin makes use of the HP Lights-Out XML scripting interface.
HP provides some PERL scripting samples: http://h18013.www1.hp.com/support/files/lights-out/us/download/25057.html
Usage:
check_ilo2_health.pl -H host -u username -p password
Additional options:
-e: plugin ignores "syntax error" messages in the XML output. This may help for older firmwares.
-n: output without temperature listing.
-d: add PerfParse compatible temperature output.
-v: print out the full XML output from the BMC.
-3: support for iLO3|4
-a: check fan redundancy (only some models)
-c: check drive bays (only some models)
-o: check power redundancy (only some models)
-b: temperature output with location
-l: parse iLO eventlog
Howto:
First test if you can reach the management controller with a web browser. The plugin only works if the https interface is reachable.
Install the PERL modules Nagios::Plugin, IO::Socket::SSL and XML::Simple. Copy the plugin to your nagios plugin directory and make sure that the nagios user can execute it.
Put this in your nagios config:
define command {
command_name check_ilo2_health.pl
command_line $USER1$/check_ilo2_health.pl -u $USER10$ -p $USER11$ -H $HOSTADDRESS$
}
Assuming that $USER1$ contains the path to the plugin, $USER10$ the username and $USER11$ the password for the management controller.
Set up the appropriate services.
Hint: All management controllers have their own host definition in my nagios setup. So every Proliant Server with host_name foo has a management controller with host_name foo-ilo2.