check_ilo2_health

Check hardware health of HP Proliant Servers by querying the iLO2|3|4 Management Controller

Check hardware health of HP Proliant Servers by querying the iLO2|3|4 Management Controller. No need for snmp or installation of software.

Checks if all sensors are ok, returns warning on high temperatures and fan failures and critical on overall health failure.

A PERL plugin using Nagios::Plugin, IO::Socket::SSL and XML::Simple.

The plugin makes use of the HP Lights-Out XML scripting interface.

HP provides some PERL scripting samples: http://h18013.www1.hp.com/support/files/lights-out/us/download/25057.html


Usage:

check_ilo2_health.pl -H host -u username -p password

Additional options:

-e: plugin ignores "syntax error" messages in the XML output. This may help for older firmwares.

-n: output without temperature listing.

-d: add PerfParse compatible temperature output.

-v: print out the full XML output from the BMC.

-3: support for iLO3|4

-a: check fan redundancy (only some models)

-c: check drive bays (only some models)

-o: check power redundancy (only some models)

-b: temperature output with location

-l: parse iLO eventlog


Howto:

First test if you can reach the management controller with a web browser. The plugin only works if the https interface is reachable.

Install the PERL modules Nagios::Plugin, IO::Socket::SSL and XML::Simple. Copy the plugin to your nagios plugin directory and make sure that the nagios user can execute it.

Put this in your nagios config:

define command {

command_name check_ilo2_health.pl

command_line $USER1$/check_ilo2_health.pl -u $USER10$ -p $USER11$ -H $HOSTADDRESS$

}

Assuming that $USER1$ contains the path to the plugin, $USER10$ the username and $USER11$ the password for the management controller.

Set up the appropriate services.

Hint: All management controllers have their own host definition in my nagios setup. So every Proliant Server with host_name foo has a management controller with host_name foo-ilo2.