check_nmon

CHECK_NMON.PL

This script will read a data file generated by NMON and return the value for the key/subkey params passed to it. Option -t just reads the data file and returns the list of key/subkey values included in the file. This option is useful at a beginning to determine exactly what to monitor.

It is possible to check thresholds and generate warning or critical alarms for:

  • % occupied CPU.
  • % occupied MEM.
  • % disk busy time.

Options:

-F: NMON data file.
-k: Comma separated metric keys to report.
-s: Comma separated metric subkeys to report.
-h: Displays help message.
-d: Debug mode on. Prints debug messages.
-t: Test mode. List metric and sub-metrics in the specified file.
-y: Check type. Possible values: CPU, MEM, DISKBUSY.
-w: Warning threshold.
-c: Critical threshold.

NMOND.PL

This script controls the NMON process. If NMON is launched it does nothing unless the date when NMON was launched (obtained from NMON output file) is not today's date. In this case it stops NMON.This is intended to rotate NMON log files.

If NMON is not launched the script will launch it. Before it, if NMON output file exists (hostname.nmon) in $nmon_dir (configured as global variable) it is archived. An archive file is maintained for each month day. Archive files for the same month day get overwritten.

NMON command is launched in recording mode with the following flags: -F: file where NMON writes its output. It is $nmon_dir/hostname.nmon. $nmon_dir is configured as global variable. -t: include top processes. -S: include WLM sections. -P: include paging space section. -V: include disk volume group section. -s: interval in seconds between 2 snapshots. It is set to take a snapshot every 60 seconds. -c: number of snapshots that must be taken. It is set to 1500. This covers 1500 minutes, which is 25 hours.

This script is supposed to be launched through the cron with a frecuency according to our needs to check possible NMON outages (every 5 minutes, every 10 minutes, ...).

PNP4NAGIOS

  • check_nmon_cpu.php: thought for check_nmon.pl with options -k CPU_ID -s User%,Sys%,Wait%,Idl.
  • check_nmon_disk.php: thought for check_nmon.pl with options -k DISKBUSY,DISKREAD,DISKWRITE,DISKXFER -s hdiskx.
  • check_nmon_hba.php: thought for check_nmon.pl with options -k IOADAPT -s "fcsx_read-KB/s,fcsx_write-KB/s,fcsx_xfer-tps".
  • check_nmon_mem.php: thought for check_nmon.pl with options -k MEM -s "Real Free %,Virtual free %".
  • check_nmon_net.php: thought for check_nmon.pl with options -k NET,NETERROR -s "en0-read-KB/s,en0-write-KB/s,en0-ierrs,en0-oerrs,en0-collisions".
  • check_nmon_pag.php: thought for check_nmon.pl with options -k PAGE -s pgin,pgout,pgsin,pgsout.

NOTE: exact values for -k and -s values should be checked with test mode option -t.