check_linux_stat.sh
check_linux_stat.sh is a bash script that checks cpu, load & memory using the /proc filesystem as opposed to binary tools (i.e. sar). My goal with this plugin was to create a script that could be distributed without modification to all of my Linux servers; load thresholds take into account number of cpus, memory thresholds are triggered off of percentages. Additionally some of the values do not trigger warnings or thresholds but are kept for graphing purposes - I use pnp4nagios and have included my templates.
check_linux_stat.sh has been tested on Redhat EL 4, 5 & 6. It requires: bash, coreutils, grep and bc. On a RHEL6 minimal install 'bc' may need to be installed via yum.
-
check_linux_stat.sh -c: check cpu utilization (user, nice, sys, idle, io wait, hardirq, softirq) using /proc/stat which stores data in jiffies! (http://docs.redhat.com/docs/en-US/Red\_Hat\_Enterprise\_Linux/5/html/Deployment\_Guide/s2-proc-stat.html). It keeps the previous values and compares them to the current values so that you are getting an average of cpu utilization instead of a point in time snapshot. Currently I do not alert (warning|critical) on cpu data as I think load is a much better metric but I collect cpu data for graphing.
-
check_linux_stat.sh -l: check load (1, 5 & 15 minute) using /proc/loadavg. Also grabs the number of processors from /proc/cpuinfo and generates warning/critical thresholds by multiplying a load modifier by the number of cpus. Because the thresholds are automatically generated I can deploy the script accross many boxes of varying size without modification.
-
check_linux_stat.sh -m: check memory, swap and committed_as (http://www.redhat.com/advice/tips/meminfo.html) from /proc/meminfo. It adds cache and buffers to calculate total free memory and lists the values in MB and % (warning/critical are keyed off of percent used). Committed_AS is an interesting stat, I don't alert on that value but keep it for graphing.