Check Windows Performance Monitor Counters
Plugin for Icinga/Nagios that allow to check a group of Windows performance counters
Check Win Perfmon
Plugin for Nagios/Icinga that allows to check a group of Windows performance counters specified in a XML file.
Checks value of performance counters based on thresholds specified in XML.
Returns exit code and performance data in Nagios/Icinga format.
Download Check Win Perfmon v2.0.
Release 2.0 has breaking changes with 1.4. You must change 'auto' parameter in xml files for 'automemory', 'autodisk' or 'autonetwork'
Please read below before use it!
Preconfigured XML files
In downloaded zip package, there are several .xml files preconfigured:
-
PerfMonNetwork.xml: Performance Counters to check network load.
-
PerfMonPhysicalDisk.xml:Performance Counters to check physical Disk load.
-
PerfMonCPU.xml: Performance Counters to check CPU load.
-
PerfMonMem.xml: Performance Counters to check Memory (RAM and virtual) load.
-
PerfMonMSQL.xml: Performance Counters to check Microsoft SQL Server.
-
PerfMonWebService.xml: Performance Counters to check Microsoft IIS Web Service.
-
PerfMonPrinter.xml: Performance Counters to check Microsoft Print Server.
-
PerfMonCB.xml: Performance Counters to check Microsoft Connection Broker Server and its WID.
-
PerfMonHyperV.xml: Performance Counters to check Microsoft Hyper-V Server.
-
PerfMonWID.xml: Performance Counters to check Microsoft Windows Internal Database of WSUS.
Usage
check_win_perfmon.exe [parameters]:
-
-f, --xmlFile (Default: perfcounts.xml) XML file with performance counters configuration.
-
-s, --maxSamples (Default: 3) Amount of samples to take from perfmon.
-
-t, --timeSamples (Default: 1000) Time between samples in ms.
-
-n, --noalerts (Default: false) Always returns 0. Useful to get only performance data without alerts.
-
-p, --xmlParams (Default: none) Array of params to change in xml file. Read below for examples.
-
-v, --verbose Verbose output for debuging.
Check performance counters of PerfMonMem.xml taking 10 samples with 2 sec interval.
check_win_perfmon.exe -f PerfMonMem.xml -s 10 -t 2000
Examples
Example CPU counters: check_win_perfmon.exe -f PerfMonCPU.xml
OK - All performance counters between range | 'ProcessorTime'=3%;95;100;0;100 'UserTime'=2%;85;95;0;100 'DPCTime'=0%;15;20;0;100 'InterruptTime'=0%;10;15;0;100 'ProcessorQueueLength'=0;4;8;;
Example Memory counters: check_win_perfmon.exe -f PerfMonMem.xml
OK - All performance counters between range | 'CommittedBytesInUse'=57%;80;90;0;100 'AvailableMBytes'=4083MB;1024;512;0;8192 'AvailableMBytesPercent'=50%;13;6;0;100 'FreeSystemPageTableEntries'=2867405056;5000;4000;; 'PagesSec'=0;5000;6000;;
Example Physical Disk counters: check_win_perfmon.exe -f PerfMonPhysicalDisk.xml
OK - All performance counters between range | 'AvgDiskSecTransfer'=0.0002s;0.04;0.05;; 'CurrentDiskQueueLength'=0;32;40;; 'AvgDiskSecWrite'=0.0002s;0.04;0.05;0; 'AvgDiskSecRead'=0s;0.04;0.05;0; 'IdleTime'=100%;20;15;0;100
Example Network counters: check_win_perfmon.exe -f PerfMonNetwork.xml
OK - All performance counters between range | 'BytesTotalSec'=1885.7051B;15728640;17825790;0;20971520 'BytesTotalSecPercent'=0%;75;85;0;100 'OutputQueueLength'=0;2;3;;
Example Microsoft SQL counters: check_win_perfmon.exe -f PerfMonMSQL.xml
OK - All performance counters between range | 'TotalServerMemory'=8381528KB;14680060;16252930;0;16777220 'TotalServerMemoryPercent'=50%;88;97;0;100 'TargetServerMemory'=8388608KB;14680060;16252930;0;16777220 'TargetServerMemoryPercent'=50%;88;97;0;100 'PageReadsSec'=0;90;100;; 'PageWritesSec'=0;90;100;; 'BufferCacheHitRatio'=100;95;90;0;100 'BufferCacheHitRatioPercent'=100%;95;90;0;100 'PageLifeExpectancy'=109982.6641;400;300;; 'LazyWritesSec'=0;15;20;; 'FreeListStallsSec'=0;1;2;; 'MemoryGrantsPending'=0;1;2;; 'BatchRequestsSec'=16.6571;1000;2000;; 'UserConnections'=115.3333;600;700;; 'LockWaitsSec'=0;1;2;; 'ProcessesBlocked'=0;1;2;;
Creating new XML files to check your own performance counters
You can set up your own performance counters adding them to xml files or creating new ones.
To list available performance counters on a system in a PowerShell console type:
# Get all counters
Get-Counter -ListSet * | Select-Object -ExpandProperty Counter
# Get specified counter
Get-Counter -ListSet *processor* | Select-Object -ExpandProperty Counter
You can check performance counters on a Windows system: Start Menu->Administrative Tools->Performance Monitor->Clic on plus symbol
XML Format
XML file used must have the following format, for example:
<?xml version="1.0" encoding="UTF-8" ?>
Processor
% Processor Time
_Total
ProcessorTime
%
95
100
0
100
Memory
Available MBytes
none
AvailableMBytes
MB
1024
512
0
auto
Hyper-V Virtual Machine Health Summary
Health Critical
none
VirtualMachineHealthCritical
none
none
gt1
none
none
Warning: Counter names must be in english.
In the example above, program will check two counters. For each counter, we need to set:
- category: Category of performance counter
- name: Name of the performance counter.
- instance: Instance of performance counter. Some performance counter does not have instance, in this case the value must be: none.
- autonetwork: detects main network interface.
- autodisk: detect system disk.
- friendlyname: name of performance counter which program returns in performance output.
- units: units program returns in performance output. Check units on Nagios performance data docs.
- warning: Warning threshold for performance counter.
- critical: Critical threshold for performance counter.
- min: minimum value of performance counter. If you do not know the minimum value, it has to be: none.
- max: maximum value of performance counter. If you do not know the maximum value, it has to be: none.
- autonetwork: detects network interface speed in kb/s.
- automemory: detects system memory.
If max and min are specified, program returns one more performance result for calculated percent value. Max and min must have different value.
Warning and critical can be a % of max value. For example:
80%
95%
0
20480
If you want to check only warning or critical threshold, it should have the format: lt or gt, and none for not checked one. For example, warning if counter is less or equal than 15:
lt15
none
Critical if counter is greater or equal than 90% of max:
none
gt90%
0
20480
XML with parameters
To avoid creating an xml file for each network interface, disk or sql instance, we can create a generic xml file with parameters, for example.
<?xml version="1.0" encoding="UTF-8" ?>
Network Adapter
Output Queue Length
{0}
OutputQueueLength
{1}
{2}
none
none
Network Adapter
Output Queue Length
{3}
OutputQueueLength
{2}
{5}
none
none
To pass parameters:
check_win_perfmon.exe -f PerfMonNetworkParams.xml -p "Interface 1" "1" "2" "Interface 2" "5"
Params of type {n} will be replaced in order.
XML file would be:
<?xml version="1.0" encoding="UTF-8" ?>
Network Adapter
Output Queue Length
Instance 1
OutputQueueLength
1
2
none
none
Network Adapter
Output Queue Length
Instance 2
OutputQueueLength
2
5
none
none
You can use two parámeters in same field too.
<?xml version="1.0" encoding="UTF-8" ?>
PhysicalDisk
Current Disk Queue Length
{0} {1}:
CurrentDiskQueueLength
32
40
none
none
To check it:
check_win_perfmon.exe -f PerfMonDiskParams.xml -p "1" "D"
check_win_perfmon.exe -f PerfMonDiskParams.xml -p "3" "F"
System Load
I tried to minimize system load during program execution, but check performance counters allways has an impact on system performance. Program execution has a 5% of CPU usage on old systems and a minimun impact on modern servers. The more performance counters you check at a time, the more system impact.
Icinga Agent Configuration
Command
object CheckCommand "check_win_perfmon" {
import "plugin-check-command"
command = [ "C:\\Program Files\\ICINGA2\\sbin\\check_win_perfmon.exe" ]
arguments = {
"-f" = {
value = "$xml$"
order = 1
description = "XML file"
}
"-t" = {
value = "$interval$"
order = 2
description = "Time between samples"
}
"-s" = {
value = "$samples$"
order = 3
description = "Samples to take"
}
"-n" = {
value = "$noalerts$"
order = 4
description = "Always return 0"
}
"-p" = {
value = "$xmlParams$"
order = 5
description = "XML params"
}
}
}
Service
apply Service "CPU Load" {
import "generic-service"
check_command = "check_win_perfmon"
vars.xml = "C:\\Program Files\\ICINGA2\\sbin\\PerfMonCPU.xml"
command_endpoint = host.name
assign where host.vars.os == "Windows"
}
apply Service "Network Load" {
import "generic-service"
check_command = "check_win_perfmon"
vars.xml = "C:\\Program Files\\ICINGA2\\sbin\\PerfMonNetwork.xml"
command_endpoint = host.name
assign where host.vars.os == "Windows"
}
apply Service "Disk_0 Load" {
import "generic-service"
check_command = "check_win_perfmon"
vars.xml = "C:\\Program Files\\ICINGA2\\sbin\\PerfMonPhysicalDisk.xml"
command_endpoint = host.name
assign where host.vars.os == "Windows"
}
apply Service "Memory Load" {
import "generic-service"
check_command = "check_win_perfmon"
vars.xml = "C:\\Program Files\\ICINGA2\\sbin\\PerfMonMem.xml"
command_endpoint = host.name
assign where host.vars.os == "Windows"
}
apply Service "SQL Server Load" {
import "generic-service"
check_command = "check_win_perfmon"
if ("msql2012" in host.vars.checks) {
vars.xml = "C:\\ProgramData\\icinga2\\var\\lib\\icinga2\\api\\zones\\global-templates\\_etc\\scripts\\PerfMonMSQL2012.xml"
}
if ( "msql2014" in host.vars.checks) {
vars.xml = "C:\\ProgramData\\icinga2\\var\\lib\\icinga2\\api\\zones\\global-templates\\_etc\\scripts\\PerfMonMSQL2014.xml"
}
if ("msqlnamed1" in host.vars.checks) {
vars.xml = "C:\\ProgramData\\icinga2\\var\\lib\\icinga2\\api\\zones\\global-templates\\_etc\\scripts\\PerfMonMSQL2014Params.xml"
vars.xmlParams = "NAMED1"
}
if ("msqlsharepoint" in host.vars.checks) {
vars.xml = "C:\\ProgramData\\icinga2\\var\\lib\\icinga2\\api\\zones\\global-templates\\_etc\\scripts\\PerfMonMSQL2012Params.xml"
vars.xmlParams = "SHAREPOINT"
}
if ("msqlnamed2" in host.vars.checks) {
vars.xml = "C:\\ProgramData\\icinga2\\var\\lib\\icinga2\\api\\zones\\global-templates\\_etc\\scripts\\PerfMonMSQL2014Params.xml"
vars.xmlParams = "NAMED2"
}
vars.samples = "10"
vars.noalerts = ""
command_endpoint = host.name
assign where (regex("^msql?[a-z0-9]+",host.vars.checks,MatchAny))
}
References
Values and counters are based on System Center Operations Manager checkins. You can check it out here.
Values and counters for Microsoft SQL are based on articles from SLQ Shack and Database Journal.
Updated tresholds based on the amazing tool PAL created by Clint Huffman from Microsoft.