(This Document is Modified from NS_Client readme.html )
Shatter IT. - www.shatterit.com
NC_Net Home Page - www.shatterit.com/NC_Net
Original author - Tony Montibello –
amontibello@shatterit.com
Purpose
License
Requirements
History
Features
Installation
Uninstall
Configuration
Plug-in Syntax
CPU Load
Disk Usage
Uptime
Service
States
Process
States
Client
Version
Memory Usage
File Age
Usage
Custom
Counter
Instances
Technical Information
Problems
Known Issues
NC_Net has been developed as a drop in replacement for the program NS_Client.
NS_Client has been
developed to get performance information from Windows Servers and return them to
Nagios using the Check_nt client. In addition to the standard metrics, a generic
COUNTER function is provided to return the value of any counter maintained by
Windows.
NC_Net is released with a GNU General Public License.
NS_Client software is released under the GNU General Public License. A copy of the license is included here.
NC_Net was implemented in C# and uses the Dot Net Framework 1.1.4322
It is intended to be run on Windows 2000, XP, Windows 2XXX Servers.
Some of the Framework References used:
System.Managment - Access to WMI for USEDDISKSPACE check.
System.ServiceProces - Access to Services for adding itself, and for running
SERVICESTATE checks.
System.Diagnostics - Access to Perfomance Counters, Processes, and event Viewer.
System.Configuration.Install - Access to installer Class for instalation as a
service.
For Development of NC_Net make sure to add these references to your project prior to compilation.
12/13/04 | First release - more enhancements still to come. |
NC_Net is a drop in replacement for NS_Client. On its own NS_Client must be used with Check_nt. same is true in the current version of NC_Net. However, future versions of NC_Net will have the ability to send passive checks to nsca daemon.
On the Windows machine
The installation will create new folders on the system:
Start Menu -->programs--> NC_Net
Program Files --> Shatter IT --> NC_Net
Please Note NC_Net has not been tested or configured to work with other versions of windows other than English (US).
On the Unix machine
2 parameters that can be changed are the port and the password. -more parameter expected with enhancements.
Parameters currently are changed through the Start parameters of the service. (other mechanism for changing parameters will be added in the future)
To access the start parameters open Control Panel --> Administrative Tools --> services
Then right click on the NC_Net service and choose properties.
From here to change the parameters the service must be in a stopped state so choose stop.
Enter the parameters into the Start parameters text box. (make sure to have a space between parameters)
Then Start the service.
port <Port #>
-the default port is 1248. (This option still has some bugs)
password <new Password>
- The default password is None
- The password is case sensitive.
-Check_NT also uses the default password of None when the -s option is not used.
Syntax:
check_nt -H
<hostname> -v CPULOAD -l <minutes range>,<warning
percent>,<critical percent>
Check_nt send Buffer: password&2&interval
-only one interval is sent per TCP
Check_nt receive Buffer: CPU_LOAD%
-only single number is returned in the receive buffer
NC_Net uses a custom stack class (cpustack.cs) to store and calculate CPU Load.
NC_Net saves the value the CPU load every 5 seconds.
CPU load value comes from performance counters:
CounterName = "% Processor Time";CategoryName = "Processor";InstanceName = "_Total";
You can check several
intervals in one shot. The following command get the average for the last 10min.,
60min. and 24hours.
Check_nt sends out separate request when multiple intervals are chosen.
Example:./check_nt -H 192.168.1.1 -v CPULOAD -l
10,80,95,60,80,95,1440,80,95
Check_nt Result:CPU Load (10 min. 22%)|10min=22
Check_NT Return Codes: 0 - ok;1 - Warning;2 - Critical
Syntax: check_nt -H
<hostname> -v USEDDISKSPACE -l <drive letter> [-w
<warning percent> ] [-c <critical percent>]
NC_Net uses the WMI database to retrieve the Freespace and disk size of the Logical disk:
WMI scope: "root\\cimv2"; WMI Class: "win32_logicalDisk"; drive: is the drive letter concatenated with a colen.
WMI Query: "SELECT FreeSpace,Size FROM "+WMI Class +" WHERE Name='"+drive+"'"
Check_nt send Buffer: password&4&driveLetter
-drive letter is only a single logical drive letter (not case sensitive)
Check_nt receive Buffer: freespace&size
- the result from NC_Net is only the numbers for freespace and disk size from the WMI query
NC_Net returns -1&-1 to Check_nt if it was unable to perform the query.
Example:./check_nt -H 192.168.1.1 -v USEDDISKSPACE -l C
-w 80 -c 90
Check_nt Result: C:\ - total: 17.10 Gb - used: 11.03 Gb (64%) - free 6.07 Gb (36%)|used=11838660608.00
Check_nt Result for
-1&-1: Free disk space : Invalid drive
Check_NT Return Codes: 0 - ok; 1
- Warning; 2 - Critical; 127 - Unknown
Syntax: ./check_nt -H
<hostname> -v UPTIME
This plug-in doesn't care
about warning or critical values. Only the uptime of the machine is received.
Check_nt send Buffer: password&3
-all other arguments ignored
Check_nt receive Buffer: uptime_in_seconds
- the only return is the number for the uptime in seconds
Uptime value comes is actively checked when request is received and comes from performance counter:
CounterName = "System Up Time";CategoryName = "System";InstanceName = null;
Example: ./check_nt -H 192.168.1.1 -v UPTIME
Check_nt result: System Uptime : 2 day(s) 20 hour(s) 51 minute(s)|uptime=247918
Check_NT Return Codes: 0 - ok; 1 - Warning; 2 - Critical; 127 - Unknown
Syntax: check_nt -H
<hostname> -v SERVICESTATE [-d SHOWALL] -l <service
1>[,<service 2>,<service 3>,...]
You can specify serveral
services in one request. No blank should appear in the list !
If not all services are
running, you get the faulty one(s) and a critical state.
If any services are not services you will get an warning and the service will be listed as unknown.
NC_Net uses ServiceProcess.ServiceController.GetServices to get the list of services on the system.
Check_nt send Buffer: password&5&SHOW[ALL|FAIL]&Service[&service2]
-The third argument is either showall or showfail. if it is anything else it will be ignored (even if it is a service) and showfail will be used.
Check_nt receive Buffer: return code&Detailed result string
return codes from NC_Net are:
2 - critical at least one service reported as not running
1 - warning - no services critical but at least one service not found in service list.
0 -ok - all services found and running.
Example:./check_nt -H 192.168.1.1 -p 1248 -v SERVICESTATE -d
SHOWALL -l LanmanServer,Schedule
CHeck_NT Result: Lanmanserver: Started - Schedule: Started
Check_NT Result: All services are running
Check_NT Return Codes: 0 - ok; 1 - Warning; 2 - Critical; 127 - Unknown
Syntax: check_nt -H
<hostname> -v PROCSTATE [-d SHOWALL] -l <process
1>[,<process 2>,<process 3>,...]
You can specify several
processes in one request. No blank should appear in the list !
If not all processes are
running, you get the faulty one(s) and a critical state.
NC_Net uses System.Diagnostics.Process.GetProcesses to get the list of processes.
Check_nt send Buffer: password&6&SHOW[ALL|FAIL]&Process[&process2]
-The third argument is either showall or showfail. if it is anything else it will be ignored (even if it is a service) and showfail will be used.
-Processes are not case sensitive but must match including the .exe
-idle and system always return true, and are not actually checked.
Check_nt receive Buffer: return code&Detailed result string
return codes from NC_Net are:
2 - critical at least one process not found in list of running processes
0 -ok - all processes found.
Example: ./check_nt -H 192.168.1.1 -v PROCSTATE -l NC_Net,nc_net.exe,NC_Net.exe -d SHOWALL
Check_NT Result: NC_Net: not running - nc_net.exe: Running - NC_Net.exe: Running
Syntax: check_nt -H
<hostname> -v CLIENTVERSION
Check_nt send Buffer: password&1
all other arguments are ignored
Check_nt receive buffer: Version_String
Return the NC_Net version.
Syntax: check_nt -H
<hostname> -v MEMUSE [-w <warning percent> ] [-c
<critical percent>]
NC_Net uses performance counters to retrieve results:
CounterName = "Commit Limit";CategoryName = "Memory";InstanceName =
null;CounterName = "Committed Bytes";CategoryName = "Memory";InstanceName =
null;Check_nt send Buffer: password&7
all other arguments are ignored
Check_nt receive buffer: Commit Limit&Committed Bytes
-NC_Net returns the commit limit value and the Commited Bytes back to Check_nt.
IF there was a problem retrieving these values NC_Net will return:
-1&Could not process memory usage check
Example:
./check_nt -H 192.168.1.1 -p 1248 -v MEMUSE -w 80 -c 90
CHeck_NT result:
File Age Usage
Syntax: check_nt -H
<hostname> -p <port> -v FILEAGE –l <filename>[,<date
format>] [-w
<warning> ] [-c <critical >]
strftime(description, 50, "Date: %D %I:%M:%S %p", localtime(&rettime));
Change the "Date: %D %I:%M:%S %p" part to what you want as a default and then compile it.
Example:
./check_nt -H 192.168.1.1 -p 1248 -v FILEAGE –l
"c:\\program files\\nsclient\\pnsclient.exe" -w 1440 -c 2880
./check_nt -H 192.168.1.1 -p 1248 -v FILEAGE -l "c:\\program
files\\nsclient\\pnsclient.exe","Date: %d-%m-%Y %I:%M:%S %p" -w
1440 -c 2880
Syntax: check_nt -H
<hostname> -v COUNTER -l <counter name>[,<counter
description>] [-w <warning percent> ] [-c <critical percent>]
Check_nt send Buffer: password&8&Counter_to_check&CounterDescription
- the counter description is used by Check_nt and is not expected or used by NC_Net
Check_nt receive Buffer: decimal_counter_value
- a decimal value is returned by NC_Net
- 5 decimal places are retuned
error returns to Check_nt:
-1&No Counter to check (if less than three arguments sent to NC_Net)
-1&Could not process check counter (if the Performance counter check failed for any reason.)
Details of CHECK_NT results:
IF warning equals critical and less then result then OK
Else If warning equals critical and greater than or equal result then CRITICAL
Else If could not find Counter then result is -1 plug in returns OK -no Unknown for check
Else If warning less than Critical and critical less than or equal result then critical
Else If warning less than Critical and warning less than or equal result then Warning
Else If warning less than critical and result less than warning than OK
Else If critical less than warning and result less than or equal critical then Critical
Else If critical less than warning and result less than or equal warning then Warning
Else If critical less than warning and result greater than warning then OK
Example:
./check_nt -H 192.168.1.1 -p 1248 -v COUNTER -l "\\Paging File(_Total)\\%% Usage","Paging file usage is %.2f %%" -w 80 -c 90
./check_nt -H 192.168.1.1 -p 1248 -v COUNTER -l "\\Process(_Total)\\Thread Count","Thread Count: %.f" -w 600 -c 800
./check_nt -H 192.168.1.1 -p 1248 -v COUNTER -l "\\Server\\Server Sessions","Server Sessions: %.f" -w 20 -c 30
Syntax: check_nt -H <hostname> -v INSTANCES -l <Category object>[,<Category object2>]
Check_nt send Buffer: password&10&Category&Category
NC_Net can check multiple categories
Categories are not case sensitive
Check_nt receive Buffer: String of descriptive results.
Basic output is the category name colon comma separated list of instances.
if no instances then the list is empty.
Categories outputs are delimted with a dash - in the output.
If a Category does not exist it will also list an empty list.
String Format:
Cat1: inst1,inst2,inst3 - cat2: inst1,inst2
Example
./check_nt -H 192.168.1.1 -p 1248 -v INSTANCES -l Process
Socket timeout after 10 seconds
return Code:2
Windows agent
NSClient has been
developed using Borland Delphi 7. It is installed as a service.
Every five seconds,
NSClient query Windows to get the CPU load and store this information in a
circular buffer which keeps the measures for the last 24 hours. It also collects
the uptime, memory and disk utilization metrics every 5 seconds and stores them
in global variables. When requested, the client returns these results from the
global variables.
Have a look at the source
code in case you want to know more about it. It's well documented so you
shouldn't have too much pain to figure out how it works. Some functions and
pieces of code where found on the internet, so I thank very much the programmers
who release them as open source. When possible, credits are left in the code.
Unix plug in
The Unix plug-in has been
written in C, using a template by Ethan Galstad. It mostly uses common functions
and should be compiled in the same directory as all other plug-in. I hope that
it will be included in the common distribution.
Let us know if any problems are encountered.
For the most part, NC_Net will report errors to the Application Log in Event Viewer.
More work needs to be accomplished on the error reporting. It should on all cases list an event however the meaning of the event is not always accurate, and sometimes not very meaningful. The event reporting will be updated slowly as new revisions are released. For the most part the event reporting updates will focus on Fixing current problems.
current has some problems with changing ports,
File age not yet implemented
# NSClient basic return
types
command[check_nt_disk]=$USER1$/check_nt
-H $HOSTADDRESS$ -p 1248 -v USEDDISKSPACE -l $ARG1$ -w $ARG2$ -c $ARG3$
command[check_nt_cpuload]=$USER1$/check_nt -H $HOSTADDRESS$ -p 1248 -v CPULOAD
-l $ARG1$
command[check_nt_uptime]=$USER1$/check_nt -H $HOSTADDRESS$ -p 1248 -v UPTIME
command[check_nt_clientversion]=$USER1$/check_nt -H $HOSTADDRESS$ -p 1248 -v
CLIENTVERSION
command[check_nt_process]=$USER1$/check_nt -H $HOSTADDRESS$ -p 1248 -v PROCSTATE
-l $ARG1$
command[check_nt_service]=$USER1$/check_nt -H $HOSTADDRESS$ -p 1248 -v
SERVICESTATE -l $ARG1$
command[check_nt_memuse]=$USER1$/check_nt -H $HOSTADDRESS$ -p 1248 -v MEMUSE -w
$ARG1$ -c $ARG2$
command[check_nt_fileage]=$USER1$/check_nt -H $HOSTADDRESS$ -p 1248 -v FILEAGE
–l $ARG1$ -w $ARG2$ -c $ARG3$
# Custom counters (one per required counter, or define a generic one and use
$ARG1$ to specify the requested counter).
command[check_nt_pagingfile]=$USER1$/check_nt -H $HOSTADDRESS$ -p 1248 -v
COUNTER -l "\\Paging
File(_Total)\\%% Usage","Paging File usage is %.2f %%" -w
$ARG1$ -c $ARG2$