NC_Net  documentation

(This Document is Modified from NS_Client readme.html )

Shatter IT. - www.shatterit.com

NC_Net Home Page - www.shatterit.com/NC_Net

Original author - Tony Montibello amontibello@shatterit.com

Contents

Purpose
License
Requirements
History
Features
Installation
Uninstall
Configuration
Plug-in Syntax
        CPU Load
        Disk Usage
        Uptime
        Service States
        Process States
        Client Version
        Memory Usage
        File Age Usage
        Custom Counter
        Instances
Technical Information
Problems
Known Issues


Purpose

NC_Net has been developed as a drop in replacement for the program NS_Client.

NS_Client has been developed to get performance information from Windows Servers and return them to Nagios using the Check_nt client. In addition to the standard metrics, a generic COUNTER function is provided to return the value of any counter maintained by Windows.


License

NC_Net is released with a GNU  General Public License.

NS_Client software is released under the GNU General Public License. A copy of the license is included here.


Requirements

NC_Net was implemented in C# and uses the Dot Net Framework 1.1.4322

It is intended to be run on Windows 2000, XP, Windows 2XXX Servers.

Some of the  Framework References used:

System.Managment - Access to WMI for USEDDISKSPACE check.
System.ServiceProces - Access to Services for adding itself, and for running SERVICESTATE checks.
System.Diagnostics - Access to Perfomance Counters, Processes, and event Viewer.
System.Configuration.Install - Access to installer Class for instalation as a service.

For Development of NC_Net make sure to add these references to your project prior to compilation. 


History

12/13/04 First release - more enhancements still to come.


Features

|
Installation

NC_Net is a drop in replacement for NS_Client.  On its own NS_Client must be used with Check_nt.  same is true in the current version of NC_Net.  However, future versions of NC_Net will have the ability to send passive checks to nsca daemon.

On the Windows machine

  1. Make sure To uninstall older versions of NC_Net (this is done through Control Panel -> add remove programs)
  2. Download and run the install Package NC_Net_setup.msi from www.shatterit.com/NC_Net
  3. Uninstall, or disable NS_Client (if running)
    1. -If you choose not to uninstall make sure to set it to startup manually.
  4. Start the NC_Net service

The installation will create  new folders on the system:

Start Menu -->programs--> NC_Net

Program Files --> Shatter IT --> NC_Net

Please Note NC_Net has not been tested or configured to work with other versions of windows other than English (US).

On the Unix machine

  1. Use the Nagios plug-in Check_Nt.
  2. Follow installation instructions for the plug-in.


Uninstall

  1. Uninstall NC_Net from Control Panel --> Add or Remove Programs
  2. If Problems with uninstall please review the file Manual Uninstall.


Configuration

2 parameters that can be changed are the port and the password. -more parameter expected with enhancements.

Parameters currently  are changed through the Start parameters of the service. (other mechanism for changing parameters will be added in the future)

To access the start parameters open Control Panel --> Administrative Tools --> services

Then right click on the NC_Net service and choose properties.

From here to change the parameters the service must be in a stopped state so choose stop.

Enter the parameters into the Start parameters text box. (make sure to have a space between parameters)

Then Start the service.

port <Port #>

-the default port is 1248.  (This option still has some bugs)

password <new Password>

- The default password is None

- The password is case sensitive. 

-Check_NT also uses the default password of None when the -s option is not used. 

 

Check_Nt Plug-in Syntax

 

CPU Load

Syntax: check_nt -H <hostname>  -v CPULOAD -l <minutes range>,<warning percent>,<critical percent>

Check_nt send Buffer: password&2&interval

-only one interval is sent per TCP

Check_nt receive Buffer: CPU_LOAD%

-only single number is returned in the receive buffer

NC_Net uses a custom stack class (cpustack.cs)  to store and calculate CPU Load. 

NC_Net saves the value the CPU load every 5 seconds.

CPU load value comes from performance counters:

CounterName = "% Processor Time";CategoryName = "Processor";InstanceName = "_Total";

You can check several intervals in one shot. The following command get the average for the last 10min., 60min. and 24hours.

Check_nt sends out separate request when multiple intervals are chosen.

Example:./check_nt -H 192.168.1.1 -v CPULOAD -l 10,80,95,60,80,95,1440,80,95

Check_nt Result:CPU Load (10 min. 22%)|10min=22

Check_NT Return Codes: 0 - ok;1 - Warning;2 - Critical


Disk Usage

Syntax: check_nt -H <hostname>  -v USEDDISKSPACE -l <drive letter> [-w <warning percent> ] [-c <critical percent>]

NC_Net uses the WMI database to retrieve the Freespace and disk size of the Logical disk:

WMI scope: "root\\cimv2"; WMI Class: "win32_logicalDisk"; drive: is the drive letter concatenated with a colen.

WMI Query: "SELECT FreeSpace,Size FROM "+WMI Class +" WHERE Name='"+drive+"'"

Check_nt send Buffer: password&4&driveLetter

-drive letter is only a single logical drive letter (not case sensitive)

Check_nt receive Buffer: freespace&size

- the result from NC_Net is only the numbers for freespace and disk size from the WMI query

NC_Net returns -1&-1 to Check_nt if it was unable to perform the query.

 

Example:./check_nt -H 192.168.1.1  -v USEDDISKSPACE -l C -w 80 -c 90

Check_nt Result: C:\ - total: 17.10 Gb - used: 11.03 Gb (64%) - free 6.07 Gb (36%)|used=11838660608.00

Check_nt Result for -1&-1: Free disk space : Invalid drive
Check_NT Return Codes: 0 - ok; 1 - Warning; 2 - Critical; 127 - Unknown

Uptime

Syntax: ./check_nt -H <hostname>  -v UPTIME

This plug-in doesn't care about warning or critical values. Only the uptime of the machine is received.

Check_nt send Buffer: password&3

-all other arguments ignored

Check_nt receive Buffer: uptime_in_seconds

- the only return is the number for the uptime in seconds

Uptime value comes is actively checked when request is received and comes from performance counter:

CounterName = "System Up Time";CategoryName = "System";InstanceName = null;

Example: ./check_nt -H 192.168.1.1 -v UPTIME

Check_nt result:  System Uptime : 2 day(s) 20 hour(s) 51 minute(s)|uptime=247918

Check_NT Return Codes: 0 - ok; 1 - Warning; 2 - Critical; 127 - Unknown

Services States

Syntax: check_nt -H <hostname>  -v SERVICESTATE [-d SHOWALL] -l <service 1>[,<service 2>,<service 3>,...]

You can specify serveral services in one request. No blank should appear in the list !

If not all services are running, you get the faulty one(s) and a critical state.

If any services are not services you will get an warning and the service will be listed as unknown.

NC_Net uses ServiceProcess.ServiceController.GetServices to get the list of services on the system.

Check_nt send Buffer: password&5&SHOW[ALL|FAIL]&Service[&service2]

-The third argument is either showall or showfail.  if it is anything else it will be ignored (even if it is a service) and showfail will be used.

Check_nt receive Buffer: return code&Detailed result string

return codes from NC_Net are:

2 - critical at least one service reported as not running

1 - warning - no services critical but at least one service not found in service list.

0 -ok - all services found and running.

Example:./check_nt -H 192.168.1.1 -p 1248 -v SERVICESTATE -d SHOWALL -l LanmanServer,Schedule

CHeck_NT Result: Lanmanserver: Started - Schedule: Started

Check_NT Result: All services are running
Check_NT Return Codes: 0 - ok; 1 - Warning; 2 - Critical; 127 - Unknown


Processes States

Syntax: check_nt -H <hostname>  -v PROCSTATE [-d SHOWALL] -l <process 1>[,<process 2>,<process 3>,...]

You can specify several processes in one request. No blank should appear in the list !

If not all processes are running, you get the faulty one(s) and a critical state.

NC_Net uses System.Diagnostics.Process.GetProcesses to get the list of processes.

Check_nt send Buffer: password&6&SHOW[ALL|FAIL]&Process[&process2]

-The third argument is either showall or showfail.  if it is anything else it will be ignored (even if it is a service) and showfail will be used.

-Processes are not case sensitive but must match including the .exe

-idle and system always return true, and are not actually checked.

Check_nt receive Buffer: return code&Detailed result string

return codes from NC_Net are:

2 - critical at least one process not found in list of running processes

0 -ok - all processes found.

Example: ./check_nt -H 192.168.1.1 -v PROCSTATE -l NC_Net,nc_net.exe,NC_Net.exe -d SHOWALL

Check_NT Result: NC_Net: not running - nc_net.exe: Running - NC_Net.exe: Running

Client Version

Syntax: check_nt -H <hostname>  -v CLIENTVERSION

Check_nt send Buffer: password&1

all other arguments are ignored

Check_nt receive buffer: Version_String

Return the NC_Net version.


Memory Usage

Syntax: check_nt -H <hostname> -v MEMUSE [-w <warning percent> ] [-c <critical percent>]

NC_Net uses performance counters to retrieve results:

CounterName = "Commit Limit";CategoryName = "Memory";InstanceName = null;

CounterName = "Committed Bytes";CategoryName = "Memory";InstanceName = null;

Check_nt send Buffer: password&7

all other arguments are ignored

Check_nt receive buffer: Commit Limit&Committed Bytes

-NC_Net returns the commit limit value and the Commited Bytes back to Check_nt.

IF there was a problem retrieving these values NC_Net will return:

-1&Could not process memory usage check

Example: 
./check_nt -H 192.168.1.1 -p 1248 -v MEMUSE -w 80 -c 90

CHeck_NT result: Memory usage: total:619.08 Mb - used: 315.64 Mb (51%) - free: 303.44 Mb (49%)|used=303.44
 


File Age Usage
--(NOT YET IMPLEMENTED) (THESE COMMENTS FROM NS_CLIENT DOCS)

Syntax: check_nt -H <hostname> -p <port> -v FILEAGE –l <filename>[,<date format>] [-w <warning> ] [-c <critical >]

strftime(description, 50, "Date: %D %I:%M:%S %p", localtime(&rettime));

Change the "Date: %D %I:%M:%S %p" part to what you want as a default and then compile it.

Example: 
./check_nt -H 192.168.1.1 -p 1248 -v FILEAGE –l "c:\\program files\\nsclient\\pnsclient.exe" -w 1440 -c 2880
./check_nt -H 192.168.1.1 -p 1248 -v FILEAGE -l "c:\\program files\\nsclient\\pnsclient.exe","Date: %d-%m-%Y %I:%M:%S %p" -w 1440 -c 2880


Custom Counter

Syntax: check_nt -H <hostname>  -v COUNTER -l <counter name>[,<counter description>] [-w <warning percent> ] [-c <critical percent>]

Check_nt send Buffer: password&8&Counter_to_check&CounterDescription

- the counter description is used by Check_nt and is not expected or used by NC_Net

Check_nt receive Buffer: decimal_counter_value

- a decimal value is returned by NC_Net

- 5 decimal places are retuned

error returns to Check_nt:

-1&No Counter to check  (if less than three arguments sent to NC_Net)

-1&Could not process check counter  (if the Performance counter check failed for any reason.)

 

Details of CHECK_NT results:

IF warning equals critical and less then result then OK

Else If warning equals critical and greater than or equal result then CRITICAL

Else If could not find Counter then result is -1 plug in returns OK -no Unknown for check

Else If warning less than Critical and critical less than or equal result then critical

Else If warning less than Critical and warning less than or equal result then Warning

Else If warning less than critical and result less than warning than OK

Else If critical less than warning and result less than or equal critical then Critical

Else If critical less than warning and result less than or equal warning then Warning

Else If critical less than warning and result greater than warning then OK

 

 

Example: 
./check_nt -H 192.168.1.1 -p 1248 -v COUNTER -l "\\Paging File(_Total)\\%% Usage","Paging file usage is %.2f %%" -w 80 -c 90
./check_nt -H 192.168.1.1 -p 1248 -v COUNTER -l "\\Process(_Total)\\Thread Count","Thread Count: %.f" -w 600 -c 800
./check_nt -H 192.168.1.1 -p 1248 -v COUNTER -l "\\Server\\Server Sessions","Server Sessions: %.f" -w 20 -c 30  

Instances

Syntax: check_nt -H <hostname>  -v INSTANCES -l <Category object>[,<Category object2>]

Check_nt send Buffer: password&10&Category&Category

NC_Net can check multiple categories

Categories are not case sensitive

Check_nt receive Buffer: String of descriptive results.

Basic output is the category name colon comma separated list of instances.

if no instances then the list is empty. 

Categories outputs are delimted with a dash - in the output.

If a Category does not exist it will also list an empty list.

String Format:

Cat1: inst1,inst2,inst3 - cat2: inst1,inst2

Example

./check_nt -H 192.168.1.1 -p 1248 -v INSTANCES -l Process

 

Other Returns from Check NT:

Socket timeout after 10 seconds
return Code:2

Technical information

Windows agent

NSClient has been developed using Borland Delphi 7. It is installed as a service.

Every five seconds, NSClient query Windows to get the CPU load and store this information in a circular buffer which keeps the measures for the last 24 hours. It also collects the uptime, memory and disk utilization metrics every 5 seconds and stores them in global variables. When requested, the client returns these results from the global variables.

Have a look at the source code in case you want to know more about it. It's well documented so you shouldn't have too much pain to figure out how it works. Some functions and pieces of code where found on the internet, so I thank very much the programmers who release them as open source. When possible, credits are left in the code.

Unix plug in

The Unix plug-in has been written in C, using a template by Ethan Galstad. It mostly uses common functions and should be compiled in the same directory as all other plug-in. I hope that it will be included in the common distribution. 


Problems & Known Issues

Let us know if any problems are encountered. 

For the most part, NC_Net will report errors to the Application Log in Event Viewer. 

More work needs to be accomplished on the error reporting.  It should on all cases list an event however the meaning of the event is not always accurate, and sometimes not very meaningful.  The event reporting will be updated slowly as new revisions are released. For the most part the event reporting updates will focus on Fixing current problems.

current has some problems with changing ports,

File age not yet implemented

 

 

 

# NSClient basic return types

command[check_nt_disk]=$USER1$/check_nt -H $HOSTADDRESS$ -p 1248 -v USEDDISKSPACE -l $ARG1$ -w $ARG2$ -c $ARG3$
command[check_nt_cpuload]=$USER1$/check_nt -H $HOSTADDRESS$ -p 1248 -v CPULOAD -l $ARG1$
command[check_nt_uptime]=$USER1$/check_nt -H $HOSTADDRESS$ -p 1248 -v UPTIME
command[check_nt_clientversion]=$USER1$/check_nt -H $HOSTADDRESS$ -p 1248 -v CLIENTVERSION
command[check_nt_process]=$USER1$/check_nt -H $HOSTADDRESS$ -p 1248 -v PROCSTATE -l $ARG1$
command[check_nt_service]=$USER1$/check_nt -H $HOSTADDRESS$ -p 1248 -v SERVICESTATE -l $ARG1$
command[check_nt_memuse]=$USER1$/check_nt -H $HOSTADDRESS$ -p 1248 -v MEMUSE -w $ARG1$ -c $ARG2$
command[check_nt_fileage]=$USER1$/check_nt -H $HOSTADDRESS$ -p 1248 -v FILEAGE –l $ARG1$ -w $ARG2$ -c $ARG3$

# Custom counters (one per required counter, or define a generic one and use $ARG1$ to specify the requested counter).

command[check_nt_pagingfile]=$USER1$/check_nt -H $HOSTADDRESS$ -p 1248 -v COUNTER -l "
\\Paging File(_Total)\\%% Usage","Paging File usage is %.2f %%" -w $ARG1$ -c $ARG2$