Release 0.12.5 latest
Windows PDH overhaul, expression functions, boot.ini paths
This release lands a long-overdue stabilisation pass on the Windows PDH subsystem (multiple long-standing crashes and
counter-availability issues), adds first-class functions in detail-syntax / warn / crit expressions, and moves
path-resolver overrides from settings into boot.ini to unblock future moves of config and certificate storage.
Highlights
- Windows PDH subsystem overhaul. Fixes #547 / #592 (service crash when PDH misbehaves on a particular machine),
634 (counters now retried when initially unavailable instead of staying broken until restart), and #652 / #906 (
better English-counter fallback on non-English Windows).
- Functions in expressions and templates (#281).
format_bytes,convert_bytes,scale, composable withand/or/not— usable indetail-syntax,top-syntax,warn,crit, andfilter. Today exposed bycheck_pdh; rolling out elsewhere. check_networkunderstands NIC teams (#625). Newmode=adapter/mode=bothreadsWin32_PerfRawData_Tcpip_NetworkAdapter, which is the only source that reports the team aggregate.- Aliases in
CheckHelpers. A new alias section under[/settings/check helpers/alias]provides the historicalCheckExternalScriptsalias mechanism without dragging in the external-scripts machinery. Preferred place for new aliases. - WEB:
disable admin useroption. Suppresses the built-inadminuser entirely — for monitoring-only exposures where remote reconfiguration must be impossible even if credentials leak. - Plugin
prepare-shutdownhook. Modules get a clean teardown phase before unload — listening sockets and pollers stop accepting work cleanly. Wired up in the network/scheduler modules. - Path overrides moved from settings to
boot.ini.[/paths]innsclient.iniis no longer consulted; a new[paths]section inboot.ini(and a--path KEY=VALUECLI flag) take its place. This is a breaking change for the small number of users who relied on[/paths]— see Upgrade notes below.
Detailed changes
Windows PDH — stability overhaul
Long-standing instability in the PDH-based Windows performance-counter subsystem, addressed in one pass:
- #547 / #592 — service crash when PDH misbehaves. Hardened the enumeration and lookup paths against the partial / inconsistent results PDH returns on certain machine states. PDH enumeration buffers were refactored to use smart buffers throughout, removing the manual sizing loops where the bug lived.
- #634 — counters now retried when initially unavailable. Previously a counter that wasn't ready at boot would stay broken until the service was restarted; the collector now re-attempts on the normal collection cadence.
- #652 / #906 — non-English Windows counter lookup. Improved the English-counter fallback path so checks that reference counters by English name keep working on localised installs.
- Resource leak in PDH counter lookup — handle leaked on the error path of counter-name → counter-path resolution.
CheckSystem — expression functions and counter scaling
#281. The expression language now supports function calls, usable in any context that takes an expression (filter,
warn, crit) or a template (detail-syntax, top-syntax, perf-syntax). Use the %(...) placeholder form —
the legacy ${...} form cannot capture nested parentheses and cannot call functions.
Built-ins exposed by check_pdh today:
| Function | Purpose |
|---|---|
format_bytes(value) |
Auto-scaled human bytes — 4194304 → "4MB" (1024-based) |
format_bytes(value, 'MB') |
Fixed unit. B, K/KB, M/MB, G/GB, T/TB |
convert_bytes(value, 'MB') |
Numeric value in the named unit — for thresholds |
scale(value, divisor) |
Divide by an arbitrary divisor (e.g. 1 000 000 for Mbps) |
# Threshold in MB, display human-friendly
check_pdh counter=memory_bytes \
"warning=convert_bytes(value, 'MB') > 500" \
"detail-syntax=%(alias) = %(format_bytes(value))"
# Network rates as Mbps (decimal — use scale, not convert_bytes)
check_pdh counter=bytes_per_sec \
"detail-syntax=Speed = %(scale(value, 1000000)) Mbps"
check_pdh also exposes variable-style shortcuts (value_human, value_mb, value_gb, …) — syntactic sugar for the
corresponding format_bytes / convert_bytes calls. Reach for variables when one of the prebuilt units fits; reach for
functions when you need a custom unit, a custom divisor, or composition with other expressions.
CheckSystem — check_network NIC team support
#625. The default mode=interface reads Win32_PerfRawData_Tcpip_NetworkInterface (one row per physical adapter —
does not report team aggregates). New modes:
mode=adapter— readsWin32_PerfRawData_Tcpip_NetworkAdapter, which includes the team aggregate as a virtual interface named after the team. The team aggregate is the row with no matchingWin32_NetworkAdapterMAC entry, so it can be selected withfilter=MAC = ''.mode=both— returns both sources, tagged with a newsourcekeyword for filtering.
# Monitor a NIC team aggregate
check_network mode=adapter "warn=total > 100M" "crit=total > 500M"
# Alert only on the team adapter
check_network mode=adapter "filter=MAC = ''"
CheckHelpers — aliases
Aliases (a fixed command + fixed argument list exposed under a new name) have historically lived in
[/settings/external scripts/alias], requiring CheckExternalScripts to be loaded even when the alias only
invoked internal commands. A new section under [/settings/check helpers/alias] provides the same mechanism in
CheckHelpers, with no external-scripts dependency.
[/modules]
CheckHelpers = enabled
[/settings/check helpers/alias]
my_check_cpu = check_cpu warn=load>80 crit=load>90
my_check_process = check_process "process=$ARG1$" "crit=state != 'started'"
Both modules can coexist; each reads its own section. Last-loaded wins on name collisions — pick one as the home for new aliases so you don't have to remember which is which.
WEBServer — disable admin user (cccc14e4)
New boolean under [/settings/WEB/server] that suppresses the built-in admin user entirely: it is not seeded on first
boot, any pre-existing admin row in [/settings/WEB/server/users] is dropped at load time, and the "no users → re-add
admin" fallback is skipped. For monitoring-only WEB exposures where remote reconfiguration must be impossible even if
credentials leak.
[/settings/WEB/server]
disable admin user = true
[/settings/WEB/server/users/readonly]
password = ...
role = monitoring
Mirrored on the install command:
nscp web install --disable-admin
Mutually exclusive with --password (the install would create no user, so a password would have nowhere to go — the
command refuses explicitly).
Service — prepare_shutdown plugin hook
Plugins now receive a prepare_shutdown callback before unload, giving them a chance to flush state, stop accepting
new work, and tear down listening sockets cleanly rather than racing the unload. Wired up in NRPEServer, NSCAServer,
NSClientServer, CheckMKServer, WEBServer, and Scheduler. The callback is optional — custom plugins built against
the older API continue to work unchanged.
Service — path overrides via boot.ini and --path CLI (fbdfe257, d2075b99)
Path-resolver tokens (module-path, certificate-path, log-path, cache-path, scripts, web-path, …) used to be
overridden via [/paths] in nsclient.ini. That doesn't work for the upcoming move of writable state out of the
install directory: the path resolver is needed before the main INI is opened, so overriding where the INI lives must
happen earlier.
The override location is now boot.ini:
; boot.ini
[settings]
common = ini://${shared-path}/nsclient.ini
[paths]
module-path = C:\Program Files\NSClient++\modules
log-path = D:\nscp\logs
cache-path = D:\nscp\cache
A --path KEY=VALUE CLI flag layers on top of boot.ini and wins — useful for build tooling and CI:
nscp service --run \
--path module-path=C:\build\modules \
--path log-path=C:\build\log
IcingaClient — built-in alias and container test
Adds a built-in alias for the standard Icinga submission flow and a Docker-based end-to-end test so the integration is exercised on every build.
simpleini — NUL-termination fix for non-UTF-8 INI files
The INI loader passed an explicit length to mbstowcs, but per POSIX mbstowcs(NULL, src, n) ignores n and scans
until \0. On non-UTF-8 stores the size probe could walk past the buffer. The buffer now carries an explicit
terminator.
Upgrade notes
[/paths]users: if you had a[/paths]section in yournsclient.ini, copy the entries into[paths]inboot.ini. The settings-side section is no longer consulted. The default install does not use[/paths]and is unaffected.- Custom-plugin authors: the new
prepare_shutdowncallback is optional. If your module manages sockets or background threads, you should implement it —unloadis now expected to be a last-resort teardown rather than the place where listeners get stopped. check_pdhconfigs using${...}for function calls: there are none today (the feature is new), but if you adapt examples from third-party docs that use${format_bytes(...)}, rewrite to%(format_bytes(...)). The${...}form stops at the first}and cannot parse nested parentheses.- Monitoring-only WEB deployments: flip
disable admin user = trueunder[/settings/WEB/server]and define your own read-only users (or rely onallow anonymous access = truewith a tightly scopedanonymousrole). The built-in admin will not be seeded, even on first boot.
Full Changelog: https://github.com/mickem/nscp/compare/0.12.4...0.12.5
Release 0.12.4
0.12.4 — Regression fixes for Icinga and CheckSystem
This is a maintenance release focused on regressions introduced since 0.12.3. No new features; no breaking changes for configurations that don't hit the items below.
Highlights
- Icinga
check_nscp_apiworks again. The query-string credential path was removed in 0.12.3 for security (commit340b8db1). That hardening broke Icinga's bundledcheck_nscp_apiplugin, which still passes the password as?password=.... This release reinstates the legacy path behind a User-Agent allowlist (default: clients whose User-Agent matchesIcinga/check_nscp_api) — every other client keeps the strict post-340b8db1 rejection. - Better "module not found" messages on Windows. When a configured module fails to load, the error now points at the
WiX installer feature that ships the module (e.g.
NRPEServer→ "NRPE Support"), so operators can fix the cause (re-run the installer and tick that feature) without reading source. IcingaClient.dllis now in the installer. The DLL was being built but not packaged, so the corresponding Op5/Icinga client features were unusable on stock Windows installs.os_updates.statuskeyword renamed toupdate_status. The previous name clashed with the built-instatuskeyword every check exposes, which made filter / detail-syntax expressions ambiguous oncheck_os_updates. Any custom config that referencedos_updates.statusmust be updated — see Behaviour change below.check_wmino longer crashes on warn/crit filters. A use-after-mutation in the WMI row iterator caused an access violation whenever awarn=orcrit=filter touched a column value (e.g.check_wmi "query=Select Version from win32_OperatingSystem" "warn=Version not like '6.3'"). Affected every filter that exercised the post-iteration deferred-evaluation path.
Detailed changes
WebServer — legacy query-string authentication restored for specific clients (94b2057d)
The 0.12.3 hardening removed three paths because URL-borne credentials and tokens leak into browser history, proxy logs, and Referer headers:
GET/POST /auth/token?password=...GET/POST /auth/logout?token=...?TOKEN=.../?__TOKEN=...as a session-token fallback on any endpoint
Removing them broke Icinga's bundled check_nscp_api plugin, which still ships with the query-string mechanism. To
unblock that integration without re-opening the vector to browsers and arbitrary scrapers, this release gates the legacy
paths on a User-Agent allowlist:
- New setting
[/settings/WEB/server] legacy query auth user agents. Comma-separated list of User-Agent substrings ( case-insensitive). A request whose User-Agent contains any pattern is allowed to use the legacy query-string mechanism; everything else still gets the 0.12.3 rejection (410 Gone on/auth/*, 403 on?TOKEN=). - Default:
Icinga/check_nscp_api— anchors on the specific plugin name, so unrelated tooling that merely mentions " Icinga" in its User-Agent doesn't slip through. - Set to an empty string to disable the fallback entirely (matches the strict 0.12.3 behaviour).
- The 410 / 403 rejection log lines now mention this setting as the escape hatch so operators don't have to dig through source to find it.
Security posture, in short: this is not a defence against malicious clients — an attacker can spoof the User-Agent — but
it keeps the legacy vector off the default surface for browsers, scrapers, and anything else that isn't
check_nscp_api.
Service — installer-feature hints in module-load errors (793c3ee1)
When a referenced module's DLL isn't on disk (typically because the operator didn't tick the relevant feature in the Windows installer), the error now ends with a hint:
Failed to load NRPEServer: (module 'NRPEServer' is part of the 'NRPE Support' installer feature; re-run the
NSClient++ installer and enable that feature, or see installers/installer-NSCP/Product.wxs for the full feature map)
Covers every module shipped by the MSI: CheckPlugins (the bulk of check_* modules), NRPE Support, Check MK Support,
NSCA / NSCA-NG, WEB Server, Lua / Python scripting, OP5 / Elastic / Icinga client, etc.
Hint is Windows-only — on Linux the package manager handles module installation and the hint would be misleading.
Installer — IcingaClient.dll added (3a9af3cf)
IcingaClient.dll is built by the CheckSystem solution but was missing from Product.wxs, so it was never shipped. The
Op5 → Icinga integration path was effectively broken on stock Windows installs. The DLL is now in the "Various client
plugins" feature alongside GraphiteClient, SMTPClient, SyslogClient, etc.
CheckSystem — check_os_updates keyword rename (cf3613e2)
The check_os_updates filter previously exposed a per-item field called status (overall update status: up_to_date /
pending / error).
Every check also exposes a built-in top-level status (OK / WARNING / CRITICAL / UNKNOWN), so filter and detail-syntax
expressions like status = 'pending' were ambiguous — a regression caught by users upgrading from 0.11.x. The per-item
field has been renamed to update_status.
The built-in status keyword (OK/WARNING/CRITICAL) is unaffected.
Upgrade notes
- Icinga users:
check_nscp_apishould start working again after the upgrade with no config changes. If you have a non-stock Icinga probe that uses a different binary name, set[/settings/WEB/server] legacy query auth user agentsto a substring matching its User-Agent (or to plainIcingato broaden the match beyond the default). - Strict-deployment operators: if you want the strict 0.12.3 behaviour (no query-string credentials, no exceptions),
set
[/settings/WEB/server] legacy query auth user agents =(empty).
Full Changelog: https://github.com/mickem/nscp/compare/0.12.3...0.12.4
Release 0.12.3
Protocols, Security and bug fixes
This release rolls up everything since the last stable: five pre-releases (0.11.31, 0.11.32, 0.11.33, 0.12.1,
0.12.2) plus the latest in-development changes.
The headline themes are:
- New monitoring scenarios — first-class Checkmk and Icinga 2 integration, plus a real
check_netfamily. - A modern Web UI and REST API — events, metadata, settings DELETE, filterable lists, dedicated widgets for PDH counters and real-time filters.
- Hardened by default —
0.12.2is a security release that closes listener defaults that used to be silently permissive (emptyallowed hosts, plaintextcheck_nt, query-string tokens, etc.). - Many long-standing check fixes —
check_service,check_process,check_files,check_drivesize,check_uptime,CheckLogFile, and the shared filter/threshold engine all behave correctly now.
> Read the Breaking changes section before upgrading — several long-standing-but-incorrect > behaviours have been corrected and a number of listener defaults are now fail-closed. If you have an existing > configuration, plan to review it.
TL;DR for end users
- New scenario: Checkmk agent integration. Point a Checkmk site at port 6556 and you get a native-looking agent dump. See scenarios/check-mk.md.
- New scenario: Icinga 2 passive submission. A new
IcingaClientmodule submits passive results to Icinga 2's REST API as an alternative to NSCA / NRDP. - New scenario: NSCA-ng. A new hardened
NSCAngClientwith PSK and AEAD-first cipher selection. - Native cross-platform network checks:
check_tcp,check_dns,check_http,check_ntp_offset,check_connections. - Native Windows registry checks:
check_registry_key,check_registry_value. - HTTP proxy support for every HTTP-based client (NRDP, Elastic, Op5, Icinga, the configuration loader, ...).
- Windows ROOT trust store auto-export — HTTPS-bound checks validate certificates against the system trust store automatically.
- A modern Web UI with filterable lists, settings diff, dashboard, and dedicated CheckSystem widgets.
- New REST endpoints:
GET/DELETE /api/v2/events,GET /api/v2/metadata,DELETE /api/v2/settings/.... Covered in api/rest/. - Linux real-time metrics — the same background CPU/memory/disk/ network/load sampling that Windows has had for years.
- Many bug fixes in
check_service,check_process,check_files,CheckLogFile, the filter/threshold engine and the HTTP stack.
Major new features
Checkmk agent integration
NSClient++ can now serve a Checkmk-compatible agent dump on TCP port 6556. A real Checkmk site can register the host
with tag_agent = cmk-agent, discover services, and run checks — no proxy, no NSCA gateway.
Enable it:
[/modules]
CheckMKServer = enabled
LUAScript = enabled
CheckSystem = enabled
CheckDisk = enabled
CheckHelpers = enabled
[/settings/check_mk/server]
port = 6556
allowed hosts = 127.0.0.1,
submission ttl = 60 ; seconds, default 60
mrpe channel = check_mk-mrpe
local channel = check_mk-local
Out-of-the-box sections (no extra config):
| Section | Contents |
|---|---|
<<>> |
Version, OS, hostname |
<<>> |
Unix epoch (Windows clock-skew check) |
<<>> |
Seconds since boot (read from internal metrics store) |
<<>> |
MemTotal:/MemFree:/SwapTotal:/SwapFree: (from metrics store) |
<<>> |
Per-volume size/used/free/mountpoint (Windows) |
<<>> |
name state/start_type display_name per Windows service |
<<>> |
(user,vsz_kb,rss_kb,cputime,pid) cmdline per process |
Expose any nscp check as a Checkmk service under <<>>:
[/settings/check_mk/server/local]
CPU Load = command=check_cpu warn=load>80 crit=load>95
Disk C = command=check_drivesize drive=C: "warn=free<20%" "crit=free<10%"
MRPE relay under <<>>:
[/settings/check_mk/server/mrpe]
Uptime = command=check_uptime warn=uptime<2d
Memory = command=check_memory type=committed warn=used>80% crit=used>90%
Documentation: https://nsclient.org/docs/scenarios/check-mk.md`.
IcingaClient — Icinga 2 REST API submission
A new client module submits passive check results directly to an Icinga 2 master/satellite via the
/v1/actions/process-check-result REST endpoint, as an alternative to NSCA or NRDP.
[/modules]
IcingaClient = enabled
[/settings/IcingaClient/targets/default]
address = https://icinga2.example.com:5665
username = nscp
password = secret
hostname = ${hostname}
nscp client --module IcingaClient \
--command submit_icinga \
--address https://icinga2.example.com:5665 \
--username nscp --password secret \
--command heartbeat \
--result 0 \
--message "Hello from NSClient++" \
--ensure-objects
NSCA-ng client
A new NSCAngClient module ships a hardened NSCA-ng submission client with PSK support, AEAD-first cipher selection,
and connection retry logic.
Native support for Windows CA-store
On startup NSClient++ now exports the machine's ROOT certificate store as a single PEM bundle, so any check that does
TLS (check_http, IcingaClient, NRDP, ...) can validate certificates against the trust store the rest of Windows
already uses.
check_http url=https://www.ibm.com
OK: https://www.ibm.com -> 303 ok (0B in 33ms)
check_http url=https://self-signed.badssl.com/
CRITICAL: https://self-signed.badssl.com/ -> 0 error: Failed to connect ... certificate verify failed
CheckNet — five new (cross-platform) checks
CheckNet graduated from a placeholder into a full network-check module. All five commands work over NRPE as well as
locally:
check_tcp— open a TCP socket to one or more host/port pairs, optionally send a payload and require an expected substring.check_dns— resolve a hostname and optionally assert which addresses come back.check_http— fetch one or more URLs, check status code, response time and body content; supports custom headers and user-agent.check_ntp_offset— query one or more NTP servers and alert on offset / stratum.check_connections— Windows TCP/UDP connection table inspection (counts per protocol/family/state).
check_tcp host=smtp.gmail.com port=25 send="EHLO nsclient.org" expect="250"
check_dns host=google.com expected-address=172.217.20.174
check_http url=https://nsclient.org/ expected-body="NSClient" \
"warn=time > 500 or code >= 400" \
"crit=time > 2000 or code >= 500 or result != 'ok'"
check_ntp_offset "servers=0.pool.ntp.org,1.pool.ntp.org" timeout=2000
check_connections "filter=protocol = 'tcp' and state = 'TIME_WAIT'" \
"warn=count > 200" "crit=count > 1000"
CheckSystem (Windows) — registry checks
Two new commands let you monitor the Windows registry directly from NSClient++ instead of relying on external scripts.
They support recursion, exclude lists, 32/64-bit (WoW64) views, custom filters and the usual warn=/crit= expression
syntax.
check_registry_key— verify that a key exists, count sub-keys/values, watch its last-write time.check_registry_value— read a single value assert its type, size or content.
check_registry_key "key=HKLM\Software\NSClient" \
"warn=age > 7d" "crit=age > 30d or not exists"
check_registry_key "key=HKLM\Software\Microsoft\Windows\CurrentVersion\Uninstall" \
recursive max-depth=1 exclude=KB5005463 exclude=KB5005539
check_registry_value "key=HKLM\System\CurrentControlSet\Services\W32Time\Config" \
value=MaxPollInterval "warn=int_value > 14" "crit=int_value > 17"
CheckSystem — check_os_updates (Windows)
A new check using the Windows Update Agent (WUA) reports pending OS updates. By default any pending update returns warning; thresholds let you alert only on security/critical:
check_os_updates "warning=important > 0" "critical=security > 0 or critical > 0"
CheckSystem (Linux) — real-time metrics
The Linux build of CheckSystem now ships with the same real-time metric collection that has been available on Windows
for a long time: CPU, memory, disk, network and load are sampled in the background and exposed both to
dashboards/metrics and to real-time filters (filter=... rules that fire when a threshold is crossed). Existing
real-time filter configuration just works on Linux now.
Real-time filter metrics
CheckSystem's real-time filters now publish per-filter match and error counts under
system.realtime..fired / system.realtime..errors. Visible via:
- The metrics REST endpoint (
/api/v2/metrics+ filter) - Prometheus scrape
- The new
Metrics()Lua API indefault_check_mk.lua
Useful for spotting filters that never fire (typo in the where-clause) or filters that always error (broken expression).
CheckDisk — check_single_file
A focused variant of check_files for inspecting a single, known path. Compared to using check_files for the same
job:
- Only one required argument (
file=). - A clear error when the input is empty.
UNKNOWN: File not found:when the file is missing — instead of the empty-set / "No files found" workflow.- A useful default
detail-syntaxso a no-threshold run is informative on its own.
check_single_file file=C:/windows/WindowsUpdate.log "warn=age > 5m" "crit=age > 1h"
CRITICAL: WindowsUpdate.log (size=276, age=917)
CheckDisk — filesystem filtering for check_drivesize
check_drivesize can now filter drives by filesystem type — useful for excluding tmpfs, nfs, etc.
check_drivesize drive=* "filter=fs = 'NTFS'"
check_nscp_update
A new check command queries the GitHub releases API (with caching) and reports whether the running NSClient++ is up to date.
HTTP proxy support across every HTTP client
NSClient++ can now route HTTP and HTTPS traffic through a corporate proxy. The same surface is used by every component
built on the internal http::simple_client (NRDPClient, ElasticClient, Op5Client, IcingaClient, the remote boot.ini
loader, ...).
For HTTPS targets the client opens a CONNECT tunnel to the proxy, validates the proxy's response, and only then performs
the TLS handshake — so a single setting covers both http:// and https:// URLs.
Two new options on every HTTP client command and target:
| Option | Purpose |
|---|---|
proxy |
Proxy URL — scheme://[user:pass@]host[:port]/. Empty value disables the proxy. |
no-proxy |
Comma-separated list of hosts that bypass the proxy. A leading . is a suffix match. |
[/settings/NRDP/client/targets/nagios]
address = https://nagios.example.com/nrdp/
token = mytoken
proxy = http://proxy.corp.example:3128/
no proxy = localhost,127.0.0.1,.internal
Configuration loader (boot.ini):
[proxy]
url = http://proxy.corp.example:3128/
no_proxy = localhost,127.0.0.1,.internal
Notes / limits:
- Only the
http://proxy scheme is supported.socks5:///https://proxies are not. - No automatic detection of system proxy settings (
HTTP_PROXYenv vars, WinINET / WPAD). The proxy must be configured explicitly. - On
407 Proxy Authentication Requiredthe proxy's response body is captured in the error message.
Web UI / REST API expansion
New web routes:
| Route | Method | Purpose |
|---|---|---|
/api/v2/events |
GET |
List buffered real-time events |
/api/v2/events |
DELETE |
Drain (returns + clears) the event buffer in one call |
/api/v2/metadata |
GET |
Module/setting metadata index |
/api/v2/metadata/counters |
GET |
List of available PDH counters |
/api/v2/metadata/channels |
GET |
List of registered submission channels |
/api/v2/settings/ |
DELETE |
Remove a settings key or path (staged delete; survives restart) |
The settings store gained staged deletion: a DELETE is recorded so that subsequent reads of the deleted key/path
return "not present" until the change is saved. Stops a deleted-but-not-yet-saved key from being re-resurrected by a
concurrent read.
Web UI refresh
The bundled web interface has been heavily reworked:
- Modern theme with active-navigation highlighting and a redesigned login page.
- Filterable lists for Modules, Queries and Settings.
- Settings diff dialog — the "settings changed" widget can now show exactly which keys changed.
- CheckSystem settings UI got dedicated widgets for PDH counters and real-time filters: a counter picker that hits
/api/v2/metadata/counters, "Add filter" / "Add counter" dialogs, and a live preview of metric values pulled from the metrics endpoint.
If you've been editing real-time filters in nsclient.ini by hand, the web UI is now a much faster way to do it.
SMTPClient rewrite
The SMTPClient module has been substantially rewritten with proper SMTP handling, integration tests, and a Python-based test harness.
Smaller features
nscp settings --sort— produce stable, sorted output, useful for diffing exported settings between hosts.- Performance threshold min/max bounds — perfdata threshold expressions can declare minimum and maximum bounds,
propagated into emitted perfdata:
check_pdh "counter=\\Processor(_Total)\\% Processor Time" \ "perf-config=*(minimum:0;maximum:100)" - Timezone-aware
check_uptimeandSchduler— applies a timezone cache on both Windows and Unix, so absolute boot-time output and cron expressions agree with the host's local time. - WEBServer cookie attribute support —
Secure,HttpOnly,SameSite,Path,Domain,Expires,Max-Age. - WEBServer password hashing with constant-time verification — removes the timing oracle on the previous plaintext equality check.
- WEBServer authentication rate limiter — per-source throttling of failed authentication attempts:
[/settings/WEB/server] auth rate limit max failures = 10 ; 0 disables the limiter auth rate limit block seconds = 60
Filter engine — stable summary thresholds
These changes touch the shared filter / threshold engine and therefore affect every modular check (check_files,
check_service, check_process, check_eventlog, ...).
Stable count / total / *_count in warn= / crit=
warn= / crit= were evaluated during iteration. Summary variables such as count therefore exposed their running
value instead of the final post-iteration value, so a mixed expression like
crit = state = 'hung' OR count < 5
mis-fired on the very first row (count == 1 < 5) regardless of how many rows ultimately matched. Per-row evaluation is
now deferred: matched rows are recorded during iteration, and the warn/crit/ok engines run once the summary state is
final.
Mixed warn= / crit= evaluated when no rows match
If a filter excluded every row, mixed expressions like crit = state = 'stopped' OR count = 0 were skipped entirely —
leaving the check OK in the empty case. They are now evaluated with object-bound variables defaulting to false and
summary variables at their final values, so the check correctly returns CRITICAL when the service is missing.
Quieter, more predictable expression evaluation
- Operators audited so
is_unsurepropagates consistently; invalid-type comparisons resolve tounsure-falseinstead of erroring. - String variables on no-object cases now return an empty string with
is_unsure=trueand produce a warning in the log instead of an error per row — log volume on complex queries drops dramatically. - Removed the misleading "most likely mutating" warnings.
- Substantial new test coverage.
check_service and check_process fixes (Windows)
- "Failed to enumerate service: 6f7" on busy hosts — enumeration is now properly looped until the SCM signals end-of-data.
perf-syntax=noneactually suppresses perfdata —check_serviceused to emit empty perfdata aliases (''=4;0;1 ''=4;0;1 ...), blowing past NRPE size limits.- No more
TODOleaking into${desc}—check_service service=Spoolerused to render asOK: Spooler: TODO. Now:OK: Spooler: Print Spooler. delayedonly reported forSERVICE_AUTO_START— manual / boot / system / disabled services no longer randomly show up asdelayed.check_processsees protected / cross-user processes asNETWORK SERVICE— aPROCESS_QUERY_LIMITED_INFORMATIONfallback is now attempted, sowinlogon.exe,csrss.exeetc. no longer reportCRITICAL: =stoppedwhen the agent runs unprivileged.- Realtime
check_processis now case-insensitive, matching the active path and Windows itself.
check_files fixes
- #730 —
max-depth=0now scans the top directory only (was: bail out before scanning anything, returning "no files found"). - #598 — Non-ASCII paths (accented letters, CJK, ...) are no longer silently mangled by mismatched codepage conversions.
- #613 — Top-level paths that cannot be opened now produce
UNKNOWN: Path was not found:instead of being hidden behind the configuredempty-state. - #605 — NTFS junctions / symlinks / mount points are now skipped during recursion, preventing double-counts on self-referential trees.
- #717 — The legacy
CheckFilesshim now setsempty-state=okwhen translating, restoring 0.4-era behaviour for legacy calls that find zero files.
Other check / module fixes
- CheckDisk resilience — an error on a single unavailable volume no longer aborts the entire
check_drivesizerun. - #581 —
CheckLogFilehonours theline-splitargument (previously hard-coded to\n); multi-character delimiters such as\r\nare handled correctly. Real-time seek behaviour fixed; CRLF handling harmonised. - #589 — Time/duration arguments such as
time=3000foobarortime=3000mfoobarare no longer silently accepted; malformed inputs are rejected with a clear error. - #669 — The literal
U(Nagios "undefined") in performance data is preserved end-to-end instead of being coerced to0. Only an exactU,u,U%oru%token matches. - NSCA wire timestamps are now correctly built in UTC. Both server (IV packet) and client (data packet) used to
derive seconds-since-epoch from
second_clock::local_time(), which drifted by the host's TZ offset. Atimezonesetting on both ends allows legacy interop with agents that emit local-clock-as-Unix-time stamps. - Metrics collection regression fixed (some metrics were silently dropped).
- Op5Client / ElasticClient unified on the new HTTP client; 401 path fixed;
reponse → responsetypos corrected. - Gracefully handle non-numeric NSClient command codes.
- TLS support fixes; better randomness for encryption; race condition fixes; boundary checks for various network payloads and reading certificates.
- NRDP integration tests added; new
nrdpclient alias.
HTTP refactor
- HTTP request and response are now distinct types instead of one shared bag.
- Chunked transfer-encoding is decoded properly.
check_httpagainst servers usingTransfer-Encoding: chunked( most modern reverse proxies, Icinga 2, Kubernetes ingress, ...) now returns the full body instead of a truncated/garbled one. The IcingaClient module relies on this. - Header storage is normalised — case-insensitive lookup, no more duplicate-header surprises.
Security hardening
The 0.12.2 release is a security-focused pass. These do not change documented behaviour for well-formed traffic but
close down attacker-controlled edge cases.
DoS / resource-exhaustion limits
- Authorization header capped at 8 KiB to mitigate amplification.
- Per-connection parser buffer cap to prevent memory pinning from oversized or never-completed requests.
- Session token cap with eviction to prevent unbounded memory growth.
- Payload lengths below the protocol minimum are rejected before allocation.
- Path expansion now detects cycles and refuses to recurse, preventing stack overflow on pathological configurations.
NSCA hardening
- Packet version is checked.
- Timestamp validation tightened to mitigate replay attacks.
Log/output injection prevention
Control characters are stripped from values before they are written to external sinks, removing log/protocol-injection vectors:
- Log file entries (file names and messages)
- syslog messages (CR/LF/NUL stripped)
- Graphite metric paths and values
- HTTP response headers (header keys and values)
log_statusis now JSON-serialised so attacker-controlled fields cannot inject extra structured fields.
Filesystem / process safety
- PID file creation hardened against symlink attacks; exclusive access enforced.
- Archive extraction has a zip-slip guard that validates entry paths and refuses traversal sequences.
- Module and script names are validated to prevent path traversal at load time.
- Argument substitution in external scripts is isolated to prevent command injection through user-controlled tokens.
Cryptography / TLS
- HTTPS now logs explicitly when no certificate is present and warns on HTTP fallback in production.
- SSL connections enable hostname verification by default.
- Auto-generated passwords use OpenSSL
RAND_bytes(cryptographically secure) instead of the previous predictable generator. - Sensitive values are no longer logged at debug level.
check_ntpassword compare is constant-time.
Breaking changes
> Read this section carefully. Some changes are listener defaults that are now fail-closed; some are corrections to > long-standing buggy behaviour; some are internal API changes for out-of-tree modules.
Listeners default to safer behaviour
- Empty
allowed hostsnow rejects all connections. Previously treated as "allow any source". To genuinely expose the agent to any source, set it explicitly:allowed hosts = 0.0.0.0/0,::/0 check_nt(NSClientServer) defaults tossl = true. The legacycheck_ntprotocol carries the password in every request. The listener will not refuse to start if TLS is off, but it will log a warning. To keep the old plaintext behaviour for legacy clients, setssl = falseexplicitly in[/settings/NSClient/server].check_nt: the literal passwordNoneno longer authenticates. Empty server passwords now reject all requests. Errors are also genericised (ERROR: Bad request.) to remove the online password-guessing oracle.- WEBServer:
/auth/tokenand/auth/logoutare removed (HTTP 410). They accepted the password and session token as URL query parameters, leaking credentials into browser history and proxy logs. Migrate to:POST /api/v2/loginwithAuthorization: Basicto obtain a tokenDELETE /api/v2/loginwithAuthorization: Bearerto log out
- WEBServer:
?TOKEN=/?__TOKEN=query-string token auth removed. Send the token in a header instead:Authorization: Bearer,TOKEN:, orX-Auth-Token:. - WEBServer: anonymous access is now opt-in. A role named
anonymousregistered in settings is silently ignored unless the newallow_anonymousflag is enabled. - WEBServer: existing
adminuser is no longer overwritten on restart. Deployments that relied on the password being reset to the default at boot must adapt.
Scheduler — cron expressions evaluate in local time by default (#570)
The Scheduler module previously used UTC, so 40 15 * * * fired at 15:40 UTC regardless of host TZ. The default has
changed to local time, matching standard cron semantics. Hour and minute fields will shift accordingly on non-UTC hosts.
A new timezone setting under [/settings/scheduler] controls the reference clock:
[/settings/scheduler]
timezone = local ; default — standard cron semantics
; timezone = utc ; restore the pre-0.12 behaviour
; timezone = EST-05EDT,M3.2.0,M11.1.0 ; any POSIX TZ string is honoured
IANA names such as Europe/Stockholm are not supported — use the POSIX form. Unparseable values fall back to UTC
and surface as UTC? in any timezone label.
Filter / threshold engine
warn=/crit=no longer fire mid-iteration on running counts. Configurations "tuned" against the buggy early-fire will produce different results.crit = state = 'hung' OR count < 5 # Old: CRITICAL on the very first row (count == 1). # New: CRITICAL only if any row is 'hung' OR final count < 5.- Mixed
warn=/crit=now evaluate when no rows match.crit = state = 'stopped' OR count = 0 # Old: OK when nothing matched (count = 0). # New: CRITICAL when nothing matched (count = 0).If your old config implicitly treated "empty" as OK, add a
count > 0 AND ...guard or move the empty-case logic into a dedicated check.
Check-specific corrections
check_service:delayedis no longer reported for non-auto services. Filters that matchedstart_type = 'delayed'on Manual / Boot / System / Disabled services will stop matching. To alert on "any non-running service that isn't disabled":filter = start_type IN ('auto','delayed','boot','system') AND state != 'running'- Realtime
check_processis now case-insensitive. A rule that intentionally matched only an exact casing will now match all variants (almost certainly the desired behaviour). check_service:${desc}no longer returns the literalTODO. Use the real display name.check_service:perf-syntax=noneactually suppresses perfdata. Backends that consumed the empty-aliased entries (highly unlikely) will see them disappear.
check_files — corner cases changed
max-depth=0now scans the top directory instead of returning empty (#730).- Missing paths now return UNKNOWN instead of OK / empty (#613).
- NTFS junction loops are no longer double-counted (#605).
- Legacy
CheckFilescalls that previously returned UNKNOWN on empty results will now return OK (#717).
Configuration / startup
- CheckExternalScripts: malformed alias commands are refused at startup. The fallback "split-on-space" parser has been removed. Aliases whose command line does not parse cleanly are refused with an error in the log instead of being silently registered with surprising tokenisation. Review your logs after upgrading.
Internal API (out-of-tree module authors)
-
HTTP request/response API changed. Internal C++ types
http::request/http::responseare now distinct, headers are case-insensitive, and chunked decoding happens transparently. Out-of-tree modules linked against the old shared bag type need a small adjustment:// before http::packet pkt = client.send(...); auto body = pkt.body; // after http::response resp = client.send(http::request{...}); auto body = resp.body(); // chunked decoding already applied
Documentation reorganisation
- The documentation tree was restructured (
concepts/,checks-in-depth/,scenarios/,tutorial/,reference/are now clearly separated). Bookmarks and external links may need updating.
Upgrade checklist
- Audit
allowed hostson every node — empty values now reject everything. check_nt(NSClientServer) now defaults tossl = true. If your clients don't speak TLS, setssl = falseexplicitly. Either way the listener will log a warning at startup if TLS is off or a password is configured, recommending a switch to REST or NRPE.- Replace any client that calls
/auth/tokenor/auth/logoutwith the/api/v2/loginflow. - Replace any client that passes
?TOKEN=/?__TOKEN=in the query string with a header-based token. - Scheduler cron expressions on non-UTC hosts will shift to local time. Either update them or set
[/settings/scheduler] timezone = utcto restore the previous behaviour. - Review
check_service/check_process/check_filesfilters that may have relied on the corrected behaviours listed above. - Restart the service and review the log for new "refused alias" or "rejected connection" warnings — these flag configurations that were previously silently accepted.
No configuration migration is required for the new HTTP proxy keys, the Checkmk server, the Icinga client, the NSCA-ng client, or the new checks — they are all opt-in.
Full Changelog: https://github.com/mickem/nscp/compare/0.12.2...0.11.30