check_pg_streaming_replication

Nagios plugin to check Postgres Streaming replication

This script could be used as Nagios check plugin to verify Postgres Streaming replication state.

This script :

  • check if Postgres is running (CRITICAL raise if not)
  • check if Postgres is in recovery mode (using ̀pg_is_in_recovery()) :
  • if the expected mode need to be auto-detected (default, see ̀-e parameter):
    • if Postgres is in recovery mode: hot-standby
    • if recovery file is present and contain primary_conninfo: hot-standby
    • otherwise: master
  • if expected mode is hot-standby:
    • check if Postgres is in recovery mode (CRITICAL raise if not)
    • retrieve from Postgres the last xlog file received and the xlog file replayed
    • retrieve master connection information from Postgres primary_conninfo configuration parameter (UNKNOWN raise on error). Default Postgres master TCP port will be used if port is not specify.
    • retrieve master sync mode from synchronous_commit setting and assume synchronous commit is enabled if synchronous_commit is equal to on or remote_apply)
    • retrieve the current state and sync state of the host from Postgres master server by making a connection on master server (UNKNOWN raise on error).
    • check if the current state of the host is streaming (CRITICAL raise if not)
    • check if the current sync state of the host is the expected one (default: sync, see -e parameter, CRITICAL raise if not)
    • if the check of the current XLOG file of the master host is enabled :
      • retrieve current xlog file from Postgres master server by making a connection on master server (UNKNOWN raise on error).
      • check if the current master xlog file is the last received xlog file (CRITICAL raise if not)
    • check if the last received xlog file is the last replayed xlog file : if not, check the current delay with the last replayed transaction against _replay_warndelay and _replay_critdelay thresholds and raise corresponding error if they are exceeded
    • if synchronous commit is enabled on master, check the last xlog file sent by Postgres master is the last received by the slave. If not, retrieve difference (in bytes) and raise a WARNING.
    • if synchronous commit is disabled on master, check the last xlog file sent by Postgres master is the last writed by the slave. If not, retrieve difference (in bytes) and raise a WARNING.
    • otherwise, return OK state
  • if expected mode is master:
    • check if Postgres is in recovery mode (CRITICAL raise if it is)
    • retrieve current xlog file (UNKNOWN raise on error)
    • retrieve sync mode from synchronous_commit setting and assume synchronous commit is enabled if synchronous_commit is equal to on or remote_apply)
    • list stand-by client(s) from master and check for each to them:
      • if synchronous commit is enabled, check the last xlog file sent by Postgres master is the current one on master (WARNING raise if not)
      • if synchronous commit is disabled, check the last xlog file sent by Postgres master is the laster writed one on master (WARNING raise if not)
    • otherwise, return OK state with list and count of stand-by client(s)

Note : This script was originally write for PostgreSQL 9.1 and test on 9.1, 9.5, 9.6, 11, 13 and 15. Do not hesitate to tell me how this script work with other versions and share some fix. All contributions are welcome !

Requirements

  • Some CLI tools: sudo, awk, sed, bc, psql and pg_lscluster

  • On master node: Slaves must be able to connect with user from recovery.conf / postgresql.auto.conf (or user specify using -U) to database with the same name (or another specified with -D) as trust (or using password specified in ~/.pgpass). This user must have SUPERUSER privilege (need to get replication details).

  • On standby node: PG_USER must be able to connect locally on the database with the same name (or another specified with -D) as trust (or using password specified in ~/.pgpass).

Installation

From debian packages

echo "deb http://debian.zionetrix.net stable main" | sudo tee /etc/apt/sources.list.d/zionetrix.list
sudo apt -o Acquire::AllowInsecureRepositories=true -o Acquire::AllowDowngradeToInsecureRepositories=true update
sudo apt -o APT::Get::AllowUnauthenticated=true install --yes zionetrix-archive-keyring
sudo apt update
sudo apt install check-pg-streaming-replication

From sources

apt install sudo awk sed bc postgresql-client
git clone https://gitea.zionetrix.net/bn8/check_pg_streaming_replication.git \
  /usr/local/src/check_pg_streaming_replication
mkdir -p /usr/local/lib/nagios/plugins
ln -s /usr/local/src/check_pg_streaming_replication/check_pg_streaming_replication \
  /usr/local/lib/nagios/plugins/check_pg_streaming_replication

Usage

Usage: ./check_pg_streaming_replication [-d] [-h] [options]
    -u pg_user              Specify local Postgres user (Default: try to auto-detect or
                            use postgres)
    -b psql_bin             Specify psql binary path (Default: /usr/bin/psql)
    -B pg_lsclusters_bin    Specify pg_lsclusters binary path (Default: /usr/bin/pg_lsclusters)
    -V pg_version           Specify Postgres version (Default: try to auto-detect or
                            use 9.1)
    -m pg_main              Specify Postgres main directory path (Default: try to auto-detect or
                            use /var/lib/postgresql//main)
    -r recovery_conf        Specify Postgres recovery configuration file path
                            (Default: [PG_MAIN]/recovery.conf on PG <= 11,
                            [PG_MAIN]/postgresql.auto.conf on PG >= 12)
    -U pg_master_user       Specify Postgres user to use on master (Default: user from recovery.conf
                            file)
    -p pg_port              Specify default Postgres master TCP port (Default: same as local
                            PostgreSQL port if detected or use 5432)
    -D dbname               Specify DB name on Postgres master/slave to connect on (Default:
                            PG_USER, must match with .pgpass one is used)
    -w replay_warn_delay    Specify the replay warning delay in second
                            (Default: 3)
    -c replay_crit_delay    Specify the replay critical delay in second
                            (Default: 5)
    -e expected_sync_state  The expected replication state ('sync' or 'async',
                            default: sync)
    -E expected_mode        The expected mode ('master', 'hot-standby' or 'auto',
                            default: 'auto')
    -d                      Debug mode
    -h                      Show this message

Copyright

Copyright (c) 2014-2024 Benjamin Renard

License

This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

You should have received a copy of the GNU General Public License along with this program. If not, see .