1.5 KiB
Failover for Princeton University
Failover from primary Nagios XI to secondary Nagios XI is a Disaster Recovery effort aimed at providing an up-to-date secondary Nagios server that can take over monitoring and notification options should the primary be unavailable. Once the primary is available, the secondary will cease monitoring and go back to a passive state.
Prerequisites
-
Nagios XI must be installed on both boxes with the same underlying directory configuration. If there are any differences in file locations or major configuration between the two boxes, the Nagios failover will have unpredictable results, including complete system failure.
-
The syncing process will delete files that it does not believe should be on the secondary, so all work must be performed on the primary. Any work performed on the secondary will be overwritten when the next synchronization process occurs.
-
/home/nagios/bin exists and contains the files needed for this process. Note that the sync process will sync these from the primary to the secondary, so like all other files, they must only be modified on the primary.
/home/nagios/bin/failover.sh /home/nagios/bin/nagios_start_stop.sh /home/nagios/bin/restore_xi.sh /home/nagios/bin/rsync_xi.sh
-
The root user has the ability to SSH from the primary to the secondary without entering a passphrase. This is how the rsync and database copies are performed
-
root on primary (and root on secondary) have crontab requirements that will be detailed separately.