4.6. Configuring a Watchdog
4.6.1. Adding a Watchdog Card to a Virtual Machine
You can add a watchdog card to a virtual machine to monitor the operating system’s responsiveness.
Adding Watchdog Cards to Virtual Machines
-
Click
and select a virtual machine. - Click .
- Click the High Availability tab.
- Select the watchdog model to use from the Watchdog Model drop-down list.
- Select an action from the Watchdog Action drop-down list. This is the action that the virtual machine takes when the watchdog is triggered.
- Click .
4.6.2. Installing a Watchdog
To activate a watchdog card attached to a virtual machine, you must install the watchdog
package on that virtual machine and start the watchdog
service.
Installing Watchdogs
- Log in to the virtual machine on which the watchdog card is attached.
Install the
watchdog
package and dependencies:# yum install watchdog
Edit the /etc/watchdog.conf file and uncomment the following line:
watchdog-device = /dev/watchdog
- Save the changes.
Start the
watchdog
service and ensure this service starts on boot:Red Hat Enterprise Linux 6:
# service watchdog start # chkconfig watchdog on
Red Hat Enterprise Linux 7:
# systemctl start watchdog.service # systemctl enable watchdog.service
4.6.3. Confirming Watchdog Functionality
Confirm that a watchdog card has been attached to a virtual machine and that the watchdog
service is active.
This procedure is provided for testing the functionality of watchdogs only and must not be run on production machines.
Confirming Watchdog Functionality
- Log in to the virtual machine on which the watchdog card is attached.
Confirm that the watchdog card has been identified by the virtual machine:
# lspci | grep watchdog -i
Run one of the following commands to confirm that the watchdog is active:
Trigger a kernel panic:
# echo c > /proc/sysrq-trigger
Terminate the
watchdog
service:# kill -9
pgrep watchdog
The watchdog timer can no longer be reset, so the watchdog counter reaches zero after a short period of time. When the watchdog counter reaches zero, the action specified in the Watchdog Action drop-down menu for that virtual machine is performed.
4.6.4. Parameters for Watchdogs in watchdog.conf
The following is a list of options for configuring the watchdog
service available in the /etc/watchdog.conf file. To configure an option, you must uncomment that option and restart the watchdog
service after saving the changes.
For a more detailed explanation of options for configuring the watchdog
service and using the watchdog
command, see the watchdog
man page.
Variable name | Default Value | Remarks |
---|---|---|
| N/A |
An IP address that the watchdog attempts to ping to verify whether that address is reachable. You can specify multiple IP addresses by adding additional |
| N/A |
A network interface that the watchdog will monitor to verify the presence of network traffic. You can specify multiple network interfaces by adding additional |
|
|
A file on the local system that the watchdog will monitor for changes. You can specify multiple files by adding additional |
|
|
The number of watchdog intervals after which the watchdog checks for changes to files. A |
|
|
The maximum average load that the virtual machine can sustain over a one-minute period. If this average is exceeded, then the watchdog is triggered. A value of |
|
|
The maximum average load that the virtual machine can sustain over a five-minute period. If this average is exceeded, then the watchdog is triggered. A value of |
|
|
The maximum average load that the virtual machine can sustain over a fifteen-minute period. If this average is exceeded, then the watchdog is triggered. A value of |
|
|
The minimum amount of virtual memory that must remain free on the virtual machine. This value is measured in pages. A value of |
|
| The path and file name of a binary file on the local system that will be run when the watchdog is triggered. If the specified file resolves the issues preventing the watchdog from resetting the watchdog counter, then the watchdog action is not triggered. |
| N/A | The path and file name of a binary file on the local system that the watchdog will attempt to run during each interval. A test binary allows you to specify a file for running user-defined tests. |
| N/A |
The time limit, in seconds, for which user-defined tests can run. A value of |
| N/A |
The path to and name of a device for checking the temperature of the machine on which the |
|
|
The maximum allowed temperature for the machine on which the |
|
| The email address to which email notifications are sent. |
|
| The interval, in seconds, between updates to the watchdog device. The watchdog device expects an update at least once every minute, and if there are no updates over a one-minute period, then the watchdog is triggered. This one-minute period is hard-coded into the drivers for the watchdog device, and cannot be configured. |
|
|
When verbose logging is enabled for the |
|
|
Specifies whether the watchdog is locked in memory. A value of |
|
|
The schedule priority when the value of |
|
| The path and file name of a PID file that the watchdog monitors to see if the corresponding process is still active. If the corresponding process is not active, then the watchdog is triggered. |