Questo contenuto non è disponibile nella lingua selezionata.
Chapter 10. Keeping kernel panic parameters disabled in virtualized environments
Do not enable the softlockup_panic and nmi_watchdog kernel parameters when configuring virtual machines in Red Hat Enterprise Linux 10. These parameters can cause spurious soft lockups that result in a kernel panic.
The reasons behind this advice are as follows.
10.1. What is a soft lockup Copia collegamentoCollegamento copiato negli appunti!
A soft lockup usually indicates a bug. The affected task runs in kernel space on one CPU without rescheduling. The task also does not allow any other task to execute on that particular CPU.
As a result, a warning is displayed to a user through the system console. This problem is also referred to as the soft lockup firing.
10.2. Parameters controlling kernel panic Copia collegamentoCollegamento copiato negli appunti!
Control system behavior during soft lockups by configuring certain kernel parameters. You can enable detection mechanisms, adjust thresholds, and determine if the kernel panics to manage system stability.
softlockup_panicControls whether the kernel panics when a soft lockup is detected.
Expand Table 10.1. Valid values for softlockup_panic Type Value Effect Integer
0
kernel does not panic on soft lockup
Integer
1
kernel panics on soft lockup
By default, on RHEL 10, this value is 0.
The system needs to detect a hard lockup first to be able to panic. The detection is controlled by the
nmi_watchdogparameter.nmi_watchdogControls whether lockup detection mechanisms (
watchdogs) are active or not. This parameter is of integer type.Expand Table 10.2. Valid values for nmi_watchdog Value Effect 0
disables lockup detector
1
enables lockup detector
The hard lockup detector monitors each CPU for its ability to respond to interrupts.
watchdog_threshControls frequency of watchdog
hrtimer, NMI events, and soft or hard lockup thresholds.Expand Table 10.3. Relationship between default threshold and soft lockup threshold for watchdog_thresh Default threshold Soft lockup threshold 10 seconds
2 *
watchdog_threshSetting this parameter to zero disables lockup detection altogether.
10.3. Spurious soft lockups in virtualized environments Copia collegamentoCollegamento copiato negli appunti!
The soft lockup firing on physical hosts usually represents a kernel or a hardware bug. The same phenomenon happening on guest operating systems in virtualized environments might represent a false warning.
Heavy workload on a host or high contention over some specific resource, such as memory, can cause a spurious soft lockup firing because the host might schedule out the guest CPU for a period longer than 20 seconds. When the guest CPU is again scheduled to run on the host, it experiences a time jump that triggers the due timers. The timers also include the hrtimer watchdog that can report a soft lockup on the guest CPU.
Soft lockup in a virtualized environment can be false. You must not enable the kernel parameters that trigger a system panic when a soft lockup reports to a guest CPU.
To understand soft lockups in guests, it is essential to know that the host schedules the guest as a task, and the guest then schedules its own tasks.