Este conteúdo não está disponível no idioma selecionado.
Chapter 9. Keeping kernel panic parameters disabled in virtualized environments
When configuring a Virtual Machine in RHEL 9, do not enable the softlockup_panic
and nmi_watchdog
kernel parameters, because the Virtual Machine might suffer from a spurious soft lockup. And that should not require a kernel panic.
Find the reasons behind this advice in the following sections.
9.1. What is a soft lockup
A soft lockup is a situation usually caused by a bug, when a task is executing in kernel space on a CPU without rescheduling. The task also does not allow any other task to execute on that particular CPU. As a result, a warning is displayed to a user through the system console. This problem is also referred to as the soft lockup firing.
9.2. Parameters controlling kernel panic
The following kernel parameters can be set to control a system’s behavior when a soft lockup is detected.
softlockup_panic
Controls whether or not the kernel will panic when a soft lockup is detected.
Type Value Effect Integer
0
kernel does not panic on soft lockup
Integer
1
kernel panics on soft lockup
By default, on RHEL 8, this value is 0.
The system needs to detect a hard lockup first to be able to panic. The detection is controlled by the
nmi_watchdog
parameter.nmi_watchdog
Controls whether lockup detection mechanisms (
watchdogs
) are active or not. This parameter is of integer type.Value Effect 0
disables lockup detector
1
enables lockup detector
The hard lockup detector monitors each CPU for its ability to respond to interrupts.
watchdog_thresh
Controls frequency of watchdog
hrtimer
, NMI events, and soft or hard lockup thresholds.Default threshold Soft lockup threshold 10 seconds
2 *
watchdog_thresh
Setting this parameter to zero disables lockup detection altogether.
9.3. Spurious soft lockups in virtualized environments
The soft lockup firing on physical hosts usually represents a kernel or a hardware bug. The same phenomenon happening on guest operating systems in virtualized environments might represent a false warning.
Heavy workload on a host or high contention over some specific resource, such as memory, can cause a spurious soft lockup firing because the host might schedule out the guest CPU for a period longer than 20 seconds. When the guest CPU is again scheduled to run on the host, it experiences a time jump that triggers the due timers. The timers also include the hrtimer
watchdog that can report a soft lockup on the guest CPU.
Soft lockup in a virtualized environment can be false. You must not enable the kernel parameters that trigger a system panic when a soft lockup reports to a guest CPU.
To understand soft lockups in guests, it is essential to know that the host schedules the guest as a task, and the guest then schedules its own tasks.