Chapter 3. Improving the network latency
CPU power management features can cause unwanted delays in time-sensitive application processing. You can disable some or all of these power management features to improve the network latency.
For example, if the latency is higher when the server is idle than under heavy load, CPU power management settings could influence the latency.
Disabling CPU power management features can cause a higher power consumption and heat loss.
3.1. How the CPU power states influence the network latency
The consumption state (C-states) of CPUs optimize and reduce the power consumption of computers. The C-states are numbered, starting at C0. In C0, the processor is fully powered and executing. In C1, the processor is fully powered but not executing. The higher the number of the C-state, the more components the CPU turns off.
Whenever a CPU core is idle, the built-in power saving logic steps in and attempts to move the core from the current C-state to a higher one by turning off various processor components. If the CPU core must process data, Red Hat Enterprise Linux (RHEL) sends an interrupt to the processor to wake up the core and set its C-state back to C0.
Moving out of deep C-states back to C0 takes time due to turning power back on to various components of the processor. On multi-core systems, it can also happen that many of the cores are simultaneously idle and, therefore, in deeper C-states. If RHEL tries to wake them up at the same time, the kernel can generate a large number of Inter-Processor Interrupts (IPIs) while all cores return from deep C-states. Due to locking that is required while processing interrupts, the system can then stall for some time while handling all the interrupts. This can result in large delays in the application response to events.
Example 3.1. Displaying times in C-state per core
The Idle Stats
page in the PowerTOP application displays how much time the CPU cores spend in each C-state:
Pkg(HW) | Core(HW) | CPU(OS) 0 CPU(OS) 4 | | C0 active 2.5% 2.2% | | POLL 0.0% 0.0 ms 0.0% 0.1 ms | | C1 0.1% 0.2 ms 0.0% 0.1 ms C2 (pc2) 63.7% | | C3 (pc3) 0.0% | C3 (cc3) 0.1% | C3 0.1% 0.1 ms 0.1% 0.1 ms C6 (pc6) 0.0% | C6 (cc6) 8.3% | C6 5.2% 0.6 ms 6.0% 0.6 ms C7 (pc7) 0.0% | C7 (cc7) 76.6% | C7s 0.0% 0.0 ms 0.0% 0.0 ms C8 (pc8) 0.0% | | C8 6.3% 0.9 ms 5.8% 0.8 ms C9 (pc9) 0.0% | | C9 0.4% 3.7 ms 2.2% 2.2 ms C10 (pc10) 0.0% | | | | C10 80.8% 3.7 ms 79.4% 4.4 ms | | C1E 0.1% 0.1 ms 0.1% 0.1 ms ...
Pkg(HW) | Core(HW) | CPU(OS) 0 CPU(OS) 4
| | C0 active 2.5% 2.2%
| | POLL 0.0% 0.0 ms 0.0% 0.1 ms
| | C1 0.1% 0.2 ms 0.0% 0.1 ms
C2 (pc2) 63.7% | |
C3 (pc3) 0.0% | C3 (cc3) 0.1% | C3 0.1% 0.1 ms 0.1% 0.1 ms
C6 (pc6) 0.0% | C6 (cc6) 8.3% | C6 5.2% 0.6 ms 6.0% 0.6 ms
C7 (pc7) 0.0% | C7 (cc7) 76.6% | C7s 0.0% 0.0 ms 0.0% 0.0 ms
C8 (pc8) 0.0% | | C8 6.3% 0.9 ms 5.8% 0.8 ms
C9 (pc9) 0.0% | | C9 0.4% 3.7 ms 2.2% 2.2 ms
C10 (pc10) 0.0% | |
| | C10 80.8% 3.7 ms 79.4% 4.4 ms
| | C1E 0.1% 0.1 ms 0.1% 0.1 ms
...
3.2. C-state settings in the EFI firmware
In most systems with an EFI firmware, you can enable and disable the individual consumption states (C-states). However, on Red Hat Enterprise Linux (RHEL), the idle driver determines whether the kernel uses the settings from the firmware:
-
intel_idle
: This is the default driver on hosts with an Intel CPU and ignores the C-state settings from the EFI firmware. -
acpi_idle
: RHEL uses this driver on hosts with CPUs from vendors other than Intel and ifintel_idle
is disabled. By default, theacpi_idle
driver uses the C-state settings from the EFI firmware.
For further details, see the /usr/share/doc/kernel-doc-<version>/Documentation/admin-guide/pm/cpuidle.rst
file provided by the kernel-doc
package.
3.3. Disabling C-states by using a custom TuneD profile
The TuneD service uses the Power Management Quality of Service (PMQOS
) interface of the kernel to set consumption states (C-states) locking. The kernel idle driver can communicate with this interface to dynamically limit the C-states. This prevents that administrators must hard code a maximum C-state value by using kernel command line parameters.
Prerequisites
-
The
tuned
package is installed. -
The
tuned
service is enabled and running.
Procedure
Display the active profile:
tuned-adm active
# tuned-adm active Current active profile: network-latency
Copy to Clipboard Copied! Create a directory for the custom TuneD profile:
mkdir /etc/tuned/network-latency-custom/
# mkdir /etc/tuned/network-latency-custom/
Copy to Clipboard Copied! Create the
/etc/tuned/network-latency-custom/tuned.conf
file with the following content:[main] include=network-latency [cpu] force_latency=cstate.id:1|2
[main] include=network-latency [cpu] force_latency=cstate.id:1|2
Copy to Clipboard Copied! This custom profile inherits all settings from the
network-latency
profile. Theforce_latency
TuneD parameter specifies the latency in microseconds (µs). If the C-state latency is higher than the specified value, the idle driver in Red Hat Enterprise Linux prevents the CPU from moving to a higher C-state. Withforce_latency=cstate.id:1|2
, TuneD first checks if the/sys/devices/system/cpu/cpu_<number>_/cpuidle/state_<cstate.id>_/
directory exists. In this case, TuneD reads the latency value from thelatency
file in this directory. If the directory does not exist, TuneD uses 2 microseconds as a fallback value.Activate the
network-latency-custom
profile:tuned-adm profile network-latency-custom
# tuned-adm profile network-latency-custom
Copy to Clipboard Copied!
3.4. Disabling C-states by using a kernel command line option
The processor.max_cstate
and intel_idle.max_cstate
kernel command line parameters configure the maximum consumption states (C-state) CPU cores can use. For example, setting the parameters to 1
ensures that the CPU will never request a C-state below C1.
Use this method to test whether the latency of applications on a host are being affected by C-states. To not hard code a specific state, consider using a more dynamic solution. See Disabling C-states by using a custom TuneD profile.
Prerequisites
-
The
tuned
service is not running or configured to not update C-state settings.
Procedure
Display the idle driver the system uses:
cat /sys/devices/system/cpu/cpuidle/current_driver intel_idle
# cat /sys/devices/system/cpu/cpuidle/current_driver intel_idle
Copy to Clipboard Copied! For details about the driver, see the
/usr/share/doc/kernel-doc-<version>/Documentation/admin-guide/pm/cpuidle.rst
file provided by thekernel-doc
package.If the host uses the
intel_idle
driver, set theintel_idle.max_cstate
kernel parameter to define the highest C-state that CPU cores should be able to use:grubby --update-kernel=ALL --args="intel_idle.max_cstate=0"
# grubby --update-kernel=ALL --args="intel_idle.max_cstate=0"
Copy to Clipboard Copied! Setting
intel_idle.max_cstate=0
disables theintel_idle
driver. Consequently, the kernel uses theacpi_idle
driver that uses the C-state values set in the EFI firmware. For this reason, also setprocessor.max_cstate
to override these C-state settings.On every host, independent from the CPU vendor, set the highest C-state that CPU cores should be able to use:
grubby --update-kernel=ALL --args="processor.max_cstate=0"
# grubby --update-kernel=ALL --args="processor.max_cstate=0"
Copy to Clipboard Copied! ImportantIf you set
processor.max_cstate=0
in addition tointel_idle.max_cstate=0
, theacpi_idle
driver overrides the value ofprocessor.max_cstate
and sets it to1
. As a result, withprocessor.max_cstate=0 intel_idle.max_cstate=0
, the highest C-state the kernel will use is C1, not C0.Restart the host for the changes to take effect:
reboot
# reboot
Copy to Clipboard Copied!
Verification
Display the maximum C-state:
cat /sys/module/processor/parameters/max_cstate 1
# cat /sys/module/processor/parameters/max_cstate 1
Copy to Clipboard Copied! If the host uses the
intel_idle
driver, display the maximum C-state:cat /sys/module/intel_idle/parameters/max_cstate 0
# cat /sys/module/intel_idle/parameters/max_cstate 0
Copy to Clipboard Copied!