Chapter 3. Improving the network latency
CPU power management features can cause unwanted delays in time-sensitive application processing. You can disable some or all of these power management features to improve the network latency.
For example, if the latency is higher when the server is idle than under heavy load, CPU power management settings could influence the latency.
Disabling CPU power management features can cause a higher power consumption and heat loss.
3.1. How the CPU power states influence the network latency Copy linkLink copied to clipboard!
The consumption state (C-states) of CPUs optimize and reduce the power consumption of computers.
The C-states are numbered, starting at C0. In C0, the processor is fully powered and executing. In C1, the processor is fully powered but not executing. The higher the number of the C-state, the more components the CPU turns off.
Whenever a CPU core is idle, the built-in power saving logic steps in and attempts to move the core from the current C-state to a higher one by turning off various processor components. If the CPU core must process data, Red Hat Enterprise Linux (RHEL) sends an interrupt to the processor to wake up the core and set its C-state back to C0.
Moving out of deep C-states back to C0 takes time due to turning power back on to various components of the processor. On multi-core systems, it can also happen that many of the cores are simultaneously idle and, therefore, in deeper C-states. If RHEL tries to wake them up at the same time, the kernel can generate a large number of Inter-Processor Interrupts (IPIs) while all cores return from deep C-states. Due to locking that is required while processing interrupts, the system can then stall for some time while handling all the interrupts. This can result in large delays in the application response to events.
Example 3.1. Displaying times in C-state per core
The Idle Stats page in the PowerTOP application displays how much time the CPU cores spend in each C-state:
Pkg(HW) | Core(HW) | CPU(OS) 0 CPU(OS) 4
| | C0 active 2.5% 2.2%
| | POLL 0.0% 0.0 ms 0.0% 0.1 ms
| | C1 0.1% 0.2 ms 0.0% 0.1 ms
C2 (pc2) 63.7% | |
C3 (pc3) 0.0% | C3 (cc3) 0.1% | C3 0.1% 0.1 ms 0.1% 0.1 ms
C6 (pc6) 0.0% | C6 (cc6) 8.3% | C6 5.2% 0.6 ms 6.0% 0.6 ms
C7 (pc7) 0.0% | C7 (cc7) 76.6% | C7s 0.0% 0.0 ms 0.0% 0.0 ms
C8 (pc8) 0.0% | | C8 6.3% 0.9 ms 5.8% 0.8 ms
C9 (pc9) 0.0% | | C9 0.4% 3.7 ms 2.2% 2.2 ms
C10 (pc10) 0.0% | |
| | C10 80.8% 3.7 ms 79.4% 4.4 ms
| | C1E 0.1% 0.1 ms 0.1% 0.1 ms
...
3.2. C-state settings in the EFI firmware Copy linkLink copied to clipboard!
In most systems with an EFI firmware, you can enable and disable the individual consumption states (C-states). However, on RHEL, the idle driver determines whether the kernel uses the settings from the firmware.
These drivers are available:
-
intel_idle: This is the default driver on hosts with an Intel CPU and ignores the C-state settings from the EFI firmware. -
acpi_idle: RHEL uses this driver on hosts with CPUs from vendors other than Intel and ifintel_idleis disabled. By default, theacpi_idledriver uses the C-state settings from the EFI firmware.
For further details, see the /usr/share/doc/kernel-doc-<version>/Documentation/admin-guide/pm/cpuidle.rst file provided by the kernel-doc package.
3.3. Disabling C-states by using a custom TuneD profile Copy linkLink copied to clipboard!
Using the TuneD service prevents that administrators must hard code a maximum C-state value by using kernel command line parameters.
The TuneD service uses the Power Management Quality of Service (PMQOS) interface of the kernel to set consumption states (C-states) locking. The kernel idle driver can communicate with this interface to dynamically limit the C-states.
Prerequisites
-
The
tunedpackage is installed. -
The
tunedservice is enabled and running.
Procedure
Display the active profile:
# tuned-adm active Current active profile: network-latencyCreate a directory for the custom TuneD profile:
# mkdir /etc/tuned/network-latency-custom/Create the
/etc/tuned/network-latency-custom/tuned.conffile with the following content:[main] include=network-latency [cpu] force_latency=cstate.id:1|2This custom profile inherits all settings from the
network-latencyprofile. Theforce_latencyTuneD parameter specifies the latency in microseconds (µs). If the C-state latency is higher than the specified value, the idle driver in Red Hat Enterprise Linux prevents the CPU from moving to a higher C-state. Withforce_latency=cstate.id:1|2, TuneD first checks if the/sys/devices/system/cpu/cpu_<number>_/cpuidle/state_<cstate.id>_/directory exists. In this case, TuneD reads the latency value from thelatencyfile in this directory. If the directory does not exist, TuneD uses 2 microseconds as a fallback value.Activate the
network-latency-customprofile:# tuned-adm profile network-latency-custom
3.4. Disabling C-states by using a kernel command line option Copy linkLink copied to clipboard!
To test whether the latency of applications on a host are being affected by C-states, temporarily disable consumption states (C-state) in a kernel command line option.
The processor.max_cstate and intel_idle.max_cstate kernel command line parameters configure the maximum C-state CPU cores can use. For example, setting the parameters to 1 ensures that the CPU will never request a C-state below C1.
To not hard code a specific state, consider using a more dynamic solution. See Disabling C-states by using a custom TuneD profile.
Prerequisites
-
The
tunedservice is not running or configured to not update C-state settings.
Procedure
Display the idle driver the system uses:
# cat /sys/devices/system/cpu/cpuidle/current_driver intel_idleFor details about the driver, see the
/usr/share/doc/kernel-doc-<version>/Documentation/admin-guide/pm/cpuidle.rstfile provided by thekernel-docpackage.If the host uses the
intel_idledriver, set theintel_idle.max_cstatekernel parameter to define the highest C-state that CPU cores should be able to use:# grubby --update-kernel=ALL --args="intel_idle.max_cstate=0"Setting
intel_idle.max_cstate=0disables theintel_idledriver. Consequently, the kernel uses theacpi_idledriver that uses the C-state values set in the EFI firmware. For this reason, also setprocessor.max_cstateto override these C-state settings.On every host, independent from the CPU vendor, set the highest C-state that CPU cores should be able to use:
# grubby --update-kernel=ALL --args="processor.max_cstate=0"ImportantIf you set
processor.max_cstate=0in addition tointel_idle.max_cstate=0, theacpi_idledriver overrides the value ofprocessor.max_cstateand sets it to1. As a result, withprocessor.max_cstate=0 intel_idle.max_cstate=0, the highest C-state the kernel will use is C1, not C0.Restart the host for the changes to take effect:
# reboot
Verification
Display the maximum C-state:
# cat /sys/module/processor/parameters/max_cstate 1If the host uses the
intel_idledriver, display the maximum C-state:# cat /sys/module/intel_idle/parameters/max_cstate 0