Chapter 7. Running and interpreting hardware and firmware latency tests
With the hwlatdetect program, you can test and verify if a potential hardware platform is suitable for using real-time operations.
7.1. Running hardware and firmware latency tests Copy linkLink copied to clipboard!
You can use the hwlatdetect program to test for latencies introduced by the hardware architecture or firmware.
It is not required to run any load on the system while running the hwlatdetect program, because the test looks for latencies introduced by the hardware architecture or firmware. The default values for hwlatdetect are to poll for 0.5 seconds each second, and report any gaps greater than 10 microseconds between consecutive calls to fetch the time. hwlatdetect returns the best maximum latency possible on the system. Therefore, if you have an application that requires maximum latency values of less than 10us and hwlatdetect reports one of the gaps as 20us, then the system can only guarantee latency of 20us.
If hwlatdetect shows that the system cannot meet the latency requirements of the application, try changing the firmware settings or working with the system vendor to get new firmware that meets the latency requirements of the application.
Prerequisites
-
Ensure that the
RHEL-RT(RHEL for Real Time) andrealtime-testspackages are installed. Check the vendor documentation for any tuning steps required for low latency operation.
The vendor documentation can provide instructions to reduce or remove any System Management Interrupts (SMIs) that would move the system into System Management Mode (SMM). While a system is in SMM, it runs firmware and not operating system code. This means that any timers that expire while in SMM wait until the system returns to normal operation. This can cause unexplained latencies, because SMIs cannot be blocked by Linux, and the only indication that we actually took an SMI can be found in vendor-specific performance counter registers.
WarningRed Hat strongly recommends that you do not completely disable SMIs, as it can result in catastrophic hardware failure.
Procedure
Run
hwlatdetect, specifying the test duration in seconds.hwlatdetectlooks for hardware and firmware-induced latencies by polling the clock-source and looking for unexplained gaps.# hwlatdetect --duration=60s hwlatdetect: test duration 60 seconds detector: tracer parameters: Latency threshold: 10us Sample window: 1000000us Sample width: 500000us Non-sampling period: 500000us Output File: None Starting test test finished Max Latency: Below threshold Samples recorded: 0 Samples exceeding threshold: 0TipFor more information about
hwlatdetect, see thehwlatdetectman page on your system.
7.2. Interpreting hardware and firmware latency test results Copy linkLink copied to clipboard!
The hardware latency detector (hwlatdetect) uses the tracer mechanism to detect latencies introduced by the hardware architecture or firmware. By checking the latencies measured by hwlatdetect, you can determine if a potential hardware is suitable to support the RHEL for Real Time kernel.
Example 7.1. Examples
The example result represents a system tuned to minimize system interruptions from firmware. In this situation, the output of
hwlatdetectlooks like this:# hwlatdetect --duration=60s hwlatdetect: test duration 60 seconds detector: tracer parameters: Latency threshold: 10us Sample window: 1000000us Sample width: 500000us Non-sampling period: 500000us Output File: None Starting test test finished Max Latency: Below threshold Samples recorded: 0 Samples exceeding threshold: 0The example result represents a system that could not be tuned to minimize system interruptions from firmware. In this situation, the output of
hwlatdetectlooks like this:# hwlatdetect --duration=10s hwlatdetect: test duration 10 seconds detector: tracer parameters: Latency threshold: 10us Sample window: 1000000us Sample width: 500000us Non-sampling period: 500000us Output File: None Starting test test finished Max Latency: 18us Samples recorded: 10 Samples exceeding threshold: 10 SMIs during run: 0 ts: 1519674281.220664736, inner:17, outer:15 ts: 1519674282.721666674, inner:18, outer:17 ts: 1519674283.722667966, inner:16, outer:17 ts: 1519674284.723669259, inner:17, outer:18 ts: 1519674285.724670551, inner:16, outer:17 ts: 1519674286.725671843, inner:17, outer:17 ts: 1519674287.726673136, inner:17, outer:16 ts: 1519674288.727674428, inner:16, outer:18 ts: 1519674289.728675721, inner:17, outer:17 ts: 1519674290.729677013, inner:18, outer:17----The output shows that during the consecutive reads of the system
clocksource, there were 10 delays that showed up in the 15-18 us range.
Previous versions used a kernel module rather than the ftrace tracer.
7.2.1. Understanding the results Copy linkLink copied to clipboard!
The information on testing method, parameters, and results helps you understand the latency parameters and the latency values detected by the hwlatdetect utility.
The table for Testing method, parameters, and results, lists the parameters and the latency values detected by the hwlatdetect utility.
| Parameter | Value | Description |
|---|---|---|
|
|
| The duration of the test in seconds |
|
|
|
The utility that runs the |
|
| ||
|
|
| The maximum allowable latency |
|
|
| 1 second |
|
|
| 0.05 seconds |
|
|
| 0.05 seconds |
|
|
| The file to which the output is saved. |
|
| ||
|
|
|
The highest latency during the test that exceeded the |
|
|
| The number of samples recorded by the test. |
|
|
|
The number of samples recorded by the test where the latency exceeded the |
|
|
| The number of System Management Interrupts (SMIs) that occurred during the test run. |
The values printed by the hwlatdetect utility for inner and outer are the maximum latency values. They are deltas between consecutive reads of the current system clocksource (usually the TSC or TSC register, but potentially the HPET or ACPI power management clock) and any delays between consecutive reads introduced by the hardware-firmware combination.
After finding the suitable hardware-firmware combination, the next step is to test the real-time performance of the system while under a load.