Chapter 32. Stress testing real-time systems with stress-ng
The stress-ng
tool measures the system’s capability to maintain a good level of efficiency under unfavorable conditions. The stress-ng
tool is a stress workload generator to load and stress all kernel interfaces. It includes a wide range of stress mechanisms known as stressors. Stress testing makes a machine work hard and trip hardware issues such as thermal overruns and operating system bugs that occur when a system is being overworked.
There are over 270 different tests. These include CPU specific tests that exercise floating point, integer, bit manipulation, control flow, and virtual memory tests.
Use the stress-ng
tool with caution as some of the tests can impact the system’s thermal zone trip points on a poorly designed hardware. This can impact system performance and cause excessive system thrashing which can be difficult to stop.
32.1. Testing CPU floating point units and processor data cache
A floating point unit is the functional part of the processor that performs floating point arithmetic operations. Floating point units handle mathematical operations and make floating numbers or decimal calculations simpler.
Using the --matrix-method
option, you can stress test the CPU floating point operations and processor data cache.
Prerequisites
- You have root permissions on the systems
Procedure
To test the floating point on one CPU for 60 seconds, use the
--matrix
option:# stress-ng --matrix 1 -t 1m
To run multiple stressors on more than one CPUs for 60 seconds, use the
--times
or-t
option:# stress-ng --matrix 0 -t 1m stress-ng --matrix 0 -t 1m --times stress-ng: info: [16783] dispatching hogs: 4 matrix stress-ng: info: [16783] successful run completed in 60.00s (1 min, 0.00 secs) stress-ng: info: [16783] for a 60.00s run time: stress-ng: info: [16783] 240.00s available CPU time stress-ng: info: [16783] 205.21s user time ( 85.50%) stress-ng: info: [16783] 0.32s system time ( 0.13%) stress-ng: info: [16783] 205.53s total time ( 85.64%) stress-ng: info: [16783] load average: 3.20 1.25 1.40
The special mode with 0 stressors, query the available CPUs to run, removing the need to specify the CPU number.
The total CPU time required is 4 x 60 seconds (240 seconds), of which 0.13% is in the kernel, 85.50% is in user time, and
stress-ng
runs 85.64% of all the CPUs.To test message passing between processes using a POSIX message queue, use the
-mq
option:# stress-ng --mq 0 -t 30s --times --perf
The
mq
option configures a specific number of processes to force context switches using the POSIX message queue. This stress test aims for low data cache misses.
32.2. Testing CPU with multiple stress mechanisms
The stress-ng
tool runs multiple stress tests. In the default mode, it runs the specified stressor mechanisms in parallel.
Prerequisites
- You have root privileges on the systems
Procedure
Run multiple instances of CPU stressors as follows:
# stress-ng --cpu 2 --matrix 1 --mq 3 -t 5m
In the example,
stress-ng
runs two instances of the CPU stressors, one instance of the matrix stressor and three instances of the message queue stressor to test for five minutes.To run all stress tests in parallel, use the
–all
option:# stress-ng --all 2
In this example,
stress-ng
runs two instances of all stress tests in parallel.To run each different stressor in a specific sequence, use the
--seq
option.# stress-ng --seq 4 -t 20
In this example,
stress-ng
runs all the stressors one by one for 20 minutes, with the number of instances of each stressor matching the number of online CPUs.To exclude specific stressors from a test run, use the
-x
option:# stress-ng --seq 1 -x numa,matrix,hdd
In this example,
stress-ng
runs all stressors, one instance of each, excludingnuma
,hdd
andkey
stressors mechanisms.
32.3. Measuring CPU heat generation
To measure the CPU heat generation, the specified stressors generate high temperatures for a short time duration to test the system’s cooling reliability and stability under maximum heat generation. Using the --matrix-size
option, you can measure CPU temperatures in degrees Celsius over a short time duration.
Prerequisites
- You have root privileges on the system.
Procedure
To test the CPU behavior at high temperatures for a specified time duration, run the following command:
# stress-ng --matrix 0 --matrix-size 64 --tz -t 60 stress-ng: info: [18351] dispatching hogs: 4 matrix stress-ng: info: [18351] successful run completed in 60.00s (1 min, 0.00 secs) stress-ng: info: [18351] matrix: stress-ng: info: [18351] x86_pkg_temp 88.00 °C stress-ng: info: [18351] acpitz 87.00 °C
In this example, the
stress-ng
configures the processor package thermal zone to reach 88 degrees Celsius over the duration of 60 seconds.Optional: To print a report at the end of a run, use the
--tz
option:# stress-ng --cpu 0 --tz -t 60 stress-ng: info: [18065] dispatching hogs: 4 cpu stress-ng: info: [18065] successful run completed in 60.07s (1 min, 0.07 secs) stress-ng: info: [18065] cpu: stress-ng: info: [18065] x86_pkg_temp 88.75 °C stress-ng: info: [18065] acpitz 88.38 °C
32.4. Measuring test outcomes with bogo operations
The stress-ng
tool can measure a stress test throughput by measuring the bogo operations per second. The size of a bogo operation depends on the stressor being run. The test outcomes are not precise, but they provide a rough estimate of the performance.
You must not use this measurement as an accurate benchmark metric. These estimates help to understand the system performance changes on different kernel versions or different compiler versions used to build stress-ng
. Use the --metrics-brief
option to display the total available bogo operations and the matrix stressor performance on your machine.
Prerequisites
- You have root privileges on the system.
Procedure
To measure test outcomes with bogo operations, use with the
--metrics-brief
option:# stress-ng --matrix 0 -t 60s --metrics-brief stress-ng: info: [17579] dispatching hogs: 4 matrix stress-ng: info: [17579] successful run completed in 60.01s (1 min, 0.01 secs) stress-ng: info: [17579] stressor bogo ops real time usr time sys time bogo ops/s bogo ops/s stress-ng: info: [17579] (secs) (secs) (secs) (real time) (usr+sys time) stress-ng: info: [17579] matrix 349322 60.00 203.23 0.19 5822.03 1717.25
The
--metrics-brief
option displays the test outcomes and the total real-time bogo operations run by thematrix
stressor for 60 seconds.
32.5. Generating a virtual memory pressure
When under memory pressure, the kernel starts writing pages out to swap. You can stress the virtual memory by using the --page-in
option to force non-resident pages to swap back into the virtual memory. This causes the virtual machine to be heavily exercised. Using the --page-in
option, you can enable this mode for the bigheap
, mmap
and virtual machine (vm
) stressors. The --page-in
option, touch allocated pages that are not in core, forcing them to page in.
Prerequisites
- You have root privileges on the system.
Procedure
To stress test a virtual memory, use the
--page-in
option:# stress-ng --vm 2 --vm-bytes 2G --mmap 2 --mmap-bytes 2G --page-in
In this example,
stress-ng
tests memory pressure on a system with 4GB of memory, which is less than the allocated buffer sizes, 2 x 2GB ofvm
stressor and 2 x 2GB ofmmap
stressor with--page-in
enabled.
32.6. Testing large interrupts loads on a device
Running timers at high frequency can generate a large interrupt load. The –timer
stressor with an appropriately selected timer frequency can force many interrupts per second.
Prerequisites
- You have root permissions on the system.
Procedure
To generate an interrupt load, use the
--timer
option:# stress-ng --timer 32 --timer-freq 1000000
In this example,
stress-ng
tests 32 instances at 1MHz.
32.7. Generating major page faults in a program
With stress-ng
, you can test and analyze the page fault rate by generating major page faults in a page that are not loaded in the memory. On new kernel versions, the userfaultfd
mechanism notifies the fault finding threads about the page faults in the virtual memory layout of a process.
Prerequisites
- You have root permissions on the system.
Procedure
To generate major page faults on early kernel versions, use:
# stress-ng --fault 0 --perf -t 1m
To generate major page faults on new kernel versions, use:
# stress-ng --userfaultfd 0 --perf -t 1m
32.8. Viewing CPU stress test mechanisms
The CPU stress test contains methods to exercise a CPU. You can print an output to view all methods using the which
option.
If you do not specify the test method, by default, the stressor checks all the stressors in a round-robin fashion to test the CPU with each stressor.
Prerequisites
- You have root permissions on the system.
Procedure
Print all available stressor mechanisms, use the
which
option:# stress-ng --cpu-method which cpu-method must be one of: all ackermann bitops callfunc cdouble cfloat clongdouble correlate crc16 decimal32 decimal64 decimal128 dither djb2a double euler explog fft fibonacci float fnv1a gamma gcd gray hamming hanoi hyperbolic idct int128 int64 int32
Specify a specific CPU stress method using the
--cpu-method
option:# stress-ng --cpu 1 --cpu-method fft -t 1m
32.9. Using the verify mode
The verify
mode validates the results when a test is active. It sanity checks the memory contents from a test run and reports any unexpected failures.
All stressors do not have the verify
mode and enabling one will reduce the bogo operation statistics because of the extra verification step being run in this mode.
Prerequisites
- You have root permissions on the system.
Procedure
To validate a stress test results, use the
--verify
option:# stress-ng --vm 1 --vm-bytes 2G --verify -v
In this example,
stress-ng
prints the output for an exhaustive memory check on a virtually mapped memory using thevm
stressor configured with--verify
mode. It sanity checks the read and write results on the memory.