5.2. Identifying performance bottlenecks
Identifying bottlenecks in VDO performance is crucial for optimizing system efficiency. One of the primary steps you can take is to determine whether the bottleneck lies in the CPU, memory, or the speed of the backing storage.
After pinpointing the slowest component, you can develop strategies for enhancing performance. To ensure that the root cause of the low performance is not a hardware issue, run tests with and without VDO in the storage stack.
The journalQ thread in VDO is a natural bottleneck, especially when the VDO volume is handling write operations. If you notice that another thread type has higher utilization than the journalQ thread, you can remediate this by adding more threads of that type.
5.2.1. Analyzing VDO performance with top リンクのコピーリンクがクリップボードにコピーされました!
You can examine the performance of VDO threads by using the top utility.
Tools such as top cannot differentiate between productive CPU cycles and cycles stalled due to cache or memory delays. These tools interpret cache contention and slow memory access as actual work. Moving threads between nodes can appear like reduced CPU utilization while increasing operations per second.
Procedure
Display the individual threads:
$ top -H- Press the f key to display the fields manager.
-
Use the (↓) key to navigate to the
P = Last Used Cpu (SMP)field. -
Press the spacebar to select the
P = Last Used Cpu (SMP)field. -
Press the q key to close the fields manager. The
toputility now displays the CPU load for individual cores and indicates which CPU each process or thread recently used. You can switch to per-CPU statistics by pressing 1.
5.2.2. Interpretation of the top results リンクのコピーリンクがクリップボードにコピーされました!
Interpreting top utility results helps analyze VDO thread performance by understanding CPU usage thresholds and applying appropriate troubleshooting recommendations.
While analyzing the performance of VDO threads, use the following table to interpret results of the top utility.
| Values | Description | Suggestions |
|---|---|---|
| Thread or CPU usage surpasses 70%. | The thread or CPU is overloaded. High usage can result from a VDO thread scheduled on a CPU with no actual work. This may happen due to excessive hardware interrupts, memory conflicts, or resource competition. | Increase the number of threads of the type running this core. |
|
Low | The core is actively handling tasks. | No action required. |
|
Low | The core is performing standard processing work. | Add more cores to improve the performance. Avoid NUMA conflicts. |
| The core is over-committed. | Reassign kernel threads and device interrupt handling to different cores. |
| VDO is consistently keeping the storage system busy with I/O requests. This is good if the storage system can handle multiple requests or if request processing is efficient. | Reduce the number of I/O submission threads if the CPU utilization is very low. |
|
|
VDO has more |
Reduce the number of |
| High CPU utilization per I/O request. | CPU utilization per I/O request increases with more threads. | Check for CPU, memory, or lock contention. |
5.2.3. Analyzing VDO performance with perf リンクのコピーリンクがクリップボードにコピーされました!
You can check the CPU performance of VDO by using the perf utility.
Prerequisites
-
The
perfpackage is installed.
Procedure
Display the performance profile:
# perf topAnalyze the CPU performance by interpreting
perfresults:Expand 表5.2 Interpreting perf results Values Description Suggestions vdo:bioQthreads spend excessive cycles acquiring spin locksToo much contention might be occurring in the device driver below VDO
Reduce the number of
vdo:bioQthreadsHigh CPU usage
Contention between NUMA nodes.
Check counters such as
stalled-cycles-backend,cache-misses, andnode-load-missesif they are supported by your processor. High miss rates might cause stalls, resembling high CPU usage in other tools, indicating possible contention.Implement CPU affinity for the VDO kernel threads or IRQ affinity for interrupt handlers to restrict processing work to a single node.
5.2.4. Analyzing VDO performance with sar リンクのコピーリンクがクリップボードにコピーされました!
You can create periodic reports on VDO performance by using the sar utility.
Not all block device drivers can provide the data needed by the sar utility. For example, devices such as MD RAID do not report the %util value.
Prerequisites
Install the
sysstatutility:# dnf install sysstat
Procedure
Displays the disk I/O statistics at 1-second intervals:
$ sar -d 1Analyze the VDO performance by interpreting
sarresults:Expand 表5.3 Interpreting sar results Values Description Suggestions -
The
%utilvalue for the underlying storage device is well under 100%. - VDO is busy at 100%.
-
bioQthreads are using a lot of CPU time.
VDO has too few
bioQthreads for a fast device.Add more
bioQthreads.Note that certain storage drivers might slow down when you add
bioQthreads due to spin lock contention.-
The