5.2. Identifying performance bottlenecks


Identifying bottlenecks in VDO performance is crucial for optimizing system efficiency. One of the primary steps you can take is to determine whether the bottleneck lies in the CPU, memory, or the speed of the backing storage.

After pinpointing the slowest component, you can develop strategies for enhancing performance. To ensure that the root cause of the low performance is not a hardware issue, run tests with and without VDO in the storage stack.

The journalQ thread in VDO is a natural bottleneck, especially when the VDO volume is handling write operations. If you notice that another thread type has higher utilization than the journalQ thread, you can remediate this by adding more threads of that type.

5.2.1. Analyzing VDO performance with top

You can examine the performance of VDO threads by using the top utility.

注意

Tools such as top cannot differentiate between productive CPU cycles and cycles stalled due to cache or memory delays. These tools interpret cache contention and slow memory access as actual work. Moving threads between nodes can appear like reduced CPU utilization while increasing operations per second.

Procedure

  1. Display the individual threads:

    $ top -H
  2. Press the f key to display the fields manager.
  3. Use the (↓) key to navigate to the P = Last Used Cpu (SMP) field.
  4. Press the spacebar to select the P = Last Used Cpu (SMP) field.
  5. Press the q key to close the fields manager. The top utility now displays the CPU load for individual cores and indicates which CPU each process or thread recently used. You can switch to per-CPU statistics by pressing 1.

5.2.2. Interpretation of the top results

Interpreting top utility results helps analyze VDO thread performance by understanding CPU usage thresholds and applying appropriate troubleshooting recommendations.

While analyzing the performance of VDO threads, use the following table to interpret results of the top utility.

Expand
表 5.1. Interpreting top results
ValuesDescriptionSuggestions

Thread or CPU usage surpasses 70%.

The thread or CPU is overloaded. High usage can result from a VDO thread scheduled on a CPU with no actual work. This may happen due to excessive hardware interrupts, memory conflicts, or resource competition.

Increase the number of threads of the type running this core.

Low %id and %wa values

The core is actively handling tasks.

No action required.

Low %hi value

The core is performing standard processing work.

Add more cores to improve the performance. Avoid NUMA conflicts.

  • High (more than a few percent) %hi value
  • Only one thread is assigned to the core
  • %id is zero
  • %wa values is zero

The core is over-committed.

Reassign kernel threads and device interrupt handling to different cores.

  • vdo:bioQ threads frequently in D state.

VDO is consistently keeping the storage system busy with I/O requests. This is good if the storage system can handle multiple requests or if request processing is efficient.

Reduce the number of I/O submission threads if the CPU utilization is very low.

vdo:bioQ threads frequently in S state.

VDO has more vdo:bioQ threads than it needs.

Reduce the number of vdo:bioQ threads.

High CPU utilization per I/O request.

CPU utilization per I/O request increases with more threads.

Check for CPU, memory, or lock contention.

5.2.3. Analyzing VDO performance with perf

You can check the CPU performance of VDO by using the perf utility.

Prerequisites

  • The perf package is installed.

Procedure

  1. Display the performance profile:

    # perf top
  2. Analyze the CPU performance by interpreting perf results:

    Expand
    表 5.2. Interpreting perf results
    ValuesDescriptionSuggestions

    vdo:bioQ threads spend excessive cycles acquiring spin locks

    Too much contention might be occurring in the device driver below VDO

    Reduce the number of vdo:bioQ threads

    High CPU usage

    Contention between NUMA nodes.

    Check counters such as stalled-cycles-backend, cache-misses, and node-load-misses if they are supported by your processor. High miss rates might cause stalls, resembling high CPU usage in other tools, indicating possible contention.

    Implement CPU affinity for the VDO kernel threads or IRQ affinity for interrupt handlers to restrict processing work to a single node.

5.2.4. Analyzing VDO performance with sar

You can create periodic reports on VDO performance by using the sar utility.

注意

Not all block device drivers can provide the data needed by the sar utility. For example, devices such as MD RAID do not report the %util value.

Prerequisites

  • Install the sysstat utility:

    # dnf install sysstat

Procedure

  1. Displays the disk I/O statistics at 1-second intervals:

    $ sar -d 1
  2. Analyze the VDO performance by interpreting sar results:

    Expand
    表 5.3. Interpreting sar results
    ValuesDescriptionSuggestions
    • The %util value for the underlying storage device is well under 100%.
    • VDO is busy at 100%.
    • bioQ threads are using a lot of CPU time.

    VDO has too few bioQ threads for a fast device.

    Add more bioQ threads.

    Note that certain storage drivers might slow down when you add bioQ threads due to spin lock contention.

Red Hat logoGithubredditYoutubeTwitter

学习

尝试、购买和销售

社区

关于红帽文档

通过我们的产品和服务,以及可以信赖的内容,帮助红帽用户创新并实现他们的目标。 了解我们当前的更新.

让开源更具包容性

红帽致力于替换我们的代码、文档和 Web 属性中存在问题的语言。欲了解更多详情,请参阅红帽博客.

關於紅帽

我们提供强化的解决方案,使企业能够更轻松地跨平台和环境(从核心数据中心到网络边缘)工作。

Theme

© 2026 Red Hat
返回顶部