Este contenido no está disponible en el idioma seleccionado.
9.3. libvirt NUMA Tuning
Generally, optimal performance on NUMA systems is achieved by limiting guest size to the amount of resources on a single NUMA node. Avoid unnecessarily splitting resources across NUMA nodes.
Use the
numastat
tool to view per-NUMA-node memory statistics for processes and the operating system.
In the following example, the
Copy to Clipboard
Copied!
Toggle word wrap
Toggle overflow
numastat
tool shows four virtual machines with suboptimal memory alignment across NUMA nodes:
You can run
numad
to align the guests' CPUs and memory resources automatically. However, it is highly recommended to configure guest resource alignment using libvirt
instead: .
To verify that the memory has veen aligned, run
Copy to Clipboard
Copied!
Toggle word wrap
Toggle overflow
numastat -c qemu-kvm
again. The following output shows successful resource alignment:
Note
Running
numastat
with -c
provides compact output; adding the -m
option adds system-wide memory information on a per-node basis to the output. Refer to the numastat
man page for more information.
For optimal performance results, memory pinning should be used in combination with pinning of vCPU threads as well as other hypervisor threads.
9.3.1. NUMA vCPU Pinning Copiar enlaceEnlace copiado en el portapapeles!
Copiar enlaceEnlace copiado en el portapapeles!
vCPU pinning provides similar advantages to task pinning on bare metal systems. Since vCPUs run as user-space tasks on the host operating system, pinning increases cache efficiency. One example of this is an environment where all vCPU threads are running on the same physical socket, therefore sharing a L3 cache domain.
Combining vCPU pinning with
numatune
can avoid NUMA misses. The performance impacts of NUMA misses are significant, generally starting at a 10% performance hit or higher. vCPU pinning and numatune
should be configured together.
If the virtual machine is performing storage or network I/O tasks, it can be beneficial to pin all vCPUs and memory to the same physical socket that is physically connected to the I/O adapter.
Note
The lstopo tool can be used to visualize NUMA topology. It can also help verify that vCPUs are binding to cores on the same physical socket. Refer to the following Knowledgebase article for more information on lstopo: https://access.redhat.com/site/solutions/62879.
Important
Pinning causes increased complexity when there are many more vCPUs than physical cores.
The following example XML configuration has a domain process pinned to physical CPUs 0-7. The vCPU thread is pinned to its own cpuset. For example, vCPU0 is pinned to physical CPU 0, vCPU1 is pinned to physical CPU 1, and so on:
There is a direct relationship between the vcpu and vcpupin tags. If a vcpupin option is not specified, the value will be automatically determined and inherited from the parent vcpu tag option. The following configuration shows <vcpupin > for vcpu 5 missing. Hence, vCPU5 would be pinned to physical CPUs 0-7, as specified in the parent tag <vcpu>: