Chapter 18. Optimizing virtual machine performance

18.1. What influences virtual machine performance
Copiar o link

VMs are run as user-space processes on the host. The hypervisor therefore needs to convert the host’s system resources so that the VMs can use them. As a consequence, a portion of the resources is consumed by the conversion, and the VM therefore cannot achieve the same performance efficiency as the host.

The impact of virtualization on system performance

More specific reasons for VM performance loss include:

Virtual CPUs (vCPUs) are implemented as threads on the host, handled by the Linux scheduler.
VMs do not automatically inherit optimization features, such as NUMA or huge pages, from the host kernel.
Disk and network I/O settings of the host might have a significant performance impact on the VM.
Network traffic typically travels to a VM through a software-based bridge.
Depending on the host devices and their models, there might be significant overhead due to emulation of particular hardware.

The severity of the virtualization impact on the VM performance is influenced by a variety factors, which include:

The number of concurrently running VMs.
The amount of virtual devices used by each VM.
The device types used by the VMs.

Reducing VM performance loss

RHEL 9 provides a number of features you can use to reduce the negative performance effects of virtualization. Notably:

The TuneD service can automatically optimize the resource distribution and performance of your VMs.
Block I/O tuning can improve the performances of the VM’s block devices, such as disks.
NUMA tuning can increase vCPU performance.
Virtual networking can be optimized in various ways.

Important

Tuning VM performance can have negative effects on other virtualization functions. For example, it can make migrating the modified VM more difficult.

18.2. Optimizing virtual machine performance by using TuneD
Copiar o link

The TuneD utility is a tuning profile delivery mechanism that adapts RHEL for certain workload characteristics, such as requirements for CPU-intensive tasks or storage-network throughput responsiveness. It provides a number of tuning profiles that are pre-configured to enhance performance and reduce power consumption in a number of specific use cases. You can edit these profiles or create new profiles to create performance solutions tailored to your environment, including virtualized environments.

To optimize RHEL 9 for virtualization, use the following profiles:

For RHEL 9 virtual machines, use the virtual-guest profile. It is based on the generally applicable throughput-performance profile, but also decreases the swappiness of virtual memory.
For RHEL 9 virtualization hosts, use the virtual-host profile. This enables more aggressive writeback of dirty memory pages, which benefits the host performance.

Prerequisites

The TuneD service is installed and enabled.

Procedure

To enable a specific TuneD profile:

List the available TuneD profiles.

# tuned-adm list

Available profiles:
- balanced             - General non-specialized TuneD profile
- desktop              - Optimize for the desktop use-case
[...]
- virtual-guest        - Optimize for running inside a virtual guest
- virtual-host         - Optimize for running KVM guests
Current active profile: balanced

Optional: Create a new TuneD profile or edit an existing TuneD profile.
For more information, see Customizing TuneD profiles.
Activate a TuneD profile.
```
# tuned-adm profile selected-profile
```
- To optimize a virtualization host, use the virtual-host profile.
  # tuned-adm profile virtual-host
- On a RHEL guest operating system, use the virtual-guest profile.
  # tuned-adm profile virtual-guest

Verification

Display the active profile for TuneD.

# tuned-adm active
Current active profile: virtual-host

Ensure that the TuneD profile settings have been applied on your system.

# tuned-adm verify
Verification succeeded, current system settings match the preset profile. See tuned log file ('/var/log/tuned/tuned.log') for details.

18.3. Virtual machine performance optimization for specific workloads
Copiar o link

Virtual machines (VMs) are frequently dedicated to perform a specific workload. You can improve the performance of your VMs by optimizing their configuration for the intended workload.

Expand

Table 18.1. Recommended VM configurations for specific use cases
Use case	IOThread	vCPU pinning	vNUMA pinning	huge pages	multi-queue
Database	For database disks	Yes^*	Yes^*	Yes^*	Yes, see: multi-queue virtio-blk, virtio-scsi
Virtualized Network Function (VNF)	No	Yes	Yes	Yes	Yes, see: multi-queue virtio-net
High Performance Computing (HPC)	No	Yes	Yes	Yes	No
Backup Server	For backup disks	No	No	No	Yes, see: multi-queue virtio-blk, virtio-scsi
VM with many CPUs (Usually more than 32)	No	Yes^*	Yes^*	No	No
VM with large RAM (Usually more than 128 GB)	No	No	Yes^*	Yes	No

* If the VM has enough CPUs and RAM to use more than one NUMA node.

Note

A VM can fit in more than one category of use cases. In this situation, you should apply all of the recommended configurations.

18.4. Optimizing libvirt daemons
Copiar o link

The libvirt virtualization suite works as a management layer for the RHEL hypervisor, and your libvirt configuration significantly impacts your virtualization host. Notably, RHEL 9 contains two different types of libvirt daemons, monolithic or modular, and which type of daemons you use affects how granularly you can configure individual virtualization drivers.

18.4.1. Types of libvirt daemons
Copiar o link

RHEL 9 supports the following libvirt daemon types:

Monolithic libvirt

The traditional libvirt daemon, libvirtd, controls a wide variety of virtualization drivers, by using a single configuration file - /etc/libvirt/libvirtd.conf.

As such, libvirtd allows for centralized hypervisor configuration, but may use system resources inefficiently. Therefore, libvirtd will become unsupported in a future major release of RHEL.

However, if you updated to RHEL 9 from RHEL 8, your host still uses libvirtd by default.

Modular libvirt

Newly introduced in RHEL 9, modular libvirt provides a specific daemon for each virtualization driver. These include the following:

virtqemud - A primary daemon for hypervisor management
virtinterfaced - A secondary daemon for host NIC management
virtnetworkd - A secondary daemon for virtual network management
virtnodedevd - A secondary daemon for host physical device management
virtnwfilterd - A secondary daemon for host firewall management
virtsecretd - A secondary daemon for host secret management
virtstoraged - A secondary daemon for storage management

Each of the daemons has a separate configuration file - for example /etc/libvirt/virtqemud.conf. As such, modular libvirt daemons provide better options for fine-tuning libvirt resource management.

If you performed a fresh install of RHEL 9, modular libvirt is configured by default.

Next steps

If your RHEL 9 uses libvirtd, Red Hat recommends switching to modular daemons. For instructions, see Enabling modular libvirt daemons.

18.4.2. Enabling modular libvirt daemons
Copiar o link

In RHEL 9, the libvirt library uses modular daemons that handle individual virtualization driver sets on your host. For example, the virtqemud daemon handles QEMU drivers.

If you performed a fresh install of a RHEL 9 host, your hypervisor uses modular libvirt daemons by default. However, if you upgraded your host from RHEL 8 to RHEL 9, your hypervisor uses the monolithic libvirtd daemon, which is the default in RHEL 8.

If that is the case, Red Hat recommends enabling the modular libvirt daemons instead, because they provide better options for fine-tuning libvirt resource management. In addition, libvirtd will become unsupported in a future major release of RHEL.

Prerequisites

Your hypervisor is using the monolithic libvirtd service.
```
# systemctl is-active libvirtd.service
active
```
If this command displays active, you are using libvirtd.
Your virtual machines are shut down.

Procedure

Stop libvirtd and its sockets.

$ systemctl stop libvirtd.service
$ systemctl stop libvirtd{,-ro,-admin,-tcp,-tls}.socket

Disable libvirtd to prevent it from starting on boot.

$ systemctl disable libvirtd.service
$ systemctl disable libvirtd{,-ro,-admin,-tcp,-tls}.socket

Enable the modular libvirt daemons.

# for drv in qemu interface network nodedev nwfilter secret storage; do systemctl unmask virt${drv}d.service; systemctl unmask virt${drv}d{,-ro,-admin}.socket; systemctl enable virt${drv}d.service; systemctl enable virt${drv}d{,-ro,-admin}.socket; done

Start the sockets for the modular daemons.

# for drv in qemu interface network nodedev nwfilter secret storage; do systemctl start virt${drv}d{,-ro,-admin}.socket; done

Optional: If you require connecting to your host from remote hosts, enable and start the virtualization proxy daemon.

Check whether the libvirtd-tls.socket service is enabled on your system.
```
# grep listen_tls /etc/libvirt/libvirtd.conf

listen_tls = 0
```

If libvirtd-tls.socket is not enabled (listen_tls = 0), activate virtproxyd as follows:

# systemctl unmask virtproxyd.service
# systemctl unmask virtproxyd{,-ro,-admin}.socket
# systemctl enable virtproxyd.service
# systemctl enable virtproxyd{,-ro,-admin}.socket
# systemctl start virtproxyd{,-ro,-admin}.socket

If libvirtd-tls.socket is enabled (listen_tls = 1), activate virtproxyd as follows:

# systemctl unmask virtproxyd.service
# systemctl unmask virtproxyd{,-ro,-admin,-tls}.socket
# systemctl enable virtproxyd.service
# systemctl enable virtproxyd{,-ro,-admin,-tls}.socket
# systemctl start virtproxyd{,-ro,-admin,-tls}.socket

To enable the TLS socket of virtproxyd, your host must have TLS certificates configured to work with libvirt. For more information, see the Upstream libvirt documentation.

Verification

Activate the enabled virtualization daemons.
```
# virsh uri
qemu:///system
```
Verify that your host is using the virtqemud modular daemon.
```
# systemctl is-active virtqemud.service
active
```
If the status is active, you have successfully enabled modular libvirt daemons.

18.5. Configuring virtual machine memory
Copiar o link

To improve the performance of a virtual machine (VM), you can assign additional host RAM to the VM. Similarly, you can decrease the amount of memory allocated to a VM so the host memory can be allocated to other VMs or tasks.

18.5.1. Memory overcommitment
Copiar o link

Virtual machines (VMs) running on a KVM hypervisor do not have dedicated blocks of physical RAM assigned to them. Instead, each VM functions as a Linux process where the host’s Linux kernel allocates memory only when requested. In addition, the host’s memory manager can move the VM’s memory between its own physical memory and swap space. If memory overcommitment is enabled, the kernel can decide to allocate less physical memory than is requested by a VM, because often the requested amount of memory is not fully used by the VM’s process.

By default, memory overcommitment is enabled in the Linux kernel and the kernel estimates the safe amount of memory overcommitment for VM’s requests. However, the system can still become unstable with frequent overcommitment for memory-intensive workloads.

Memory overcommitment requires you to allocate sufficient swap space on the host physical machine to accommodate all VMs as well as enough memory for the host physical machine’s processes. For instructions on the basic recommended swap space size, see What is the recommended swap size for Red Hat platforms? (Red Hat Knowledgebase).

Recommended methods to deal with memory shortages on the host:

Allocate less memory per VM.
Add more physical memory to the host.
Use larger swap space.

Important

A VM will run slower if it is swapped frequently. In addition, overcommitting can cause the system to run out of memory (OOM), which may lead to the Linux kernel shutting down important system processes.

Memory overcommit is not supported with device assignment. This is because when device assignment is in use, all virtual machine memory must be statically pre-allocated to enable direct memory access (DMA) with the assigned device.

18.5.2. Configuring virtual machines to use huge pages
Copiar o link

In certain use cases, you can improve memory allocation for your virtual machines (VMs) by using huge pages instead of the default 4 KiB memory pages. For example, huge pages can improve performance for VMs with high memory utilization, such as database servers.

Prerequisites

The host is configured to use huge pages in memory allocation. For instructions, see: Configuring HugeTLB at boot time

Procedure

Shut down the selected VM if it is running.
To configure a VM to use 1 GiB huge pages, open the XML definition of a VM for editing. For example, to edit a testguest VM, run the following command:
```
# virsh edit testguest
```

Add the following lines to the <memoryBacking> section in the XML definition:

<memoryBacking>
  <hugepages>
    <page size='1' unit='GiB'/>
  </hugepages>
</memoryBacking>

Verification

Start the VM.
Confirm that the host has successfully allocated huge pages for the running VM. On the host, run the following command:
```
# cat /proc/meminfo | grep Huge

HugePages_Total:    4
HugePages_Free:     2
HugePages_Rsvd:     1
Hugepagesize:       1024000 kB
```
When you add together the number of free and reserved huge pages (HugePages_Free + HugePages_Rsvd), the result should be less than the total number of huge pages (HugePages_Total). The difference is the number of huge pages that is used by the running VM.

18.5.3. Adding and removing virtual machine memory by using virtio-mem
Copiar o link

RHEL 9 provides the virtio-mem paravirtualized memory device. This device makes it possible to dynamically add or remove host memory in virtual machines (VMs). For example, you can use virtio-mem to move memory resources between running VMs or to resize VM memory in cloud setups based on your current requirements.

18.5.3.1. Overview of virtio-mem
Copiar o link

virtio-mem is a paravirtualized memory device that can be used to dynamically add or remove host memory in virtual machines (VMs). For example, you can use this device to move memory resources between running VMs or to resize VM memory in cloud setups based on your current requirements.

By using virtio-mem, you can increase the memory of a VM beyond its initial size, and shrink it back to its original size, in units that can have the size of 4 to several hundred mebibytes (MiBs). Note, however, that virtio-mem also relies on a specific guest operating system configuration, especially to reliably unplug memory.

virtio-mem feature limitations

virtio-mem is currently not compatible with the following features:

Using memory locking for real-time applications on the host
Using encrypted virtualization on the host
Combining virtio-mem with memballoon inflation and deflation on the host
Unloading or reloading the virtio_mem driver in a VM
Using vhost-user devices, with the exception of virtiofs

18.5.3.2. Configuring memory onlining in virtual machines
Copiar o link

Before using virtio-mem to attach memory to a running virtual machine (also known as memory hot-plugging), you must configure the virtual machine (VM) operating system to automatically set the hot-plugged memory to an online state. Otherwise, the guest operating system is not able to use the additional memory. You can choose from one of the following configurations for memory onlining:

online_movable
online_kernel
auto-movable

To learn about differences between these configurations, see: Comparison of memory onlining configurations

Memory onlining is configured with udev rules by default in RHEL. However, when using virtio-mem, it is recommended to configure memory onlining directly in the kernel.

Prerequisites

The host uses the Intel 64, AMD64, ARM 64, or IBM Z CPU architecture.
The host uses one of the following operating system versions:
- On Intel 64 and AMD64 hosts: RHEL 9.4 or later
- On ARM 64 hosts: RHEL 9.6 or later
- On IBM Z hosts: RHEL 9.7 or later
VMs running on the host use one of the following operating system versions:
- On Intel 64 and AMD64 hosts: RHEL 8.10, RHEL 9.4 or later, or RHEL 10.0 or later
  Important
  Unplugging memory from a running VM is disabled by default in RHEL 8.10 VMs.
- On ARM 64 hosts: RHEL 9.6 or later or RHEL 10.0 or later
- On IBM Z hosts: RHEL 9.7 or later or RHEL 10.1 or later

Procedure

To set memory onlining to use the online_movable configuration in the VM:
1. Set the memhp_default_state kernel command line parameter to online_movable:
  # grubby --update-kernel=ALL --remove-args=memhp_default_state --args=memhp_default_state=online_movable
2. Reboot the VM.
To set memory onlining to use the online_kernel configuration in the VM:
1. Set the memhp_default_state kernel command line parameter to online_kernel:
  # grubby --update-kernel=ALL --remove-args=memhp_default_state --args=memhp_default_state=online_kernel
2. Reboot the VM.
To use the auto-movable memory onlining policy in the VM:
1. Set the memhp_default_state kernel command line parameter to online:
  # grubby --update-kernel=ALL --remove-args=memhp_default_state --args=memhp_default_state=online
2. Set the memory_hotplug.online_policy kernel command line parameter to auto-movable:
  # grubby --update-kernel=ALL --remove-args="memory_hotplug.online_policy" --args=memory_hotplug.online_policy=auto-movable
3. Optional: To further tune the auto-movable onlining policy, change the memory_hotplug.auto_movable_ratio and memory_hotplug.auto_movable_numa_aware parameters:
  # grubby --update-kernel=ALL --remove-args="memory_hotplug.auto_movable_ratio" --args=memory_hotplug.auto_movable_ratio=<percentage> # grubby --update-kernel=ALL --remove-args="memory_hotplug.memory_auto_movable_numa_aware" --args=memory_hotplug.auto_movable_numa_aware=<y/n>
  - The memory_hotplug.auto_movable_ratio parameter sets the maximum ratio of memory only available for movable allocations compared to memory available for any allocations. The ratio is expressed in percents and the default value is: 301 (%), which is a 3:1 ratio.
  - The memory_hotplug.auto_movable_numa_aware parameter controls whether the memory_hotplug.auto_movable_ratio parameter applies to memory across all available NUMA nodes or only for memory within a single NUMA node. The default value is: y (yes)
    For example, if the maximum ratio is set to 301% and the memory_hotplug.auto_movable_numa_aware is set to y (yes), than the 3:1 ratio is applied even within the NUMA node with the attached virtio-mem device. If the parameter is set to n (no), the maximum 3:1 ratio is applied only for all the NUMA nodes as a whole.
    Additionally, if the ratio is not exceeded, the newly hot-plugged memory will be available only for movable allocations. Otherwise, the newly hot-plugged memory will be available for both movable and unmovable allocations.
4. Reboot the VM.

Verification

To see if the online_movable configuration has been set correctly, check the current value of the memhp_default_state kernel parameter:
```
# cat /sys/devices/system/memory/auto_online_blocks

online_movable
```
To see if the online_kernel configuration has been set correctly, check the current value of the memhp_default_state kernel parameter:
```
# cat /sys/devices/system/memory/auto_online_blocks

online_kernel
```

To see if the auto-movable configuration has been set correctly, check the following kernel parameters:

memhp_default_state:

# cat /sys/devices/system/memory/auto_online_blocks

online

memory_hotplug.online_policy:

# cat /sys/module/memory_hotplug/parameters/online_policy

auto-movable

memory_hotplug.auto_movable_ratio:

# cat /sys/module/memory_hotplug/parameters/auto_movable_ratio

301

memory_hotplug.auto_movable_numa_aware:

# cat /sys/module/memory_hotplug/parameters/auto_movable_numa_aware

y

18.5.3.3. Attaching a virtio-mem device to virtual machines
Copiar o link

To attach additional memory to a running virtual machine (also known as memory hot-plugging) and afterwards be able to resize the hot-plugged memory, you can use a virtio-mem device. Specifically, you can use libvirt XML configuration files and virsh commands to define and attach virtio-mem devices to virtual machines (VMs).

Prerequisites

The host uses the Intel 64, AMD64, ARM 64, or IBM Z CPU architecture.
The host uses one of the following operating system versions:
- On Intel 64 and AMD64 hosts: RHEL 9.4 or later
- On ARM 64 hosts: RHEL 9.6 or later
- On IBM Z hosts: RHEL 9.7 or later
VMs running on the host use one of the following operating system versions:
- On Intel 64 and AMD64 hosts: RHEL 8.10, RHEL 9.4 or later, or RHEL 10.0 or later
  Important
  Unplugging memory from a running VM is disabled by default in RHEL 8.10 VMs.
- On ARM 64 hosts: RHEL 9.6 or later or RHEL 10.0 or later
- On IBM Z hosts: RHEL 9.7 or later or RHEL 10.1 or later
The VM has memory onlining configured. For instructions, see: Configuring memory onlining in virtual machines

Procedure

Ensure the XML configuration of the target VM includes the maxMemory parameter:
```
# virsh edit testguest1

<domain type='kvm'>
  <name>testguest1</name>
  ...
  <maxMemory unit='GiB'>128</maxMemory>
  ...
</domain>
```
In this example, the XML configuration of the testguest1 VM defines a maxMemory parameter with a 128 gibibyte (GiB) size. The maxMemory size specifies the maximum memory the VM can use, which includes both initial and hot-plugged memory.
Create and open an XML file to define virtio-mem devices on the host, for example:
```
# vim virtio-mem-device.xml
```
Add XML definitions of virtio-mem devices to the file and save it:
```
<memory model='virtio-mem'>
        <target>
                <size unit='GiB'>48</size>
                <node>0</node>
                <block unit='MiB'>2</block>
                <requested unit='GiB'>16</requested>
                <current unit='GiB'>16</current>
        </target>
        <alias name='ua-virtiomem0'/>
        <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0'/>
</memory>
<memory model='virtio-mem'>
        <target>
                <size unit='GiB'>48</size>
                <node>1</node>
                <block unit='MiB'>2</block>
                <requested unit='GiB'>0</requested>
                <current unit='GiB'>0</current>
        </target>
        <alias name='ua-virtiomem1'/>
        <address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/>
</memory>
```
In this example, two virtio-mem devices are defined with the following parameters:
- size: This is the maximum size of the device. In the example, it is 48 GiB. The size must be a multiple of the block size.
- node: This is the assigned vNUMA node for the virtio-mem device.
- block: This is the block size of the device. It must be at least the size of the Transparent Huge Page (THP), which is 2 MiB on Intel 64 and AMD64 CPU architecture. On ARM64 architecture, the size of THP can be 2 MiB or 512 MiB depending on the base page size. The 2 MiB block size on Intel 64 or AMD64 architecture is usually a good default choice. When using virtio-mem with Virtual Function I/O (VFIO) or mediated devices (mdev), the total number of blocks across all virtio-mem devices must not be larger than 32768, otherwise the plugging of RAM might fail.
- requested: This is the amount of memory you attach to the VM with the virtio-mem device. However, it is just a request towards the VM and it might not be resolved successfully, for example if the VM is not properly configured. The requested size must be a multiple of the block size and cannot exceed the maximum defined size.
- current: This represents the current size the virtio-mem device provides to the VM. The current size can differ from the requested size, for example when requests cannot be completed or when rebooting the VM.
- alias: This is an optional user-defined alias that you can use to specify the intended virtio-mem device, for example when editing the device with libvirt commands. All user-defined aliases in libvirt must start with the "ua-" prefix.
  Apart from these specific parameters, libvirt handles the virtio-mem device like any other PCI device. For more information on managing PCI devices attached to VMs, see: Managing virtual devices
Use the XML file to attach the defined virtio-mem devices to a VM. For example, to permanently attach the two devices defined in the virtio-mem-device.xml to the running testguest1 VM:
```
# virsh attach-device testguest1 virtio-mem-device.xml --live --config
```
The --live option attaches the device to a running VM only, without persistence between boots. The --config option makes the configuration changes persistent. You can also attach the device to a shutdown VM without the --live option.
Optional: To dynamically change the requested size of a virtio-mem device attached to a running VM, use the virsh update-memory-device command:
```
# virsh update-memory-device testguest1 --alias ua-virtiomem0 --requested-size 4GiB
```
In this example:
- testguest1 is the VM you want to update.
- --alias ua-virtiomem0 is the virtio-mem device specified by a previously defined alias.
- --requested-size 4GiB changes the requested size of the virtio-mem device to 4 GiB.
  Warning
  Unplugging memory from a running VM by reducing the requested size might be unreliable. Whether this process succeeds depends on various factors, such as the memory onlining policy that is used.
  In some cases, the guest operating system cannot complete the request successfully, because changing the amount of hot-plugged memory is not possible at that time.
  Additionally, unplugging memory from a running VM is disabled by default in RHEL 8.10 VMs.
Optional: To unplug a virtio-mem device from a shut-down VM, use the virsh detach-device command:
```
# virsh detach-device testguest1 virtio-mem-device.xml
```
Optional: To unplug a virtio-mem device from a running VM:
1. Change the requested size of the virtio-mem device to 0, otherwise the attempt to unplug a virtio-mem device from a running VM will fail.
  # virsh update-memory-device testguest1 --alias ua-virtiomem0 --requested-size 0
2. Unplug a virtio-mem device from the running VM:
  # virsh detach-device testguest1 virtio-mem-device.xml --config

Verification

In the VM, check the available RAM and see if the total amount now includes the hot-plugged memory:

# free -h

        total    used    free   shared  buff/cache   available
Mem:    31Gi     5.5Gi   14Gi   1.3Gi   11Gi         23Gi
Swap:   8.0Gi    0B      8.0Gi

# numactl -H

available: 1 nodes (0)
node 0 cpus: 0 1 2 3 4 5 6 7
node 0 size: 29564 MB
node 0 free: 13351 MB
node distances:
node   0
  0:  10

The current amount of plugged-in RAM can be also viewed on the host by displaying the XML configuration of the running VM:

# virsh dumpxml testguest1

<domain type='kvm'>
  <name>testguest1</name>
  ...
  <currentMemory unit='GiB'>31</currentMemory>
  ...
  <memory model='virtio-mem'>
      <target>
        <size unit='GiB'>48</size>
        <node>0</node>
        <block unit='MiB'>2</block>
        <requested unit='GiB'>16</requested>
        <current unit='GiB'>16</current>
      </target>
      <alias name='ua-virtiomem0'/>
      <address type='pci' domain='0x0000' bus='0x08' slot='0x00' function='0x0'/>
  ...
</domain>

In this example:

<currentMemory unit='GiB'>31</currentMemory> represents the total RAM available in the VM from all sources.
<current unit='GiB'>16</current> represents the current size of the plugged-in RAM provided by the virtio-mem device.

18.5.3.4. Comparison of memory onlining configurations
Copiar o link

When attaching memory to a running RHEL virtual machine (also known as memory hot-plugging), you must set the hot-plugged memory to an online state in the virtual machine (VM) operating system. Otherwise, the system will not be able to use the memory.

The following table summarizes the main considerations when choosing between the available memory onlining configurations.

Expand

Table 18.2. Comparison of memory onlining configurations
Configuration name	Unplugging memory from a VM	A risk of creating a memory zone imbalance	A potential use case	Memory requirements of the intended workload
`online_movable`	Hot-plugged memory can be reliably unplugged.	Yes	Hot-plugging a comparatively small amount of memory	Mostly user-space memory
`auto-movable`	Movable portions of hot-plugged memory can be reliably unplugged.	Minimal	Hot-plugging a large amount of memory	Mostly user-space memory
`online_kernel`	Hot-plugged memory cannot be reliably unplugged.	No	Unreliable memory unplugging is acceptable.	User-space or kernel-space memory

A zone imbalance is a lack of available memory pages in one of the Linux memory zones. A zone imbalance can negatively impact the system performance. For example, the kernel might crash if it runs out of free memory for unmovable allocations. Usually, movable allocations contain mostly user-space memory pages and unmovable allocations contain mostly kernel-space memory pages.

18.6. Optimizing virtual machine I/O performance
Copiar o link

The input and output (I/O) capabilities of a virtual machine (VM) can significantly limit the VM’s overall efficiency. To address this, you can optimize a VM’s I/O by configuring block I/O parameters.

18.6.1. Tuning block I/O in virtual machines
Copiar o link

When multiple block devices are being used by one or more VMs, it might be important to adjust the I/O priority of specific virtual devices by modifying their I/O weights.

Increasing the I/O weight of a device increases its priority for I/O bandwidth, and therefore provides it with more host resources. Similarly, reducing a device’s weight makes it consume less host resources.

Note

Each device’s weight value must be within the 100 to 1000 range. Alternatively, the value can be 0, which removes that device from per-device listings.

Procedure

To display and set a VM’s block I/O parameters:

Display the current <blkio> parameters for a VM:

# virsh dumpxml VM-name

<domain>
  [...]
  <blkiotune>
    <weight>800</weight>
    <device>
      <path>/dev/sda</path>
      <weight>1000</weight>
    </device>
    <device>
      <path>/dev/sdb</path>
      <weight>500</weight>
    </device>
  </blkiotune>
  [...]
</domain>

Edit the I/O weight of a specified device:
```
# virsh blkiotune VM-name --device-weights device, I/O-weight
```
For example, the following changes the weight of the /dev/sda device in the testguest1 VM to 500.
```
# virsh blkiotune testguest1 --device-weights /dev/sda, 500
```

Verification

Check that the VM’s block I/O parameters have been configured correctly.
```
# virsh blkiotune testguest1

Block I/O tuning parameters for domain testguest1:

    weight                        : 800
    device_weight                  : [
                                      {"sda": 500},
                                     ]
...
```
Important
Certain kernels do not support setting I/O weights for specific devices. If the previous step does not display the weights as expected, it is likely that this feature is not compatible with your host kernel.

18.6.2. Disk I/O throttling in virtual machines
Copiar o link

When several VMs are running simultaneously, they can interfere with system performance by using excessive disk I/O. Disk I/O throttling in KVM virtualization provides the ability to set a limit on disk I/O requests sent from the VMs to the host machine. This can prevent a VM from over-utilizing shared resources and impacting the performance of other VMs.

To enable disk I/O throttling, set a limit on disk I/O requests sent from each block device attached to VMs to the host machine.

Procedure

Use the virsh domblklist command to list the names of all the disk devices on a specified VM.

# virsh domblklist rollin-coal
Target     Source
------------------------------------------------
vda        /var/lib/libvirt/images/rollin-coal.qcow2
sda        -
sdb        /home/horridly-demanding-processes.iso

Find the host block device where the virtual disk that you want to throttle is mounted.

For example, if you want to throttle the sdb virtual disk from the previous step, the following output shows that the disk is mounted on the /dev/nvme0n1p3 partition.

$ lsblk
NAME                                          MAJ:MIN RM   SIZE RO TYPE  MOUNTPOINT
zram0                                         252:0    0     4G  0 disk  [SWAP]
nvme0n1                                       259:0    0 238.5G  0 disk
├─nvme0n1p1                                   259:1    0   600M  0 part  /boot/efi
├─nvme0n1p2                                   259:2    0     1G  0 part  /boot
└─nvme0n1p3                                   259:3    0 236.9G  0 part
  └─luks-a1123911-6f37-463c-b4eb-fxzy1ac12fea 253:0    0 236.9G  0 crypt /home

Set I/O limits for the block device by using the virsh blkiotune command.

# virsh blkiotune VM-name --parameter device,limit

The following example throttles the sdb disk on the rollin-coal VM to 1000 read and write I/O operations per second and to 50 MB per second read and write throughput.

# virsh blkiotune rollin-coal --device-read-iops-sec /dev/nvme0n1p3,1000 --device-write-iops-sec /dev/nvme0n1p3,1000 --device-write-bytes-sec /dev/nvme0n1p3,52428800 --device-read-bytes-sec /dev/nvme0n1p3,52428800

Additional resources

Disk I/O throttling can be useful in various situations, for example when VMs belonging to different customers are running on the same host, or when quality of service guarantees are given for different VMs. Disk I/O throttling can also be used to simulate slower disks.
I/O throttling can be applied independently to each block device attached to a VM and supports limits on throughput and I/O operations.
Red Hat does not support using the virsh blkdeviotune command to configure I/O throttling in VMs. For more information about unsupported features when using RHEL 9 as a VM host, see Unsupported features in RHEL 9 virtualization.

18.6.3. Enabling multi-queue on storage devices
Copiar o link

When using virtio-blk or virtio-scsi storage devices in your virtual machines (VMs), the multi-queue feature provides improved storage performance and scalability. It enables each virtual CPU (vCPU) to have a separate queue and interrupt to use without affecting other vCPUs.

The multi-queue feature is enabled by default for the Q35 machine type, however you must enable it manually on the i440FX machine type. You can tune the number of queues to be optimal for your workload, however the optimal number differs for each type of workload and you must test which number of queues works best in your case.

Procedure

To enable multi-queue on a storage device, edit the XML configuration of the VM.
```
# virsh edit <example_vm>
```

In the XML configuration, find the intended storage device and change the queues parameter to use multiple I/O queues. Replace N with the number of vCPUs in the VM, up to 16.

A virtio-blk example:

<disk type='block' device='disk'>
  <driver name='qemu' type='raw' queues='N'/>
  <source dev='/dev/sda'/>
  <target dev='vda' bus='virtio'/>
  <address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/>
</disk>

A virtio-scsi example:

<controller type='scsi' index='0' model='virtio-scsi'>
   <driver queues='N' />
</controller>

Restart the VM for the changes to take effect.

18.6.4. Configuring dedicated IOThreads
Copiar o link

To improve the Input/Output (IO) performance of a disk on your virtual machine (VM), you can configure a dedicated IOThread that is used to manage the IO operations of the VM’s disk.

Normally, the IO operations of a disk are a part of the main QEMU thread, which can decrease the responsiveness of the VM as a whole during intensive IO workloads. By separating the IO operations to a dedicated IOThread, you can significantly increase the responsiveness and performance of your VM.

Procedure

Shut down the selected VM if it is running.
On the host, add or edit the <iothreads> tag in the XML configuration of the VM. For example, to create a single IOThread for a testguest1 VM:
```
# virsh edit <testguest1>

<domain type='kvm'>
  <name>testguest1</name>
  ...
  <vcpu placement='static'>8</vcpu>
  <iothreads>1</iothreads>
  ...
</domain>
```
Note
For optimal results, use only 1-2 IOThreads per CPU on the host.

Assign a dedicated IOThread` to a VM disk. For example, to assign an IOThread with ID of 1 to a disk on the testguest1 VM:

# virsh edit <testguest1>

<domain type='kvm'>
  <name>testguest1</name>
  ...
  <devices>
    <disk type='file' device='disk'>
      <driver name='qemu' type='raw' cache='none' io='native' iothread='1'/>
      <source file='/var/lib/libvirt/images/test-disk.raw'/>
      <target dev='vda' bus='virtio'/>
      <address type='pci' domain='0x0000' bus='0x04' slot='0x00' function='0x0'/>
    </disk>
    ...
  </devices>
  ...
</domain>

Note

IOThread IDs start from 1 and you must dedicate only a single IOThread to a disk.

Usually, a one dedicated IOThread per VM is sufficient for optimal performance.

When using virtio-scsi storage devices, assign a dedicated IOThread` to the virtio-scsi controller. For example, to assign an IOThread with ID of 1 to a controller on the testguest1 VM:

# virsh edit <testguest1>

<domain type='kvm'>
  <name>testguest1</name>
  ...
  <devices>
    <controller type='scsi' index='0' model='virtio-scsi'>
      <driver iothread='1'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x0b' function='0x0'/>
    </controller>
    ...
  </devices>
  ...
</domain>

Verification

Evaluate the impact of your changes on your VM performance. For details, see: Virtual machine performance monitoring tools

18.6.5. Configuring virtual disk caching
Copiar o link

KVM provides several virtual disk caching modes. For intensive Input/Output (IO) workloads, selecting the optimal caching mode can significantly increase the virtual machine (VM) performance.

+

Virtual disk cache modes overview

writethrough: Host page cache is used for reading only. Writes are reported as completed only when the data has been committed to the storage device. The sustained IO performance is decreased but this mode has good write guarantees.
writeback: Host page cache is used for both reading and writing. Writes are reported as complete when data reaches the host’s memory cache, not physical storage. This mode has faster IO performance than writethrough but it is possible to lose data on host failure.
none: Host page cache is bypassed entirely. This mode relies directly on the write queue of the physical disk, so it has a predictable sustained IO performance and offers good write guarantees on a stable guest. It is also a safe cache mode for VM live migration.

Procedure

Shut down the selected VM if it is running.
Edit the XML configuration of the selected VM.
```
# virsh edit <vm_name>
```

Find the disk device and edit the cache option in the driver tag.

<domain type='kvm'>
  <name>testguest1</name>
  ...
  <devices>
    <disk type='file' device='disk'>
      <driver name='qemu' type='raw' cache='none' io='native' iothread='1'/>
      <source file='/var/lib/libvirt/images/test-disk.raw'/>
      <target dev='vda' bus='virtio'/>
      <address type='pci' domain='0x0000' bus='0x04' slot='0x00' function='0x0'/>
    </disk>
    ...
  </devices>
  ...
</domain>

18.7. Optimizing virtual machine CPU performance
Copiar o link

Much like physical CPUs in host machines, vCPUs are critical to virtual machine (VM) performance. As a result, optimizing vCPUs can have a significant impact on the resource efficiency of your VMs. To optimize your vCPU:

Adjust how many host CPUs are assigned to the VM. You can do this using the CLI or the web console.
Ensure that the vCPU model is aligned with the CPU model of the host. For example, to set the testguest1 VM to use the CPU model of the host:
```
# virt-xml testguest1 --edit --cpu host-model
```
On an ARM 64 system, use --cpu host-passthrough.
Manage kernel same-page merging (KSM).
If your host machine uses Non-Uniform Memory Access (NUMA), you can also configure NUMA for its VMs. This maps the host’s CPU and memory processes onto the CPU and memory processes of the VM as closely as possible. In effect, NUMA tuning provides the vCPU with a more streamlined access to the system memory allocated to the VM, which can improve the vCPU processing effectiveness.
For details, see Configuring NUMA in a virtual machine and Virtual machine performance optimization for specific workloads.

18.7.1. vCPU overcommitment
Copiar o link

vCPU overcommitment allows you to have a setup where the sum of all vCPUs in virtual machines (VMs) running on a host exceeds the number of physical CPUs on the host. However, you might experience performance deterioration when simultaneously running more cores in your VMs than are physically available on the host.

For best performance, assign VMs with only as many vCPUs as are required to run the intended workloads in each VM.

vCPU overcommitment recommendations:

Assign the minimum amount of vCPUs required by by the VM’s workloads for best performance.
Avoid overcommitting vCPUs in production without extensive testing.
If overcomitting vCPUs, the safe ratio is typically 5 vCPUs to 1 physical CPU for loads under 100%.
It is not recommended to have more than 10 total allocated vCPUs per physical processor core.
Monitor CPU usage to prevent performance degradation under heavy loads.

Important

Applications that use 100% of memory or processing resources may become unstable in overcommitted environments. Do not overcommit memory or CPUs in a production environment without extensive testing, as the CPU overcommit ratio is workload-dependent.

18.7.2. Adding and removing virtual CPUs by using the command line
Copiar o link

To increase or optimize the CPU performance of a virtual machine (VM), you can add or remove virtual CPUs (vCPUs) assigned to the VM.

When performed on a running VM, this is also referred to as vCPU hot plugging and hot unplugging. However, note that vCPU hot unplug is not supported in RHEL 9, and Red Hat highly discourages its use.

Prerequisites

Optional: View the current state of the vCPUs in the targeted VM. For example, to display the number of vCPUs on the testguest VM:
```
# virsh vcpucount testguest
maximum      config         4
maximum      live           2
current      config         2
current      live           1
```
This output indicates that testguest is currently using 1 vCPU, and 1 more vCPu can be hot plugged to it to increase the VM’s performance. However, after reboot, the number of vCPUs testguest uses will change to 2, and it will be possible to hot plug 2 more vCPUs.

Procedure

Adjust the maximum number of vCPUs that can be attached to a VM, which takes effect on the VM’s next boot.
For example, to increase the maximum vCPU count for the testguest VM to 8:
```
# virsh setvcpus testguest 8 --maximum --config
```
Note that the maximum may be limited by the CPU topology, host hardware, the hypervisor, and other factors.
Adjust the current number of vCPUs attached to a VM, up to the maximum configured in the previous step. For example:
- To increase the number of vCPUs attached to the running testguest VM to 4:
  # virsh setvcpus testguest 4 --live
  This increases the VM’s performance and host load footprint of testguest until the VM’s next boot.
- To permanently decrease the number of vCPUs attached to the testguest VM to 1:
  # virsh setvcpus testguest 1 --config
  This decreases the VM’s performance and host load footprint of testguest after the VM’s next boot. However, if needed, additional vCPUs can be hot plugged to the VM to temporarily increase its performance.

Verification

Confirm that the current state of vCPU for the VM reflects your changes.

# virsh vcpucount testguest
maximum      config         8
maximum      live           4
current      config         1
current      live           4

18.7.3. Managing virtual CPUs by using the web console
Copiar o link

By using the RHEL 9 web console, you can review and configure virtual CPUs used by virtual machines (VMs) to which the web console is connected.

Prerequisites

You have installed the RHEL 9 web console.
You have enabled the cockpit service.
Your user account is allowed to log in to the web console.
For instructions, see Installing and enabling the web console.
The web console VM plug-in is installed on your system.

Procedure

Log in to the RHEL 9 web console.
For details, see Logging in to the web console.
In the Virtual Machines interface, click the VM whose information you want to see.
A new page opens with an Overview section with basic information about the selected VM and a Console section to access the VM’s graphical interface.
Click edit next to the number of vCPUs in the Overview pane.
The vCPU details dialog appears.

Configure the virtual CPUs for the selected VM.
- vCPU Count - The number of vCPUs currently in use.
  Note
  The vCPU count cannot be greater than the vCPU Maximum.
- vCPU Maximum - The maximum number of virtual CPUs that can be configured for the VM. If this value is higher than the vCPU Count, additional vCPUs can be attached to the VM.
- Sockets - The number of sockets to expose to the VM.
- Cores per socket - The number of cores for each socket to expose to the VM.
- Threads per core - The number of threads for each core to expose to the VM.
  Note that the Sockets, Cores per socket, and Threads per core options adjust the CPU topology of the VM. This may be beneficial for vCPU performance and may impact the functionality of certain software in the guest OS. If a different setting is not required by your deployment, keep the default values.
Click Apply.
The virtual CPUs for the VM are configured.
Note
Changes to virtual CPU settings only take effect after the VM is restarted.

18.7.4. Configuring NUMA in a virtual machine
Copiar o link

The following methods can be used to configure Non-Uniform Memory Access (NUMA) settings of a virtual machine (VM) on a RHEL 9 host.

For ease of use, you can set up a VM’s NUMA configuration by using automated utilities and services. However, manual NUMA setup is more likely to yield a significant performance improvement.

Prerequisites

The host is a NUMA-compatible machine. To detect whether this is the case, use the virsh nodeinfo command and see the NUMA cell(s) line:

# virsh nodeinfo
CPU model:           x86_64
CPU(s):              48
CPU frequency:       1200 MHz
CPU socket(s):       1
Core(s) per socket:  12
Thread(s) per core:  2
NUMA cell(s):        2
Memory size:         67012964 KiB

If the value of the line is 2 or greater, the host is NUMA-compatible.

Optional: You have the numactl package installed on the host.
```
# dnf install numactl
```

Procedure

Automatic methods

Set the VM’s NUMA policy to Preferred. For example, to configure the testguest5 VM:

# virt-xml testguest5 --edit --vcpus placement=auto
# virt-xml testguest5 --edit --numatune mode=preferred

Use the numad service to automatically align the VM CPU with memory resources.
```
# echo 1 > /proc/sys/kernel/numa_balancing
```
Start the numad service to automatically align the VM CPU with memory resources.
```
# systemctl start numad
```

Manual methods

To manually tune NUMA settings, you can specify which host NUMA nodes will be assigned specifically to a certain VM. This can improve the host memory usage by the VM’s vCPU.

Optional: Use the numactl command to view the NUMA topology on the host:

# numactl --hardware

available: 2 nodes (0-1)
node 0 size: 18156 MB
node 0 free: 9053 MB
node 1 size: 18180 MB
node 1 free: 6853 MB
node distances:
node   0   1
  0:  10  20
  1:  20  10

Edit the XML configuration of a VM to assign CPU and memory resources to specific NUMA nodes. For example, the following configuration sets testguest6 to use vCPUs 0-7 on NUMA node 0 and vCPUS 8-15 on NUMA node 1. Both nodes are also assigned 16 GiB of VM’s memory.

# virsh edit <testguest6>

<domain type='kvm'>
  <name>testguest6</name>
  ...
  <vcpu placement='static'>16</vcpu>
  ...
  <cpu ...>
    <numa>
      <cell id='0' cpus='0-7' memory='16' unit='GiB'/>
      <cell id='1' cpus='8-15' memory='16' unit='GiB'/>
    </numa>
  ...
</domain>

If the VM is running, restart it to apply the configuration.

Note

For best performance results, it is recommended to respect the maximum memory size for each NUMA node on the host.

Known issues

NUMA tuning currently cannot be performed on IBM Z hosts.

18.7.5. Configuring virtual CPU pinning
Copiar o link

To improve the CPU performance of a virtual machine (VM), you can pin a virtual CPU (vCPU) to a specific physical CPU thread on the host. This ensures that the vCPU will have its own dedicated physical CPU thread, which can significantly improve the vCPU performance.

To further optimize the CPU performance, you can also pin QEMU process threads associated with a specified VM to a specific host CPU.

Procedure

Check the CPU topology on the host:
```
# lscpu -p=node,cpu

Node,CPU
0,0
0,1
0,2
0,3
0,4
0,5
0,6
0,7
1,0
1,1
1,2
1,3
1,4
1,5
1,6
1,7
```
In this example, the output contains NUMA nodes and the available physical CPU threads on the host.
Check the number of vCPU threads inside the VM:
```
# lscpu -p=node,cpu

Node,CPU
0,0
0,1
0,2
0,3
```
In this example, the output contains NUMA nodes and the available vCPU threads inside the VM.
Pin specific vCPU threads from a VM to a specific host CPU or range of CPUs. This is recommended as a safe method of vCPU performance improvement.
For example, the following commands pin vCPU threads 0 to 3 of the testguest6 VM to host CPUs 1, 3, 5, 7, respectively:
```
# virsh vcpupin testguest6 0 1
# virsh vcpupin testguest6 1 3
# virsh vcpupin testguest6 2 5
# virsh vcpupin testguest6 3 7
```

Optional: Verify whether the vCPU threads are successfully pinned to CPUs.

# virsh vcpupin testguest6
VCPU   CPU Affinity
----------------------
0      1
1      3
2      5
3      7

After pinning vCPU threads, you can also pin QEMU process threads associated with a specified VM to a specific host CPU or range of CPUs. This can further help the QEMU process to run more efficiently on the physical CPU.
For example, the following commands pin the QEMU process thread of testguest6 to CPUs 2 and 4, and verify this was successful:
```
# virsh emulatorpin testguest6 2,4
# virsh emulatorpin testguest6
emulator: CPU Affinity
----------------------------------
       *: 2,4
```

18.7.6. Configuring virtual CPU capping
Copiar o link

You can use virtual CPU (vCPU) capping to limit the amount of CPU resources a virtual machine (VM) can use. vCPU capping can improve the overall performance by preventing excessive use of host’s CPU resources by a single VM and by making it easier for the hypervisor to manage CPU scheduling.

Procedure

View the current vCPU scheduling configuration on the host.

# virsh schedinfo <vm_name>

Scheduler      : posix
cpu_shares     : 0
vcpu_period : 0
vcpu_quota : 0
emulator_period: 0
emulator_quota : 0
global_period  : 0
global_quota   : 0
iothread_period: 0
iothread_quota : 0

To configure an absolute vCPU cap for a VM, set the vcpu_period and vcpu_quota parameters. Both parameters use a numerical value that represents a time duration in microseconds.
1. Set the vcpu_period parameter by using the virsh schedinfo command. For example:
  # virsh schedinfo <vm_name> --set vcpu_period=100000
  In this example, the vcpu_period is set to 100,000 microseconds, which means the scheduler enforces vCPU capping during this time interval.
  You can also use the --live --config options to configure a running VM without restarting it.
2. Set the vcpu_quota parameter by using the virsh schedinfo command. For example:
  # virsh schedinfo <vm_name> --set vcpu_quota=50000
  In this example, the vcpu_quota is set to 50,000 microseconds, which specifies the maximum amount of CPU time that the VM can use during the vcpu_period time interval. In this case, vcpu_quota is set as the half of vcpu_period, so the VM can use up to 50% of the CPU time during that interval.
  You can also use the --live --config options to configure a running VM without restarting it.

Verification

Check that the vCPU scheduling parameters have the correct values.

# virsh schedinfo <vm_name>

Scheduler      : posix
cpu_shares     : 2048
vcpu_period    : 100000
vcpu_quota     : 50000
...

18.7.7. Tuning CPU weights
Copiar o link

The CPU weight (or CPU shares) setting controls how much CPU time a virtual machine (VM) receives compared to other running VMs. By increasing the CPU weight of a specific VM, you can ensure that this VM gets more CPU time relative to other VMs. To prioritize CPU time allocation between multiple VMs, set the cpu_shares parameter

The possible CPU weight values range from 0 to 262144 and the default value for a new KVM VM is 1024.

Procedure

Check the current CPU weight of a VM.

# virsh schedinfo <vm_name>

Scheduler      : posix
cpu_shares : 1024
vcpu_period    : 0
vcpu_quota     : 0
emulator_period: 0
emulator_quota : 0
global_period  : 0
global_quota   : 0
iothread_period: 0
iothread_quota : 0

Adjust the CPU weight to a preferred value.
```
# virsh schedinfo <vm_name> --set cpu_shares=2048

Scheduler      : posix
cpu_shares : 2048
vcpu_period    : 0
vcpu_quota     : 0
emulator_period: 0
emulator_quota : 0
global_period  : 0
global_quota   : 0
iothread_period: 0
iothread_quota : 0
```
In this example, cpu_shares is set to 2048. This means that if all other VMs have the value set to 1024, this VM gets approximately twice the amount of CPU time.
You can also use the --live --config options to configure a running VM without restarting it.

18.7.8. Enabling and disabling kernel same-page merging
Copiar o link

Kernel Same-Page Merging (KSM) improves memory density by sharing identical memory pages between virtual machines (VMs). Therefore, enabling KSM might improve memory efficiency of your VM deployment.

However, enabling KSM also increases CPU utilization, and might negatively affect overall performance depending on the workload.

In RHEL 9 and later, KSM is disabled by default. To enable KSM and test its impact on your VM performance, see the following instructions.

Prerequisites

Root access to your host system.

Procedure

Enable KSM:

Warning

Enabling KSM increases CPU utilization and affects overall CPU performance.

Install the ksmtuned service:

# {PackageManagerCommand} install ksmtuned

Start the service:

To enable KSM for a single session, use the systemctl utility to start the ksm and ksmtuned services.
```
# systemctl start ksm
# systemctl start ksmtuned
```

To enable KSM persistently, use the systemctl utility to enable the ksm and ksmtuned services.

# systemctl enable ksm
Created symlink /etc/systemd/system/multi-user.target.wants/ksm.service  /usr/lib/systemd/system/ksm.service

# systemctl enable ksmtuned
Created symlink /etc/systemd/system/multi-user.target.wants/ksmtuned.service  /usr/lib/systemd/system/ksmtuned.service

Monitor the performance and resource consumption of VMs on your host to evaluate the benefits of activating KSM. Specifically, ensure that the additional CPU usage by KSM does not offset the memory improvements and does not cause additional performance issues. In latency-sensitive workloads, also pay attention to cross-NUMA page merges.
Optional: If KSM has not improved your VM performance, disable it:
- To disable KSM for a single session, use the systemctl utility to stop ksm and ksmtuned services.
  # systemctl stop ksm # systemctl stop ksmtuned
- To disable KSM persistently, use the systemctl utility to disable ksm and ksmtuned services.
  # systemctl disable ksm Removed /etc/systemd/system/multi-user.target.wants/ksm.service. # systemctl disable ksmtuned Removed /etc/systemd/system/multi-user.target.wants/ksmtuned.service.

Note

Memory pages shared between VMs before deactivating KSM will remain shared. To stop sharing, delete all the PageKSM pages in the system by using the following command:

# echo 2 > /sys/kernel/mm/ksm/run

However, this command increases memory usage, and might cause performance problems on your host or your VMs.

18.8. Optimizing virtual machine network performance
Copiar o link

Due to the virtual nature of a VM’s network interface controller (NIC), the VM loses a portion of its allocated host network bandwidth, which can reduce the overall workload efficiency of the VM. The following tips can minimize the negative impact of virtualization on the virtual NIC (vNIC) throughput.

Procedure

Use any of the following methods and observe if it has a beneficial effect on your VM network performance:

Enable the vhost_net module

On the host, ensure the vhost_net kernel feature is enabled:

# lsmod | grep vhost
vhost_net              32768  1
vhost                  53248  1 vhost_net
tap                    24576  1 vhost_net
tun                    57344  6 vhost_net

If the output of this command is blank, enable the vhost_net kernel module:

# modprobe vhost_net

Set up multi-queue virtio-net

To set up the multi-queue virtio-net feature for a VM, use the virsh edit command to edit to the XML configuration of the VM. In the XML, add the following to the <devices> section, and replace N with the number of vCPUs in the VM, up to 16:

<interface type='network'>
      <source network='default'/>
      <model type='virtio'/>
      <driver name='vhost' queues='N'/>
</interface>

If the VM is running, restart it for the changes to take effect.

Batching network packets

In Linux VM configurations with a long transmission path, batching packets before submitting them to the kernel may improve cache utilization. To set up packet batching, use the following command on the host, and replace tap0 with the name of the network interface that the VMs use:

# ethtool -C tap0 rx-frames 64

SR-IOV

If your host NIC supports SR-IOV, use SR-IOV device assignment for your vNICs. For more information, see Managing SR-IOV devices.

18.9. Virtual machine performance monitoring tools
Copiar o link

To identify what consumes the most VM resources and which aspect of VM performance needs optimization, performance diagnostic tools, both general and VM-specific, can be used.

Default OS performance monitoring tools

For standard performance evaluation, you can use the utilities provided by default by your host and guest operating systems:

On your RHEL 9 host, as root, use the top utility or the system monitor application, and look for qemu and virt in the output. This shows how much host system resources your VMs are consuming.
- If the monitoring tool displays that any of the qemu or virt processes consume a large portion of the host CPU or memory capacity, use the perf utility to investigate. For details, see below.
- In addition, if a vhost_net thread process, named for example vhost_net-1234, is displayed as consuming an excessive amount of host CPU capacity, consider using virtual network optimization features, such as multi-queue virtio-net.
On the guest operating system, use performance utilities and applications available on the system to evaluate which processes consume the most system resources.
- On Linux systems, you can use the top utility.
- On Windows systems, you can use the Task Manager application.

perf kvm

You can use the perf utility to collect and analyze virtualization-specific statistics about the performance of your RHEL 9 host. To do so:

On the host, install the perf package:
```
# dnf install perf
```
Use one of the perf kvm stat commands to display perf statistics for your virtualization host:
- For real-time monitoring of your hypervisor, use the perf kvm stat live command.
- To log the perf data of your hypervisor over a period of time, activate the logging by using the perf kvm stat record command. After the command is canceled or interrupted, the data is saved in the perf.data.guest file, which can be analyzed by using the perf kvm stat report command.

Analyze the perf output for types of VM-EXIT events and their distribution. For example, the PAUSE_INSTRUCTION events should be infrequent, but in the following output, the high occurrence of this event suggests that the host CPUs are not handling the running vCPUs well. In such a scenario, consider shutting down some of your active VMs, removing vCPUs from these VMs, or tuning the performance of the vCPUs.

# perf kvm stat report

Analyze events for all VMs, all VCPUs:


             VM-EXIT    Samples  Samples%     Time%    Min Time    Max Time         Avg time

  EXTERNAL_INTERRUPT     365634    31.59%    18.04%      0.42us  58780.59us    204.08us ( +-   0.99% )
           MSR_WRITE     293428    25.35%     0.13%      0.59us  17873.02us      1.80us ( +-   4.63% )
    PREEMPTION_TIMER     276162    23.86%     0.23%      0.51us  21396.03us      3.38us ( +-   5.19% )
   PAUSE_INSTRUCTION     189375    16.36%    11.75%      0.72us  29655.25us    256.77us ( +-   0.70% )
                 HLT      20440     1.77%    69.83%      0.62us  79319.41us  14134.56us ( +-   0.79% )
              VMCALL      12426     1.07%     0.03%      1.02us   5416.25us      8.77us ( +-   7.36% )
       EXCEPTION_NMI         27     0.00%     0.00%      0.69us      1.34us      0.98us ( +-   3.50% )
       EPT_MISCONFIG          5     0.00%     0.00%      5.15us     10.85us      7.88us ( +-  11.67% )

Total Samples:1157497, Total events handled time:413728274.66us.

Other event types that can signal problems in the output of perf kvm stat include:

INSN_EMULATION - suggests suboptimal VM I/O configuration.

For more information about using perf to monitor virtualization performance, see the perf-kvm man page on your system.

numastat

To see the current NUMA configuration of your system, you can use the numastat utility, which is provided by installing the numactl package.

The following shows a host with 4 running VMs, each obtaining memory from multiple NUMA nodes. This is not optimal for vCPU performance, and warrants adjusting:

# numastat -c qemu-kvm

Per-node process memory usage (in MBs)
PID              Node 0 Node 1 Node 2 Node 3 Node 4 Node 5 Node 6 Node 7 Total
---------------  ------ ------ ------ ------ ------ ------ ------ ------ -----
51722 (qemu-kvm)     68     16    357   6936      2      3    147    598  8128
51747 (qemu-kvm)    245     11      5     18   5172   2532      1     92  8076
53736 (qemu-kvm)     62    432   1661    506   4851    136     22    445  8116
53773 (qemu-kvm)   1393      3      1      2     12      0      0   6702  8114
---------------  ------ ------ ------ ------ ------ ------ ------ ------ -----
Total              1769    463   2024   7462  10037   2672    169   7837 32434

In contrast, the following shows memory being provided to each VM by a single node, which is significantly more efficient.

# numastat -c qemu-kvm

Per-node process memory usage (in MBs)
PID              Node 0 Node 1 Node 2 Node 3 Node 4 Node 5 Node 6 Node 7 Total
---------------  ------ ------ ------ ------ ------ ------ ------ ------ -----
51747 (qemu-kvm)      0      0      7      0   8072      0      1      0  8080
53736 (qemu-kvm)      0      0      7      0      0      0   8113      0  8120
53773 (qemu-kvm)      0      0      7      0      0      0      1   8110  8118
59065 (qemu-kvm)      0      0   8050      0      0      0      0      0  8051
---------------  ------ ------ ------ ------ ------ ------ ------ ------ -----
Total                 0      0   8072      0   8072      0   8114   8110 32368

Este conteúdo não está disponível no idioma selecionado.

18.1. What influences virtual machine performanceCopiar o linkLink copiado para a área de transferência!

The impact of virtualization on system performance

Reducing VM performance loss

18.2. Optimizing virtual machine performance by using TuneDCopiar o linkLink copiado para a área de transferência!

18.3. Virtual machine performance optimization for specific workloadsCopiar o linkLink copiado para a área de transferência!

18.4. Optimizing libvirt daemonsCopiar o linkLink copiado para a área de transferência!

18.4.1. Types of libvirt daemonsCopiar o linkLink copiado para a área de transferência!

18.4.2. Enabling modular libvirt daemonsCopiar o linkLink copiado para a área de transferência!

18.5. Configuring virtual machine memoryCopiar o linkLink copiado para a área de transferência!

18.5.1. Memory overcommitmentCopiar o linkLink copiado para a área de transferência!

18.5.2. Configuring virtual machines to use huge pagesCopiar o linkLink copiado para a área de transferência!

18.5.3. Adding and removing virtual machine memory by using virtio-memCopiar o linkLink copiado para a área de transferência!

18.5.3.1. Overview of virtio-memCopiar o linkLink copiado para a área de transferência!

18.5.3.2. Configuring memory onlining in virtual machinesCopiar o linkLink copiado para a área de transferência!

18.5.3.3. Attaching a virtio-mem device to virtual machinesCopiar o linkLink copiado para a área de transferência!

18.5.3.4. Comparison of memory onlining configurationsCopiar o linkLink copiado para a área de transferência!

18.6. Optimizing virtual machine I/O performanceCopiar o linkLink copiado para a área de transferência!

18.6.1. Tuning block I/O in virtual machinesCopiar o linkLink copiado para a área de transferência!

18.6.2. Disk I/O throttling in virtual machinesCopiar o linkLink copiado para a área de transferência!

18.6.3. Enabling multi-queue on storage devicesCopiar o linkLink copiado para a área de transferência!

18.6.4. Configuring dedicated IOThreadsCopiar o linkLink copiado para a área de transferência!

18.6.5. Configuring virtual disk cachingCopiar o linkLink copiado para a área de transferência!

18.7. Optimizing virtual machine CPU performanceCopiar o linkLink copiado para a área de transferência!

18.7.1. vCPU overcommitmentCopiar o linkLink copiado para a área de transferência!

18.7.2. Adding and removing virtual CPUs by using the command lineCopiar o linkLink copiado para a área de transferência!

18.7.3. Managing virtual CPUs by using the web consoleCopiar o linkLink copiado para a área de transferência!

18.7.4. Configuring NUMA in a virtual machineCopiar o linkLink copiado para a área de transferência!

18.7.5. Configuring virtual CPU pinningCopiar o linkLink copiado para a área de transferência!

18.7.6. Configuring virtual CPU cappingCopiar o linkLink copiado para a área de transferência!

18.7.7. Tuning CPU weightsCopiar o linkLink copiado para a área de transferência!

18.7.8. Enabling and disabling kernel same-page mergingCopiar o linkLink copiado para a área de transferência!

18.8. Optimizing virtual machine network performanceCopiar o linkLink copiado para a área de transferência!

18.9. Virtual machine performance monitoring toolsCopiar o linkLink copiado para a área de transferência!

Aprender

Experimente, compre e venda

Comunidades

Sobre a documentação da Red Hat

Tornando o open source mais inclusivo

Sobre a Red Hat

Theme

Red Hat legal and privacy links

Red Hat legal and privacy links

18.1. What influences virtual machine performance
Copiar o link

18.2. Optimizing virtual machine performance by using TuneD
Copiar o link

18.3. Virtual machine performance optimization for specific workloads
Copiar o link

18.4. Optimizing libvirt daemons
Copiar o link

18.4.1. Types of libvirt daemons
Copiar o link

18.4.2. Enabling modular libvirt daemons
Copiar o link

18.5. Configuring virtual machine memory
Copiar o link

18.5.1. Memory overcommitment
Copiar o link

18.5.2. Configuring virtual machines to use huge pages
Copiar o link

18.5.3. Adding and removing virtual machine memory by using virtio-mem
Copiar o link

18.5.3.1. Overview of virtio-mem
Copiar o link

18.5.3.2. Configuring memory onlining in virtual machines
Copiar o link

18.5.3.3. Attaching a virtio-mem device to virtual machines
Copiar o link

18.5.3.4. Comparison of memory onlining configurations
Copiar o link

18.6. Optimizing virtual machine I/O performance
Copiar o link

18.6.1. Tuning block I/O in virtual machines
Copiar o link

18.6.2. Disk I/O throttling in virtual machines
Copiar o link

18.6.3. Enabling multi-queue on storage devices
Copiar o link

18.6.4. Configuring dedicated IOThreads
Copiar o link

18.6.5. Configuring virtual disk caching
Copiar o link

18.7. Optimizing virtual machine CPU performance
Copiar o link

18.7.1. vCPU overcommitment
Copiar o link

18.7.2. Adding and removing virtual CPUs by using the command line
Copiar o link

18.7.3. Managing virtual CPUs by using the web console
Copiar o link

18.7.4. Configuring NUMA in a virtual machine
Copiar o link

18.7.5. Configuring virtual CPU pinning
Copiar o link

18.7.6. Configuring virtual CPU capping
Copiar o link

18.7.7. Tuning CPU weights
Copiar o link

18.7.8. Enabling and disabling kernel same-page merging
Copiar o link

18.8. Optimizing virtual machine network performance
Copiar o link

18.9. Virtual machine performance monitoring tools
Copiar o link