Chapter 4. Setting up real-time virtual machines


To set up a virtual machine (VM) with a RHEL 9 for Real Time guest operating system, you must create a VM, configure its guest, and optimize and test the VM’s performance.

4.1. Optimizing vCPU pinning for real-time virtual machines

To correctly set up a RHEL real-time (RT) virtual machine (VM), you must first have a plan for optimal pinning of the VM’s virtual CPUs (vCPUs) to the physical CPUs of the host.

Prerequisites

Procedure

  1. View the CPU topology of your host system:

    # lstopo-no-graphics
    Copy to Clipboard

    The following example output shows a system with 32 physical cores with enabled hyperthreading, divided into 2 sockets ("packages"), each with 4 CPU dies. The system also has 250 GB of RAM split across 2 NUMA nodes.

    Note that the following examples in this procedure are based on this topology.

    Machine (250GB total)
      Package L#0
        NUMANode L#0 (P#0 124GB)
        Die L#0 + L3 L#0 (16MB)
          L2 L#0 (1024KB) + L1d L#0 (32KB) + L1i L#0 (32KB) + Core L#0
            PU L#0 (P#0)
            PU L#1 (P#32)
          L2 L#1 (1024KB) + L1d L#1 (32KB) + L1i L#1 (32KB) + Core L#1
            PU L#2 (P#1)
            PU L#3 (P#33)
          L2 L#2 (1024KB) + L1d L#2 (32KB) + L1i L#2 (32KB) + Core L#2
            PU L#4 (P#2)
            PU L#5 (P#34)
          L2 L#3 (1024KB) + L1d L#3 (32KB) + L1i L#3 (32KB) + Core L#3
            PU L#6 (P#3)
            PU L#7 (P#35)
        Die L#1 + L3 L#1 (16MB)
          L2 L#4 (1024KB) + L1d L#4 (32KB) + L1i L#4 (32KB) + Core L#4
            PU L#8 (P#4)
            PU L#9 (P#36)
          L2 L#5 (1024KB) + L1d L#5 (32KB) + L1i L#5 (32KB) + Core L#5
            PU L#10 (P#5)
            PU L#11 (P#37)
          L2 L#6 (1024KB) + L1d L#6 (32KB) + L1i L#6 (32KB) + Core L#6
            PU L#12 (P#6)
            PU L#13 (P#38)
          L2 L#7 (1024KB) + L1d L#7 (32KB) + L1i L#7 (32KB) + Core L#7
            PU L#14 (P#7)
            PU L#15 (P#39)
        Die L#2 + L3 L#2 (16MB)
          L2 L#8 (1024KB) + L1d L#8 (32KB) + L1i L#8 (32KB) + Core L#8
            PU L#16 (P#8)
            PU L#17 (P#40)
          L2 L#9 (1024KB) + L1d L#9 (32KB) + L1i L#9 (32KB) + Core L#9
            PU L#18 (P#9)
            PU L#19 (P#41)
          L2 L#10 (1024KB) + L1d L#10 (32KB) + L1i L#10 (32KB) + Core L#10
            PU L#20 (P#10)
            PU L#21 (P#42)
          L2 L#11 (1024KB) + L1d L#11 (32KB) + L1i L#11 (32KB) + Core L#11
            PU L#22 (P#11)
            PU L#23 (P#43)
        Die L#3 + L3 L#3 (16MB)
          L2 L#12 (1024KB) + L1d L#12 (32KB) + L1i L#12 (32KB) + Core L#12
            PU L#24 (P#12)
            PU L#25 (P#44)
          L2 L#13 (1024KB) + L1d L#13 (32KB) + L1i L#13 (32KB) + Core L#13
            PU L#26 (P#13)
            PU L#27 (P#45)
          L2 L#14 (1024KB) + L1d L#14 (32KB) + L1i L#14 (32KB) + Core L#14
            PU L#28 (P#14)
            PU L#29 (P#46)
          L2 L#15 (1024KB) + L1d L#15 (32KB) + L1i L#15 (32KB) + Core L#15
            PU L#30 (P#15)
            PU L#31 (P#47)
      Package L#1
        NUMANode L#1 (P#1 126GB)
        Die L#4 + L3 L#4 (16MB)
          L2 L#16 (1024KB) + L1d L#16 (32KB) + L1i L#16 (32KB) + Core L#16
            PU L#32 (P#16)
            PU L#33 (P#48)
          L2 L#17 (1024KB) + L1d L#17 (32KB) + L1i L#17 (32KB) + Core L#17
            PU L#34 (P#17)
            PU L#35 (P#49)
          L2 L#18 (1024KB) + L1d L#18 (32KB) + L1i L#18 (32KB) + Core L#18
            PU L#36 (P#18)
            PU L#37 (P#50)
          L2 L#19 (1024KB) + L1d L#19 (32KB) + L1i L#19 (32KB) + Core L#19
            PU L#38 (P#19)
            PU L#39 (P#51)
        Die L#5 + L3 L#5 (16MB)
          L2 L#20 (1024KB) + L1d L#20 (32KB) + L1i L#20 (32KB) + Core L#20
            PU L#40 (P#20)
            PU L#41 (P#52)
          L2 L#21 (1024KB) + L1d L#21 (32KB) + L1i L#21 (32KB) + Core L#21
            PU L#42 (P#21)
            PU L#43 (P#53)
          L2 L#22 (1024KB) + L1d L#22 (32KB) + L1i L#22 (32KB) + Core L#22
            PU L#44 (P#22)
            PU L#45 (P#54)
          L2 L#23 (1024KB) + L1d L#23 (32KB) + L1i L#23 (32KB) + Core L#23
            PU L#46 (P#23)
            PU L#47 (P#55)
        Die L#6 + L3 L#6 (16MB)
          L2 L#24 (1024KB) + L1d L#24 (32KB) + L1i L#24 (32KB) + Core L#24
            PU L#48 (P#24)
            PU L#49 (P#56)
          L2 L#25 (1024KB) + L1d L#25 (32KB) + L1i L#25 (32KB) + Core L#25
            PU L#50 (P#25)
            PU L#51 (P#57)
          L2 L#26 (1024KB) + L1d L#26 (32KB) + L1i L#26 (32KB) + Core L#26
            PU L#52 (P#26)
            PU L#53 (P#58)
          L2 L#27 (1024KB) + L1d L#27 (32KB) + L1i L#27 (32KB) + Core L#27
            PU L#54 (P#27)
            PU L#55 (P#59)
        Die L#7 + L3 L#7 (16MB)
          L2 L#28 (1024KB) + L1d L#28 (32KB) + L1i L#28 (32KB) + Core L#28
            PU L#56 (P#28)
            PU L#57 (P#60)
          L2 L#29 (1024KB) + L1d L#29 (32KB) + L1i L#29 (32KB) + Core L#29
            PU L#58 (P#29)
            PU L#59 (P#61)
          L2 L#30 (1024KB) + L1d L#30 (32KB) + L1i L#30 (32KB) + Core L#30
            PU L#60 (P#30)
            PU L#61 (P#62)
          L2 L#31 (1024KB) + L1d L#31 (32KB) + L1i L#31 (32KB) + Core L#31
            PU L#62 (P#31)
            PU L#63 (P#63)
    Copy to Clipboard
  2. Based on the output of lstopo-no-graphics and your required real-time VM setup, determine how to pin your vCPUs to physical CPUs. The following items show XML configurations effective for the example host output above and a real-time VM with 4 vCPUs:

    • The following pinning placement uses an exclusive core for each vCPU. For such pinning configuration to be effective, the assigned physical CPUs must be isolated on the host and must not have any processes running on them.

      <cputune>
        <vcpupin vcpu='0' cpuset='4'/>
        <vcpupin vcpu='1' cpuset='5'/>
        <vcpupin vcpu='2' cpuset='6'/>
        <vcpupin vcpu='3' cpuset='7'/>
      [...]
      Copy to Clipboard
    • The following pinning placement uses an exclusive L3 core for each vCPU:

      <cputune>
        <vcpupin vcpu='0' cpuset='16'/>
        <vcpupin vcpu='1' cpuset='20'/>
        <vcpupin vcpu='2' cpuset='24'/>
        <vcpupin vcpu='3' cpuset='28'/>
      [...]
      Copy to Clipboard

Verification

4.2. Installing a RHEL real-time guest operating system

To prepare a virtual machine (VM) environment for real-time workloads, create a new VM and adjust its configuration for low-latency performance.

Prerequisites

Procedure

  1. Use the virt-install utility to create a RHEL 9 VM with the following properties:

    • The VM has 2 or more assigned vCPUs
    • The VM uses huge pages for memory backing.

    The following example command creates a VM named RHEL9-RT that fits the mentioned requirements:

    # virt-install -n RHEL9-RT \
        --os-variant=rhel9.6 --memory=3072,hugepages=yes \
        --memorybacking hugepages=yes,size=1,unit=G,locked=yes \
        --vcpus=4 --numatune=1 --disk path=./rhel9-rt.img,bus=virtio,cache=none,format=raw,io=threads,size=30 \
        --graphics none --console pty,target_type=serial \
        -l downloads/rhel9.iso \
        --extra-args 'console=ttyS0,115200n8 serial'
    Copy to Clipboard
  2. After the installation finishes, shut down the VM.

    # virsh shutdown <RHEL9-RT>
    Copy to Clipboard
  3. Open the XML configuration of the VM.

    # virsh edit <RHEL9-RT>
    Copy to Clipboard
  4. Adjust the CPU configuration as follows:

    <cpu mode='host-model' check='partial'>
        <feature policy='require' name='tsc-deadline'/>
    </cpu>
    Copy to Clipboard
  5. Remove non-essential virtual hardware from the VM to improve its performance.

    1. Delete the section for the virtio RNG device.

        <rng model='virtio'>
            <backend model='random'>/dev/urandom</backend>
            <address type='pci' domain='0x0000' bus='0x07' slot='0x00' function='0x0'/>
        </rng>
      Copy to Clipboard
    2. Remove USB devices, such as the following:

      <hostdev mode='subsystem' type='usb' managed='yes'>
        <source>
          <vendor id='0x1234'/>
          <product id='0xabcd'/>
        </source>
      </hostdev>
      Copy to Clipboard
    3. Remove serial devices, such as the following:

      <serial type='dev'>
        <source path='/dev/ttyS0'/>
        <target port='0'/>
      </serial>
      Copy to Clipboard
    4. Remove the QXL device.

      <video>
        <model type='qxl' ram='65536' vram='65536' vgamem='16384' heads='1'/>
      </video>
      Copy to Clipboard
    5. Disable the graphical display.

      <graphics type='vnc' ports='-1' autoport='yes' listen='127.0.0.1'>
        <listen type='address' address='127.0.0.1'>
      </graphics>
      Copy to Clipboard
    6. In the USB controller setting, change the model to none to disable it.

      <controller type='usb' index='0' model='none'/>
      Copy to Clipboard
    7. Remove the Trusted Platform Module (TPM) configuration, so that it does not interfere with RT operations.

        <tpm model='tpm-crb'>
            <backend type='emulator' version='2.0'/>
        </tpm>
      Copy to Clipboard
    8. Disable the memballoon function.

        <memballoon model='none'>
      Copy to Clipboard
    9. In the <features> section of the configuration, ensure that the PMU and vmport features are disabled, to avoid the latency they might cause.

        <features>
           [...]
           <pmu state='off'/>
           <vmport state='off'/>
        </features>
      Copy to Clipboard
  6. Edit the <numatune> section to set up the NUMA nodes.

      <numatune>
        <memory mode='strict' nodeset='1'/>
      </numatune>
    Copy to Clipboard
  7. Edit the <cputune> section of the configuration to set up vCPU NUMA pinning as planned out in Optimizing vCPU pinning for real-time virtual machines.

    The following example configures a VM with 4 vCPUs and these parameters:

    • The isolated core 15 from NUMA node 0 is the non-realtime vCPU
    • Cores 16, 47, and 48, from NUMA nodes 1 - 3, are the real-time vCPU
    • The configuration pins all the QEMU I/O threads to the host housekeeping cores (0 and 32).
    <cputune>
      <vcpupin vcpu='0' cpuset='15'/>
      <vcpupin vcpu='1' cpuset='47'/>
      <vcpupin vcpu='2' cpuset='16'/>
      <vcpupin vcpu='3' cpuset='48'/>
      <emulatorpin cpuset='0,32'/>
      <emulatorsched scheduler='fifo' priority='1'/>
      <vcpusched vcpus='0' scheduler='fifo' priority='1'/>
      <vcpusched vcpus='1' scheduler='fifo' priority='1'/>
      <vcpusched vcpus='2' scheduler='fifo' priority='1'/>
      <vcpusched vcpus='3' scheduler='fifo' priority='1'/>
    </cputune>
    Copy to Clipboard
    Note

    If your host uses hardware with enabled hyperthreading, also ensure that your <cputune> configuration meets the following requirements:

    • Assign the siblings of a physical core to perform either real-time or housekeeping tasks.
    • Use both the siblings of a physical core in the same VM.
    • For vCPUs that are pinned to the siblings of the same physical core, assign the vCPU to the same task (real-time processes or housekeeping) as the sibling.

    Note that the example configuration above meet these requirements.

  8. Save and exit the XML configuration.

Verification

  • On the host, view the configuration of the VM and verify that it has the required parameters:

    # virsh dumpxml <RHEL9-RT>
    Copy to Clipboard

4.3. Configuring the RHEL guest operating system for real time

To optimize a RHEL 9 virtual machine (VM) environment for real-time workloads, configure the guest operating system for low-latency performance.

Prerequisites

Procedure

  1. Start the VM.
  2. Install real-time packages in the guest operating system.

    # dnf install -y kernel-rt tuned tuned-profiles-realtime tuned-profiles-nfv realtime-tests
    Copy to Clipboard
  3. Adjust the virtual guest profile for tuned. To do so, edit the /etc/tuned/realtime-virtual-guest-variables.conf file and add the following lines:

    isolated_cores=<isolated-core-nrs>
    isolate_managed_irq=Y
    Copy to Clipboard

    Replace <isolated-core-nrs> with the numbers of host cores that you want to isolate for real-time workloads.

  4. Ensure that irqbalance is disabled in the guest operating system.

    # rpm -q irqbalance && systemctl stop irqbalance && systemctl disable irqbalance
    Copy to Clipboard
  5. Activate the realtime-virtual-guest profile for tuned.

    # tuned-adm profile realtime-virtual-guest
    Copy to Clipboard
  6. Ensure that the real-time kernel is used by the guest operating system by default.

    # grubby --set-default vmlinuz-5.14.0-XXX.el9.x86_64+rt
    Copy to Clipboard
  7. Configure huge pages for the guest operating system in the same way as in the host. For instructions, see Configuring huge pages for real-time virtualization hosts.

Verification

Troubleshooting

If the results of the stress test exceed the required latency, do the following:

  1. Perform the stress tests on the host again. If the latency results are suboptimal, adjust the host configuration of TuneD and huge pages, and re-test. For instructions, see Configuring TuneD for the real-time virtualization host and Configuring huge pages for real-time virtualization hosts.
  2. If the stress test results on the host show sufficiently low latency but on the guest they do not, use the trace-cmd utility to generate a detailed test report. For instructions, see Troubleshooting latency issues for RHEL real-time guests.

4.4. Setting up cache protection for real-time virtual machines

Eviction of cache lines might cause performance issues in real-time virtual machines (VMs). Optionally, to avoid this problem, use the User Interface for Resource Control (resctrlfs`) feature to manage your caches and cache partitions:

  • Divide the main memory cache of the host system into partitions
  • Assign separate tasks to each partition
  • Assign vCPUs that run real-time applications to one cache partition
  • Assign vCPUs and host CPUs that run housekeeping workloads to a different cache partition

Prerequisites

  • You have created a real-time virtual machine on your host. For instructions, see Installing a RHEL real-time guest operating system.
  • Your host is using an Intel processor that supports L2 or L3 cache partitioning. To ensure this is the case:

    1. Install the intel-cmt-cat utility.

      # dnf install intel-cmt-cat
      Copy to Clipboard
    2. Use the pqos utility to display your core cache details.

      # pqos -d
      
      [...]
         Allocation
            Cache Allocation Technology (CAT)
              L3 CAT
      Copy to Clipboard

      This output indicates your CPU supports L3 cache partitioning.

Procedure

The following steps assume that you have the following NUMA pinning assignment of vCPUs to CPUs.

<cputune>
    <vcpupin vcpu='0' cpuset='16'/>
    <vcpupin vcpu='1' cpuset='17'/>
    <vcpupin vcpu='2' cpuset='18'/>
    <vcpupin vcpu='3' cpuset='19'/>
Copy to Clipboard
  1. Mount the resctrl file system. This makes it possible to use the resource control capabilities of the processor.

    # mount -t resctrl resctrl /sys/fs/resctrl
    Copy to Clipboard

    Depending on whether your system supports L2 or L3 cache partitioning, a subdirectory named L2 or L3 is mounted in the /sys/fs/resctrl/info directory. The following steps assume your system supports L3 cache partitioning.

  2. Move into the cache directory and list its content.

    # cd /sys/fs/resctrl/info/L3/; ls
    
    bit_usage  cbm_mask  min_cbm_bits  num_closids  shareable_bits  sparse_masks
    Copy to Clipboard
  3. View the value of the cbm_mask file.

    # cat cbm_mask
    
    ffff
    Copy to Clipboard

    This value represents the cache bitmask in a hexadecimal code. ffff means that all 16 bits of the cache can be used by the workload.

  4. View the value of the "shareable_bits" file.

    # cat shareable_bits
    
    0
    Copy to Clipboard

    This value represents partitions of the L3 cache that are shared by other executing processes, such as I/O, and therefore should not be used in exclusive cache partitions. 0 means that you can use all the L3 cache partitions.

  5. View the schemata file to see the global cache allocation.

    # cat /sys/fs/resctrl/schemata
    
    L3:0=ffff;2=ffff;4=ffff;6=ffff;8=ffff;10=ffff;12=ffff;14=ffff
    Copy to Clipboard

    This output indicates that for the L3 cache partition, CPU sockets 0, 2, 4, 6, 8, 10, 12, and 14 are fully allocated to the default control group. In this example, CPU sockets 16 - 19 are pinned to vCPUs 0-3.

  6. Determine what cache distribution you want to set for real-time applications. For example, for an even distribution of 8 MB for both real-time applications and housekeeping applications:

    • The cache bitmask for real-time applications is ff00
    • The cache bitmask for housekeeping applications is 00ff
  7. Adjust the default schemata file with your required cache allocation for housekeeping processes. For example, to assign 8MB to CPU socket 8, do the following:

    # echo "L3:0=ffff;2=ffff;4=ffff;6=ffff;8=00ff;10=ffff;12=ffff;14=ffff" > /sys/fs/resctrl/schemata
    Copy to Clipboard
  8. Create a specific control group for real-time processes, for example part1.

    # mkdir /sys/fs/resctrl/part1
    Copy to Clipboard
  9. Create a schemata file for the part1 control group, and set it to use cache allocation that does not conflict with the housekeeping cache allocation.

    # echo "L3:0=ffff;2=ffff;4=ffff;6=ffff;8=ff00;10=ffff;12=ffff;14=ffff" > /sys/fs/resctrl/part1/schemata
    Copy to Clipboard

    With this setting, the L3 cache 8 uses half of its cache allocation for real-time processes and the other half for the housekeeping processes. All the other L3 caches can be used freely by both real-time processes and housekeeping.

  10. Assign the CPUs pinned to real-time vCPUs (in this case 17, 18, and 19) to this control group.

    # echo 17,18,19 > part1/cpus_list
    Copy to Clipboard

Verification

  • If you previously tested the latency of the VM, run the cyclictest utility again. For instructions, see Stress testing the real-time virtualization system.

    If the maximum latency is lower than previously, you have set up cache protection correctly.

4.5. Troubleshooting RHEL real-time guest installation

While installing a RHEL 9 virtual machine (VM) on your real-time host, you might encounter one of the following errors. Use the following recommendations to fix or work around these issues.

  • Error: Host doesn’t support any virtualization options

    • Ensure that virtualization is enabled in the host BIOS
    • Check that cpu flags on your host contain vmx for Intel, or svm for AMD.

      $ cat /proc/cpuinfo | grep vmx
      Copy to Clipboard
    • Check that the lsmod command detects the kvm and kvm_intel or kvm_amd modules.

      $ lsmod | grep kvm
      Copy to Clipboard
    • Make sure the kernel-rt-kvm package is installed.

      $ dnf info kernel-rt-kvm
      Copy to Clipboard
    • Check if the /dev/kvm device exists.
    • Run the virt-host-validate utility to detect any further issues.
    • Run kvm-unit-tests.
  • Permission-related issues while accessing the disc image

    • In the /etc/libvirt/qemu.conf file, uncomment the group = and user = lines.
    • Restart the virtqemud service.

      $ service virtqemud restart
      Copy to Clipboard

4.6. Troubleshooting latency issues for RHEL real-time guests

To troubleshoot the latency of a RHEL real-time guest operating system, generate kernel trace files for the guest by running a trace-cmd test. Afterwards, you can use the output of this test to request support from Red Hat.

Prerequisites

Procedure

  1. Configure the VM that you want to test to use a VSOCK interface:

    1. On the host, use the virsh edit command to open the XML configuration of the VM.
    2. Add the following lines to the <devices> section of the configuration.

      <vsock model="virtio">
        <cid auto="no" address="3"/>
      Copy to Clipboard

      If performing the test on multiple VMs, set a different address value for each one.

    3. Save and exit the configuration.
  2. To ensure that tracing works properly, run the following short test on the host. This prevents the guest from becoming unresponsive due to the vCPU running at SCHED_FIFO priority.

    1. Use the following command on the host:

      # trace-cmd record -m 1000 -e kvm_exit
      Copy to Clipboard
    2. After roughly 10 seconds, press Ctrl-C to abort the test.
    3. Make sure that the test created a trace.dat file in the same directory, and that the file is not empty.
  3. Start the VM.
  4. On the host, configure password-less SSH access to the VM:

    1. Obtain the IP address of the VM.

      # virsh domifaddr <vm-name>
      
       Name       MAC address          Protocol     Address
      -------------------------------------------------------------------
       vnet0      52:54:00:6b:29:9f    ipv4         192.168.122.145/24
      Copy to Clipboard
    2. Copy your SSH public key to the VM.

      # ssh-copy-id root@<vm-ip>
      Copy to Clipboard
    3. Optional: Test password-less login to the VM:

      # ssh root@<vm-ip>
      Copy to Clipboard

      If this connects you to the guest’s root access, you have configured password-less SSH successfully.

  5. In the guest operating system, start the trace-cmd agent in the background.

    # trace-cmd agent &
    Copy to Clipboard

    The agent will run continuously on the guest and wait for commands from the host.

  6. On the host, use the trace-cmd command, and adjust the parameters for the vCPU pinning configuration in the VM:

    # trace-cmd record --poll -m 1000 \
      -e <host-trace-points> \
      -f "cpu==<host_cpu_num> || cpu==<host_cpu_num> || cpu==<host_cpu_num>" \
      -A <guest-vsock-port> \
      -e <guest-trace-points> \
      -f "cpu==<guest_cpu_num> || cpu==<guest_cpu_num> || cpu==<guest_cpu_num>" \
      ssh root@<vm-ip> "<latency-test-command>"
    Copy to Clipboard

    Replace the values in the command as follows:

    • <host-trace-points> with trace events you want to trace on the host, such as sched_switch or reschedule_entry.
    • <host_cpu_num> with each host CPU that has vCPU pins configured.
    • <guest-vsock-port> with the VSOCK port that you configured in the previous step.
    • <guest-trace-points> with trace events you want to trace on the guest. Optimally, these are the same as <host-trace-points>.
    • <vm-ip> with the IP address of the VM.
    • <latency-test-command> with a specific stress test command that you want to perform, as detailed in Stress testing the real-time virtualization system. When using the cyclictest procedure, also add a --tracemark parameter.

    If you want to test multiple VMs at the same time, adjust the command as follows:

    # trace-cmd record -m 1000 \
      -e <host-trace-points> \
      -A <guest1-vsock-port> -e <guest-trace-points> \
      -A <guest2-vsock-port> -e <guest-trace-points> \
      bash -c "ssh root@<vm1-ip> \
      \"<latency-test-command>\" \
      & ssh root@<vm2-ip> \
      \"<latency-test-command>\""
    Copy to Clipboard

    For example:

    # trace-cmd record --poll -m 1000 \
      -e sched_switch -e hrtimer_start -e hrtimer_expire_entry -e hrtimer_expire_exit -e irq_handler_entry -e local_timer_entry -e local_timer_exit -e reschedule_entry -e reschedule_exit -e call_function_entry -e call_function_exit -e call_function_single_entry -e call_function_single_exit -e irq_work_entry -e irq_work_exit -e tick_stop -e ipi_send_cpumask -e kvm_exit -e kvm_entry -e ipi_send_cpu -e csd_queue_cpu -e csd_function_entry -e csd_function_exit \
      -f "cpu==3 || cpu==5 || cpu==7 || cpu==9 || cpu==11 || cpu==13 || cpu==15 || cpu==17" \
      -A 3 \
      -e sched_switch -e hrtimer_start -e hrtimer_expire_entry -e hrtimer_expire_exit -e irq_handler_entry -e local_timer_entry -e local_timer_exit -e reschedule_entry -e reschedule_exit -e call_function_entry -e call_function_exit -e call_function_single_entry -e call_function_single_exit -e irq_work_entry -e irq_work_exit -e tick_stop -e ipi_send_cpumask -e kvm_exit -e kvm_entry -e ipi_send_cpu -e csd_queue_cpu -e csd_function_entry -e csd_function_exit \
      -f "cpu==2 || cpu==3 || cpu==4 || cpu==5 || cpu==6 || cpu==7 || cpu==8 || cpu==9" \
      ssh root@192.168.122.10 "cyclictest -m -q -p95 --policy=fifo -D 10min -h60 -t 8 -a 2,3,4,5,6,7,8,9 -i 200 -b 50 --mainaffinity 0,1 --tracemark"
    Copy to Clipboard

    This command does the following:

    • Tests a single VM with IP 192.168.122.10 and VSOCK port 3
    • Uses the typical trace points relevant for real-time deployments.
    • Traces host CPUs 3,5,7,9,11,13,15, and 17, to which vCPUs 2-9 are NUMA-pinned.
    • Triggers a cyclictest for 10 minutes.

    When the test finishes, the trace data is saved on the host as a trace.dat file and a trace-<guest-vsock-port>.dat file.

  7. Generate a human-readable report from the trace-cmd data. For example, if the VSOCK port of the VM is 3:

    # trace-cmd report -i trace.dat -i trace-3.dat > kvm_rt_latency.log
    Copy to Clipboard

    If you performed the test on multiple VMs, create a report for each one:

    # trace-cmd report -i trace.dat -i trace-3.dat > kvm_rt_latency_1.log
    # trace-cmd report -i trace.dat -i trace-4.dat > kvm_rt_latency_2.log
    Copy to Clipboard
  8. After you finish testing, stop the trace-cmd agent in the guest operating system:

    # pkill trace-cmd
    Copy to Clipboard
  9. Open a support case with Red Hat and attach the trace logs obtained in the previous steps.

Troubleshooting

If tracing fails, do the following and try again:

  • Verify that vsock is properly configured in the guest.
  • Ensure the trace-cmd agent is running in the guest.
  • Ensure that the host has sufficient disk space for trace files. Optimally, have at least 2 GB of free disk space for each minute you run the trace.
  • Verify that the vCPU pinning configuration on the VM matches the parameters you used for trace-cmd. To view the relevant part of the VM configuration, use virsh vcpupin <vm-name>.
Back to top
Red Hat logoGithubredditYoutubeTwitter

Learn

Try, buy, & sell

Communities

About Red Hat Documentation

We help Red Hat users innovate and achieve their goals with our products and services with content they can trust. Explore our recent updates.

Making open source more inclusive

Red Hat is committed to replacing problematic language in our code, documentation, and web properties. For more details, see the Red Hat Blog.

About Red Hat

We deliver hardened solutions that make it easier for enterprises to work across platforms and environments, from the core datacenter to the network edge.

Theme

© 2025 Red Hat