Chapter 12. Enabling RT-KVM for NFV Workloads
To facilitate installing and configuring Red Hat Enterprise Linux Real Time KVM (RT-KVM), Red Hat OpenStack Platform provides the following features:
- A real-time Compute node role that provisions Red Hat Enterprise Linux for real-time.
- The additional RT-KVM kernel module.
- Automatic configuration of the Compute node.
12.1. Planning for your RT-KVM Compute nodes
When planning for RT-KVM Compute nodes, ensure that the following tasks are completed:
You must use Red Hat certified servers for your RT-KVM Compute nodes.
For more information, see Red Hat Enterprise Linux for Real Time certified servers.
Register your undercloud and attach a valid Red Hat OpenStack Platform subscription.
For more information, see: Registering the undercloud and attaching subscriptions in Installing and managing Red Hat OpenStack Platform with director.
Enable the repositories that are required for the undercloud, such as the
rhel-9-server-nfv-rpms
repository for RT-KVM, and update the system packages to the latest versions.NoteYou need a separate subscription to a
Red Hat OpenStack Platform for Real Time
SKU before you can access this repository.For more information, see Enabling repositories for the undercloud in Installing and managing Red Hat OpenStack Platform with director.
Building the real-time image
Install the libguestfs-tools package on the undercloud to get the virt-customize tool:
(undercloud) [stack@undercloud-0 ~]$ sudo dnf install libguestfs-tools
ImportantIf you install the
libguestfs-tools
package on the undercloud, disableiscsid.socket
to avoid port conflicts with thetripleo_iscsid
service on the undercloud:$ sudo systemctl disable --now iscsid.socket
Extract the images:
(undercloud) [stack@undercloud-0 ~]$ tar -xf /usr/share/rhosp-director-images/overcloud-hardened-uefi-full-17.1.x86_64.tar (undercloud) [stack@undercloud-0 ~]$ tar -xf /usr/share/rhosp-director-images/ironic-python-agent-17.1.x86_64.tar
Copy the default image:
(undercloud) [stack@undercloud-0 ~]$ cp overcloud-hardened-uefi-full.qcow2 overcloud-realtime-compute.qcow2
Register your image to enable Red Hat repositories relevant to your customizations. Replace
[username]
and[password]
with valid credentials in the following example.virt-customize -a overcloud-realtime-compute.qcow2 --run-command \ 'subscription-manager register --username=[username] --password=[password]' \ subscription-manager release --set 9.0
NoteFor security, you can remove credentials from the history file if they are used on the command prompt. You can delete individual lines in history using the
history -d
command followed by the line number.Find a list of pool IDs from your account’s subscriptions, and attach the appropriate pool ID to your image.
sudo subscription-manager list --all --available | less ... virt-customize -a overcloud-realtime-compute.qcow2 --run-command \ 'subscription-manager attach --pool [pool-ID]'
Add the repositories necessary for Red Hat OpenStack Platform with NFV.
virt-customize -a overcloud-realtime-compute.qcow2 --run-command \ 'sudo subscription-manager repos --enable=rhel-9-for-x86_64-baseos-eus-rpms \ --enable=rhel-9-for-x86_64-appstream-eus-rpms \ --enable=rhel-9-for-x86_64-highavailability-eus-rpms \ --enable=ansible-2.9-for-rhel-9-x86_64-rpms \ --enable=rhel-9-for-x86_64-nfv-rpms --enable=fast-datapath-for-rhel-9-x86_64-rpms'
Create a script to configure real-time capabilities on the image.
(undercloud) [stack@undercloud-0 ~]$ cat <<'EOF' > rt.sh #!/bin/bash set -eux dnf -v -y --setopt=protected_packages= erase kernel.$(uname -m) dnf -v -y install kernel-rt kernel-rt-kvm tuned-profiles-nfv-host grubby --set-default /boot/vmlinuz*rt* EOF
Run the script to configure the real-time image:
(undercloud) [stack@undercloud-0 ~]$ virt-customize -a overcloud-realtime-compute.qcow2 -v --run rt.sh 2>&1 | tee virt-customize.log
NoteIf you see the following line in the
rt.sh
script output,"grubby fatal error: unable to find a suitable template"
, you can ignore this error.Examine the
virt-customize.log
file that resulted from the previous command, to check that the packages installed correctly using thert.sh
script .(undercloud) [stack@undercloud-0 ~]$ cat virt-customize.log | grep Verifying Verifying : kernel-3.10.0-957.el7.x86_64 1/1 Verifying : 10:qemu-kvm-tools-rhev-2.12.0-18.el7_6.1.x86_64 1/8 Verifying : tuned-profiles-realtime-2.10.0-6.el7_6.3.noarch 2/8 Verifying : linux-firmware-20180911-69.git85c5d90.el7.noarch 3/8 Verifying : tuned-profiles-nfv-host-2.10.0-6.el7_6.3.noarch 4/8 Verifying : kernel-rt-kvm-3.10.0-957.10.1.rt56.921.el7.x86_64 5/8 Verifying : tuna-0.13-6.el7.noarch 6/8 Verifying : kernel-rt-3.10.0-957.10.1.rt56.921.el7.x86_64 7/8 Verifying : rt-setup-2.0-6.el7.x86_64 8/8
Relabel SELinux:
(undercloud) [stack@undercloud-0 ~]$ virt-customize -a overcloud-realtime-compute.qcow2 --selinux-relabel
Extract vmlinuz and initrd:
(undercloud) [stack@undercloud-0 ~]$ mkdir image (undercloud) [stack@undercloud-0 ~]$ guestmount -a overcloud-realtime-compute.qcow2 -i --ro image (undercloud) [stack@undercloud-0 ~]$ cp image/boot/vmlinuz-3.10.0-862.rt56.804.el7.x86_64 ./overcloud-realtime-compute.vmlinuz (undercloud) [stack@undercloud-0 ~]$ cp image/boot/initramfs-3.10.0-862.rt56.804.el7.x86_64.img ./overcloud-realtime-compute.initrd (undercloud) [stack@undercloud-0 ~]$ guestunmount image
NoteThe software version in the
vmlinuz
andinitramfs
filenames vary with the kernel version.Upload the image:
(undercloud) [stack@undercloud-0 ~]$ openstack overcloud image upload --update-existing --os-image-name overcloud-realtime-compute.qcow2
You now have a real-time image you can use with the ComputeOvsDpdkRT
composable role on your selected Compute nodes.
Modifying BIOS settings on RT-KVM Compute nodes
To reduce latency on your RT-KVM Compute nodes, disable all options for the following parameters in your Compute node BIOS settings:
- Power Management
- Hyper-Threading
- CPU sleep states
- Logical processors
12.2. Configuring OVS-DPDK with RT-KVM
12.2.1. Designating nodes for Real-time Compute
To designate nodes for Real-time Compute, create a new role file to configure the Real-time Compute role, and configure the bare-metal nodes with a Real-time Compute resource class to tag the Compute nodes for real-time.
The following procedure applies to new overcloud nodes that you have not yet provisioned. To assign a resource class to an existing overcloud node that has already been provisioned, scale down the overcloud to unprovision the node, then scale up the overcloud to reprovision the node with the new resource class assignment. For more information, see Scaling overcloud nodes in Installing and managing Red Hat OpenStack Platform with director.
Procedure
-
Log in to the undercloud host as the
stack
user. Source the
stackrc
undercloud credentials file:[stack@director ~]$ source ~/stackrc
-
Based on the
/usr/share/openstack-tripleo-heat-templates/environments/compute-real-time-example.yaml
file, create acompute-real-time.yaml
environment file that sets the parameters for theComputeRealTime
role. Generate a new roles data file named
roles_data_rt.yaml
that includes theComputeRealTime
role, along with any other roles that you need for the overcloud. The following example generates the roles data fileroles_data_rt.yaml
, which includes the rolesController
,Compute
, andComputeRealTime
:(undercloud)$ openstack overcloud roles generate \ -o /home/stack/templates/roles_data_rt.yaml \ ComputeRealTime Compute Controller
Update the roles_data_rt.yaml file for the ComputeRealTime role:
################################################### # Role: ComputeRealTime # ################################################### - name: ComputeRealTime description: | Real Time Compute Node role CountDefault: 1 # Create external Neutron bridge tags: - compute - external_bridge networks: InternalApi: subnet: internal_api_subnet Tenant: subnet: tenant_subnet Storage: subnet: storage_subnet HostnameFormatDefault: '%stackname%-computert-%index%' deprecated_nic_config_name: compute-rt.yaml
Register the ComputeRealTime nodes for the overcloud by adding them to your node definition template:
node.json
ornode.yaml
.For more information, see Registering nodes for the overcloud in Installing and managing Red Hat OpenStack Platform with director.
Inspect the node hardware:
(undercloud)$ openstack overcloud node introspect --all-manageable --provide
For more information, see Creating an inventory of the bare-metal node hardware in Installing and managing Red Hat OpenStack Platform with director.
Tag each bare-metal node that you want to designate for ComputeRealTime with a custom ComputeRealTime resource class:
(undercloud)$ openstack baremetal node set \ --resource-class baremetal.RTCOMPUTE <node>
Replace <node> with the name or UUID of the bare-metal node.
Add the ComputeRealTime role to your node definition file,
overcloud-baremetal-deploy.yaml
, and define any predictive node placements, resource classes, network topologies, or other attributes that you want to assign to your nodes:- name: Controller count: 3 ... - name: Compute count: 3 ... - name: ComputeRealTime count: 1 defaults: resource_class: baremetal.RTCOMPUTE network_config: template: /home/stack/templates/nic-config/<role_topology_file>
Replace
<role_topology_file>
with the name of the topology file to use for theComputeRealTime
role, for example,myRoleTopology.j2
. You can reuse an existing network topology or create a new custom network interface template for the role.For more information, see Defining custom network interface templates in Installing and managing Red Hat OpenStack Platform with director. To use the default network definition settings, do not include
network_config
in the role definition.For more information about the properties you can use to configure node attributes in your node definition file, see Bare-metal node provisioning attributes in Installing and managing Red Hat OpenStack Platform with director.
For an example node definition file, see Example node definition file in Installing and managing Red Hat OpenStack Platform with director.
Create the following Ansible playbook to configure the kernel during the node provisioning, and save the playbook as
/home/stack/templates/fix_rt_kernel.yaml
:# RealTime KVM fix until BZ #2122949 is closed- - name: Fix RT Kernel hosts: allovercloud any_errors_fatal: true gather_facts: false vars: reboot_wait_timeout: 900 pre_tasks: - name: Wait for provisioned nodes to boot wait_for_connection: timeout: 600 delay: 10 tasks: - name: Fix bootloader entry become: true shell: |- set -eux new_entry=$(grep saved_entry= /boot/grub2/grubenv | sed -e s/saved_entry=//) source /etc/default/grub sed -i "s/options.*/options root=$GRUB_DEVICE ro $GRUB_CMDLINE_LINUX $GRUB_CMDLINE_LINUX_DEFAULT/" /boot/loader/entries/$(</etc/machine-id)$new_entry.conf cp -f /boot/grub2/grubenv /boot/efi/EFI/redhat/grubenv post_tasks: - name: Configure reboot after new kernel become: true reboot: reboot_timeout: "{{ reboot_wait_timeout }}" when: reboot_wait_timeout is defined
Include
/home/stack/templates/fix_rt_kernel.yaml
as a playbook in theComputeOvsDpdkSriovRT
role definition in your node provisioning file:- name: ComputeOvsDpdkSriovRT ... ansible_playbooks: - playbook: /usr/share/ansible/tripleo-playbooks/cli-overcloud-node-kernelargs.yaml extra_vars: kernel_args: "default_hugepagesz=1GB hugepagesz=1G hugepages=64 iommu=pt intel_iommu=on tsx=off isolcpus=2-19,22-39" reboot_wait_timeout: 900 tuned_profile: "cpu-partitioning" tuned_isolated_cores: "2-19,22-39" defer_reboot: true - playbook: /home/stack/templates/fix_rt_kernel.yaml extra_vars: reboot_wait_timeout: 1800
For more information about the properties you can use to configure node attributes in your node definition file, see Bare-metal node provisioning attributes in Installing and managing Red Hat OpenStack Platform with director.
For an example node definition file, see Example node definition file in Installing and managing Red Hat OpenStack Platform with director.
Provision the new nodes for your role:
(undercloud)$ openstack overcloud node provision \ [--stack <stack> \ ] [--network-config \] --output <deployment_file> \ /home/stack/templates/overcloud-baremetal-deploy.yaml
-
Optional: Replace
<stack>
with the name of the stack for which the bare-metal nodes are provisioned. The default isovercloud
. -
Optional: Include the
--network-config
optional argument to provide the network definitions to thecli-overcloud-node-network-config.yaml
Ansible playbook. If you do not define the network definitions by using thenetwork_config
property, then the default network definitions are used. -
Replace
<deployment_file>
with the name of the heat environment file to generate for inclusion in the deployment command, for example/home/stack/templates/overcloud-baremetal-deployed.yaml
.
-
Optional: Replace
Monitor the provisioning progress in a separate terminal. When provisioning is successful, the node state changes from
available
toactive
:(undercloud)$ watch openstack baremetal node list
If you ran the provisioning command without the
--network-config
option, then configure the<Role>NetworkConfigTemplate
parameters in yournetwork-environment.yaml
file to point to your NIC template files:parameter_defaults: ComputeNetworkConfigTemplate: /home/stack/templates/nic-configs/compute.j2 ComputeAMDSEVNetworkConfigTemplate: /home/stack/templates/nic-configs/<rt_compute>.j2 ControllerNetworkConfigTemplate: /home/stack/templates/nic-configs/controller.j2
Replace
<rt_compute>
with the name of the file that contains the network topology of theComputeRealTime
role, for example,computert.yaml
to use the default network topology.Add your environment file to the stack with your other environment files and deploy the overcloud:
(undercloud)$ openstack overcloud deploy --templates \ -r /home/stack/templates/roles_data_rt.yaml \ -e /home/stack/templates/overcloud-baremetal-deployed.yaml -e /home/stack/templates/node-info.yaml \ -e [your environment files] \ -e /home/stack/templates/compute-real-time.yaml
12.2.2. Configuring OVS-DPDK parameters
Under
parameter_defaults
, set the tunnel type tovxlan
, and the network type tovxlan,vlan
:NeutronTunnelTypes: 'vxlan' NeutronNetworkType: 'vxlan,vlan'
Under
parameters_defaults
, set the bridge mapping:# The OVS logical->physical bridge mappings to use. NeutronBridgeMappings: - dpdk-mgmt:br-link0
Under
parameter_defaults
, set the role-specific parameters for theComputeOvsDpdkSriov
role:########################## # OVS DPDK configuration # ########################## ComputeOvsDpdkSriovParameters: KernelArgs: "default_hugepagesz=1GB hugepagesz=1G hugepages=32 iommu=pt intel_iommu=on isolcpus=2-19,22-39" TunedProfileName: "cpu-partitioning" IsolCpusList: "2-19,22-39" NovaComputeCpuDedicatedSet: ['4-19,24-39'] NovaReservedHostMemory: 4096 OvsDpdkSocketMemory: "3072,1024" OvsDpdkMemoryChannels: "4" OvsPmdCoreList: "2,22,3,23" NovaComputeCpuSharedSet: [0,20,1,21] NovaLibvirtRxQueueSize: 1024 NovaLibvirtTxQueueSize: 1024
NoteTo prevent failures during guest creation, assign at least one CPU with sibling thread on each NUMA node. In the example, the values for the
OvsPmdCoreList
parameter denote cores 2 and 22 from NUMA 0, and cores 3 and 23 from NUMA 1.NoteThese huge pages are consumed by the virtual machines, and also by OVS-DPDK using the
OvsDpdkSocketMemory
parameter as shown in this procedure. The number of huge pages available for the virtual machines is theboot
parameter minus theOvsDpdkSocketMemory
.You must also add
hw:mem_page_size=1GB
to the flavor you associate with the DPDK instance.NoteOvsDpdkMemoryChannels
is a required setting for this procedure. For optimum operation, ensure you deploy DPDK with appropriate parameters and values.Configure the role-specific parameters for SR-IOV:
NovaPCIPassthrough: - vendor_id: "8086" product_id: "1528" address: "0000:06:00.0" trusted: "true" physical_network: "sriov-1" - vendor_id: "8086" product_id: "1528" address: "0000:06:00.1" trusted: "true" physical_network: "sriov-2"
12.3. Launching an RT-KVM instance
Perform the following steps to launch an RT-KVM instance on a real-time enabled Compute node:
Create an RT-KVM flavor on the overcloud:
$ openstack flavor create r1.small --id 99 --ram 4096 --disk 20 --vcpus 4 $ openstack flavor set --property hw:cpu_policy=dedicated 99 $ openstack flavor set --property hw:cpu_realtime=yes 99 $ openstack flavor set --property hw:mem_page_size=1GB 99 $ openstack flavor set --property hw:cpu_realtime_mask="^0-1" 99 $ openstack flavor set --property hw:cpu_emulator_threads=isolate 99
Launch an RT-KVM instance:
$ openstack server create --image <rhel> --flavor r1.small --nic net-id=<dpdk-net> test-rt
To verify that the instance uses the assigned emulator threads, run the following command:
$ virsh dumpxml <instance-id> | grep vcpu -A1 <vcpu placement='static'>4</vcpu> <cputune> <vcpupin vcpu='0' cpuset='1'/> <vcpupin vcpu='1' cpuset='3'/> <vcpupin vcpu='2' cpuset='5'/> <vcpupin vcpu='3' cpuset='7'/> <emulatorpin cpuset='0-1'/> <vcpusched vcpus='2-3' scheduler='fifo' priority='1'/> </cputune>