Chapter 5. Configuring SR-IOV and DPDK interfaces on the same compute node
This section describes how to deploy SR-IOV and DPDK interfaces on the same Compute node.
This guide provides examples for CPU assignments, memory allocation, and NIC configurations that may vary from your topology and use case. See the Network Functions Virtualization Product Guide and the Network Functions Virtualization Planning Guide to understand the hardware and configuration options.
The process to create and deploy SR-IOV and DPDK interfaces on the same Compute node includes:
-
Set the parameters for SR-IOV role and OVS-DPDK in the
network-environment.yaml
file. -
Configure the
compute.yaml
file with an SR-IOV interface and a DPDK interface. - Deploy the overcloud with this updated set of roles.
- Create the appropriate OpenStack flavor, networks, and ports to support these interface types.
We recommend the following network settings:
- Use floating IP addresses for the guest instances.
- Create a router and attach it to the DPDK VXLAN network (the management network).
- Use SR-IOV for the provider network.
-
Boot the guest instance with two ports attached. We recommend you use
cloud-init
for the guest instance to set the default route for the management network. - Add the floating IP address to booted guest instance.
If needed, use SR-IOV bonding for the guest instance and ensure both SR-IOV interfaces exist on the same NUMA node for optimum performance.
You must install and configure the undercloud before you can deploy the compute node in the overcloud. See the Director Installation and Usage Guide for details.
Ensure that you create an OpenStack flavor that match this custom role.
5.1. Modifying the first-boot.yaml file
Modify the first-boot.yaml file to set up OVS and DPDK parameters and to configure tuned
for CPU affinity.
If you have included the following lines in the first-boot.yaml
file in a previous deployment, remove these lines for Red Hat OpenStack Platform 10 with Open vSwitch 2.9.
ovs_service_path="/usr/lib/systemd/system/ovs-vswitchd.service" grep -q "RuntimeDirectoryMode=.*" $ovs_service_path if [ "$?" -eq 0 ]; then sed -i 's/RuntimeDirectoryMode=.*/RuntimeDirectoryMode=0775/' $ovs_service_path else echo "RuntimeDirectoryMode=0775" >> $ovs_service_path fi grep -Fxq "Group=qemu" $ovs_service_path if [ ! "$?" -eq 0 ]; then echo "Group=qemu" >> $ovs_service_path fi grep -Fxq "UMask=0002" $ovs_service_path if [ ! "$?" -eq 0 ]; then echo "UMask=0002" >> $ovs_service_path fi ovs_ctl_path='/usr/share/openvswitch/scripts/ovs-ctl' grep -q "umask 0002 \&\& start_daemon \"\$OVS_VSWITCHD_PRIORITY\"" $ovs_ctl_path if [ ! "$?" -eq 0 ]; then sed -i 's/start_daemon \"\$OVS_VSWITCHD_PRIORITY.*/umask 0002 \&\& start_daemon \"$OVS_VSWITCHD_PRIORITY\" \"$OVS_VSWITCHD_WRAPPER\" \"$@\"/' $ovs_ctl_path fi
Add additional resources.
resources: userdata: type: OS::Heat::MultipartMime properties: parts: - config: {get_resource: set_dpdk_params} - config: {get_resource: install_tuned} - config: {get_resource: compute_kernel_args}
Set the DPDK parameters.
set_dpdk_params: type: OS::Heat::SoftwareConfig properties: config: str_replace: template: | #!/bin/bash set -x get_mask() { local list=$1 local mask=0 declare -a bm max_idx=0 for core in $(echo $list | sed 's/,/ /g') do index=$(($core/32)) bm[$index]=0 if [ $max_idx -lt $index ]; then max_idx=$(($index)) fi done for ((i=$max_idx;i>=0;i--)); do bm[$i]=0 done for core in $(echo $list | sed 's/,/ /g') do index=$(($core/32)) temp=$((1<<$(($core % 32)))) bm[$index]=$((${bm[$index]} | $temp)) done printf -v mask "%x" "${bm[$max_idx]}" for ((i=$max_idx-1;i>=0;i--)); do printf -v hex "%08x" "${bm[$i]}" mask+=$hex done printf "%s" "$mask" } FORMAT=$COMPUTE_HOSTNAME_FORMAT if [[ -z $FORMAT ]] ; then FORMAT="compute" ; else # Assumption: only %index% and %stackname% are the variables in Host name format FORMAT=$(echo $FORMAT | sed 's/\%index\%//g' | sed 's/\%stackname\%//g') ; fi if [[ $(hostname) == *$FORMAT* ]] ; then # 42477 is the kolla hugetlbfs gid value. getent group hugetlbfs >/dev/null || \ groupadd hugetlbfs -g 42477 && groupmod -g 42477 hugetlbfs pmd_cpu_mask=$( get_mask $PMD_CORES ) host_cpu_mask=$( get_mask $LCORE_LIST ) socket_mem=$(echo $SOCKET_MEMORY | sed s/\'//g ) ovs-vsctl --no-wait set Open_vSwitch . other_config:dpdk-init=true ovs-vsctl --no-wait set Open_vSwitch . other_config:dpdk-socket-mem=$socket_mem ovs-vsctl --no-wait set Open_vSwitch . other_config:pmd-cpu-mask=$pmd_cpu_mask ovs-vsctl --no-wait set Open_vSwitch . other_config:dpdk-lcore-mask=$host_cpu_mask fi params: $COMPUTE_HOSTNAME_FORMAT: {get_param: ComputeHostnameFormat} $LCORE_LIST: {get_param: HostCpusList} $PMD_CORES: {get_param: NeutronDpdkCoreList} $SOCKET_MEMORY: {get_param: NeutronDpdkSocketMemory}
Set the
tuned
configuration to provide CPU affinity.install_tuned: type: OS::Heat::SoftwareConfig properties: config: str_replace: template: | #!/bin/bash FORMAT=$COMPUTE_HOSTNAME_FORMAT if [[ -z $FORMAT ]] ; then FORMAT="compute" ; else # Assumption: only %index% and %stackname% are the variables in Host name format FORMAT=$(echo $FORMAT | sed 's/\%index\%//g' | sed 's/\%stackname\%//g') ; fi if [[ $(hostname) == *$FORMAT* ]] ; then # Install the tuned package yum install -y tuned-profiles-cpu-partitioning tuned_conf_path="/etc/tuned/cpu-partitioning-variables.conf" if [ -n "$TUNED_CORES" ]; then grep -q "^isolated_cores" $tuned_conf_path if [ "$?" -eq 0 ]; then sed -i 's/^isolated_cores=.*/isolated_cores=$TUNED_CORES/' $tuned_conf_path else echo "isolated_cores=$TUNED_CORES" >> $tuned_conf_path fi tuned-adm profile cpu-partitioning fi fi params: $COMPUTE_HOSTNAME_FORMAT: {get_param: ComputeHostnameFormat} $TUNED_CORES: {get_param: HostIsolatedCoreList}
Set the kernel arguments.
compute_kernel_args: type: OS::Heat::SoftwareConfig properties: config: str_replace: template: | #!/bin/bash FORMAT=$COMPUTE_HOSTNAME_FORMAT if [[ -z $FORMAT ]] ; then FORMAT="compute" ; else # Assumption: only %index% and %stackname% are the variables in Host name format FORMAT=$(echo $FORMAT | sed 's/\%index\%//g' | sed 's/\%stackname\%//g') ; fi if [[ $(hostname) == *$FORMAT* ]] ; then sed 's/^\(GRUB_CMDLINE_LINUX=".*\)"/\1 $KERNEL_ARGS isolcpus=$TUNED_CORES"/g' -i /etc/default/grub ; grub2-mkconfig -o /etc/grub2.cfg reboot fi params: $KERNEL_ARGS: {get_param: ComputeKernelArgs} $COMPUTE_HOSTNAME_FORMAT: {get_param: ComputeHostnameFormat} $TUNED_CORES: {get_param: HostIsolatedCoreList}
5.2. Configuring openvswitch for security groups (Technology Preview)
Dataplane interfaces need a high degree of performance in a stateful firewall. To protect these interfaces, consider deploying a telco grade firewall (VNF).
Controlplane interfaces can be configured by setting the NeutronOVSFirewallDriver
parameter openvswitch
. This configures OpenStack Networking to use the flow-based OVS firewall driver. This is set in the network-environment.yaml
file under parameter_defaults
.
Example:
parameter_defaults: NeutronOVSFirewallDriver: openvswitch
Openvswitch is a technology preview and should only be used in testing environments. The only supported value for the NeutronOVSFirewallDriver
parameter is noop
.
When the OVS firewall driver is used, it is important to disable it for dataplane interfaces. This can be done with the openstack port set
command.
Example:
openstack port set --no-security-group --disable-port-security ${PORT}
5.3. Defining the SR-IOV and OVS-DPDK parameters
Modify the network-environment.yaml file to configure SR-IOV and OVS-DPDK role-specific parameters:
Add the resource mapping for the OVS-DPDK and SR-IOV services to the
network-environment.yaml
file along with the network configuration for these nodes:resource_registry: # Specify the relative/absolute path to the config files you want to use for override the default. OS::TripleO::Compute::Net::SoftwareConfig: nic-configs/compute.yaml OS::TripleO::Controller::Net::SoftwareConfig: nic-configs/controller.yaml OS::TripleO::NodeUserData: first-boot.yaml
Define the flavors:
OvercloudControlFlavor: controller OvercloudComputeFlavor: compute
Define the tunnel type:
# The tunnel type for the tenant network (vxlan or gre). Set to '' to disable tunneling. NeutronTunnelTypes: 'vxlan' # The tenant network type for Neutron (vlan or vxlan). NeutronNetworkType: 'vlan'
Configure the parameters for SR-IOV. You can obtain the PCI vendor and device values as seen in the
NeutronSupportedPCIVendorDevs
parameter by runninglspci -nv
.NoteThe OpenvSwitch firewall driver, as seen in the following example, is a Technology Preview and should be used for control plane interfaces only. The only supported value for the
NeutronOVSFirewallDriver
parameter isnoop
. See Configuring openvswitch for security groups for details.NeutronSupportedPCIVendorDevs: ['8086:154d', '8086:10ed'] NovaPCIPassthrough: - devname: "p5p2" physical_network: "tenant" NeutronPhysicalDevMappings: "tenant:p5p2" NeutronSriovNumVFs: "p5p2:5" # Global MTU. NeutronGlobalPhysnetMtu: 9000 # Configure the classname of the firewall driver to use for implementing security groups. NeutronOVSFirewallDriver: openvswitch
Configure the parameters for OVS-DPDK:
######################## # OVS DPDK configuration ## NeutronDpdkCoreList and NeutronDpdkMemoryChannels are REQUIRED settings. ## Attempting to deploy DPDK without appropriate values will cause deployment to fail or lead to unstable deployments. # List of cores to be used for DPDK Poll Mode Driver NeutronDpdkCoreList: "'2,22,3,23'" # Number of memory channels to be used for DPDK NeutronDpdkMemoryChannels: "4" # NeutronDpdkSocketMemory NeutronDpdkSocketMemory: "'3072,1024'" # NeutronDpdkDriverType NeutronDpdkDriverType: "vfio-pci" # The vhost-user socket directory for OVS NeutronVhostuserSocketDir: "/var/lib/vhost_sockets" ######################## # Additional settings ######################## # Reserved RAM for host processes NovaReservedHostMemory: 4096 # A list or range of physical CPU cores to reserve for virtual machine processes. # Example: NovaVcpuPinSet: ['4-12','^8'] will reserve cores from 4-12 excluding 8 NovaVcpuPinSet: "4-19,24-39" # An array of filters used by Nova to filter a node.These filters will be applied in the order they are listed, # so place your most restrictive filters first to make the filtering process more efficient. NovaSchedulerDefaultFilters: - "RetryFilter" - "AvailabilityZoneFilter" - "RamFilter" - "ComputeFilter" - "ComputeCapabilitiesFilter" - "ImagePropertiesFilter" - "ServerGroupAntiAffinityFilter" - "ServerGroupAffinityFilter" - "PciPassthroughFilter" - "NUMATopologyFilter" - "AggregateInstanceExtraSpecsFilter" # Kernel arguments for Compute node ComputeKernelArgs: "default_hugepagesz=1GB hugepagesz=1G hugepages=32 iommu=pt intel_iommu=on" # A list or range of physical CPU cores to be tuned. # The given args will be appended to the tuned cpu-partitioning profile. HostIsolatedCoreList: "2-19,22-39" # List of logical cores to be used by ovs-dpdk processess (dpdk-lcore-mask) HostCpusList: "'2-19,22-39'"
NoteYou must assign at least one CPU (with sibling thread) on each NUMA node with or without DPDK NICs present for DPDK PMD to avoid failures in creating guest instances.
-
Configure the remainder of the
network-environment.yaml
file to override the default parameters from theneutron-ovs-dpdk-agent.yaml
andneutron-sriov-agent.yaml
files as needed for your OpenStack deployment.
See the Network Functions Virtualization Planning Guide for details on how to determine the best values for the OVS-DPDK parameters that you set in the network-environment.yaml
file to optimize your OpenStack network for OVS-DPDK.
5.4. Configuring the Compute node for SR-IOV and DPDK interfaces
This example uses the sample the compute.yaml file to support SR-IOV and DPDK interfaces.
Create the control plane Linux bond for an isolated network:
type: linux_bond name: bond_api bonding_options: "mode=active-backup" use_dhcp: false dns_servers: {get_param: DnsServers} members: - type: interface name: nic3 primary: true - type: interface name: nic4
Assign VLANs to this Linux bond:
- type: vlan vlan_id: {get_param: InternalApiNetworkVlanID} device: bond_api addresses: - ip_netmask: {get_param: InternalApiIpSubnet} - type: vlan vlan_id: {get_param: StorageNetworkVlanID} device: bond_api addresses: - ip_netmask: {get_param: StorageIpSubnet}
Set a bridge with a DPDK port to link to the controller:
type: ovs_user_bridge name: br-link0 ovs_extra: - str_replace: template: set port br-link0 tag=_VLAN_TAG_ params: _VLAN_TAG_: {get_param: TenantNetworkVlanID} addresses: - ip_netmask: {get_param: TenantIpSubnet} use_dhcp: false members: - type: ovs_dpdk_port name: dpdk0 mtu: 9000 ovs_extra: - set interface $DEVICE mtu_request=$MTU - set interface $DEVICE optoins:n_rxq=2 members: - type: interface name: nic7 primary: true
NoteTo include multiple DPDK devices, repeat the
type
code section for each DPDK device you want to add.NoteWhen using OVS-DPDK, all bridges on the same Compute node should be of type
ovs_user_bridge
. The director may accept the configuration, but Red Hat OpenStack Platform does not support mixingovs_bridge
andovs_user_bridge
on the same node.Create the SR-IOV interface to the Controller:
- type: interface name: p7p2 mtu: 9000 use_dhcp: false defroute: false nm_controlled: true hotplug: true
5.5. Deploying the overcloud
The following example defines the overcloud_deploy.sh Bash script that deploys both OVS-DPDK and SR-IOV:
#!/bin/bash openstack overcloud deploy \ --templates \ -e /usr/share/openstack-tripleo-heat-templates/environments/network-isolation.yaml \ -e /usr/share/openstack-tripleo-heat-templates/environments/neutron-ovs-dpdk.yaml \ -e /usr/share/openstack-tripleo-heat-templates/environments/neutron-sriov.yaml \ -e /home/stack/ospd-10-vxlan-vlan-dpdk-sriov-ctlplane-bonding/network-environment.yaml
5.6. Creating a flavor and deploying an instance with SR-IOV and DPDK interfaces
With a successful deployment, you can now begin populating the overcloud. Start by sourcing the newly created overcloudrc
file in the /home/stack
directory. Then, create a flavor and deploy an instance.
Create a flavor:
# source overcloudrc # openstack flavor create --vcpus 6 --ram 4096 --disk 40 compute
Where:
-
compute
is the flavor name. -
4096
is the memory size in MB. -
40
is the disk size in GB (default 0G). -
6
is the number of vCPUs.
-
Set the flavor for large pages:
# openstack flavor set compute --property hw:mem_page_size=1GB
Create the external network:
# openstack network create --share --external \ --provider-physical-network <net-mgmt-physnet> \ --provider-network-type <flat|vlan> external
Create the networks for SR-IOV and DPDK:
# openstack network create net-dpdk # openstack network create net-sriov # openstack subnet create --subnet-range <cidr/prefix> --network net-dpdk net-dpdk-subnet # openstack subnet create --subnet-range <cidr/prefix> --network net-sriov net-sriov-subnet
Create the SR-IOV port.
Use
vnic-type
direct to create an SR-IOV VF port:# openstack port create --network net-sriov --vnic-type direct sriov_port
Use
vnic-type
direct-physical to create an SR-IOV PF port:# openstack port create --network net-sriov --vnic-type direct-physical sriov_port
Create a router and attach to the DPDK VXLAN network:
# openstack router create router1 # openstack router add subnet router1 net-dpdk-subnet
Create a floating IP address and associate it with the guest instance port:
# openstack floating ip create --floating-ip-address FLOATING-IP external
Deploy an instance:
# openstack server create --flavor compute --image rhel_7.3 --nic port-id=sriov_port --nic net-id=NET_DPDK_ID vm1
Where:
- compute is the flavor name or ID.
-
rhel_7.3
is the image (name or ID) used to create an instance. -
sriov_port
is the name of the port created in the previous step. - NET_DPDK_ID is the DPDK network ID.
-
vm1
is the name of the instance.
You have now deployed an instance that uses an SR-IOV interface and a DPDK interface on the same Compute node.
For instances with more interfaces, you can use cloud-init
. See Table 3.1 in Create an Instance for details.