Search

Chapter 5. Configuring SR-IOV and DPDK interfaces on the same compute node

download PDF

This section describes how to deploy SR-IOV and DPDK interfaces on the same Compute node.

Note

This guide provides examples for CPU assignments, memory allocation, and NIC configurations that may vary from your topology and use case. See the Network Functions Virtualization Product Guide and the Network Functions Virtualization Planning Guide to understand the hardware and configuration options.

The process to create and deploy SR-IOV and DPDK interfaces on the same Compute node includes:

  • Set the parameters for SR-IOV role and OVS-DPDK in the network-environment.yaml file.
  • Configure the compute.yaml file with an SR-IOV interface and a DPDK interface.
  • Deploy the overcloud with this updated set of roles.
  • Create the appropriate OpenStack flavor, networks, and ports to support these interface types.

We recommend the following network settings:

  • Use floating IP addresses for the guest instances.
  • Create a router and attach it to the DPDK VXLAN network (the management network).
  • Use SR-IOV for the provider network.
  • Boot the guest instance with two ports attached. We recommend you use cloud-init for the guest instance to set the default route for the management network.
  • Add the floating IP address to booted guest instance.
Note

If needed, use SR-IOV bonding for the guest instance and ensure both SR-IOV interfaces exist on the same NUMA node for optimum performance.

You must install and configure the undercloud before you can deploy the compute node in the overcloud. See the Director Installation and Usage Guide for details.

Note

Ensure that you create an OpenStack flavor that match this custom role.

5.1. Modifying the first-boot.yaml file

Modify the first-boot.yaml file to set up OVS and DPDK parameters and to configure tuned for CPU affinity.

Note

If you have included the following lines in the first-boot.yaml file in a previous deployment, remove these lines for Red Hat OpenStack Platform 10 with Open vSwitch 2.9.

ovs_service_path="/usr/lib/systemd/system/ovs-vswitchd.service"
grep -q "RuntimeDirectoryMode=.*" $ovs_service_path

if [ "$?" -eq 0 ]; then
    sed -i 's/RuntimeDirectoryMode=.*/RuntimeDirectoryMode=0775/' $ovs_service_path
else
    echo "RuntimeDirectoryMode=0775" >> $ovs_service_path
fi

grep -Fxq "Group=qemu" $ovs_service_path

if [ ! "$?" -eq 0 ]; then
    echo "Group=qemu" >> $ovs_service_path
fi

grep -Fxq "UMask=0002" $ovs_service_path

if [ ! "$?" -eq 0 ]; then
    echo "UMask=0002" >> $ovs_service_path
fi

ovs_ctl_path='/usr/share/openvswitch/scripts/ovs-ctl'
grep -q "umask 0002 \&\& start_daemon \"\$OVS_VSWITCHD_PRIORITY\"" $ovs_ctl_path

if [ ! "$?" -eq 0 ]; then
    sed -i 's/start_daemon \"\$OVS_VSWITCHD_PRIORITY.*/umask 0002 \&\& start_daemon \"$OVS_VSWITCHD_PRIORITY\" \"$OVS_VSWITCHD_WRAPPER\" \"$@\"/' $ovs_ctl_path
fi
  1. Add additional resources.

      resources:
        userdata:
          type: OS::Heat::MultipartMime
          properties:
            parts:
            - config: {get_resource: set_dpdk_params}
            - config: {get_resource: install_tuned}
            - config: {get_resource: compute_kernel_args}
  2. Set the DPDK parameters.

      set_dpdk_params:
        type: OS::Heat::SoftwareConfig
        properties:
          config:
            str_replace:
              template: |
                #!/bin/bash
                set -x
                get_mask()
                {
                  local list=$1
                  local mask=0
                  declare -a bm
                  max_idx=0
                  for core in $(echo $list | sed 's/,/ /g')
                  do
                      index=$(($core/32))
                      bm[$index]=0
                      if [ $max_idx -lt $index ]; then
                         max_idx=$(($index))
                      fi
                  done
                  for ((i=$max_idx;i>=0;i--));
                  do
                      bm[$i]=0
                  done
                  for core in $(echo $list | sed 's/,/ /g')
                  do
                      index=$(($core/32))
                      temp=$((1<<$(($core % 32))))
                      bm[$index]=$((${bm[$index]} | $temp))
                  done
    
                  printf -v mask "%x" "${bm[$max_idx]}"
                  for ((i=$max_idx-1;i>=0;i--));
                  do
                      printf -v hex "%08x" "${bm[$i]}"
                      mask+=$hex
                  done
                  printf "%s" "$mask"
                }
    
                FORMAT=$COMPUTE_HOSTNAME_FORMAT
                if [[ -z $FORMAT ]] ; then
                  FORMAT="compute" ;
                else
                  # Assumption: only %index% and %stackname% are the variables in Host name format
                  FORMAT=$(echo $FORMAT | sed  's/\%index\%//g' | sed 's/\%stackname\%//g') ;
                fi
                if [[ $(hostname) == *$FORMAT* ]] ; then
                  # 42477 is the kolla hugetlbfs gid value.
                  getent group hugetlbfs >/dev/null || \
                    groupadd hugetlbfs -g 42477 && groupmod -g 42477 hugetlbfs
    
                  pmd_cpu_mask=$( get_mask $PMD_CORES )
                  host_cpu_mask=$( get_mask $LCORE_LIST )
                  socket_mem=$(echo $SOCKET_MEMORY | sed s/\'//g )
                  ovs-vsctl --no-wait set Open_vSwitch . other_config:dpdk-init=true
                  ovs-vsctl --no-wait set Open_vSwitch . other_config:dpdk-socket-mem=$socket_mem
                  ovs-vsctl --no-wait set Open_vSwitch . other_config:pmd-cpu-mask=$pmd_cpu_mask
                  ovs-vsctl --no-wait set Open_vSwitch . other_config:dpdk-lcore-mask=$host_cpu_mask
                fi
              params:
                $COMPUTE_HOSTNAME_FORMAT: {get_param: ComputeHostnameFormat}
                $LCORE_LIST: {get_param: HostCpusList}
                $PMD_CORES: {get_param: NeutronDpdkCoreList}
                $SOCKET_MEMORY: {get_param: NeutronDpdkSocketMemory}
  3. Set the tuned configuration to provide CPU affinity.

      install_tuned:
        type: OS::Heat::SoftwareConfig
        properties:
          config:
            str_replace:
              template: |
                #!/bin/bash
                FORMAT=$COMPUTE_HOSTNAME_FORMAT
                if [[ -z $FORMAT ]] ; then
                  FORMAT="compute" ;
                else
                  # Assumption: only %index% and %stackname% are the variables in Host name format
                  FORMAT=$(echo $FORMAT | sed  's/\%index\%//g' | sed 's/\%stackname\%//g') ;
                fi
                if [[ $(hostname) == *$FORMAT* ]] ; then
                  # Install the tuned package
                  yum install -y tuned-profiles-cpu-partitioning
    
                  tuned_conf_path="/etc/tuned/cpu-partitioning-variables.conf"
                  if [ -n "$TUNED_CORES" ]; then
                    grep -q "^isolated_cores" $tuned_conf_path
                    if [ "$?" -eq 0 ]; then
                      sed -i 's/^isolated_cores=.*/isolated_cores=$TUNED_CORES/' $tuned_conf_path
                    else
                      echo "isolated_cores=$TUNED_CORES" >> $tuned_conf_path
                    fi
                    tuned-adm profile cpu-partitioning
                  fi
                fi
              params:
                $COMPUTE_HOSTNAME_FORMAT: {get_param: ComputeHostnameFormat}
                $TUNED_CORES: {get_param: HostIsolatedCoreList}
  4. Set the kernel arguments.

      compute_kernel_args:
        type: OS::Heat::SoftwareConfig
        properties:
          config:
            str_replace:
              template: |
                #!/bin/bash
                FORMAT=$COMPUTE_HOSTNAME_FORMAT
                if [[ -z $FORMAT ]] ; then
                  FORMAT="compute" ;
                else
                  # Assumption: only %index% and %stackname% are the variables in Host name format
                  FORMAT=$(echo $FORMAT | sed  's/\%index\%//g' | sed 's/\%stackname\%//g') ;
                fi
                if [[ $(hostname) == *$FORMAT* ]] ; then
                  sed 's/^\(GRUB_CMDLINE_LINUX=".*\)"/\1 $KERNEL_ARGS isolcpus=$TUNED_CORES"/g' -i /etc/default/grub ;
                  grub2-mkconfig -o /etc/grub2.cfg
                  reboot
                fi
              params:
                $KERNEL_ARGS: {get_param: ComputeKernelArgs}
                $COMPUTE_HOSTNAME_FORMAT: {get_param: ComputeHostnameFormat}
                $TUNED_CORES: {get_param: HostIsolatedCoreList}

5.2. Configuring openvswitch for security groups (Technology Preview)

Dataplane interfaces need a high degree of performance in a stateful firewall. To protect these interfaces, consider deploying a telco grade firewall (VNF).

Controlplane interfaces can be configured by setting the NeutronOVSFirewallDriver parameter openvswitch. This configures OpenStack Networking to use the flow-based OVS firewall driver. This is set in the network-environment.yaml file under parameter_defaults.

Example:

parameter_defaults:
  NeutronOVSFirewallDriver: openvswitch
Note

Openvswitch is a technology preview and should only be used in testing environments. The only supported value for the NeutronOVSFirewallDriver parameter is noop.

When the OVS firewall driver is used, it is important to disable it for dataplane interfaces. This can be done with the openstack port set command.

Example:

openstack port set --no-security-group  --disable-port-security ${PORT}

5.3. Defining the SR-IOV and OVS-DPDK parameters

Modify the network-environment.yaml file to configure SR-IOV and OVS-DPDK role-specific parameters:

  1. Add the resource mapping for the OVS-DPDK and SR-IOV services to the network-environment.yaml file along with the network configuration for these nodes:

      resource_registry:
        # Specify the relative/absolute path to the config files you want to use for override the default.
        OS::TripleO::Compute::Net::SoftwareConfig: nic-configs/compute.yaml
        OS::TripleO::Controller::Net::SoftwareConfig: nic-configs/controller.yaml
        OS::TripleO::NodeUserData: first-boot.yaml
  2. Define the flavors:

      OvercloudControlFlavor: controller
      OvercloudComputeFlavor: compute
  3. Define the tunnel type:

      # The tunnel type for the tenant network (vxlan or gre). Set to '' to disable tunneling.
      NeutronTunnelTypes: 'vxlan'
      # The tenant network type for Neutron (vlan or vxlan).
      NeutronNetworkType: 'vlan'
  4. Configure the parameters for SR-IOV. You can obtain the PCI vendor and device values as seen in the NeutronSupportedPCIVendorDevs parameter by running lspci -nv.

    Note

    The OpenvSwitch firewall driver, as seen in the following example, is a Technology Preview and should be used for control plane interfaces only. The only supported value for the NeutronOVSFirewallDriver parameter is noop. See Configuring openvswitch for security groups for details.

      NeutronSupportedPCIVendorDevs: ['8086:154d', '8086:10ed']
      NovaPCIPassthrough:
        - devname: "p5p2"
          physical_network: "tenant"
    
      NeutronPhysicalDevMappings: "tenant:p5p2"
      NeutronSriovNumVFs: "p5p2:5"
      # Global MTU.
      NeutronGlobalPhysnetMtu: 9000
      # Configure the classname of the firewall driver to use for implementing security groups.
      NeutronOVSFirewallDriver: openvswitch
  5. Configure the parameters for OVS-DPDK:

      ########################
      # OVS DPDK configuration
      ## NeutronDpdkCoreList and NeutronDpdkMemoryChannels are REQUIRED settings.
      ## Attempting to deploy DPDK without appropriate values will cause deployment to fail or lead to unstable deployments.
      # List of cores to be used for DPDK Poll Mode Driver
      NeutronDpdkCoreList: "'2,22,3,23'"
      # Number of memory channels to be used for DPDK
      NeutronDpdkMemoryChannels: "4"
      # NeutronDpdkSocketMemory
      NeutronDpdkSocketMemory: "'3072,1024'"
      # NeutronDpdkDriverType
      NeutronDpdkDriverType: "vfio-pci"
      # The vhost-user socket directory for OVS
      NeutronVhostuserSocketDir: "/var/lib/vhost_sockets"
    
      ########################
      # Additional settings
      ########################
      # Reserved RAM for host processes
      NovaReservedHostMemory: 4096
      # A list or range of physical CPU cores to reserve for virtual machine processes.
      # Example: NovaVcpuPinSet: ['4-12','^8'] will reserve cores from 4-12 excluding 8
      NovaVcpuPinSet: "4-19,24-39"
      # An array of filters used by Nova to filter a node.These filters will be applied in the order they are listed,
      # so place your most restrictive filters first to make the filtering process more efficient.
      NovaSchedulerDefaultFilters:
        - "RetryFilter"
        - "AvailabilityZoneFilter"
        - "RamFilter"
        - "ComputeFilter"
        - "ComputeCapabilitiesFilter"
        - "ImagePropertiesFilter"
        - "ServerGroupAntiAffinityFilter"
        - "ServerGroupAffinityFilter"
        - "PciPassthroughFilter"
        - "NUMATopologyFilter"
        - "AggregateInstanceExtraSpecsFilter"
      # Kernel arguments for Compute node
      ComputeKernelArgs: "default_hugepagesz=1GB hugepagesz=1G hugepages=32 iommu=pt intel_iommu=on"
      # A list or range of physical CPU cores to be tuned.
      # The given args will be appended to the tuned cpu-partitioning profile.
      HostIsolatedCoreList: "2-19,22-39"
      # List of logical cores to be used by ovs-dpdk processess (dpdk-lcore-mask)
      HostCpusList: "'2-19,22-39'"
    Note

    You must assign at least one CPU (with sibling thread) on each NUMA node with or without DPDK NICs present for DPDK PMD to avoid failures in creating guest instances.

  6. Configure the remainder of the network-environment.yaml file to override the default parameters from the neutron-ovs-dpdk-agent.yaml and neutron-sriov-agent.yaml files as needed for your OpenStack deployment.

See the Network Functions Virtualization Planning Guide for details on how to determine the best values for the OVS-DPDK parameters that you set in the network-environment.yaml file to optimize your OpenStack network for OVS-DPDK.

5.4. Configuring the Compute node for SR-IOV and DPDK interfaces

This example uses the sample the compute.yaml file to support SR-IOV and DPDK interfaces.

  1. Create the control plane Linux bond for an isolated network:

      type: linux_bond
      name: bond_api
      bonding_options: "mode=active-backup"
      use_dhcp: false
      dns_servers: {get_param: DnsServers}
      members:
        -
          type: interface
          name: nic3
          primary: true
        -
          type: interface
          name: nic4
  2. Assign VLANs to this Linux bond:

      -
        type: vlan
        vlan_id: {get_param: InternalApiNetworkVlanID}
        device: bond_api
        addresses:
          -
            ip_netmask: {get_param: InternalApiIpSubnet}
      -
        type: vlan
        vlan_id: {get_param: StorageNetworkVlanID}
        device: bond_api
        addresses:
          -
            ip_netmask: {get_param: StorageIpSubnet}
  3. Set a bridge with a DPDK port to link to the controller:

      type: ovs_user_bridge
      name: br-link0
      ovs_extra:
        -
          str_replace:
            template: set port br-link0 tag=_VLAN_TAG_
            params:
              _VLAN_TAG_: {get_param: TenantNetworkVlanID}
      addresses:
        -
          ip_netmask: {get_param: TenantIpSubnet}
      use_dhcp: false
      members:
        -
          type: ovs_dpdk_port
          name: dpdk0
          mtu: 9000
          ovs_extra:
          - set interface $DEVICE mtu_request=$MTU
          - set interface $DEVICE optoins:n_rxq=2
          members:
            -
              type: interface
              name: nic7
              primary: true
    Note

    To include multiple DPDK devices, repeat the type code section for each DPDK device you want to add.

    Note

    When using OVS-DPDK, all bridges on the same Compute node should be of type ovs_user_bridge. The director may accept the configuration, but Red Hat OpenStack Platform does not support mixing ovs_bridge and ovs_user_bridge on the same node.

  4. Create the SR-IOV interface to the Controller:

      - type: interface
        name: p7p2
        mtu: 9000
        use_dhcp: false
        defroute: false
        nm_controlled: true
        hotplug: true

5.5. Deploying the overcloud

The following example defines the overcloud_deploy.sh Bash script that deploys both OVS-DPDK and SR-IOV:

#!/bin/bash

openstack overcloud deploy \
--templates \
-e /usr/share/openstack-tripleo-heat-templates/environments/network-isolation.yaml \
-e /usr/share/openstack-tripleo-heat-templates/environments/neutron-ovs-dpdk.yaml \
-e /usr/share/openstack-tripleo-heat-templates/environments/neutron-sriov.yaml \
-e /home/stack/ospd-10-vxlan-vlan-dpdk-sriov-ctlplane-bonding/network-environment.yaml

5.6. Creating a flavor and deploying an instance with SR-IOV and DPDK interfaces

With a successful deployment, you can now begin populating the overcloud. Start by sourcing the newly created overcloudrc file in the /home/stack directory. Then, create a flavor and deploy an instance.

  1. Create a flavor:

    # source overcloudrc
    # openstack flavor create --vcpus 6 --ram 4096 --disk 40 compute

    Where:

    • compute is the flavor name.
    • 4096 is the memory size in MB.
    • 40 is the disk size in GB (default 0G).
    • 6 is the number of vCPUs.
  2. Set the flavor for large pages:

    # openstack flavor set compute --property hw:mem_page_size=1GB
  3. Create the external network:

    # openstack network create --share --external \
    --provider-physical-network <net-mgmt-physnet> \
    --provider-network-type <flat|vlan> external
  4. Create the networks for SR-IOV and DPDK:

    # openstack network create net-dpdk
    # openstack network create net-sriov
    # openstack subnet create --subnet-range <cidr/prefix> --network net-dpdk  net-dpdk-subnet
    # openstack subnet create --subnet-range <cidr/prefix> --network net-sriov  net-sriov-subnet
  5. Create the SR-IOV port.

    1. Use vnic-type direct to create an SR-IOV VF port:

      # openstack port create --network net-sriov --vnic-type direct sriov_port
    2. Use vnic-type direct-physical to create an SR-IOV PF port:

      # openstack port create --network net-sriov --vnic-type direct-physical sriov_port
  6. Create a router and attach to the DPDK VXLAN network:

    # openstack router create router1
    # openstack router add subnet router1 net-dpdk-subnet
  7. Create a floating IP address and associate it with the guest instance port:

    # openstack floating ip create --floating-ip-address FLOATING-IP external
  8. Deploy an instance:

    # openstack server create --flavor compute --image rhel_7.3 --nic port-id=sriov_port --nic net-id=NET_DPDK_ID vm1

Where:

  • compute is the flavor name or ID.
  • rhel_7.3 is the image (name or ID) used to create an instance.
  • sriov_port is the name of the port created in the previous step.
  • NET_DPDK_ID is the DPDK network ID.
  • vm1 is the name of the instance.

You have now deployed an instance that uses an SR-IOV interface and a DPDK interface on the same Compute node.

Note

For instances with more interfaces, you can use cloud-init. See Table 3.1 in Create an Instance for details.

Red Hat logoGithubRedditYoutubeTwitter

Learn

Try, buy, & sell

Communities

About Red Hat Documentation

We help Red Hat users innovate and achieve their goals with our products and services with content they can trust.

Making open source more inclusive

Red Hat is committed to replacing problematic language in our code, documentation, and web properties. For more details, see the Red Hat Blog.

About Red Hat

We deliver hardened solutions that make it easier for enterprises to work across platforms and environments, from the core datacenter to the network edge.

© 2024 Red Hat, Inc.