此内容没有您所选择的语言版本。
Network Functions Virtualization Planning and Configuration Guide
Planning and Configuring the Network Functions Virtualization (NFV) OpenStack Deployment
Abstract
Preface 复制链接链接已复制到粘贴板!
Red Hat OpenStack Platform provides the foundation to build a private or public cloud on top of Red Hat Enterprise Linux. It offers a massively scalable, fault-tolerant platform for the development of cloud-enabled workloads.
This guide describes the steps to plan and configure single root input/output virtualization(SR-IOV) and Open vSwitch with Data Plane Development Kit (OVS-DPDK) using the Red Hat OpenStack Platform director for NFV deployments.
Chapter 1. Overview of NFV 复制链接链接已复制到粘贴板!
Network Functions Virtualization (NFV) is a software solution that virtualizes a network function, such as a network switch, on general purpose, cloud-based infrastructure. NFV allows the Communication Service Provider to move away from traditional or proprietary hardware.
For a high-level overview of NFV concepts, see the Network Functions Virtualization Product Guide.
OVS-DPDK and SR-IOV configuration depends on your hardware and topology. This guide provides examples for CPU assignments, memory allocation, and NIC configurations that might vary from your topology and use case.
Use Red Hat OpenStack Platform director to isolate specific network types, for example, external, project, internal API, and so on. You can deploy a network on a single network interface, or distributed over a multiple-host network interface. With Open vSwitch you can create bonds by assigning multiple interfaces to a single bridge. Configure network isolation in a Red Hat OpenStack Platform installation with template files. If you do not provide template files, the service networks deploy on the provisioning network. There are two types of template configuration files:
-
network-environment.yaml- this file contains network details, such as subnets and IP address ranges, for the overcloud nodes. This file also contains the different settings that override the default parameter values for various scenarios. -
Host network templates, for example,
compute.yamlandcontroller.yaml- define the network interface configuration for the overcloud nodes. The values of the network details are provided by thenetwork-environment.yamlfile.
These heat template files are located at /usr/share/openstack-tripleo-heat-templates/ on the undercloud node.
The Hardware requirements and Software requirements sections provide more details on how to plan and configure the heat template files for NFV using the Red Hat OpenStack Platform director.
You can edit YAML files to configure NFV. For an introduction to the YAML file format, see: YAML in a Nutshell.
Chapter 2. Hardware requirements 复制链接链接已复制到粘贴板!
This section describes the hardware requirements for NFV.
You can use Red Hat Technologies Ecosystem to check for a list of certified hardware, software, cloud providers, and components. Choose the category and select the product version.
For a complete list of the certified hardware for Red Hat OpenStack Platform, see Red Hat OpenStack Platform certified hardware.
2.1. Tested NICs 复制链接链接已复制到粘贴板!
For a list of tested NICs for NFV, see Network Adapter Support.
If you configure OVS-DPDK on Mellanox ConnectX-4 or ConnectX-5 network interfaces, you must set the corresponding kernel driver in the compute-ovs-dpdk.yaml file:
2.2. Discovering your NUMA node topology 复制链接链接已复制到粘贴板!
When you plan your deployment, you must understand the NUMA topology of your Compute node to partition the CPU and memory resources for optimum performance. To determine the NUMA information, perform one of the following tasks:
- Enable hardware introspection to retrieve this information from bare-metal nodes.
- Log on to each bare-metal node to manually collect the information.
You must install and configure the undercloud before you can retrieve NUMA information through hardware introspection. For more information about undercloud configuration, see: Director Installation and Usage Guide.
Retrieving hardware introspection details
The Bare Metal service hardware-inspection-extras feature is enabled by default, and you can use it to retrieve hardware details for overcloud configuration. For more information about the inspection_extras parameter in the undercloud.conf file, see Configuring the Director.
For example, the numa_topology collector is part of the hardware-inspection extras and includes the following information for each NUMA node:
- RAM (in kilobytes)
- Physical CPU cores and their sibling threads
- NICs associated with the NUMA node
To retrieve the information listed above, substitute <UUID> with the UUID of the bare-metal node to complete the following command:
openstack baremetal introspection data save <UUID> | jq .numa_topology
# openstack baremetal introspection data save <UUID> | jq .numa_topology
The following example shows the retrieved NUMA information for a bare-metal node:
2.3. BIOS Settings 复制链接链接已复制到粘贴板!
The following table describes the required BIOS settings for NFV:
| Parameter | Setting |
|---|---|
|
| Disabled. |
|
| Disabled. |
|
| Enabled. |
|
| Enabled. |
|
| Enabled. |
|
| Enabled. |
|
| Performance. |
|
| Enabled. |
|
| Disabled. |
|
| Enabled for Intel cards if VFIO functionality is needed. |
Chapter 3. Software requirements 复制链接链接已复制到粘贴板!
This section describes the supported configurations and drivers, and subscription details necessary for NFV.
3.1. Registering and enabling repositories 复制链接链接已复制到粘贴板!
To install Red Hat OpenStack Platform, you must register Red Hat OpenStack Platform director using the Red Hat Subscription Manager, and subscribe to the required channels. See Registering your system for details.
Procedure
Register your system with the Content Delivery Network, entering your Customer Portal user name and password when prompted.
sudo subscription-manager register
[stack@director ~]$ sudo subscription-manager registerCopy to Clipboard Copied! Toggle word wrap Toggle overflow Determine the entitlement pool ID for Red Hat OpenStack Platform director, for example {Pool ID} from the following command and output:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Include the
Pool IDvalue in the following command to attach the Red Hat OpenStack Platform 15 entitlement.sudo subscription-manager attach --pool={Pool-ID}-123456[stack@director ~]$ sudo subscription-manager attach --pool={Pool-ID}-123456Copy to Clipboard Copied! Toggle word wrap Toggle overflow Disable the default repositories.
subscription-manager repos --disable=*
subscription-manager repos --disable=*Copy to Clipboard Copied! Toggle word wrap Toggle overflow Enable the required repositories for Red Hat OpenStack Platform with NFV.
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Update your system so you have the latest base system packages.
sudo dnf update -y sudo reboot
[stack@director ~]$ sudo dnf update -y [stack@director ~]$ sudo rebootCopy to Clipboard Copied! Toggle word wrap Toggle overflow
You need a separate subscription to a Red Hat OpenStack Platform for Real Time SKU before you can access the rhel-8-server-nfv-rpms repository for Real Time KVM.
To register your overcloud nodes, see Ansible Based Registration.
3.2. Supported configurations for NFV deployments 复制链接链接已复制到粘贴板!
Red Hat OpenStack Platform (RHOSP) supports the following NFV deployments using director:
- Single root I/O virtualization (SR-IOV)
- Open vSwitch with Data Plane Development Kit (OVS-DPDK)
Additionally, you can deploy RHOSP with any of the following features:
- Composable roles
- Hyper-converged infrastructure (limited support)
- Real-time KVM
- OVS hardware offload (Technology preview)
Red Hat’s embedded OpenDaylight SDN solution was deprecated in RHOSP 14. Red Hat support, including bug fixes, for OpenDaylight ends with the RHOSP 13 lifecycle, planned for June 27, 2021.
RHOSP NFV deployments with Open Virtual Network (OVN) as the default Software Defined Networking (SDN) solution are unsupported. Use the following steps to deploy RHOSP with the OVS mechanism driver:
Procedure
Modify the
containers-prepare-parameter.yamlfile so that theneutron_driverparameter is set tonull.Copy to Clipboard Copied! Toggle word wrap Toggle overflow Include the
neutron-ovs.yamlenvironment file in the/usr/share/openstack-tripleo-heat-templates/environments/servicesdirectory with your deployment script.Copy to Clipboard Copied! Toggle word wrap Toggle overflow
3.3. Supported drivers 复制链接链接已复制到粘贴板!
For a complete list of supported drivers, see Component, Plug-In, and Driver Support in Red Hat OpenStack Platform .
For a list of NICs tested for Red Hat OpenStack Platform deployments with NFV, see Tested NICs.
3.4. Compatibility with third-party software 复制链接链接已复制到粘贴板!
For a complete list of products and services tested, supported, and certified to perform with Red Hat OpenStack Platform, see Third Party Software compatible with Red Hat OpenStack Platform. You can filter the list by product version and software category.
For a complete list of products and services tested, supported, and certified to perform with Red Hat Enterprise Linux, see Third Party Software compatible with Red Hat Enterprise Linux. You can filter the list by product version and software category.
Chapter 4. Network considerations 复制链接链接已复制到粘贴板!
The undercloud host requires at least the following networks:
- Provisioning network - Provides DHCP and PXE-boot functions to help discover bare-metal systems for use in the overcloud.
- External network - A separate network for remote connectivity to all nodes. The interface connecting to this network requires a routable IP address, either defined statically, or generated dynamically from an external DHCP service.
The minimal overcloud network configuration includes the following NIC configurations:
- Single NIC configuration - One NIC for the provisioning network on the native VLAN and tagged VLANs that use subnets for the different overcloud network types.
- Dual NIC configuration - One NIC for the provisioning network and the other NIC for the external network.
- Dual NIC configuration - One NIC for the provisioning network on the native VLAN, and the other NIC for tagged VLANs that use subnets for different overcloud network types.
- Multiple NIC configuration - Each NIC uses a subnet for a different overcloud network type.
For more information on the networking requirements, see Networking requirements.
Chapter 5. Planning an SR-IOV deployment 复制链接链接已复制到粘贴板!
Optimize single root I/O virtualization (SR-IOV) deployments for NFV by setting individual parameters based on your Compute node hardware.
See Discovering your NUMA node topology to evaluate your hardware impact on the SR-IOV parameters.
5.1. Hardware partitioning for an SR-IOV deployment 复制链接链接已复制到粘贴板!
To achieve high performance with SR-IOV, partition the resources between the host and the guest.
A typical topology includes 14 cores per NUMA node on dual socket Compute nodes. Both hyper-threading (HT) and non-HT cores are supported. Each core has two sibling threads. One core is dedicated to the host on each NUMA node. The virtual network function (VNF) handles the SR-IOV interface bonding. All the interrupt requests (IRQs) are routed on the host cores. The VNF cores are dedicated to the VNFs. They provide isolation from other VNFs and isolation from the host. Each VNF must use resources on a single NUMA node. The SR-IOV NICs used by the VNF must also be associated with that same NUMA node. This topology does not have a virtualization overhead. The host, OpenStack Networking (neutron), and Compute (nova) configuration parameters are exposed in a single file for ease, consistency, and to avoid incoherence that is fatal to proper isolation, causing preemption, and packet loss. The host and virtual machine isolation depend on a tuned profile, which defines the boot parameters and any Red Hat OpenStack Platform modifications based on the list of isolated CPUs.
5.2. Topology of an NFV SR-IOV deployment 复制链接链接已复制到粘贴板!
The following image has two VNFs each with the management interface represented by mgt and the data plane interfaces. The management interface manages the ssh access, and so on. The data plane interfaces bond the VNFs to DPDK to ensure high availability, as VNFs bond the data plane interfaces using the DPDK library. The image also has two provider networks for redundancy. The Compute node has two regular NICs bonded together and shared between the VNF management and the Red Hat OpenStack Platform API management.
The image shows a VNF that uses DPDK at an application level, and has access to SR-IOV virtual functions (VFs) and physical functions (PFs), for better availability or performance, depending on the fabric configuration. DPDK improves performance, while the VF/PF DPDK bonds provide support for failover, and high availability. The VNF vendor must ensure that the DPDK poll mode driver (PMD) supports the SR-IOV card that is being exposed as a VF/PF. The management network uses OVS, therefore the VNF sees a mgmt network device using the standard virtIO drivers. You can use that device to initially connect to the VNF, and ensure that the DPDK application bonds the two VF/PFs.
5.2.1. Topology for NFV SR-IOV without HCI 复制链接链接已复制到粘贴板!
Observe the topology for SR-IOV without hyper-converged infrastructure (HCI) for NFV in the image below. It consists of compute and controller nodes with 1 Gbps NICs, and the director node.
Chapter 6. Deploying SR-IOV technologies 复制链接链接已复制到粘贴板!
In your Red Hat OpenStack Platform NFV deployment, you can achieve higher performance with single root I/O virtualization (SR-IOV), when you configure direct access from your instances to a shared PCIe resource through virtual resources.
6.1. Prerequisites 复制链接链接已复制到粘贴板!
- For details on how to install and configure the undercloud before deploying the overcloud, see the Director Installation and Usage Guide.
Do not manually edit any values in /etc/tuned/cpu-partitioning-variables.conf that director heat templates modify.
6.2. Configuring SR-IOV 复制链接链接已复制到粘贴板!
The following CPU assignments, memory allocation, and NIC configurations are examples, and might be different from your use case.
Generate the built-in
ComputeSriovrole to define nodes in the OpenStack cluster that runNeutronSriovAgent,NeutronSriovHostConfig, and default compute services.openstack overcloud roles generate \ -o /home/stack/templates/roles_data.yaml \ Controller ComputeSriov
# openstack overcloud roles generate \ -o /home/stack/templates/roles_data.yaml \ Controller ComputeSriovCopy to Clipboard Copied! Toggle word wrap Toggle overflow To prepare the SR-IOV containers, include the
neutron-sriov.yamlandroles_data.yamlfiles when you generate theovercloud_images.yamlfile.sudo openstack tripleo container image prepare \ --roles-file ~/templates/roles_data.yaml \ -e /usr/share/openstack-tripleo-heat-templates/environments/services/neutron-sriov.yaml \ -e ~/containers-prepare-parameter.yaml \ --output-env-file=/home/stack/templates/overcloud_images.yaml
sudo openstack tripleo container image prepare \ --roles-file ~/templates/roles_data.yaml \ -e /usr/share/openstack-tripleo-heat-templates/environments/services/neutron-sriov.yaml \ -e ~/containers-prepare-parameter.yaml \ --output-env-file=/home/stack/templates/overcloud_images.yamlCopy to Clipboard Copied! Toggle word wrap Toggle overflow For more information on container image preparation, see Director Installation and Usage.
Configure the parameters for the SR-IOV nodes under
parameter_defaultsappropriately for your cluster, and your hardware configuration. Typically, you add these settings to thenetwork-environment.yamlfile.NeutronNetworkType: 'vlan' NeutronNetworkVLANRanges: - tenant:22:22 - tenant:25:25 NeutronTunnelTypes: ''NeutronNetworkType: 'vlan' NeutronNetworkVLANRanges: - tenant:22:22 - tenant:25:25 NeutronTunnelTypes: ''Copy to Clipboard Copied! Toggle word wrap Toggle overflow In the same file, configure role specific parameters for SR-IOV compute nodes.
NoteThe
numvfsparameter replaces theNeutronSriovNumVFsparameter in the network configuration templates. Red Hat does not support modification of theNeutronSriovNumVFsparameter or thenumvfsparameter after deployment. If you modify either parameter after deployment, it might cause a disruption for the running instances that have an SR-IOV port on that physical function (PF). In this case, you must hard reboot these instances to make the SR-IOV PCI device available again. TheNovaVcpuPinSetparameter is now deprecated, and is replaced byNovaComputeCpuDedicatedSetfor dedicated, pinned workflows.Copy to Clipboard Copied! Toggle word wrap Toggle overflow Configure the SR-IOV enabled interfaces in the
compute.yamlnetwork configuration template. To create SR-IOV virtual functions (VFs), configure the interfaces as standalone NICs:Copy to Clipboard Copied! Toggle word wrap Toggle overflow Ensure that the list of default filters includes the value
AggregateInstanceExtraSpecsFilter.NovaSchedulerDefaultFilters: ['AvailabilityZoneFilter','ComputeFilter','ComputeCapabilitiesFilter','ImagePropertiesFilter','ServerGroupAntiAffinityFilter','ServerGroupAffinityFilter','PciPassthroughFilter','AggregateInstanceExtraSpecsFilter']
NovaSchedulerDefaultFilters: ['AvailabilityZoneFilter','ComputeFilter','ComputeCapabilitiesFilter','ImagePropertiesFilter','ServerGroupAntiAffinityFilter','ServerGroupAffinityFilter','PciPassthroughFilter','AggregateInstanceExtraSpecsFilter']Copy to Clipboard Copied! Toggle word wrap Toggle overflow -
Run the
overcloud_deploy.shscript.
6.3. NIC partitioning 复制链接链接已复制到粘贴板!
This feature is available in this release as a Technology Preview, and therefore is not fully supported by Red Hat. It should only be used for testing, and should not be deployed in a production environment. For more information about Technology Preview features, see Scope of Coverage Details.
You can configure single root I/O virtualization (SR-IOV) so that an Red Hat OpenStack Platform host can use virtual functions (VFs).
When you partition a single, high-speed NIC into multiple VFs, you can use the NIC for both control and data plane traffic. You can then apply a QoS (Quality of Service) priority value to VF interfaces as desired.
Procedure
Ensure that you complete the following steps when creating the templates for an overcloud deployment:
Use the interface type
sriov_pfin anos-net-configrole file to configure a physical function that the host can use.- type: sriov_pf name: <interface name> use_dhcp: false numvfs: <number of vfs> promisc: <true/false> #optional (Defaults to true)- type: sriov_pf name: <interface name> use_dhcp: false numvfs: <number of vfs> promisc: <true/false> #optional (Defaults to true)Copy to Clipboard Copied! Toggle word wrap Toggle overflow NoteThe
numvfsparameter replaces theNeutronSriovNumVFsparameter in the network configuration templates. Red Hat does not support modification of theNeutronSriovNumVFsparameter or thenumvfsparameter after deployment. If you modify either parameter after deployment, it might cause a disruption for the running instances that have an SR-IOV port on that physical function (PF). In this case, you must hard reboot these instances to make the SR-IOV PCI device available again.Use the interface type
sriov_vfto configure virtual functions in a bond that the host can use.Copy to Clipboard Copied! Toggle word wrap Toggle overflow The VLAN tag must be unique across all VFs that belong to a common PF device. You must assign VLAN tags to an interface type:
- linux_bond
- ovs_bridge
- ovs_dpdk_port
- The applicable VF ID range starts at zero, and ends at the maximum number of VFs minus one.
To reserve virtual functions for VMs, use the
NovaPCIPassthroughparameter. You must assign a regex value to theaddressparameter to identify the VFs that you want to pass through to Nova, to be used by virtual instances, and not by the host.You can obtain these values from
lspci, so, if necessary, boot a compute node into a Linux environment to obtain this information.The
lspcicommand returns the address of each device in the format<bus>:<device>:<slot>. Enter these address values in theNovaPCIPassthroughparameter in the following format:NovaPCIPassthrough: - physical_network: "sriovnet2" address: {"domain": ".*", "bus": "06", "slot": "11", "function": "[5-7]"} - physical_network: "sriovnet2" address: {"domain": ".*", "bus": "06", "slot": "10", "function": "[5]"}NovaPCIPassthrough: - physical_network: "sriovnet2" address: {"domain": ".*", "bus": "06", "slot": "11", "function": "[5-7]"} - physical_network: "sriovnet2" address: {"domain": ".*", "bus": "06", "slot": "10", "function": "[5]"}Copy to Clipboard Copied! Toggle word wrap Toggle overflow Ensure that
IOMMUis enabled on all nodes that require NIC partitioning. For example, if you want NIC Partitioning for compute nodes, enable IOMMU using theKernelArgsparameter for that role:parameter_defaults: ComputeParameters: KernelArgs: "intel_iommu=on iommu=pt"parameter_defaults: ComputeParameters: KernelArgs: "intel_iommu=on iommu=pt"Copy to Clipboard Copied! Toggle word wrap Toggle overflow
Validation
Check the number of VFs.
[root@overcloud-compute-0 heat-admin]# cat /sys/class/net/p4p1/device/sriov_numvfs 10 [root@overcloud-compute-0 heat-admin]# cat /sys/class/net/p4p2/device/sriov_numvfs 10
[root@overcloud-compute-0 heat-admin]# cat /sys/class/net/p4p1/device/sriov_numvfs 10 [root@overcloud-compute-0 heat-admin]# cat /sys/class/net/p4p2/device/sriov_numvfs 10Copy to Clipboard Copied! Toggle word wrap Toggle overflow Check Linux bonds.
Copy to Clipboard Copied! Toggle word wrap Toggle overflow List OVS bonds.
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Show OVS connections.
Copy to Clipboard Copied! Toggle word wrap Toggle overflow
If you used NovaPCIPassthrough to pass VFs to instances, test by deploying an SR-IOV instance.
The following bond modes are supported:
- balance-slb
- active-backup
6.4. Configuring OVS hardware offload 复制链接链接已复制到粘贴板!
This feature is available in this release as a Technology Preview, and therefore is not fully supported by Red Hat. It should only be used for testing, and should not be deployed in a production environment. For more information about Technology Preview features, see Scope of Coverage Details.
The procedure for OVS hardware offload configuration shares many of the same steps as configuring SR-IOV.
Procedure
Generate the
ComputeSriovrole:openstack overcloud roles generate -o roles_data.yaml Controller ComputeSriov
openstack overcloud roles generate -o roles_data.yaml Controller ComputeSriovCopy to Clipboard Copied! Toggle word wrap Toggle overflow Configure the
physical_networkparameter to match your environment.-
For VLAN, set the
physical_networkparameter to the name of the network you create in neutron after deployment. This value should also be inNeutronBridgeMappings. -
For VXLAN, set the
physical_networkparameter to the string valuenull. Ensure the
OvsHwOffloadparameter under role specific parameters has a value oftrue.Example:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow
-
For VLAN, set the
Ensure that the list of default filters includes
NUMATopologyFilter:NovaSchedulerDefaultFilters: [\'RetryFilter',\'AvailabilityZoneFilter',\'ComputeFilter',\'ComputeCapabilitiesFilter',\'ImagePropertiesFilter',\'ServerGroupAntiAffinityFilter',\'ServerGroupAffinityFilter',\'PciPassthroughFilter',\'NUMATopologyFilter']
NovaSchedulerDefaultFilters: [\'RetryFilter',\'AvailabilityZoneFilter',\'ComputeFilter',\'ComputeCapabilitiesFilter',\'ImagePropertiesFilter',\'ServerGroupAntiAffinityFilter',\'ServerGroupAffinityFilter',\'PciPassthroughFilter',\'NUMATopologyFilter']Copy to Clipboard Copied! Toggle word wrap Toggle overflow Configure one or more network interfaces intended for hardware offload in the
compute-sriov.yamlconfiguration file:NoteDo not use the
NeutronSriovNumVFsparameter when configuring Open vSwitch hardware offload. Thenumvfsparameter specifies the number of VFs in a network configuration file used byos-net-config.Copy to Clipboard Copied! Toggle word wrap Toggle overflow NoteDo not configure Mellanox network interfaces as a nic-config interface type
ovs-vlanbecause this prevents tunnel endpoints such as VXLAN from passing traffic due to driver limitations.Include the
ovs-hw-offload.yamlfile in theovercloud deploycommand:Copy to Clipboard Copied! Toggle word wrap Toggle overflow
6.4.1. Verifying OVS hardware offload 复制链接链接已复制到粘贴板!
Confirm that a PCI device is in
switchdevmode:devlink dev eswitch show pci/0000:03:00.0 pci/0000:03:00.0: mode switchdev inline-mode none encap enable
# devlink dev eswitch show pci/0000:03:00.0 pci/0000:03:00.0: mode switchdev inline-mode none encap enableCopy to Clipboard Copied! Toggle word wrap Toggle overflow Verify if offload is enabled in OVS:
ovs-vsctl get Open_vSwitch . other_config:hw-offload “true”
# ovs-vsctl get Open_vSwitch . other_config:hw-offload “true”Copy to Clipboard Copied! Toggle word wrap Toggle overflow
6.5. Deploying an instance for SR-IOV 复制链接链接已复制到粘贴板!
Use host aggregates to separate high performance compute hosts. For information on creating host aggregates and associated flavors for scheduling see Creating host aggregates.
Pinned CPU instances can be located on the same Compute node as unpinned instances. For more information, see Configuring CPU pinning on the Compute node in the Instances and Images Guide.
Deploy an instance for single root I/O virtualization (SR-IOV) by performing the following steps:
Create a flavor.
openstack flavor create <flavor> --ram <MB> --disk <GB> --vcpus <#>
# openstack flavor create <flavor> --ram <MB> --disk <GB> --vcpus <#>Copy to Clipboard Copied! Toggle word wrap Toggle overflow TipYou can specify the NUMA affinity policy for PCI passthrough devices and SR-IOV interfaces by adding the extra spec
hw:pci_numa_affinity_policyto your flavor. For more information, see Update flavor metadata in the Instance and Images Guide.Create the network.
openstack network create net1 --provider-physical-network tenant --provider-network-type vlan --provider-segment <VLAN-ID> openstack subnet create subnet1 --network net1 --subnet-range 192.0.2.0/24 --dhcp
# openstack network create net1 --provider-physical-network tenant --provider-network-type vlan --provider-segment <VLAN-ID> # openstack subnet create subnet1 --network net1 --subnet-range 192.0.2.0/24 --dhcpCopy to Clipboard Copied! Toggle word wrap Toggle overflow Create the port.
Use vnic-type
directto create an SR-IOV virtual function (VF) port.openstack port create --network net1 --vnic-type direct sriov_port
# openstack port create --network net1 --vnic-type direct sriov_portCopy to Clipboard Copied! Toggle word wrap Toggle overflow Use the following command to create a virtual function with hardware offload.
openstack port create --network net1 --vnic-type direct --binding-profile '{"capabilities": ["switchdev"]} sriov_hwoffload_port# openstack port create --network net1 --vnic-type direct --binding-profile '{"capabilities": ["switchdev"]} sriov_hwoffload_portCopy to Clipboard Copied! Toggle word wrap Toggle overflow Use vnic-type
direct-physicalto create an SR-IOV PF port.openstack port create --network net1 --vnic-type direct-physical sriov_port
# openstack port create --network net1 --vnic-type direct-physical sriov_portCopy to Clipboard Copied! Toggle word wrap Toggle overflow
Deploy an instance.
openstack server create --flavor <flavor> --image <image> --nic port-id=<id> <instance name>
# openstack server create --flavor <flavor> --image <image> --nic port-id=<id> <instance name>Copy to Clipboard Copied! Toggle word wrap Toggle overflow
6.6. Creating host aggregates 复制链接链接已复制到粘贴板!
For better performance, deploy guests that have cpu pinning and hugepages. You can schedule high performance instances on a subset of hosts by matching aggregate metadata with flavor metadata.
Procedure
Ensure that the
AggregateInstanceExtraSpecsFiltervalue is included in thescheduler_default_filtersparameter in thenova.conffile. This configuration can be set through the heat parameterNovaSchedulerDefaultFiltersunder role-specific parameters before deployment.ComputeOvsDpdkSriovParameters: NovaSchedulerDefaultFilters: ['AggregateInstanceExtraSpecsFilter', 'RetryFilter','AvailabilityZoneFilter','ComputeFilter','ComputeCapabilitiesFilter','ImagePropertiesFilter','ServerGroupAntiAffinityFilter','ServerGroupAffinityFilter','PciPassthroughFilter','NUMATopologyFilter']ComputeOvsDpdkSriovParameters: NovaSchedulerDefaultFilters: ['AggregateInstanceExtraSpecsFilter', 'RetryFilter','AvailabilityZoneFilter','ComputeFilter','ComputeCapabilitiesFilter','ImagePropertiesFilter','ServerGroupAntiAffinityFilter','ServerGroupAffinityFilter','PciPassthroughFilter','NUMATopologyFilter']Copy to Clipboard Copied! Toggle word wrap Toggle overflow NoteTo add this parameter to the configuration of an exiting cluster, you can add it to the heat templates, and run the original deployment script again.
Create an aggregate group for SR-IOV, and add relevant hosts. Define metadata, for example,
sriov=true, that matches defined flavor metadata.openstack aggregate create sriov_group openstack aggregate add host sriov_group compute-sriov-0.localdomain openstack aggregate set --property sriov=true sriov_group
# openstack aggregate create sriov_group # openstack aggregate add host sriov_group compute-sriov-0.localdomain # openstack aggregate set --property sriov=true sriov_groupCopy to Clipboard Copied! Toggle word wrap Toggle overflow Create a flavor.
openstack flavor create <flavor> --ram <MB> --disk <GB> --vcpus <#>
# openstack flavor create <flavor> --ram <MB> --disk <GB> --vcpus <#>Copy to Clipboard Copied! Toggle word wrap Toggle overflow Set additional flavor properties. Note that the defined metadata,
sriov=true, matches the defined metadata on the SR-IOV aggregate.openstack flavor set --property aggregate_instance_extra_specs:sriov=true --property hw:cpu_policy=dedicated --property hw:mem_page_size=1GB <flavor>
openstack flavor set --property aggregate_instance_extra_specs:sriov=true --property hw:cpu_policy=dedicated --property hw:mem_page_size=1GB <flavor>Copy to Clipboard Copied! Toggle word wrap Toggle overflow
Chapter 7. Planning your OVS-DPDK deployment 复制链接链接已复制到粘贴板!
To optimize your OVS-DPDK deployment, you should understand how to configure the OVS-DPDK parameters relative to your Compute node hardware. For more information about CPUs and NUMA topology, see: NFV performance considerations.
7.1. OVS-DPDK with CPU partitioning and NUMA topology 复制链接链接已复制到粘贴板!
OVS-DPDK partitions the hardware resources for host, guests, and itself. The OVS-DPDK Poll Mode Drivers (PMDs) run DPDK active loops, which require dedicated CPU cores. Therefore you must allocate some CPUs, and huge pages, to OVS-DPDK.
A sample partitioning includes 16 cores per NUMA node on dual-socket Compute nodes. The traffic requires additional NICs because you cannot share NICs between the host and OVS-DPDK.
You must reserve DPDK PMD threads on both NUMA nodes, even if a NUMA node does not have an associated DPDK NIC.
For optimum OVS-DPDK performance, reserve a block of memory local to the NUMA node. Choose NICs associated with the same NUMA node that you use for memory and CPU pinning. Ensure that both bonded interfaces are from NICs on the same NUMA node.
7.2. Workflows and derived parameters 复制链接链接已复制到粘贴板!
This feature is available in this release as a Technology Preview, and therefore is not fully supported by Red Hat. It should only be used for testing, and should not be deployed in a production environment. For more information about Technology Preview features, see Scope of Coverage Details.
You can use the OpenStack Workflow (mistral) service to derive parameters based on the capabilities of your available bare-metal nodes. Workflows use a YAML file to define a set of tasks and actions to perform. You can use a pre-defined workbook, derive_params.yaml, in the directory tripleo-common/workbooks/. This workbook provides workflows to derive each supported parameter from the results of Bare Metal introspection. The derive_params.yaml workflows use the formulas from tripleo-common/workbooks/derive_params_formulas.yaml to calculate the derived parameters.
You can modify derive_params_formulas.yaml to suit your environment.
The derive_params.yaml workbook assumes all nodes for a particular composable role have the same hardware specifications. The workflow considers the flavor-profile association and nova placement scheduler to match nodes associated with a role, then uses the introspection data from the first node that matches the role.
For more information about Red Hat OpenStack Platform workflows, see Troubleshooting workflows and executions.
You can use the -p or --plan-environment-file option to add a custom plan_environment.yaml file, containing a list of workbooks and any input values, to the openstack overcloud deploy command. The resultant workflows merge the derived parameters back into the custom plan_environment.yaml, where they are available for the overcloud deployment.
For details on how to use the --plan-environment-file option in your deployment, see Plan Environment Metadata.
7.3. Derived OVS-DPDK parameters 复制链接链接已复制到粘贴板!
The workflows in derive_params.yaml derive the DPDK parameters associated with the role that uses the ComputeNeutronOvsDpdk service.
The workflows can automatically derive the following parameters for OVS-DPDK. The NovaVcpuPinSet parameter is now deprecated, and is replaced by NovaComputeCpuDedicatedSet for dedicated, pinned workflows:
- IsolCpusList
- KernelArgs
- NovaReservedHostMemory
- NovaComputeCpuDedicatedSet
- OvsDpdkCoreList
- OvsDpdkSocketMemory
- OvsPmdCoreList
The OvsDpdkMemoryChannels parameter cannot be derived from the introspection memory bank data because the format of memory slot names are inconsistent across different hardware environments.
In most cases, the default number of OvsDpdkMemoryChannels is four. Consult your hardware manual to determine the number of memory channels per socket, and update the default number with this value.
For more information about workflow parameters, see Section 8.1, “Deriving DPDK parameters with workflows”.
7.4. Calculating OVS-DPDK parameters manually 复制链接链接已复制到粘贴板!
This section describes how OVS-DPDK uses parameters within the director network_environment.yaml heat templates to configure the CPU and memory for optimum performance. Use this information to evaluate the hardware support on your Compute nodes and how to partition the hardware to optimize your OVS-DPDK deployment.
For more information on an how to generate these values with the derived_parameters.yaml workflow instead, see Overview of workflows and derived parameters.
Always pair CPU sibling threads, or logical CPUs, together in the physical core when allocating CPU cores.
For details on how to determine the CPU and NUMA nodes on your Compute nodes, see Discovering your NUMA node topology. Use this information to map CPU and other parameters to support the host, guest instance, and OVS-DPDK process needs.
7.4.1. CPU parameters 复制链接链接已复制到粘贴板!
OVS-DPDK uses the following parameters for CPU partitioning:
- OvsPmdCoreList
Provides the CPU cores that are used for the DPDK poll mode drivers (PMD). Choose CPU cores that are associated with the local NUMA nodes of the DPDK interfaces. Use
OvsPmdCoreListfor thepmd-cpu-maskvalue in OVS. Observe the following recommendations forOvsPmdCoreList:- Pair the sibling threads together.
-
Exclude all cores from the
OvsDpdkCoreList -
Avoid allocating the logical CPUs of both thread siblings on the first physical core to both NUMA nodes as these should be used for the
OvsDpdkCoreListparameter. - Performance depends on the number of physical cores allocated for this PMD Core list. On the NUMA node which is associated with DPDK NIC, allocate the required cores.
- For NUMA nodes with a DPDK NIC, determine the number of physical cores required based on the performance requirement, and include all the sibling threads or logical CPUs for each physical core.
- For NUMA nodes without DPDK NICs, allocate the sibling threads or logical CPUs of any physical core except the first physical core of the NUMA node. You need a minimal DPDK poll mode driver on the NUMA node without DPDK NICs present to properly create guest instances.
You must reserve DPDK PMD threads on both NUMA nodes, even if a NUMA node does not have an associated DPDK NIC.
- NovaComputeCpuDedicatedSet
A comma-separated list or range of physical host CPU numbers to which processes for pinned instance CPUs can be scheduled. For example,
NovaComputeCpuDedicatedSet: [4-12,^8,15]reserves cores from 4-12 and 15, excluding 8.-
Exclude all cores from the
OvsPmdCoreListand theOvsDpdkCoreList. - Include all remaining cores.
- Pair the sibling threads together.
-
Exclude all cores from the
- NovaComputeCpuSharedSet
-
A comma-separated list or range of physical host CPU numbers used to determine the host CPUs for instance emulator threads. The recommended value for this parameter matches the value set for
OvsDpdkCoreList. - IsolCpusList
A set of CPU cores isolated from the host processes.
IsolCpusListis theisolated_coresvalue in thecpu-partitioning-variable.conffile for thetuned-profiles-cpu-partitioningcomponent. Observe the following recommendations forIsolCpusList:-
Match the list of cores in
OvsPmdCoreListandNovaComputeCpuDedicatedSet. - Pair the sibling threads together.
-
Match the list of cores in
- OvsDpdkCoreList
Provides CPU cores for non data path OVS-DPDK processes, such as handler and revalidator threads. This parameter has no impact on overall data path performance on multi-NUMA node hardware.
OvsDpdkCoreListis thedpdk-lcore-maskvalue in OVS, and these cores are shared with the host. Observe the following recommendations forOvsDpdkCoreList:- Allocate the first physical core, and sibling thread, from each NUMA node, even if the NUMA node has no associated DPDK NIC.
-
These cores must be mutually exclusive from the list of cores in
OvsPmdCoreListandNovaComputeCpuDedicatedSet.
- DerivePciWhitelistEnabled
To reserve virtual functions (VF) for VMs, use the
NovaPCIPassthroughparameter to create a list of VFs passed through to Nova. VFs excluded from the list remain available for the host.Red Hat recommends that you change the
DerivePciWhitelistEnabledvalue tofalsefrom the default oftrue, and then manually configure the list in theNovaPCIPassthroughparameter.For each VF in the list, populate the address parameter with a regular expression that resolves to the address value.
The following is an example of the manual list creation process. If NIC partitioning is enabled in a device named
eno2, list the PCI addresses of the VFs with the following command:Copy to Clipboard Copied! Toggle word wrap Toggle overflow In this case, the VFs 0, 4, and 6 are used by
eno2for NIC Partitioning. Manually configureNovaPCIPassthroughto include VFs 1-3, 5, and 7, and consequently exclude VFs 0,4, and 6, as in the following example:Copy to Clipboard Copied! Toggle word wrap Toggle overflow
7.4.2. Memory parameters 复制链接链接已复制到粘贴板!
OVS-DPDK uses the following memory parameters:
- OvsDpdkMemoryChannels
Maps memory channels in the CPU per NUMA node.
OvsDpdkMemoryChannelsis theother_config:dpdk-extra=”-n <value>”value in OVS. Observe the following recommendations forOvsDpdkMemoryChannels:-
Use
dmidecode -t memoryor your hardware manual to determine the number of memory channels available. -
Use
ls /sys/devices/system/node/node* -dto determine the number of NUMA nodes. - Divide the number of memory channels available by the number of NUMA nodes.
-
Use
- NovaReservedHostMemory
Reserves memory in MB for tasks on the host.
NovaReservedHostMemoryis thereserved_host_memory_mbvalue for the Compute node innova.conf. Observe the following recommendation forNovaReservedHostMemory:- Use the static recommended value of 4096 MB.
- OvsDpdkSocketMemory
Specifies the amount of memory in MB to pre-allocate from the hugepage pool, per NUMA node.
OvsDpdkSocketMemoryis theother_config:dpdk-socket-memvalue in OVS. Observe the following recommendations forOvsDpdkSocketMemory:- Provide as a comma-separated list.
- For a NUMA node without a DPDK NIC, use the static recommendation of 1024 MB (1GB)
-
Calculate the
OvsDpdkSocketMemoryvalue from the MTU value of each NIC on the NUMA node. The following equation approximates the value for
OvsDpdkSocketMemory:MEMORY_REQD_PER_MTU = (ROUNDUP_PER_MTU + 800) * (4096 * 64) Bytes
- 800 is the overhead value.
- 4096 * 64 is the number of packets in the mempool.
- Add the MEMORY_REQD_PER_MTU for each of the MTU values set on the NUMA node and add another 512 MB as buffer. Round the value up to a multiple of 1024.
Sample Calculation - MTU 2000 and MTU 9000
DPDK NICs dpdk0 and dpdk1 are on the same NUMA node 0, and configured with MTUs 900, and 2000 respectively. The sample calculation to derive the memory required is as follows:
Round off the MTU values to the nearest multiple of 1024 bytes.
The MTU value of 9000 becomes 9216 bytes. The MTU value of 2000 becomes 2048 bytes.
The MTU value of 9000 becomes 9216 bytes. The MTU value of 2000 becomes 2048 bytes.Copy to Clipboard Copied! Toggle word wrap Toggle overflow Calculate the required memory for each MTU value based on these rounded byte values.
Memory required for 9000 MTU = (9216 + 800) * (4096*64) = 2625634304 Memory required for 2000 MTU = (2048 + 800) * (4096*64) = 746586112
Memory required for 9000 MTU = (9216 + 800) * (4096*64) = 2625634304 Memory required for 2000 MTU = (2048 + 800) * (4096*64) = 746586112Copy to Clipboard Copied! Toggle word wrap Toggle overflow Calculate the combined total memory required, in bytes.
2625634304 + 746586112 + 536870912 = 3909091328 bytes.
2625634304 + 746586112 + 536870912 = 3909091328 bytes.Copy to Clipboard Copied! Toggle word wrap Toggle overflow This calculation represents (Memory required for MTU of 9000) + (Memory required for MTU of 2000) + (512 MB buffer).
Convert the total memory required into MB.
3909091328 / (1024*1024) = 3728 MB.
3909091328 / (1024*1024) = 3728 MB.Copy to Clipboard Copied! Toggle word wrap Toggle overflow Round this value up to the nearest 1024.
3724 MB rounds up to 4096 MB.
3724 MB rounds up to 4096 MB.Copy to Clipboard Copied! Toggle word wrap Toggle overflow Use this value to set
OvsDpdkSocketMemory.OvsDpdkSocketMemory: “4096,1024”
OvsDpdkSocketMemory: “4096,1024”Copy to Clipboard Copied! Toggle word wrap Toggle overflow
Sample Calculation - MTU 2000
DPDK NICs dpdk0 and dpdk1 are on the same NUMA node 0, and each are configured with MTUs of 2000. The sample calculation to derive the memory required is as follows:
Round off the MTU values to the nearest multiple of 1024 bytes.
The MTU value of 2000 becomes 2048 bytes.
The MTU value of 2000 becomes 2048 bytes.Copy to Clipboard Copied! Toggle word wrap Toggle overflow Calculate the required memory for each MTU value based on these rounded byte values.
Memory required for 2000 MTU = (2048 + 800) * (4096*64) = 746586112
Memory required for 2000 MTU = (2048 + 800) * (4096*64) = 746586112Copy to Clipboard Copied! Toggle word wrap Toggle overflow Calculate the combined total memory required, in bytes.
746586112 + 536870912 = 1283457024 bytes.
746586112 + 536870912 = 1283457024 bytes.Copy to Clipboard Copied! Toggle word wrap Toggle overflow This calculation represents (Memory required for MTU of 2000) + (512 MB buffer).
Convert the total memory required into MB.
1283457024 / (1024*1024) = 1224 MB.
1283457024 / (1024*1024) = 1224 MB.Copy to Clipboard Copied! Toggle word wrap Toggle overflow Round this value up to the nearest multiple of 1024.
1224 MB rounds up to 2048 MB.
1224 MB rounds up to 2048 MB.Copy to Clipboard Copied! Toggle word wrap Toggle overflow Use this value to set
OvsDpdkSocketMemory.OvsDpdkSocketMemory: “2048,1024”
OvsDpdkSocketMemory: “2048,1024”Copy to Clipboard Copied! Toggle word wrap Toggle overflow
7.4.3. Networking parameters 复制链接链接已复制到粘贴板!
- NeutronDpdkDriverType
-
Sets the driver type used by DPDK. Use the default value of
vfio-pci. - NeutronDatapathType
-
Datapath type for OVS bridges. DPDK uses the default value of
netdev. - NeutronVhostuserSocketDir
-
Sets the vhost-user socket directory for OVS. Use
/var/lib/vhost_socketsfor vhost client mode.
7.4.4. Other parameters 复制链接链接已复制到粘贴板!
- NovaSchedulerDefaultFilters
- Provides an ordered list of filters that the Compute node uses to find a matching Compute node for a requested guest instance.
- VhostuserSocketGroup
-
Sets the vhost-user socket directory group. The default value is
qemu. SetVhostuserSocketGrouptohugetlbfsso that theovs-vswitchdandqemuprocesses can access the shared huge pages and unix socket that configures the virtio-net device. This value is role-specific and should be applied to any role leveraging OVS-DPDK. - KernelArgs
Provides multiple kernel arguments to
/etc/default/grubfor the Compute node at boot time. Add the following values based on your configuration:hugepagesz: Sets the size of the huge pages on a CPU. This value can vary depending on the CPU hardware. Set to 1G for OVS-DPDK deployments (default_hugepagesz=1GB hugepagesz=1G). Use this command to check for thepdpe1gbCPU flag that confirms your CPU supports 1G.lshw -class processor | grep pdpe1gb
lshw -class processor | grep pdpe1gbCopy to Clipboard Copied! Toggle word wrap Toggle overflow -
hugepages count: Sets the number of huge pages available based on available host memory. Use most of your available memory, exceptNovaReservedHostMemory. You must also configure the huge pages count value within the flavor of your Compute nodes. -
iommu: For Intel CPUs, add“intel_iommu=on iommu=pt”` -
isolcpus: Sets the CPU cores for tuning. This value matchesIsolCpusList.
7.4.5. Instance extra specifications 复制链接链接已复制到粘贴板!
Before deploying instances in an NFV environment, create a flavor that utilizes CPU pinning, huge pages, and emulator thread pinning.
- hw:cpu_policy
-
When this parameter is set to
dedicated, the guest uses pinned CPUs. Instances created from a flavor with this parameter set have an effective overcommit ratio of 1:1. The default value isshared. - hw:mem_page_size
Set this parameter to a valid string of a specific value with standard suffix (For example,
4KB,8MB, or1GB). Use 1GB to match thehugepageszboot parameter. Calculate the number of huge pages available for the virtual machines by subtractingOvsDpdkSocketMemoryfrom the boot parameter. The following values are also valid:- small (default) - The smallest page size is used
- large - Only use large page sizes. (2MB or 1GB on x86 architectures)
- any - The compute driver can attempt to use large pages, but defaults to small if none available.
- hw:emulator_threads_policy
-
Set the value of this parameter to
shareso that emulator threads are locked to CPUs that you’ve identified in the heat parameter,NovaComputeCpuSharedSet. If an emulator thread is running on a vCPU with the poll mode driver (PMD) or real-time processing, you can experience negative effects, such as packet loss.
7.5. Two NUMA node example OVS-DPDK deployment 复制链接链接已复制到粘贴板!
The Compute node in the following example includes two NUMA nodes:
- NUMA 0 has cores 0-7. The sibling thread pairs are (0,1), (2,3), (4,5), and (6,7)
- NUMA 1 has cores 8-15. The sibling thread pairs are (8,9), (10,11), (12,13), and (14,15).
- Each NUMA node connects to a physical NIC, namely NIC1 on NUMA 0, and NIC2 on NUMA 1.
Reserve the first physical cores or both thread pairs on each NUMA node (0,1 and 8,9) for non-datapath DPDK processes, such as OvsDpdkCoreList.
This example also assumes a 1500 MTU configuration, so the OvsDpdkSocketMemory is the same for all use cases:
OvsDpdkSocketMemory: “1024,1024”
OvsDpdkSocketMemory: “1024,1024”
NIC 1 for DPDK, with one physical core for PMD
In this use case, you allocate one physical core on NUMA 0 for PMD. You must also allocate one physical core on NUMA 1, even though DPDK is not enabled on the NIC for that NUMA node. The remaining cores, not reserved for OvsDpdkCoreList, are allocated for guest instances. The resulting parameter settings are:
OvsPmdCoreList: “2,3,10,11” NovaComputeCpuDedicatedSet: “4,5,6,7,12,13,14,15”
OvsPmdCoreList: “2,3,10,11”
NovaComputeCpuDedicatedSet: “4,5,6,7,12,13,14,15”
NIC 1 for DPDK, with two physical cores for PMD
In this use case, you allocate two physical cores on NUMA 0 for PMD. You must also allocate one physical core on NUMA 1, even though DPDK is not enabled on the NIC for that NUMA node. The remaining cores, not reserved for OvsDpdkCoreList, are allocated for guest instances. The resulting parameter settings are:
OvsPmdCoreList: “2,3,4,5,10,11” NovaComputeCpuDedicatedSet: “6,7,12,13,14,15”
OvsPmdCoreList: “2,3,4,5,10,11”
NovaComputeCpuDedicatedSet: “6,7,12,13,14,15”
NIC 2 for DPDK, with one physical core for PMD
In this use case, you allocate one physical core on NUMA 1 for PMD. You must also allocate one physical core on NUMA 0, even though DPDK is not enabled on the NIC for that NUMA node. The remaining cores, not reserved for OvsDpdkCoreList, are allocated for guest instances. The resulting parameter settings are:
OvsPmdCoreList: “2,3,10,11” NovaComputeCpuDedicatedSet: “4,5,6,7,12,13,14,15”
OvsPmdCoreList: “2,3,10,11”
NovaComputeCpuDedicatedSet: “4,5,6,7,12,13,14,15”
NIC 2 for DPDK, with two physical cores for PMD
In this use case, you allocate two physical cores on NUMA 1 for PMD. You must also allocate one physical core on NUMA 0, even though DPDK is not enabled on the NIC for that NUMA node. The remaining cores, not reserved for OvsDpdkCoreList, are allocated for guest instances. The resulting parameter settings are:
OvsPmdCoreList: “2,3,10,11,12,13” NovaComputeCpuDedicatedSet: “4,5,6,7,14,15”
OvsPmdCoreList: “2,3,10,11,12,13”
NovaComputeCpuDedicatedSet: “4,5,6,7,14,15”
NIC 1 and NIC2 for DPDK, with two physical cores for PMD
In this use case, you allocate two physical cores on each NUMA node for PMD. The remaining cores, not reserved for OvsDpdkCoreList, are allocated for guest instances. The resulting parameter settings are:
OvsPmdCoreList: “2,3,4,5,10,11,12,13” NovaComputeCpuDedicatedSet: “6,7,14,15”
OvsPmdCoreList: “2,3,4,5,10,11,12,13”
NovaComputeCpuDedicatedSet: “6,7,14,15”
7.6. Topology of an NFV OVS-DPDK deployment 复制链接链接已复制到粘贴板!
This example deployment shows an OVS-DPDK configuration and consists of two virtual network functions (VNFs) with two interfaces each:
-
The management interface, represented by
mgt. - The data plane interface.
In the OVS-DPDK deployment, the VNFs operate with inbuilt DPDK that supports the physical interface. OVS-DPDK enables bonding at the vSwitch level. For improved performance in your OVS-DPDK deployment, it is recommended that you separate kernel and OVS-DPDK NICs. To separate the management (mgt) network, connected to the Base provider network for the virtual machine, ensure you have additional NICs. The Compute node consists of two regular NICs for the Red Hat OpenStack Platform API management that can be reused by the Ceph API but cannot be shared with any OpenStack project.
NFV OVS-DPDK topology
The following image shows the topology for OVS-DPDK for NFV. It consists of Compute and Controller nodes with 1 or 10 Gbps NICs, and the director node.
Chapter 8. Configuring an OVS-DPDK deployment 复制链接链接已复制到粘贴板!
This section deploys OVS-DPDK within the Red Hat OpenStack Platform environment. The overcloud usually consists of nodes in predefined roles such as Controller nodes, Compute nodes, and different storage node types. Each of these default roles contains a set of services defined in the core heat templates on the director node.
You must install and configure the undercloud before you can deploy the overcloud. See the Director Installation and Usage Guide for details.
You must determine the best values for the OVS-DPDK parameters found in the network-environment.yaml file to optimize your OpenStack network for OVS-DPDK.
Do not manually edit or change isolated_cores or other values in etc/tuned/cpu-partitioning-variables.conf that the director heat templates modify.
8.1. Deriving DPDK parameters with workflows 复制链接链接已复制到粘贴板!
This feature is available in this release as a Technology Preview, and therefore is not fully supported by Red Hat. It should only be used for testing, and should not be deployed in a production environment. For more information about Technology Preview features, see Scope of Coverage Details.
See Section 7.2, “Workflows and derived parameters” for an overview of the Mistral workflow for DPDK.
Prerequisites
You must have bare metal introspection, including hardware inspection extras (inspection_extras) enabled to provide the data retrieved by this workflow. Hardware inspection extras are enabled by default. For more information about hardware of the nodes, see: Inspecting the hardware of nodes.
Define the Workflows and Input Parameters for DPDK
The following list outlines the input parameters you can provide to the OVS-DPDK workflows:
- num_phy_cores_per_numa_node_for_pmd
- This input parameter specifies the required minimum number of cores for the NUMA node associated with the DPDK NIC. One physical core is assigned for the other NUMA nodes not associated with DPDK NIC. Ensure that this parameter is set to 1.
- huge_page_allocation_percentage
-
This input parameter specifies the required percentage of total memory, excluding
NovaReservedHostMemory, that can be configured as huge pages. TheKernelArgsparameter is derived using the calculated huge pages based on thehuge_page_allocation_percentagespecified. Ensure that this parameter is set to 50.
The workflows calculate appropriate DPDK parameter values from these input parameters and the bare-metal introspection details.
To define the workflows and input parameters for DPDK:
Copy the
usr/share/openstack-tripleo-heat-templates/plan-samples/plan-environment-derived-params.yamlfile to a local directory and set the input parameters to suit your environment.Copy to Clipboard Copied! Toggle word wrap Toggle overflow Run the
openstack overcloud deploycommand and include the following information:-
The
update-plan-onlyoption - The role file and all environment files specific to your environment
The
plan-environment-derived-parms.yamlfile with the--plan-environment-fileoptional argumentopenstack overcloud deploy --templates --update-plan-only \ -r /home/stack/roles_data.yaml \ -e /home/stack/<environment-file> \ ... _#repeat as necessary_ ... **-p /home/stack/plan-environment-derived-params.yaml**
$ openstack overcloud deploy --templates --update-plan-only \ -r /home/stack/roles_data.yaml \ -e /home/stack/<environment-file> \ ... _#repeat as necessary_ ... **-p /home/stack/plan-environment-derived-params.yaml**Copy to Clipboard Copied! Toggle word wrap Toggle overflow
-
The
The output of this command shows the derived results, which are also merged into the plan-environment.yaml file.
The OvsDpdkMemoryChannels parameter cannot be derived from introspection details. In most cases, this value should be 4.
Deploy the overcloud with the derived parameters
To deploy the overcloud with these derived parameters:
Copy the derived parameters from the deploy command output to the
network-environment.yamlfile.Copy to Clipboard Copied! Toggle word wrap Toggle overflow NoteYou must assign at least one CPU with sibling thread on each NUMA node with or without DPDK NICs present for DPDK PMD to avoid failures in creating guest instances.
NoteThese parameters apply to the specific role, ComputeOvsDpdk. You can apply these parameters globally, but role-specific parameters overwrite any global parameters.
- Deploy the overcloud using the role file and all environment files specific to your environment.
openstack overcloud deploy --templates \ -r /home/stack/roles_data.yaml \ -e /home/stack/<environment-file> \ ... #repeat as necessary ...
openstack overcloud deploy --templates \
-r /home/stack/roles_data.yaml \
-e /home/stack/<environment-file> \
... #repeat as necessary ...
In a cluster with Compute, ComputeOvsDpdk, and ComputeSriov, the workflow applies the formula only for the ComputeOvsDpdk role, not Compute or ComputeSriovs.
8.2. OVS-DPDK topology 复制链接链接已复制到粘贴板!
With Red Hat OpenStack Platform, you can create custom deployment roles, using the composable roles feature to add or remove services from each role. For more information on Composable Roles, see Composable Services and Custom Roles in Advanced Overcloud Customization.
This image shows a example OVS-DPDK topology with two bonded ports for the control plane and data plane:
To configure OVS-DPDK, perform the following tasks:
-
If you use composable roles, copy and modify the
roles_data.yamlfile to add the custom role for OVS-DPDK. -
Update the appropriate
network-environment.yamlfile to include parameters for kernel arguments, and DPDK arguments. -
Update the
compute.yamlfile to include the bridge for DPDK interface parameters. -
Update the
controller.yamlfile to include the same bridge details for DPDK interface parameters. -
Run the
overcloud_deploy.shscript to deploy the overcloud with the DPDK parameters.
This guide provides examples for CPU assignments, memory allocation, and NIC configurations that can vary from your topology and use case. For more information on hardware and configuration options, see: Network Functions Virtualization Product Guide and Chapter 2, Hardware requirements .
Prerequisites
- OVS 2.10
- DPDK 17
- A supported NIC. To view the list of supported NICs for NFV, see Section 2.1, “Tested NICs”.
The Red Hat OpenStack Platform operates in OVS client mode for OVS-DPDK deployments.
8.3. Setting the MTU value for OVS-DPDK interfaces 复制链接链接已复制到粘贴板!
Red Hat OpenStack Platform supports jumbo frames for OVS-DPDK. To set the maximum transmission unit (MTU) value for jumbo frames you must:
-
Set the global MTU value for networking in the
network-environment.yamlfile. -
Set the physical DPDK port MTU value in the
compute.yamlfile. This value is also used by the vhost user interface. - Set the MTU value within any guest instances on the Compute node to ensure that you have a comparable MTU value from end to end in your configuration.
VXLAN packets include an extra 50 bytes in the header. Calculate your MTU requirements based on these additional header bytes. For example, an MTU value of 9000 means the VXLAN tunnel MTU value is 8950 to account for these extra bytes.
You do not need any special configuration for the physical NIC because the NIC is controlled by the DPDK PMD, and has the same MTU value set by the compute.yaml file. You cannot set an MTU value larger than the maximum value supported by the physical NIC.
To set the MTU value for OVS-DPDK interfaces:
Set the
NeutronGlobalPhysnetMtuparameter in thenetwork-environment.yamlfile.parameter_defaults: # MTU global configuration NeutronGlobalPhysnetMtu: 9000
parameter_defaults: # MTU global configuration NeutronGlobalPhysnetMtu: 9000Copy to Clipboard Copied! Toggle word wrap Toggle overflow NoteEnsure that the NeutronDpdkSocketMemory value in the
network-environment.yamlfile is large enough to support jumbo frames. For details, see Section 7.4.2, “Memory parameters” .Set the MTU value on the bridge to the Compute node in the
controller.yamlfile.Copy to Clipboard Copied! Toggle word wrap Toggle overflow Set the MTU values for an OVS-DPDK bond in the
compute.yamlfile:Copy to Clipboard Copied! Toggle word wrap Toggle overflow
8.4. Configuring a firewall for security groups 复制链接链接已复制到粘贴板!
Dataplane interfaces require high performance in a stateful firewall. To protect these interfaces, consider deploying a telco-grade firewall as a virtual network function (VNF).
To configure control plane interfaces, set the NeutronOVSFirewallDriver parameter to openvswitch. To use the flow-based OVS firewall driver, modify the network-environment.yaml file under parameter_defaults.
Example:
parameter_defaults: NeutronOVSFirewallDriver: openvswitch
parameter_defaults:
NeutronOVSFirewallDriver: openvswitch
Use the openstack port set command to disable the OVS firewall driver for dataplane interfaces.
Example:
openstack port set --no-security-group --disable-port-security ${PORT}
openstack port set --no-security-group --disable-port-security ${PORT}
8.5. Setting multiqueue for OVS-DPDK interfaces 复制链接链接已复制到粘贴板!
Multiqueue is experimental and unsupported.
To set the same number of queues for interfaces in OVS-DPDK on the Compute node, modify the compute.yaml file:
8.6. Known limitations 复制链接链接已复制到粘贴板!
Observe the following limitations when configuring OVS-DPDK with Red Hat OpenStack Platform for NFV:
- Use Linux bonds for control plane networks. Ensure that both the PCI devices used in the bond are on the same NUMA node for optimum performance. Neutron Linux bridge configuration is not supported by Red Hat.
- You require huge pages for every instance running on the hosts with OVS-DPDK. If huge pages are not present in the guest, the interface appears but does not function.
- With OVS-DPDK, there is a performance degradation of services that use tap devices, such as Distributed Virtual Routing (DVR). The resulting performance is not suitable for a production environment.
-
When using OVS-DPDK, all bridges on the same Compute node must be of type
ovs_user_bridge. The director may accept the configuration, but Red Hat OpenStack Platform does not support mixingovs_bridgeandovs_user_bridgeon the same node.
After you configure OVS-DPDK for your Red Hat OpenStack Platform deployment with NFV, you can create a flavor, and deploy an instance using the following steps:
Create an aggregate group, and add relevant hosts for OVS-DPDK. Define metadata, for example
dpdk=true, that matches defined flavor metadata.openstack aggregate create dpdk_group # openstack aggregate add host dpdk_group [compute-host] # openstack aggregate set --property dpdk=true dpdk_group
# openstack aggregate create dpdk_group # openstack aggregate add host dpdk_group [compute-host] # openstack aggregate set --property dpdk=true dpdk_groupCopy to Clipboard Copied! Toggle word wrap Toggle overflow NotePinned CPU instances can be located on the same Compute node as unpinned instances. For more information, see Configuring CPU pinning on the Compute node in the Instances and Images Guide.
Create a flavor.
openstack flavor create <flavor> --ram <MB> --disk <GB> --vcpus <#>
# openstack flavor create <flavor> --ram <MB> --disk <GB> --vcpus <#>Copy to Clipboard Copied! Toggle word wrap Toggle overflow Set flavor properties. Note that the defined metadata,
dpdk=true, matches the defined metadata in the DPDK aggregate.openstack flavor set <flavor> --property dpdk=true --property hw:cpu_policy=dedicated --property hw:mem_page_size=1GB --property hw:emulator_threads_policy=isolate
# openstack flavor set <flavor> --property dpdk=true --property hw:cpu_policy=dedicated --property hw:mem_page_size=1GB --property hw:emulator_threads_policy=isolateCopy to Clipboard Copied! Toggle word wrap Toggle overflow For details on the emulator threads policy for performance improvements, see: Configure Emulator Threads to run on a Dedicated Physical CPU .
Create the network.
openstack network create net1 --provider-physical-network tenant --provider-network-type vlan --provider-segment <VLAN-ID> openstack subnet create subnet1 --network net1 --subnet-range 192.0.2.0/24 --dhcp
# openstack network create net1 --provider-physical-network tenant --provider-network-type vlan --provider-segment <VLAN-ID> # openstack subnet create subnet1 --network net1 --subnet-range 192.0.2.0/24 --dhcpCopy to Clipboard Copied! Toggle word wrap Toggle overflow Optional: If you use multiqueue with OVS-DPDK, set the
hw_vif_multiqueue_enabledproperty on the image that you want to use to create a instance:openstack image set --property hw_vif_multiqueue_enabled=true <image>
# openstack image set --property hw_vif_multiqueue_enabled=true <image>Copy to Clipboard Copied! Toggle word wrap Toggle overflow Deploy an instance.
openstack server create --flavor <flavor> --image <glance image> --nic net-id=<network ID> <server_name>
# openstack server create --flavor <flavor> --image <glance image> --nic net-id=<network ID> <server_name>Copy to Clipboard Copied! Toggle word wrap Toggle overflow
8.8. Troubleshooting the configuration 复制链接链接已复制到粘贴板!
This section describes the steps to troubleshoot the OVS-DPDK configuration.
Review the bridge configuration, and confirm that the bridge has
datapath_type=netdev.Copy to Clipboard Copied! Toggle word wrap Toggle overflow Confirm that the docker container
neutron_ovs_agentis configured to start automatically.docker inspect neutron_ovs_agent | grep -A1 RestartPolicy "RestartPolicy": { "Name": "always",# docker inspect neutron_ovs_agent | grep -A1 RestartPolicy "RestartPolicy": { "Name": "always",Copy to Clipboard Copied! Toggle word wrap Toggle overflow Optionally, you can view logs for errors, such as if the container fails to start.
less /var/log/containers/neutron/openvswitch-agent.log
# less /var/log/containers/neutron/openvswitch-agent.logCopy to Clipboard Copied! Toggle word wrap Toggle overflow Confirm that the Poll Mode Driver CPU mask of the
ovs-dpdkis pinned to the CPUs. In case of hyper threading, use sibling CPUs.For example, to check the sibling of
CPU4, run the following command:cat /sys/devices/system/cpu/cpu4/topology/thread_siblings_list 4,20
# cat /sys/devices/system/cpu/cpu4/topology/thread_siblings_list 4,20Copy to Clipboard Copied! Toggle word wrap Toggle overflow The sibling of
CPU4isCPU20, therefore proceed with the following command:ovs-vsctl set Open_vSwitch . other_config:pmd-cpu-mask=0x100010
# ovs-vsctl set Open_vSwitch . other_config:pmd-cpu-mask=0x100010Copy to Clipboard Copied! Toggle word wrap Toggle overflow Display the status:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow
9.1. Pinning emulator threads 复制链接链接已复制到粘贴板!
Emulator threads handle interrupt requests and non-blocking processes for virtual machine hardware emulation. These threads float across the CPUs that the guest uses for processing. If threads used for the poll mode driver (PMD) or real-time processing run on these guest CPUs, you can experience packet loss or missed deadlines.
You can separate emulator threads from VM processing tasks by pinning the threads to their own guest CPUs, increasing performance as a result.
9.1.1. Configuring CPUs to host emulator threads 复制链接链接已复制到粘贴板!
To improve performance, reserve a subset of host CPUs identified in the OvsDpdkCoreList parameter for hosting emulator threads.
Procedure
Deploy an overcloud with
NovaComputeCpuSharedSetdefined for a given role. The value ofNovaComputeCpuSharedSetapplies to thecpu_shared_setparameter in thenova.conffile for hosts within that role.parameter_defaults: ComputeOvsDpdkParameters: OvsDpdkCoreList: “0-1,16-17” NovaComputeCpuSharedSet: “0-1,16-17” NovaComputeCpuDedicatedSet: “2-15,18-31”parameter_defaults: ComputeOvsDpdkParameters: OvsDpdkCoreList: “0-1,16-17” NovaComputeCpuSharedSet: “0-1,16-17” NovaComputeCpuDedicatedSet: “2-15,18-31”Copy to Clipboard Copied! Toggle word wrap Toggle overflow Create a flavor to build instances with emulator threads separated into a shared pool.
openstack flavor create --ram <size_mb> --disk <size_gb> --vcpus <vcpus> <flavor>
openstack flavor create --ram <size_mb> --disk <size_gb> --vcpus <vcpus> <flavor>Copy to Clipboard Copied! Toggle word wrap Toggle overflow Add the
hw:emulator_threads_policyextra specification, and set the value toshare. Instances created with this flavor will use the instance CPUs defined in thecpu_share_setparameter in the nova.conf file.openstack flavor set <flavor> --property hw:emulator_threads_policy=share
openstack flavor set <flavor> --property hw:emulator_threads_policy=shareCopy to Clipboard Copied! Toggle word wrap Toggle overflow
You must set the cpu_share_set parameter in the nova.conf file to enable the share policy for this extra specification. You should use heat for this preferably, as editing nova.conf manually might not persist across redeployments.
9.1.2. Verify the emulator thread pinning 复制链接链接已复制到粘贴板!
Procedure
Identify the host and name for a given instance.
openstack server show <instance_id>
openstack server show <instance_id>Copy to Clipboard Copied! Toggle word wrap Toggle overflow Use SSH to log on to the identified host as heat-admin.
ssh heat-admin@compute-1 [compute-1]$ sudo virsh dumpxml instance-00001 | grep `'emulatorpin cpuset'`
ssh heat-admin@compute-1 [compute-1]$ sudo virsh dumpxml instance-00001 | grep `'emulatorpin cpuset'`Copy to Clipboard Copied! Toggle word wrap Toggle overflow
9.2. Enabling RT-KVM for NFV Workloads 复制链接链接已复制到粘贴板!
To facilitate installing and configuring Red Hat Enterprise Linux 8.0 Real Time KVM (RT-KVM), Red Hat OpenStack Platform provides the following features:
- A real-time Compute node role that provisions Red Hat Enterprise Linux for real-time.
- The additional RT-KVM kernel module.
- Automatic configuration of the Compute node.
9.2.1. Planning for your RT-KVM Compute nodes 复制链接链接已复制到粘贴板!
You must use Red Hat certified servers for your RT-KVM Compute nodes. For more information, see: Red Hat Enterprise Linux for Real Time 7 certified servers.
For details on how to enable the rhel-8-server-nfv-rpms repository for RT-KVM, and ensuring your system is up to date, see: Registering and updating your undercloud.
You need a separate subscription to a Red Hat OpenStack Platform for Real Time SKU before you can access this repository.
Building the real-time image
Install the libguestfs-tools package on the undercloud to get the virt-customize tool:
sudo dnf install libguestfs-tools
(undercloud) [stack@undercloud-0 ~]$ sudo dnf install libguestfs-toolsCopy to Clipboard Copied! Toggle word wrap Toggle overflow ImportantIf you install the
libguestfs-toolspackage on the undercloud, disableiscsid.socketto avoid port conflicts with thetripleo_iscsidservice on the undercloud:sudo systemctl disable --now iscsid.socket
$ sudo systemctl disable --now iscsid.socketCopy to Clipboard Copied! Toggle word wrap Toggle overflow Extract the images:
tar -xf /usr/share/rhosp-director-images/overcloud-full.tar tar -xf /usr/share/rhosp-director-images/ironic-python-agent.tar
(undercloud) [stack@undercloud-0 ~]$ tar -xf /usr/share/rhosp-director-images/overcloud-full.tar (undercloud) [stack@undercloud-0 ~]$ tar -xf /usr/share/rhosp-director-images/ironic-python-agent.tarCopy to Clipboard Copied! Toggle word wrap Toggle overflow Copy the default image:
cp overcloud-full.qcow2 overcloud-realtime-compute.qcow2
(undercloud) [stack@undercloud-0 ~]$ cp overcloud-full.qcow2 overcloud-realtime-compute.qcow2Copy to Clipboard Copied! Toggle word wrap Toggle overflow Register your image to enable Red Hat repositories relevant to your customizations. Replace
[username]and[password]with valid credentials in the following example.virt-customize -a overcloud-realtime-compute.qcow2 --run-command \ 'subscription-manager register --username=[username] --password=[password]' \ subscription-manager release --set 8.1
virt-customize -a overcloud-realtime-compute.qcow2 --run-command \ 'subscription-manager register --username=[username] --password=[password]' \ subscription-manager release --set 8.1Copy to Clipboard Copied! Toggle word wrap Toggle overflow NoteFor security, you can remove credentials from the history file if they are used on the command prompt. You can delete individual lines in history using the
history -dcommand followed by the line number.Find a list of pool IDs from your account’s subscriptions, and attach the appropriate pool ID to your image.
sudo subscription-manager list --all --available | less ... virt-customize -a overcloud-realtime-compute.qcow2 --run-command \ 'subscription-manager attach --pool [pool-ID]'
sudo subscription-manager list --all --available | less ... virt-customize -a overcloud-realtime-compute.qcow2 --run-command \ 'subscription-manager attach --pool [pool-ID]'Copy to Clipboard Copied! Toggle word wrap Toggle overflow Add the repositories necessary for Red Hat OpenStack Platform with NFV.
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Create a script to configure real-time capabilities on the image.
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Run the script to configure the real-time image:
virt-customize -a overcloud-realtime-compute.qcow2 -v --run rt.sh 2>&1 | tee virt-customize.log
(undercloud) [stack@undercloud-0 ~]$ virt-customize -a overcloud-realtime-compute.qcow2 -v --run rt.sh 2>&1 | tee virt-customize.logCopy to Clipboard Copied! Toggle word wrap Toggle overflow NoteIf you see the following line in the
rt.shscript output,"grubby fatal error: unable to find a suitable template", you can ignore this error.Examine the
virt-customize.logfile that resulted from the previous command, to check that the packages installed correctly using thert.shscript .Copy to Clipboard Copied! Toggle word wrap Toggle overflow Relabel SELinux:
virt-customize -a overcloud-realtime-compute.qcow2 --selinux-relabel
(undercloud) [stack@undercloud-0 ~]$ virt-customize -a overcloud-realtime-compute.qcow2 --selinux-relabelCopy to Clipboard Copied! Toggle word wrap Toggle overflow Extract vmlinuz and initrd:
mkdir image guestmount -a overcloud-realtime-compute.qcow2 -i --ro image cp image/boot/vmlinuz-3.10.0-862.rt56.804.el7.x86_64 ./overcloud-realtime-compute.vmlinuz cp image/boot/initramfs-3.10.0-862.rt56.804.el7.x86_64.img ./overcloud-realtime-compute.initrd guestunmount image
(undercloud) [stack@undercloud-0 ~]$ mkdir image (undercloud) [stack@undercloud-0 ~]$ guestmount -a overcloud-realtime-compute.qcow2 -i --ro image (undercloud) [stack@undercloud-0 ~]$ cp image/boot/vmlinuz-3.10.0-862.rt56.804.el7.x86_64 ./overcloud-realtime-compute.vmlinuz (undercloud) [stack@undercloud-0 ~]$ cp image/boot/initramfs-3.10.0-862.rt56.804.el7.x86_64.img ./overcloud-realtime-compute.initrd (undercloud) [stack@undercloud-0 ~]$ guestunmount imageCopy to Clipboard Copied! Toggle word wrap Toggle overflow NoteThe software version in the
vmlinuzandinitramfsfilenames vary with the kernel version.Upload the image:
openstack overcloud image upload --update-existing --os-image-name overcloud-realtime-compute.qcow2
(undercloud) [stack@undercloud-0 ~]$ openstack overcloud image upload --update-existing --os-image-name overcloud-realtime-compute.qcow2Copy to Clipboard Copied! Toggle word wrap Toggle overflow
You now have a real-time image you can use with the ComputeOvsDpdkRT composable role on your selected Compute nodes.
Modifying BIOS settings on RT-KVM Compute nodes
To reduce latency on your RT-KVM Compute nodes, disable all options for the following parameters in your Compute node BIOS settings:
- Power Management
- Hyper-Threading
- CPU sleep states
- Logical processors
For descriptions of these settings and the impact of disabling them, see: Setting BIOS parameters. See your hardware manufacturer documentation for complete details on how to change BIOS settings.
9.2.2. Configuring OVS-DPDK with RT-KVM 复制链接链接已复制到粘贴板!
You must determine the best values for the OVS-DPDK parameters that you set in the network-environment.yaml file to optimize your OpenStack network for OVS-DPDK. For more details, see Section 8.1, “Deriving DPDK parameters with workflows”.
9.2.2.1. Generating the ComputeOvsDpdk composable role 复制链接链接已复制到粘贴板!
Use the ComputeOvsDpdkRT role to specify Compute nodes for the real-time compute image.
Generate roles_data.yaml for the ComputeOvsDpdkRT role.
(undercloud) [stack@undercloud-0 ~]$ openstack overcloud roles generate -o roles_data.yaml Controller ComputeOvsDpdkRT
# (undercloud) [stack@undercloud-0 ~]$ openstack overcloud roles generate -o roles_data.yaml Controller ComputeOvsDpdkRT
9.2.2.2. Configuring the OVS-DPDK parameters 复制链接链接已复制到粘贴板!
Determine the best values for the OVS-DPDK parameters in the network-environment.yaml file to optimize your deployment. For more information, see Section 8.1, “Deriving DPDK parameters with workflows”.
Add the NIC configuration for the OVS-DPDK role you use under
resource_registry:resource_registry: # Specify the relative/absolute path to the config files you want to use for override the default. OS::TripleO::ComputeOvsDpdkRT::Net::SoftwareConfig: nic-configs/compute-ovs-dpdk.yaml OS::TripleO::Controller::Net::SoftwareConfig: nic-configs/controller.yaml
resource_registry: # Specify the relative/absolute path to the config files you want to use for override the default. OS::TripleO::ComputeOvsDpdkRT::Net::SoftwareConfig: nic-configs/compute-ovs-dpdk.yaml OS::TripleO::Controller::Net::SoftwareConfig: nic-configs/controller.yamlCopy to Clipboard Copied! Toggle word wrap Toggle overflow Under
parameter_defaults, set the OVS-DPDK, and RT-KVM parameters:Copy to Clipboard Copied! Toggle word wrap Toggle overflow
9.2.2.3. Deploying the overcloud 复制链接链接已复制到粘贴板!
Deploy the overcloud for ML2-OVS:
9.2.3. Launching an RT-KVM instance 复制链接链接已复制到粘贴板!
Perform the following steps to launch an RT-KVM instance on a real-time enabled Compute node:
Create an RT-KVM flavor on the overcloud:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Launch an RT-KVM instance:
openstack server create --image <rhel> --flavor r1.small --nic net-id=<dpdk-net> test-rt
# openstack server create --image <rhel> --flavor r1.small --nic net-id=<dpdk-net> test-rtCopy to Clipboard Copied! Toggle word wrap Toggle overflow To verify that the instance uses the assigned emulator threads, run the following command:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow
9.3. Trusted Virtual Functions 复制链接链接已复制到粘贴板!
You can configure trust between physical functions (PFs) and virtual functions (VFs), so that VFs can perform privileged actions, such as enabling promiscuous mode, or modifying a hardware address.
Prerequisites
- An operational installation of Red Hat OpenStack Platform including director
Procedure
Complete the following steps to configure and deploy the overcloud with trust between physical and virtual functions:
Add the
NeutronPhysicalDevMappingsparameter in theparameter_defaultssection to link between the logical network name and the physical interface.parameter_defaults: NeutronPhysicalDevMappings: - sriov2:p5p2parameter_defaults: NeutronPhysicalDevMappings: - sriov2:p5p2Copy to Clipboard Copied! Toggle word wrap Toggle overflow Add the new property,
trusted, to the SR-IOV parameters.Copy to Clipboard Copied! Toggle word wrap Toggle overflow NoteYou must include double quotation marks around the value "true".
ImportantComplete the following step in trusted environments, as it allows trusted port binding by non-administrative accounts.
Modify permissions to allow users to create and update port bindings.
Copy to Clipboard Copied! Toggle word wrap Toggle overflow
9.3.2. Utilizing trusted VF networks 复制链接链接已复制到粘贴板!
Create a network of type
vlan.openstack network create trusted_vf_network --provider-network-type vlan \ --provider-segment 111 --provider-physical-network sriov2 \ --external --disable-port-security
openstack network create trusted_vf_network --provider-network-type vlan \ --provider-segment 111 --provider-physical-network sriov2 \ --external --disable-port-securityCopy to Clipboard Copied! Toggle word wrap Toggle overflow Create a subnet.
openstack subnet create --network trusted_vf_network \ --ip-version 4 --subnet-range 192.168.111.0/24 --no-dhcp \ subnet-trusted_vf_network
openstack subnet create --network trusted_vf_network \ --ip-version 4 --subnet-range 192.168.111.0/24 --no-dhcp \ subnet-trusted_vf_networkCopy to Clipboard Copied! Toggle word wrap Toggle overflow Create a port. Set the
vnic-typeoption todirect, and thebinding-profileoption totrue.openstack port create --network sriov111 \ --vnic-type direct --binding-profile trusted=true \ sriov111_port_trusted
openstack port create --network sriov111 \ --vnic-type direct --binding-profile trusted=true \ sriov111_port_trustedCopy to Clipboard Copied! Toggle word wrap Toggle overflow Create an instance, and bind it to the previously-created trusted port.
openstack server create --image rhel --flavor dpdk --network internal --port trusted_vf_network_port_trusted --config-drive True --wait rhel-dpdk-sriov_trusted
openstack server create --image rhel --flavor dpdk --network internal --port trusted_vf_network_port_trusted --config-drive True --wait rhel-dpdk-sriov_trustedCopy to Clipboard Copied! Toggle word wrap Toggle overflow
Verify the trusted VF configuration on the hypervisor
- On the compute node that you created the instance, run the following command:
ip link
7: p5p2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc mq state UP mode DEFAULT group default qlen 1000
link/ether b4:96:91:1c:40:fa brd ff:ff:ff:ff:ff:ff
vf 6 MAC fa:16:3e:b8:91:c2, vlan 111, spoof checking off, link-state auto, trust on, query_rss off
vf 7 MAC fa:16:3e:84:cf:c8, vlan 111, spoof checking off, link-state auto, trust off, query_rss off
# ip link
7: p5p2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc mq state UP mode DEFAULT group default qlen 1000
link/ether b4:96:91:1c:40:fa brd ff:ff:ff:ff:ff:ff
vf 6 MAC fa:16:3e:b8:91:c2, vlan 111, spoof checking off, link-state auto, trust on, query_rss off
vf 7 MAC fa:16:3e:84:cf:c8, vlan 111, spoof checking off, link-state auto, trust off, query_rss off
-
Verify that the trust status of the VF is
trust on. The example output contains details of an environment that contains two ports. Note thatvf 6contains the texttrust on.
9.4. Configuring RX/TX queue size 复制链接链接已复制到粘贴板!
You can experience packet loss at high packet rates above 3.5 million packets per second (mpps) for many reasons, such as:
- a network interrupt
- a SMI
- packet processing latency in the Virtual Network Function
To prevent packet loss, increase the queue size from the default of 512 to a maximum of 1024.
Prerequisites
- To configure RX, ensure that you have libvirt v2.3 and QEMU v2.7.
- To configure TX, ensure that you have libvirt v3.7 and QEMU v2.10.
Procedure
To increase the RX and TX queue size, include the following lines to the
parameter_defaults:section of a relevant director role. Here is an example with ComputeOvsDpdk role:parameter_defaults: ComputeOvsDpdkParameters: -NovaLibvirtRxQueueSize: 1024 -NovaLibvirtTxQueueSize: 1024parameter_defaults: ComputeOvsDpdkParameters: -NovaLibvirtRxQueueSize: 1024 -NovaLibvirtTxQueueSize: 1024Copy to Clipboard Copied! Toggle word wrap Toggle overflow
Testing
You can observe the values for RX queue size and TX queue size in the nova.conf file:
[libvirt] rx_queue_size=1024 tx_queue_size=1024
[libvirt] rx_queue_size=1024 tx_queue_size=1024Copy to Clipboard Copied! Toggle word wrap Toggle overflow You can check the values for RX queue size and TX queue size in the VM instance XML file generated by libvirt on the compute host.
Copy to Clipboard Copied! Toggle word wrap Toggle overflow To verify the values for RX queue size and TX queue size, use the following command on a KVM host:
virsh dumpxml <vm name> | grep queue_size
$ virsh dumpxml <vm name> | grep queue_sizeCopy to Clipboard Copied! Toggle word wrap Toggle overflow - You can check for improved performance, such as 3.8 mpps/core at 0 frame loss.
9.5. Configuring a NUMA-aware vSwitch 复制链接链接已复制到粘贴板!
This feature is available in this release as a Technology Preview, and therefore is not fully supported by Red Hat. It should only be used for testing, and should not be deployed in a production environment. For more information about Technology Preview features, see Scope of Coverage Details.
Before you implement a NUMA-aware vSwitch, examine the following components of your hardware configuration:
- The number of physical networks.
- The placement of PCI cards.
- The physical architecture of the servers.
Memory-mapped I/O (MMIO) devices, such as PCIe NICs, are associated with specific NUMA nodes. When a VM and the NIC are on different NUMA nodes, there is a significant decrease in performance. To increase performance, align PCIe NIC placement and instance processing on the same NUMA node.
Use this feature to ensure that instances that share a physical network are located on the same NUMA node. To optimize datacenter hardware, you can leverage load-sharing VMs by using multiple networks, different network types, or bonding.
To architect NUMA-node load sharing and network access correctly, you must understand the mapping of the PCIe slot and the NUMA node. For detailed information on your specific hardware, refer to your vendor’s documentation.
To prevent a cross-NUMA configuration, place the VM on the correct NUMA node, by providing the location of the NIC to Nova.
Prerequisites
- You have enabled the filter “NUMATopologyFilter”
Procedure
-
Set a new
NeutronPhysnetNUMANodesMappingparameter to map the physical network to the NUMA node that you associate with the physical network. If you use tunnels, such as VxLAN or GRE, you must also set the
NeutronTunnelNUMANodesparameter.parameter_defaults: NeutronPhysnetNUMANodesMapping: {<physnet_name>: [<NUMA_NODE>]} NeutronTunnelNUMANodes: <NUMA_NODE>,<NUMA_NODE>parameter_defaults: NeutronPhysnetNUMANodesMapping: {<physnet_name>: [<NUMA_NODE>]} NeutronTunnelNUMANodes: <NUMA_NODE>,<NUMA_NODE>Copy to Clipboard Copied! Toggle word wrap Toggle overflow
Here is an example with two physical networks tunneled to NUMA node 0:
- one project network associated with NUMA node 0
one management network without any affinity
parameter_defaults: NeutronBridgeMappings: - tenant:br-link0 NeutronPhysnetNUMANodesMapping: {tenant: [1], mgmt: [0,1]} NeutronTunnelNUMANodes: 0parameter_defaults: NeutronBridgeMappings: - tenant:br-link0 NeutronPhysnetNUMANodesMapping: {tenant: [1], mgmt: [0,1]} NeutronTunnelNUMANodes: 0Copy to Clipboard Copied! Toggle word wrap Toggle overflow
Testing
Observe the configuration in the file /var/lib/config-data/puppet-generated/nova_libvirt/etc/nova/nova.conf
[neutron_physnet_tenant] numa_nodes=1 [neutron_tunnel] numa_nodes=1
[neutron_physnet_tenant] numa_nodes=1 [neutron_tunnel] numa_nodes=1Copy to Clipboard Copied! Toggle word wrap Toggle overflow Confirm the new configuration with the
lscpucommand:lscpu
$ lscpuCopy to Clipboard Copied! Toggle word wrap Toggle overflow - Launch a VM, with the NIC attached to the appropriate network
For details on Configuring QoS, see Configuring Quality-of-Service (QoS) policies. Support is limited to QoS rule type bandwidth-limit on SR-IOV and OVS-DPDK egress interfaces.
This section describes how to deploy Compute nodes with both OVS-DPDK and SR-IOV interfaces. The cluster includes ML2/OVS and VXLAN tunnelling.
You must determine the best values for the OVS-DPDK parameters that you set in the network-environment.yaml file to optimize your OpenStack network for OVS-DPDK. For details, see: Deriving DPDK parameters with workflows.
10.1. Configuring roles data 复制链接链接已复制到粘贴板!
Red Hat OpenStack Platform provides a set of default roles in the roles_data.yaml file. You can create your own roles_data.yaml file to support the roles you require.
For the purposes of this example, the ComputeOvsDpdkSriov role is created. For information on creating roles in Red Hat OpenStack Platform, see Advanced Overcloud Customization. For details on the specific role used for this example, see roles_data.yaml.
10.2. Configuring OVS-DPDK parameters 复制链接链接已复制到粘贴板!
You must determine the best values for the OVS-DPDK parameters that you set in the network-environment.yaml file to optimize your OpenStack network for OVS-DPDK. For details, see Deriving DPDK parameters with workflows.
Add the custom resources for OVS-DPDK under
resource_registry:resource_registry: # Specify the relative/absolute path to the config files you want to use for override the default. OS::TripleO::ComputeOvsDpdkSriov::Net::SoftwareConfig: nic-configs/computeovsdpdksriov.yaml OS::TripleO::Controller::Net::SoftwareConfig: nic-configs/controller.yamlresource_registry: # Specify the relative/absolute path to the config files you want to use for override the default. OS::TripleO::ComputeOvsDpdkSriov::Net::SoftwareConfig: nic-configs/computeovsdpdksriov.yaml OS::TripleO::Controller::Net::SoftwareConfig: nic-configs/controller.yamlCopy to Clipboard Copied! Toggle word wrap Toggle overflow Under
parameter_defaults, set the tunnel type tovxlan, and the network type tovxlan,vlan:NeutronTunnelTypes: 'vxlan' NeutronNetworkType: 'vxlan,vlan'
NeutronTunnelTypes: 'vxlan' NeutronNetworkType: 'vxlan,vlan'Copy to Clipboard Copied! Toggle word wrap Toggle overflow Under
parameters_defaults, set the bridge mapping:# The OVS logical->physical bridge mappings to use. NeutronBridgeMappings: - dpdk-mgmt:br-link0
# The OVS logical->physical bridge mappings to use. NeutronBridgeMappings: - dpdk-mgmt:br-link0Copy to Clipboard Copied! Toggle word wrap Toggle overflow Under
parameter_defaults, set the role-specific parameters for theComputeOvsDpdkSriovrole:Copy to Clipboard Copied! Toggle word wrap Toggle overflow NoteTo prevent failures during guest creation, assign at least one CPU with sibling thread on each NUMA node. In the example, the values for the
OvsPmdCoreListparameter denote cores 2 and 22 from NUMA 0, and cores 3 and 23 from NUMA 1.NoteThese huge pages are consumed by the virtual machines, and also by OVS-DPDK using the
OvsDpdkSocketMemoryparameter as shown in this procedure. The number of huge pages available for the virtual machines is thebootparameter minus theOvsDpdkSocketMemory.You must also add
hw:mem_page_size=1GBto the flavor you associate with the DPDK instance.NoteOvsDPDKCoreListandOvsDpdkMemoryChannelsare the required settings for this procedure. For optimum operation, ensure you deploy DPDK with appropriate parameters and values.Configure the role-specific parameters for SR-IOV:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow
10.3. Configuring the controller node 复制链接链接已复制到粘贴板!
Create the control-plane Linux bond for an isolated network.
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Assign VLANs to this Linux bond.
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Create the OVS bridge to access
neutron-dhcp-agentandneutron-metadata-agentservices.Copy to Clipboard Copied! Toggle word wrap Toggle overflow
10.4. Configuring the Compute node for DPDK and SR-IOV 复制链接链接已复制到粘贴板!
Create the computeovsdpdksriov.yaml file from the default compute.yaml file, and make the following changes:
Create the control-plane Linux bond for an isolated network.
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Assign VLANs to this Linux bond.
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Set a bridge with a DPDK port to link to the controller.
Copy to Clipboard Copied! Toggle word wrap Toggle overflow NoteTo include multiple DPDK devices, repeat the
typecode section for each DPDK device that you want to add.NoteWhen using OVS-DPDK, all bridges on the same Compute node must be of type
ovs_user_bridge. Red Hat OpenStack Platform does not support bothovs_bridgeandovs_user_bridgelocated on the same node.
10.5. Deploying the overcloud 复制链接链接已复制到粘贴板!
-
Run the
overcloud_deploy.shscript:
For more information about upgrading Red Hat OpenStack Platform (RHOSP) with OVS-DPDK configured, see Preparing network functions virtualization (NFV) in the Framework for Upgrades (13 to 16.1) Guide.
Chapter 12. NFV Performance 复制链接链接已复制到粘贴板!
Red Hat OpenStack Platform director configures the Compute nodes to enforce resource partitioning and fine tuning to achieve line rate performance for the guest virtual network functions (VNFs). The key performance factors in the NFV use case are throughput, latency, and jitter.
You can enable high-performance packet switching between physical NICs and virtual machines using data plane development kit (DPDK) accelerated virtual machines. OVS 2.10 embeds support for DPDK 17 and includes support for vhost-user multiqueue, allowing scalable performance. OVS-DPDK provides line-rate performance for guest VNFs.
Single root I/O virtualization (SR-IOV) networking provides enhanced performance, including improved throughput for specific networks and virtual machines.
Other important features for performance tuning include huge pages, NUMA alignment, host isolation, and CPU pinning. VNF flavors require huge pages and emulator thread isolation for better performance. Host isolation and CPU pinning improve NFV performance and prevent spurious packet loss.
For a high-level introduction to CPUs and NUMA topology, see: NFV Performance Considerations and Configure Emulator Threads to run on a Dedicated Physical CPU.
Chapter 13. Finding more information 复制链接链接已复制到粘贴板!
The following table includes additional Red Hat documentation for reference:
The Red Hat OpenStack Platform documentation suite can be found here: Red Hat OpenStack Platform Documentation Suite
| Component | Reference |
|---|---|
| Red Hat Enterprise Linux | Red Hat OpenStack Platform is supported on Red Hat Enterprise Linux 8.0. For information on installing Red Hat Enterprise Linux, see the corresponding installation guide at: Red Hat Enterprise Linux Documentation Suite. |
| Red Hat OpenStack Platform | To install OpenStack components and their dependencies, use the Red Hat OpenStack Platform director. The director uses a basic OpenStack installation as the undercloud to install, configure, and manage the OpenStack nodes in the final overcloud. You need one extra host machine for the installation of the undercloud, in addition to the environment necessary for the deployed overcloud. For detailed instructions, see Red Hat OpenStack Platform Director Installation and Usage. For information on configuring advanced features for a Red Hat OpenStack Platform enterprise environment using the Red Hat OpenStack Platform director such as network isolation, storage configuration, SSL communication, and general configuration method, see Advanced Overcloud Customization. |
| NFV Documentation | For a high level overview of the NFV concepts, see the Network Functions Virtualization Product Guide. |
Appendix A. Sample DPDK SRIOV YAML files 复制链接链接已复制到粘贴板!
This section provides sample yaml files as a reference to add single root I/O virtualization (SR-IOV) and Data Plane Development Kit (DPDK) interfaces on the same compute node.
These templates are from a fully-configured environment, and include parameters unrelated to NFV, that might not apply to your deployment.
A.1. Sample VXLAN DPDK SRIOV YAML files 复制链接链接已复制到粘贴板!
A.1.1. roles_data.yaml 复制链接链接已复制到粘贴板!
-
Run the
openstack overcloud roles generatecommand to generate theroles_data.yamlfile. Include role names in the command according to the roles that you want to deploy in your environment, such asController,ComputeSriov,ComputeOvsDpdkRT,ComputeOvsDpdkSriov, or other roles. For example, to generate aroles_data.yamlfile that contains the rolesControllerandComputeOvsDpdkSriov, run the following command:
openstack overcloud roles generate -o roles_data.yaml Controller ComputeOvsDpdkSriov
$ openstack overcloud roles generate -o roles_data.yaml Controller ComputeOvsDpdkSriov