Chapter 2. Validating an OVS-DPDK Deployment
This chapter describes the validation steps to take following a deployment.
2.1. Confirming OpenStack リンクのコピーリンクがクリップボードにコピーされました!
Use the following commands to confirm OpenStack and OVS-DPDK configuration.
2.1.1. Show the Network Agents リンクのコピーリンクがクリップボードにコピーされました!
Ensure that the value for Alive is True and State is UP for each agent. If there are any issues, view the logs in /var/log/neutron and /var/log/openvswitch/ovs-vswitchd.log to determine the issue.
2.1.2. Show the Hosts in the Compute Service リンクのコピーリンクがクリップボードにコピーされました!
Ensure that the value for Status is enabled and State is up for each host. If there are any issues, see the logs in /var/log/nova to determine the issue.
2.2. Confirming Compute Node OVS Configuration リンクのコピーリンクがクリップボードにコピーされました!
To verify the configuration and health of network adapters and OpenvSwitch, complete the following steps.
To verify the DPDK network device on the compute node, install dpdk tools. Run the following command. This rpm is found in repo:
rhel-7-server-extras-rpms.yum install dpdk-tools
$ yum install dpdk-toolsCopy to Clipboard Copied! Toggle word wrap Toggle overflow Show the network devices managed by DPDK and those used for networking.
dpdk-devbind --status
$ dpdk-devbind --statusCopy to Clipboard Copied! Toggle word wrap Toggle overflow The devices using a DPDK driver are the types
ovs_dpdk_bondorovs_dpdk_portin the Tripleo compute role templates:Copy to Clipboard Copied! Toggle word wrap Toggle overflow To confirm that DPDK is enabled, run the following command:
sudo ovs-vsctl get Open_vSwitch . iface_types
$ sudo ovs-vsctl get Open_vSwitch . iface_types [dpdk, dpdkr, dpdkvhostuser, dpdkvhostuserclient, geneve, gre, internal, lisp, patch, stt, system, tap, vxlan]Copy to Clipboard Copied! Toggle word wrap Toggle overflow Run the following command. The results show PCI devices from the DPDK compatible drivers, for example,
0000:04:00.1and:05:00.0astype: dpdkwith no errors.Copy to Clipboard Copied! Toggle word wrap Toggle overflow The following output shows an error:
Port "dpdkbond0" Interface "dpdk1" type: dpdk options: {dpdk-devargs="0000:04:00.1", n_rxq="2"} error: "Error attaching device '0000:04:00.1' to DPDK"Port "dpdkbond0" Interface "dpdk1" type: dpdk options: {dpdk-devargs="0000:04:00.1", n_rxq="2"} error: "Error attaching device '0000:04:00.1' to DPDK"Copy to Clipboard Copied! Toggle word wrap Toggle overflow To show details about interfaces, run the following command:
sudo ovs-vsctl list interface dpdk1 | egrep "name|mtu|options|status"
$ sudo ovs-vsctl list interface dpdk1 | egrep "name|mtu|options|status"Copy to Clipboard Copied! Toggle word wrap Toggle overflow Run the following command. Note that lacp is not enabled.
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Check that all ovs bridges on compute nodes are
netdevfor fast data path (user space) networking.NoteMixing system (kernel) and netdev (user space) datapath types is not supported.
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Run the following command to check for persistent Open vSwitch errors:
grep ERROR /var/log/openvswitch/ovs-vswitchd.log
$ grep ERROR /var/log/openvswitch/ovs-vswitchd.logCopy to Clipboard Copied! Toggle word wrap Toggle overflow
2.3. Confirming OVS for Instance Configuration リンクのコピーリンクがクリップボードにコピーされました!
To ensure that vhostuser DMA works, configure instances with OVS-DPDK ports to have dedicated CPUs and huge pages enabled using flavors. For more information, see Step 3 in: Creating a flavor and deploying an instance for OVS-DPDK.
To confirm the instance configuration, complete the following steps:
Confirm the instance has pinned CPUs. Dedicated CPUs can be identified with virsh:
sudo virsh vcpupin 2
$ sudo virsh vcpupin 2Copy to Clipboard Copied! Toggle word wrap Toggle overflow Confirm that the emulator threads used for the instance are not running on the same vCPUs assigned to that instance:
sudo virsh emulatorpin 2
$ sudo virsh emulatorpin 2Copy to Clipboard Copied! Toggle word wrap Toggle overflow NoteBeginning with Red Hat OpenStack Platform 12, you can set the emulatorpin by flavor. See Configuring emulator threads policy with Red Hat OpenStack Platform 12.
For older versions, emulator thread pinning has to be done manually when the instance is powered on. See About the impact of using virsh emulatorpin in virtual environments with NFV, with and without isolcpus, and about optimal emulator thread pinning.
Confirm the instance is using huge pages, which is required for optimal performance.
sudo virsh numatune 1
$ sudo virsh numatune 1Copy to Clipboard Copied! Toggle word wrap Toggle overflow Confirm that the receive queues for the instance are being serviced by a poll mode driver (PMD).
The ports and queues should be equally balanced across the PMDs. Optimally, ports will be serviced by a CPU in the same NUMA node as the network adapter.
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Show statistics for the PMDs. This helps to determine how well receive queues are balanced across PMDs. For more information, see PMD Threads in the Open vSwitch documentation.
NoteThe
pmd-rxq-rebalanceoption was added in OVS 2.9.0. This command performs new PMD queue assignments in order to balance equally across PMDs based on the latest rxq processing cycle information.The
pmd-stats-showcommand shows the full history since the PMDs were running or since the statistics were last cleared. If it is not cleared, it will have incorporated into the stats before the ports were set up and data was flowing. If it is being used to see the load on a datapath (which it typically is) it would then be useless.It is best to put the system into a steady state, clear the stats, wait a few seconds, and then show the stats. This provides an accurate picture of the datapath.
Use the following command to show statistics for the PMDs:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Reset the PMD statistics. The
pmd-stats-showcommand shows the PMD statistics since the lastpmd-stats-clearcommand. If there was no previouspmd-stats-clearissued, it contains data since the PMD began running.If you are examining a system under load, it is useful to clear the PMD statistics and then show them. Otherwise, the statistics may also include data from an earlier time when the system was not under load (before traffic flowing).
Use the following command to reset the PMD statistics:
sudo ovs-appctl dpif-netdev/pmd-stats-clear
$ sudo ovs-appctl dpif-netdev/pmd-stats-clearCopy to Clipboard Copied! Toggle word wrap Toggle overflow
2.4. Other Helpful Commands リンクのコピーリンクがクリップボードにコピーされました!
Use these commands to perform additional validation checks.
Find the OVS-DPDK Port & Physical NIC Mapping Configured by os-net-config
cat /var/lib/os-net-config/dpdk_mapping.yaml
cat /var/lib/os-net-config/dpdk_mapping.yamlCopy to Clipboard Copied! Toggle word wrap Toggle overflow Find the DPDK port for an instance with the Nova instance $ID
sudo ovs-vsctl find interface external_ids:vm-uuid="$ID" | grep ^name
sudo ovs-vsctl find interface external_ids:vm-uuid="$ID" | grep ^nameCopy to Clipboard Copied! Toggle word wrap Toggle overflow Find the Nova ID for an instance using a DPDK port
sudo ovs-vsctl get interface vhu24e6c032-db external_ids:vm-uuid
sudo ovs-vsctl get interface vhu24e6c032-db external_ids:vm-uuidCopy to Clipboard Copied! Toggle word wrap Toggle overflow Perform a tcpdump on a dpdk port
sudo ovs-tcpdump -i vhu94ccc316-ea
sudo ovs-tcpdump -i vhu94ccc316-eaCopy to Clipboard Copied! Toggle word wrap Toggle overflow
ovs-tcpdump is from the openvswitch-test RPM located in the rhel-7-server-openstack-10-devtools-rpms repo.
For performance concerns, ovs-tcpdump is not recommended for production environments. For more information, see: How to use ovs-tcpdump on vhost-user interfaces in Red Hat OpenStack Platform?.
2.5. Simple Compute Node CPU Partitioning and Memory Checks リンクのコピーリンクがクリップボードにコピーされました!
Prerequisites
Run this command on a deployed compute node and note how the cpu masks map to TripleO Heat Template values:
sudo ovs-vsctl get Open_vSwitch . other_config
$ sudo ovs-vsctl get Open_vSwitch . other_config
{dpdk-init="true", dpdk-lcore-mask="300003", dpdk-socket-mem="3072,1024", pmd-cpu-mask="c0000c"}
Note the following:
-
dpdk-lcore-maskmaps toHostCpusListin TripleO Heat Templates. -
dpdk-socket-memmaps toNeutronDpdkSocketMemoryin TripleO Heat Templates. pmd-cpu-maskmaps toNeutronDpdkCoreListin TripleO Heat Templates.To convert these cpu masks to decimal values that can be reconciled back to TripleO Heat Templates and actual system values see: How to convert a hexadecimal CPU mask into a bit mask and identify the masked CPUs?
2.5.1. Detecting CPUs リンクのコピーリンクがクリップボードにコピーされました!
To detect CPUs for pid 1, use the following command. No PMDs or Nova vCPUs should be running on these cores:
taskset -c -p 1
$ taskset -c -p 1
pid 1's current affinity list: 0,1,20,21
2.5.2. Detecting PMD Threads リンクのコピーリンクがクリップボードにコピーされました!
To see PMD threads, use the following command. The output should reflect the values of the Tripleo parameter NeutronDpdkCoreList. There should be no overlap with the values of Tripleo parameters HostCpusList or HostIsolatedCoreslist:
2.5.3. Detecting NUMA node リンクのコピーリンクがクリップボードにコピーされました!
For optimal performance ensure that physical network adapters, PMD threads, and pinned CPUs for instances are all on the same NUMA node. For more information, see: CPUs and NUMA nodes.
The following is a simple exercise for examining NUMA assignments.
Examine the vhu port for an instance on a compute node:
sudo virsh domiflist 1
$ sudo virsh domiflist 1 Interface Type Source Model MAC ------------------------------------------------------- vhu24e6c032-db vhostuser - virtio fa:16:3e:e3:c4:c2Copy to Clipboard Copied! Toggle word wrap Toggle overflow Examine the PMD thread that is servicing that port and note the NUMA node:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Find the physical pinned cpus for the instance. For example, the PMD servicing the port for this instance is on cpu 2 and the instance is serviced by cpus 34 and 6.
sudo virsh dumpxml 1 | grep cpuset
$ sudo virsh dumpxml 1 | grep cpuset <vcpupin 1 vcpu='0' cpuset='34'/> <emulatorpin cpuset='6'/>Copy to Clipboard Copied! Toggle word wrap Toggle overflow Examine the cores for each NUMA node. Note that the CPUs servicing the instance (34,6) are on the same NUMA node (0).
lscpu | grep ^NUMA
$ lscpu | grep ^NUMA NUMA node(s): 2 NUMA node0 CPU(s): 0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30,32,34,36,38 NUMA node1 CPU(s): 1,3,5,7,9,11,13,15,17,19,21,23,25,27,29,31,33,35,37,39Copy to Clipboard Copied! Toggle word wrap Toggle overflow
Additionally, network adapters that are not managed by OVS DPDK will have an entry here that indicates what NUMA node they belong to:
sudo cat /sys/class/net/<device name>/device/numa_node
$ sudo cat /sys/class/net/<device name>/device/numa_node
Alternatively, you can see the NUMA node for a network adapter by querying the PCI address, even for those managed by OVS DPDK:
sudo lspci -v -s 05:00.1 | grep -i numa
$ sudo lspci -v -s 05:00.1 | grep -i numa
Flags: bus master, fast devsel, latency 0, IRQ 203, NUMA node 0
These exercises demonstrate that the PMD, instance, and network adapter are all on NUMA 0, which is optimal for performance. For an indication of cross NUMA polling from the openvswitch logs (located in /var/log/openvswitch), look for a log entry similar to this:
dpif_netdev|WARN|There's no available (non-isolated) pmd thread on numa node 0. Queue 0 on port 'dpdk0' will be assigned to the pmd on core 7 (numa node 1). Expect reduced performance.
dpif_netdev|WARN|There's no available (non-isolated) pmd thread on numa node 0. Queue 0 on port 'dpdk0' will be assigned to the pmd on core 7 (numa node 1). Expect reduced performance.
2.5.4. Detectng Isolated CPUs リンクのコピーリンクがクリップボードにコピーされました!
Use the following command to show isolated CPUs. The output should be the same as the value of the TripleO parameter HostIsolatedCoreList.
cat /etc/tuned/cpu-partitioning-variables.conf | grep -v ^# isolated_cores=2-19,22-39
$ cat /etc/tuned/cpu-partitioning-variables.conf | grep -v ^#
isolated_cores=2-19,22-39
2.5.5. Detectng CPUs Dedicated to Nova Instances リンクのコピーリンクがクリップボードにコピーされました!
Use the following command to show the CPUs dedicated to Nova instances. This output should be the same as the value of the parameter isolcpus without poll mode driver (PMD) CPUs:
grep ^vcpu_pin_set /etc/nova/nova.conf
$ grep ^vcpu_pin_set /etc/nova/nova.conf
vcpu_pin_set=4-19,24-39
2.5.6. Confirming Huge Pages Configuration リンクのコピーリンクがクリップボードにコピーされました!
Check for huge pages configuration on the compute node.
If hugepages are not configured or are exhausted, see ComputeKernelArgs.
2.6. Causes for Packet Drops リンクのコピーリンクがクリップボードにコピーされました!
Packets are dropped when a queue is full, usually when the queue is not drained fast enough. The bottleneck is the entity that is supposed to drain the queue when the queue is not draining quickly enough. In most instances, a drop counter is used to track dropped packets. Sometimes a bug in the hardware or software design can cause packets to skip the drop counter.
The Data Plan Development Kit (DPDK) includes the testpmd application for forwarding packets. In the scenarios shown in this chapter, testpmd is installed on a VM and polls ports with its assigned logical cores (lcores) to forward packets from one port to another. testpmd is ordinarily used with a traffic generator to test, in this case, throughput across a physical-virtual-physical (PVP) path.
2.6.1. OVS-DPDK Too Slow to Drain Physical NICs リンクのコピーリンクがクリップボードにコピーされました!
This example shows that a PMD thread is responsible for polling the receive (RX) queue of the physical network adapter (dpdk0). When the PMD thread cannot keep up with the packet volume, or is interrupted, packets might be dropped.
Figure 2.1. Polling the physical adapter RX queue
The following command shows statistics from the dpdk0 interface. If packets are being dropped because ovs-dpdk is not draining the physical adapter fast enough, you will see the value of rx_dropped increasing rapidly.
There should be no more than one physical CPU core per NUMA node for PMDs.
2.6.2. VM Too Slow to Drain vhost-user リンクのコピーリンクがクリップボードにコピーされました!
This example is similar to the example in Figure 2.1, in that you might experience packet loss if the lcore thread is overwhelmed by the packet volume sent to the instance receive (RX) queue.
For more information, see the following articles:
Figure 2.2. Polling the virtual adapter RX queue
To check if the tx_dropped value of the host corresponds to the rx_dropped value of the VM, run the following command:
ovs-vsctl --column statistics list interface vhud8ada965-ce
statistics : {"rx_1024_to_1522_packets"=0, "rx_128_to_255_packets"=0, "rx_1523_to_max_packets"=0,
"rx_1_to_64_packets"=0, "rx_256_to_511_packets"=0, "rx_512_to_1023_packets"=0, "rx_65_to_127_packets"=0, rx_bytes=0,
rx_dropped=0, rx_errors=0, rx_packets=0, tx_bytes=0, tx_dropped=0, tx_packets=0}
ovs-vsctl --column statistics list interface vhud8ada965-ce
statistics : {"rx_1024_to_1522_packets"=0, "rx_128_to_255_packets"=0, "rx_1523_to_max_packets"=0,
"rx_1_to_64_packets"=0, "rx_256_to_511_packets"=0, "rx_512_to_1023_packets"=0, "rx_65_to_127_packets"=0, rx_bytes=0,
rx_dropped=0, rx_errors=0, rx_packets=0, tx_bytes=0, tx_dropped=0, tx_packets=0}
2.6.3. OVS-DPDK Too Slow to Drain vhost-user リンクのコピーリンクがクリップボードにコピーされました!
In this example, a PMD thread is polls the virtio TX, the receive queue from the host perspective. If the PMD thread is overwhelmed by the packet volume, or is interrupted, packets might drop.
Figure 2.3. Polling the virtual adapter TX queue
The trace the return path of the packets from the VM and provides values from drop counters on both the host (tx_dropped) and VM (rx_dropped) sides, run the following command:
ovs-vsctl --column statistics list interface vhue5146cdf-aa
statistics : {"rx_1024_to_1522_packets"=0, "rx_128_to_255_packets"=0, "rx_1523_to_max_packets"=0,
"rx_1_to_64_packets"=0, "rx_256_to_511_packets"=0, "rx_512_to_1023_packets"=0, "rx_65_to_127_packets"=0,
rx_bytes=0, rx_dropped=0, rx_errors=0, rx_packets=0, tx_bytes=0, tx_dropped=0, tx_packets=0}
ovs-vsctl --column statistics list interface vhue5146cdf-aa
statistics : {"rx_1024_to_1522_packets"=0, "rx_128_to_255_packets"=0, "rx_1523_to_max_packets"=0,
"rx_1_to_64_packets"=0, "rx_256_to_511_packets"=0, "rx_512_to_1023_packets"=0, "rx_65_to_127_packets"=0,
rx_bytes=0, rx_dropped=0, rx_errors=0, rx_packets=0, tx_bytes=0, tx_dropped=0, tx_packets=0}
2.6.4. Packet Loss on Egress Physical Interface リンクのコピーリンクがクリップボードにコピーされました!
A slow transfer rate between the PCIe and RAM can result in the physical adapter dropping packets from the TX queue. While this is infrequent, it’s important to know how to identify and resolve this issue.
Figure 2.4. Polling the physical adapter TX queue
The following command shows statistics from the dpdk1 interface. If tx_dropped is greater than zero and growing rapidly, open a support case with Red Hat.
If you see these types of packet losses, consider reconfiguring the memory channels.
- To calculate memory channels, see: Memory parameters in the Network Functions Virtualization Planning and Conifguration Guide.
- To determine the number of memory channels, see: How to determine the number of memory channels for NeutronDpdkMemoryChannels or OvsDpdkMemoryChannels in Red Hat OpenStack Platform.