9.2. Avoiding TX queue lock contention: Transmit packet steering
For hosts with a multi-queue network interface controller (NIC), Transmit Packet Steering (XPS) distributes outgoing packet processing across multiple queues. This enables multiple CPUs to handle traffic, preventing transmit queue lock contention and packet drops.
Certain drivers, such as ixgbe, i40e, and mlx5 automatically configure XPS. To identify if the driver supports this capability, consult the documentation of your NIC driver. Consult your NIC driver’s documentation to identify if the driver supports this capability. If the driver does not support XPS auto-tuning, you can manually assign CPU cores to the transmit queues.
Red Hat Enterprise Linux does not provide an option to permanently assign transmit queues to CPU cores. Use the commands in a NetworkManager dispatcher script that is executed when the interface is activated. For details, see the Red Hat Knowledgebase solution How to write a NetworkManager dispatcher script to apply commands on interface start.
For further details about scaling in the Linux networking stack, see the /usr/share/doc/kernel-doc-<version>/Documentation/networking/scaling.rst file provided by the kernel-doc package.
Prerequisites
- The NIC supports multiple queues.
-
The
numactlpackage is installed.
Procedure
Display the count of available queues:
# ethtool -l enp1s0 Channel parameters for enp1s0: Pre-set maximums: RX: 0 TX: 0 Other: 0 Combined: 4 Current hardware settings: RX: 0 TX: 0 Other: 0 Combined: 1The
Pre-set maximumssection shows the total number of queues andCurrent hardware settingsthe number of queues that are currently assigned to the receive, transmit, other, or combined queues.Optional: If you require queues on specific channels, assign them accordingly. For example, to assign the 4 queues to the
Combinedchannel, enter:# ethtool -L enp1s0 combined 4Display to which Non-Uniform Memory Access (NUMA) node the NIC is assigned:
# cat /sys/class/net/enp1s0/device/numa_node 0If the file is not found or the command returns
-1, the host is not a NUMA system.If the host is a NUMA system, display which CPUs are assigned to which NUMA node:
# lscpu | grep NUMA NUMA node(s): 2 NUMA node0 CPU(s): 0-3 NUMA node1 CPU(s): 4-7In the example above, the NIC has 4 queues and the NIC is assigned to NUMA node 0. This node uses the CPU cores 0-3. Consequently, map each transmit queue to one of the CPU cores from 0-3:
# echo 1 > /sys/class/net/enp1s0/queues/tx-0/xps_cpus # echo 2 > /sys/class/net/enp1s0/queues/tx-1/xps_cpus # echo 4 > /sys/class/net/enp1s0/queues/tx-2/xps_cpus # echo 8 > /sys/class/net/enp1s0/queues/tx-3/xps_cpusIf the number of CPU cores and transmit (TX) queues is the same, use a 1 to 1 mapping to avoid any kind of contention on the TX queue. Otherwise, if you map multiple CPUs on the same TX queue, transmit operations on different CPUs will cause TX queue lock contention and negatively impact the transmit throughput.
Note that you must pass the bitmap, containing the CPU’s core numbers, to the queues. Use the following command to calculate the bitmap:
# printf %x $((1 << <core_number> ))
Verification
Identify the process IDs (PIDs) of services that send traffic:
# pidof <process_name> 12345 98765Pin the PIDs to cores that use XPS:
# numactl -C 0-3 12345 98765Monitor the
requeuescounter while the process send traffic:# tc -s qdisc qdisc fq_codel 0: dev enp10s0u1 root refcnt 2 limit 10240p flows 1024 quantum 1514 target 5ms interval 100ms memory_limit 32Mb ecn drop_batch 64 Sent 125728849 bytes 1067587 pkt (dropped 0, overlimits 0 requeues 30) backlog 0b 0p requeues 30 ...If the
requeuescounter no longer increases at a significant rate, TX queue lock contention no longer happens.