Dieser Inhalt ist in der von Ihnen ausgewählten Sprache nicht verfügbar.

Chapter 9. Avoiding listen queue lock contention

Queue lock contention can cause packet drops and higher CPU usage and, consequently, a higher latency. You can avoid queue lock contention on the receive (RX) and transmit (TX) queue by tuning your application and using transmit packet steering.

9.1. Avoiding RX queue lock contention: The SO_REUSEPORT and SO_REUSEPORT_BPF socket options
Link kopieren

Improving the performance of multi-threaded network server applications on multi-core systems can be done by opening the port with the SO_REUSEPORT or SO_REUSEPORT_BPF socket option. Without these, all threads must share a single socket to receive incoming traffic.

Using a single socket causes:

Significant contention on the receive buffer, which can cause packet drops and higher CPU usage.
A significant increase of CPU usage
Possibly packet drops

With the SO_REUSEPORT or SO_REUSEPORT_BPF socket option, multiple sockets on one host can bind to the same port:

For further details, see the socket(7) man page and the /usr/src/debug/kernel-<version>/linux-<version>/tools/testing/selftests/net/reuseport_bpf_cpu.c file on your system.

Red Hat Enterprise Linux provides a code example of how to use the SO_REUSEPORT socket options in the kernel sources. To access the code example:

Enable the rhel-9-for-x86_64-baseos-debug-rpms repository:

subscription-manager repos --enable rhel-10-for-x86_64-baseos-debug-rpms

# subscription-manager repos --enable rhel-10-for-x86_64-baseos-debug-rpms

Copy to Clipboard

Toggle word wrap

Install the kernel-debuginfo-common-x86_64 package:
```
dnf install kernel-debuginfo-common-x86_64
```
```
# dnf install kernel-debuginfo-common-x86_64
```
Copy to Clipboard Toggle word wrap
The code example is now available in the /usr/src/debug/kernel-<version>/linux-<version>/tools/testing/selftests/net/reuseport_bpf_cpu.c file.

9.2. Avoiding TX queue lock contention: Transmit packet steering
Link kopieren

For hosts with a multi-queue network interface controller (NIC), Transmit Packet Steering (XPS) distributes outgoing packet processing across multiple queues. This enables multiple CPUs to handle traffic, preventing transmit queue lock contention and packet drops.

Certain drivers, such as ixgbe, i40e, and mlx5 automatically configure XPS. To identify if the driver supports this capability, consult the documentation of your NIC driver. Consult your NIC driver’s documentation to identify if the driver supports this capability. If the driver does not support XPS auto-tuning, you can manually assign CPU cores to the transmit queues.

Note

Red Hat Enterprise Linux does not provide an option to permanently assign transmit queues to CPU cores. Use the commands in a NetworkManager dispatcher script that is executed when the interface is activated. For details, see the Red Hat Knowledgebase solution How to write a NetworkManager dispatcher script to apply commands on interface start.

For further details about scaling in the Linux networking stack, see the /usr/share/doc/kernel-doc-<version>/Documentation/networking/scaling.rst file provided by the kernel-doc package.

Prerequisites

The NIC supports multiple queues.
The numactl package is installed.

Procedure

Display the count of available queues:
```
ethtool -l enp1s0
```
```
# ethtool -l enp1s0
Channel parameters for enp1s0:
Pre-set maximums:
RX:		0
TX:		0
Other:		0
Combined:	4
Current hardware settings:
RX:		0
TX:		0
Other:		0
Combined:	1
```
Copy to Clipboard Toggle word wrap
The Pre-set maximums section shows the total number of queues and Current hardware settings the number of queues that are currently assigned to the receive, transmit, other, or combined queues.
Optional: If you require queues on specific channels, assign them accordingly. For example, to assign the 4 queues to the Combined channel, enter:
```
ethtool -L enp1s0 combined 4
```
```
# ethtool -L enp1s0 combined 4
```
Copy to Clipboard Toggle word wrap
Display to which Non-Uniform Memory Access (NUMA) node the NIC is assigned:
```
cat /sys/class/net/enp1s0/device/numa_node
0
```
```
# cat /sys/class/net/enp1s0/device/numa_node
0
```
Copy to Clipboard Toggle word wrap
If the file is not found or the command returns -1, the host is not a NUMA system.
If the host is a NUMA system, display which CPUs are assigned to which NUMA node:
```
lscpu | grep NUMA
```
```
# lscpu | grep NUMA
NUMA node(s):       2
NUMA node0 CPU(s):  0-3
NUMA node1 CPU(s):  4-7
```
Copy to Clipboard Toggle word wrap
In the example above, the NIC has 4 queues and the NIC is assigned to NUMA node 0. This node uses the CPU cores 0-3. Consequently, map each transmit queue to one of the CPU cores from 0-3:
```
echo 1 > /sys/class/net/enp1s0/queues/tx-0/xps_cpus
echo 2 > /sys/class/net/enp1s0/queues/tx-1/xps_cpus
echo 4 > /sys/class/net/enp1s0/queues/tx-2/xps_cpus
echo 8 > /sys/class/net/enp1s0/queues/tx-3/xps_cpus
```
```
# echo 1 > /sys/class/net/enp1s0/queues/tx-0/xps_cpus
# echo 2 > /sys/class/net/enp1s0/queues/tx-1/xps_cpus
# echo 4 > /sys/class/net/enp1s0/queues/tx-2/xps_cpus
# echo 8 > /sys/class/net/enp1s0/queues/tx-3/xps_cpus
```
Copy to Clipboard Toggle word wrap
If the number of CPU cores and transmit (TX) queues is the same, use a 1 to 1 mapping to avoid any kind of contention on the TX queue. Otherwise, if you map multiple CPUs on the same TX queue, transmit operations on different CPUs will cause TX queue lock contention and negatively impact the transmit throughput.
Note that you must pass the bitmap, containing the CPU’s core numbers, to the queues. Use the following command to calculate the bitmap:
```
printf %x $((1 << <core_number> ))
```
```
# printf %x $((1 << <core_number> ))
```
Copy to Clipboard Toggle word wrap

Verification

Identify the process IDs (PIDs) of services that send traffic:
```
pidof <process_name>
```
```
# pidof <process_name>
12345 98765
```
Copy to Clipboard Toggle word wrap
Pin the PIDs to cores that use XPS:
```
numactl -C 0-3 12345 98765
```
```
# numactl -C 0-3 12345 98765
```
Copy to Clipboard Toggle word wrap

Monitor the requeues counter while the process send traffic:

tc -s qdisc

# tc -s qdisc
qdisc fq_codel 0: dev enp10s0u1 root refcnt 2 limit 10240p flows 1024 quantum 1514 target 5ms interval 100ms memory_limit 32Mb ecn drop_batch 64
 Sent 125728849 bytes 1067587 pkt (dropped 0, overlimits 0 requeues 30)
 backlog 0b 0p requeues 30
 ...

Copy to Clipboard

Toggle word wrap

If the requeues counter no longer increases at a significant rate, TX queue lock contention no longer happens.

9.3. Disabling the Generic Receive Offload feature on servers with high UDP traffic
Link kopieren

Applications that use high-speed UDP bulk transfer should enable and use UDP Generic Receive Offload (GRO) on the UDP socket. However, you can disable GRO to increase the throughput.

Disable GRO if the following conditions apply:

The application does not support GRO and the feature cannot be added.
TCP throughput is not relevant.
Warning
Disabling GRO significantly reduces the receive throughput of TCP traffic. Therefore, do not disable GRO on hosts where TCP performance is relevant.

Prerequisites

The host mainly processes UDP traffic.
The application does not use GRO.
The host does not use UDP tunnel protocols, such as VXLAN.
The host does not run virtual machines (VMs) or containers.

Procedure

Optional: Display the NetworkManager connection profiles:

nmcli connection show

# nmcli connection show
NAME     UUID                                  TYPE      DEVICE
example  f2f33f29-bb5c-3a07-9069-be72eaec3ecf  ethernet  enp1s0

Copy to Clipboard

Toggle word wrap

Disable GRO support in the connection profile:

nmcli connection modify example ethtool.feature-gro off

# nmcli connection modify example ethtool.feature-gro off

Copy to Clipboard

Toggle word wrap

Reactivate the connection profile:
```
nmcli connection up example
```
```
# nmcli connection up example
```
Copy to Clipboard Toggle word wrap

Verification

Verify that GRO is disabled:

ethtool -k enp1s0 | grep generic-receive-offload

# ethtool -k enp1s0 | grep generic-receive-offload
generic-receive-offload: off

Copy to Clipboard

Toggle word wrap

Monitor the throughput on the server. Re-enable GRO in the NetworkManager profile if the setting has negative side effects to other applications on the host.

Dieser Inhalt ist in der von Ihnen ausgewählten Sprache nicht verfügbar.

Chapter 9. Avoiding listen queue lock contention

9.1. Avoiding RX queue lock contention: The SO_REUSEPORT and SO_REUSEPORT_BPF socket options
Link kopieren

9.2. Avoiding TX queue lock contention: Transmit packet steering
Link kopieren

9.3. Disabling the Generic Receive Offload feature on servers with high UDP traffic
Link kopieren

Lernen

Testen, kaufen und verkaufen

Communitys

Über Red Hat Dokumentation

Mehr Inklusion in Open Source

Über Red Hat

Theme

Red Hat legal and privacy links

Red Hat legal and privacy links

Dieser Inhalt ist in der von Ihnen ausgewählten Sprache nicht verfügbar.

Chapter 9. Avoiding listen queue lock contention

9.1. Avoiding RX queue lock contention: The SO_REUSEPORT and SO_REUSEPORT_BPF socket optionsLink kopierenLink in die Zwischenablage kopiert!

9.2. Avoiding TX queue lock contention: Transmit packet steeringLink kopierenLink in die Zwischenablage kopiert!

9.3. Disabling the Generic Receive Offload feature on servers with high UDP trafficLink kopierenLink in die Zwischenablage kopiert!

Lernen

Testen, kaufen und verkaufen

Communitys

Über Red Hat Dokumentation

Mehr Inklusion in Open Source

Über Red Hat

Theme

Red Hat legal and privacy links

Red Hat legal and privacy links

9.1. Avoiding RX queue lock contention: The SO_REUSEPORT and SO_REUSEPORT_BPF socket options
Link kopieren

9.2. Avoiding TX queue lock contention: Transmit packet steering
Link kopieren

9.3. Disabling the Generic Receive Offload feature on servers with high UDP traffic
Link kopieren