Chapter 24. Network determinism tips
TCP can have a large effect on latency. TCP adds latency in order to obtain efficiency, control congestion, and to ensure reliable delivery. When tuning, consider the following points:
- Do you need ordered delivery?
Do you need to guard against packet loss?
Transmitting packets more than once can cause delays.
Do you need to use TCP?
Consider disabling the Nagle buffering algorithm by using
TCP_NODELAYon your socket. The Nagle algorithm collects small outgoing packets to send all at once, and can have a detrimental effect on latency.
24.1. Optimizing RHEL for latency or throughput-sensitive services Copy linkLink copied to clipboard!
Coalesce tuning aims to minimize interrupts for a given workload. In high-throughput situations, the goal is to use as few interrupts as possible while maintaining a high data rate. In low-latency situations, more interrupts can be used to handle traffic quickly.
You can adjust the settings on your network card to increase or decrease the number of packets that are combined into a single interrupt. As a result, you can achieve improved throughput or latency for your traffic.
Procedure
Identify the network interface that is experiencing the bottleneck:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Identify the packet counters containing
drop,discard, orerrorin their name. These particular statistics measure the actual packet loss at the network interface card (NIC) packet buffer, which can be caused by NIC coalescence.Monitor values of packet counters you identified in the previous step.
Compare them to the expected values for your network to determine whether any particular interface experiences a bottleneck. Some common signs of a network bottleneck include, but are not limited to:
- Many errors on a network interface
- High packet loss
Heavy usage of the network interface
NoteOther important factors are for example CPU usage, memory usage, and disk I/O when identifying a network bottleneck.
Check the current interrupt coalescence settings:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow -
The
usecsvalues refer to the number of microseconds that the receiver or transmitter waits before generating an interrupt. -
The
framesvalues refer to the number of frames that the receiver or transmitter waits before generating an interrupt. The
irqvalues are used to configure the interrupt moderation when the network interface is already handling an interrupt.NoteNot all network interface cards support reporting and changing all values from the example output.
-
The
Adaptive RX/TXvalue represents the adaptive interrupt coalescence mechanism, which adjusts the interrupt coalescence settings dynamically. Based on the packet conditions, the NIC driver auto-calculates coalesce values whenAdaptive RX/TXare enabled (the algorithm differs for every NIC driver).
-
The
Modify the coalescence settings as needed. For example:
While
ethtool.coalesce-adaptive-rxis disabled, configureethtool.coalesce-rx-usecsto set the delay before generating an interrupt to 100 microseconds for the RX packets:nmcli connection modify enp1s0 ethtool.coalesce-rx-usecs 100
# nmcli connection modify enp1s0 ethtool.coalesce-rx-usecs 100Copy to Clipboard Copied! Toggle word wrap Toggle overflow Enable
ethtool.coalesce-adaptive-rxwhileethtool.coalesce-rx-usecsis set to its default value:nmcli connection modify enp1s0 ethtool.coalesce-adaptive-rx on
# nmcli connection modify enp1s0 ethtool.coalesce-adaptive-rx onCopy to Clipboard Copied! Toggle word wrap Toggle overflow Modify the Adaptive-RX setting as follows:
-
Users concerned with low latency (sub-50us) should not enable
Adaptive-RX. -
Users concerned with throughput can probably enable
Adaptive-RXwith no harm. If they do not want to use the adaptive interrupt coalescence mechanism, they can try setting large values like 100us, or 250us toethtool.coalesce-rx-usecs. - Users unsure about their needs should not modify this setting until an issue occurs.
-
Users concerned with low latency (sub-50us) should not enable
Re-activate the connection:
nmcli connection up enp1s0
# nmcli connection up enp1s0Copy to Clipboard Copied! Toggle word wrap Toggle overflow
Verification
Monitor the network performance and check for dropped packets:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow The value of the
rx_errors,rx_dropped,tx_errors, andtx_droppedfields should be 0 or close to it (up to few hundreds, depending on the network traffic and system resources). A high value in these fields indicates a network problem. Your counters can have different names. Closely monitor packet counters containing "drop", "discard", or "error" in their name.The value of the
rx_packets,tx_packets,rx_bytes, andtx_bytesshould increase over time. If the values do not increase, there might be a network problem. The packet counters can have different names, depending on your NIC driver.ImportantThe
ethtoolcommand output can vary depending on the NIC and driver in use.Users with focus on extremely low latency can use application-level metrics or the kernel packet time-stamping API for their monitoring purposes.
24.2. Flow control for Ethernet networks Copy linkLink copied to clipboard!
On an Ethernet link, continuous data transmission can fill buffers and cause network congestion. If the sender’s rate exceeds the receiver’s processing capacity, packet loss can occur due to the switch port’s lower data processing capacity.
The flow control mechanism manages data transmission across the Ethernet link where each sender and receiver has different sending and receiving capacities. To avoid packet loss, the Ethernet flow control mechanism temporarily suspends the packet transmission to manage a higher transmission rate from a switch port. Note that switches do not forward pause frames beyond a switch port.
When receive (RX) buffers become full, a receiver sends pause frames to the transmitter. The transmitter then stops data transmission for a short sub-second time frame, while continuing to buffer incoming data during this pause period. This duration provides enough time for the receiver to empty its interface buffers and prevent buffer overflow.
Either end of the Ethernet link can send pause frames to another end. If the receive buffers of a network interface are full, the network interface will send pause frames to the switch port. Similarly, when the receive buffers of a switch port are full, the switch port sends pause frames to the network interface.
By default, most of the network drivers in Red Hat Enterprise Linux have pause frame support enabled. To display the current settings of a network interface, enter:
Verify with your switch vendor to confirm if your switch supports pause frames.