Chapter 18. Network determinism tips

PDF

TCP can have a large effect on latency. TCP adds latency in order to obtain efficiency, control congestion, and to ensure reliable delivery. When tuning, consider the following points:

Do you need ordered delivery?
Do you need to guard against packet loss?
Transmitting packets more than once can cause delays.
Do you need to use TCP?
Consider disabling the Nagle buffering algorithm by using TCP_NODELAY on your socket. The Nagle algorithm collects small outgoing packets to send all at once, and can have a detrimental effect on latency.

18.1. Optimizing RHEL for latency or throughput-sensitive services

The goal of coalesce tuning is to minimize the number of interrupts required for a given workload. In high-throughput situations, the goal is to have as few interrupts as possible while maintaining a high data rate. In low-latency situations, more interrupts can be used to handle traffic quickly.

You can adjust the settings on your network card to increase or decrease the number of packets that are combined into a single interrupt. As a result, you can achieve improved throughput or latency for your traffic.

Procedure

Identify the network interface that is experiencing the bottleneck:
```
# ethtool -S enp1s0
NIC statistics:
     rx_packets: 1234
     tx_packets: 5678
     rx_bytes: 12345678
     tx_bytes: 87654321
     rx_errors: 0
     tx_errors: 0
     rx_missed: 0
     tx_dropped: 0
     coalesced_pkts: 0
     coalesced_events: 0
     coalesced_aborts: 0
```
Identify the packet counters containing "drop", "discard", or "error" in their name. These particular statistics measure the actual packet loss at the network interface card (NIC) packet buffer, which can be caused by NIC coalescence.
Monitor values of packet counters you identified in the previous step.
Compare them to the expected values for your network to determine whether any particular interface experiences a bottleneck. Some common signs of a network bottleneck include, but are not limited to:
- Many errors on a network interface
- High packet loss
- Heavy usage of the network interface
  Note
  Other important factors are for example CPU usage, memory usage, and disk I/O when identifying a network bottleneck.

View the current coalescence settings:

# ethtool enp1s0
Settings for enp1s0:
        Supported ports: [ TP ]
        Supported link modes:   10baseT/Half 10baseT/Full
                                100baseT/Half 100baseT/Full
                                1000baseT/Full
        Supported pause frame use: No
        Supports auto-negotiation: Yes
        Advertised link modes:  10baseT/Half 10baseT/Full
                                100baseT/Half 100baseT/Full
                                1000baseT/Full
        Advertised pause frame use: No
        Advertised auto-negotiation: Yes
        Speed: 1000Mb/s
        Duplex: Full
        Port: Twisted Pair
        PHYAD: 0
        Transceiver: internal
        Auto-negotiation: on
        MDI-X: Unknown
        Supports Wake-on: g
        Wake-on: g
        Current message level: 0x00000033 (51)
                               drv probe link
        Link detected: yes

In this output, monitor the Speed and Duplex fields. These fields display information about the network interface operation and whether it is running at its expected values.

Check the current interrupt coalescence settings:
```
# ethtool -c enp1s0
Coalesce parameters for enp1s0:
        Adaptive RX: off
        Adaptive TX: off
        RX usecs: 100
        RX frames: 8
        RX usecs irq: 100
        RX frames irq: 8
        TX usecs: 100
        TX frames: 8
        TX usecs irq: 100
        TX frames irq: 8
```
- The usecs values refer to the number of microseconds that the receiver or transmitter waits before generating an interrupt.
- The frames values refer to the number of frames that the receiver or transmitter waits before generating an interrupt.
- The irq values are used to configure the interrupt moderation when the network interface is already handling an interrupt.
  Note
  Not all network interface cards support reporting and changing all values from the example output.
- The Adaptive RX/TX value represents the adaptive interrupt coalescence mechanism, which adjusts the interrupt coalescence settings dynamically. Based on the packet conditions, the NIC driver auto-calculates coalesce values when Adaptive RX/TX are enabled (the algorithm differs for every NIC driver).
Modify the coalescence settings as needed. For example:
- While ethtool.coalesce-adaptive-rx is disabled, configure ethtool.coalesce-rx-usecs to set the delay before generating an interrupt to 100 microseconds for the RX packets:
```
# nmcli connection modify enp1s0 ethtool.coalesce-rx-usecs 100
```
- Enable ethtool.coalesce-adaptive-rx while ethtool.coalesce-rx-usecs is set to its default value:
```
# nmcli connection modify enp1s0 ethtool.coalesce-adaptive-rx on
```
  Red Hat recommends that modifying the Adaptive-RX setting as follows:
  - Users concerned with low latency (sub-50us) should not enable Adaptive-RX.
  - Users concerned with throughput can probably enable Adaptive-RX with no harm. If they do not want to use the adaptive interrupt coalescence mechanism, they can try setting large values like 100us, or 250us to ethtool.coalesce-rx-usecs.
  - Users unsure about their needs should not modify this setting until an issue occurs.
Re-activate the connection:
```
# nmcli connection up enp1s0
```

Verification steps

Monitor the network performance and check for dropped packets:
```
# ethtool -S enp1s0
NIC statistics:
     rx_packets: 1234
     tx_packets: 5678
     rx_bytes: 12345678
     tx_bytes: 87654321
     rx_errors: 0
     tx_errors: 0
     rx_missed: 0
     tx_dropped: 0
     coalesced_pkts: 12
     coalesced_events: 34
     coalesced_aborts: 56
...
```
The value of the rx_errors, rx_dropped, tx_errors, and tx_dropped fields should be 0 or close to it (up to few hundreds, depending on the network traffic and system resources). A high value in these fields indicates a network problem. Your counters can have different names. Closely monitor packet counters containing "drop", "discard", or "error" in their name.
The value of the rx_packets, tx_packets, rx_bytes, and tx_bytes should increase over time. If the values do not increase, there might be a network problem. The packet counters can have different names, depending on your NIC driver.
Important
The ethtool command output can vary depending on the NIC and driver in use.
Users with focus on extremely low latency can use application-level metrics or the kernel packet time-stamping API for their monitoring purposes.

Additional resources

18.2. Flow control for Ethernet networks

On an Ethernet link, continuous data transmission between a network interface and a switch port can lead to full buffer capacity. Full buffer capacity results in network congestion. In this case, when the sender transmits data at a higher rate than the processing capacity of the receiver, packet loss can occur due to the lower data processing capacity of a network interface on the other end of the link which is a switch port.

The flow control mechanism manages data transmission across the Ethernet link where each sender and receiver has different sending and receiving capacities. To avoid packet loss, the Ethernet flow control mechanism temporarily suspends the packet transmission to manage a higher transmission rate from a switch port. Note that routers do not forward pause frames beyond a switch port.

When receive (RX) buffers become full, a receiver sends pause frames to the transmitter. The transmitter then stops data transmission for a short sub-second time frame, while continuing to buffer incoming data during this pause period. This duration provides enough time for the receiver to empty its interface buffers and prevent buffer overflow.

Note

Either end of the Ethernet link can send pause frames to another end. If the receive buffers of a network interface are full, the network interface will send pause frames to the switch port. Similarly, when the receive buffers of a switch port are full, the switch port sends pause frames to the network interface.

By default, most of the network drivers in Red Hat Enterprise Linux have pause frame support enabled. To display the current settings of a network interface, enter:

# ethtool --show-pause enp1s0
Pause parameters for enp1s0:
...
RX:     on
TX:     on
...

Verify with your switch vendor to confirm if your switch supports pause frames.

Additional resources

ethtool(8) man page
What is network link flow control and how does it work in Red Hat Enterprise Linux?

18.3. Additional resources

ethtool(8) man page
netstat(8) man page

Chapter 18. Network determinism tips

18.1. Optimizing RHEL for latency or throughput-sensitive services

18.2. Flow control for Ethernet networks

18.3. Additional resources

Learn

Try, buy, & sell

Communities

About Red Hat Documentation

Making open source more inclusive

About Red Hat

Red Hat legal and privacy links

Red Hat legal and privacy links