Search

Chapter 18. Network determinism tips

download PDF

TCP can have a large effect on latency. TCP adds latency in order to obtain efficiency, control congestion, and to ensure reliable delivery. When tuning, consider the following points:

  • Do you need ordered delivery?
  • Do you need to guard against packet loss?

    Transmitting packets more than once can cause delays.

  • Do you need to use TCP?

    Consider disabling the Nagle buffering algorithm by using TCP_NODELAY on your socket. The Nagle algorithm collects small outgoing packets to send all at once, and can have a detrimental effect on latency.

18.1. Optimizing RHEL for latency or throughput-sensitive services

The goal of coalesce tuning is to minimize the number of interrupts required for a given workload. In high-throughput situations, the goal is to have as few interrupts as possible while maintaining a high data rate. In low-latency situations, more interrupts can be used to handle traffic quickly.

You can adjust the settings on your network card to increase or decrease the number of packets that are combined into a single interrupt. As a result, you can achieve improved throughput or latency for your traffic.

Procedure

  1. Identify the network interface that is experiencing the bottleneck:

    # ethtool -S enp1s0
    NIC statistics:
         rx_packets: 1234
         tx_packets: 5678
         rx_bytes: 12345678
         tx_bytes: 87654321
         rx_errors: 0
         tx_errors: 0
         rx_missed: 0
         tx_dropped: 0
         coalesced_pkts: 0
         coalesced_events: 0
         coalesced_aborts: 0

    Identify the packet counters containing "drop", "discard", or "error" in their name. These particular statistics measure the actual packet loss at the network interface card (NIC) packet buffer, which can be caused by NIC coalescence.

  2. Monitor values of packet counters you identified in the previous step.

    Compare them to the expected values for your network to determine whether any particular interface experiences a bottleneck. Some common signs of a network bottleneck include, but are not limited to:

    • Many errors on a network interface
    • High packet loss
    • Heavy usage of the network interface

      Note

      Other important factors are for example CPU usage, memory usage, and disk I/O when identifying a network bottleneck.

  3. View the current coalescence settings:

    # ethtool enp1s0
    Settings for enp1s0:
            Supported ports: [ TP ]
            Supported link modes:   10baseT/Half 10baseT/Full
                                    100baseT/Half 100baseT/Full
                                    1000baseT/Full
            Supported pause frame use: No
            Supports auto-negotiation: Yes
            Advertised link modes:  10baseT/Half 10baseT/Full
                                    100baseT/Half 100baseT/Full
                                    1000baseT/Full
            Advertised pause frame use: No
            Advertised auto-negotiation: Yes
            Speed: 1000Mb/s
            Duplex: Full
            Port: Twisted Pair
            PHYAD: 0
            Transceiver: internal
            Auto-negotiation: on
            MDI-X: Unknown
            Supports Wake-on: g
            Wake-on: g
            Current message level: 0x00000033 (51)
                                   drv probe link
            Link detected: yes

    In this output, monitor the Speed and Duplex fields. These fields display information about the network interface operation and whether it is running at its expected values.

  4. Check the current interrupt coalescence settings:

    # ethtool -c enp1s0
    Coalesce parameters for enp1s0:
            Adaptive RX: off
            Adaptive TX: off
            RX usecs: 100
            RX frames: 8
            RX usecs irq: 100
            RX frames irq: 8
            TX usecs: 100
            TX frames: 8
            TX usecs irq: 100
            TX frames irq: 8
    • The usecs values refer to the number of microseconds that the receiver or transmitter waits before generating an interrupt.
    • The frames values refer to the number of frames that the receiver or transmitter waits before generating an interrupt.
    • The irq values are used to configure the interrupt moderation when the network interface is already handling an interrupt.

      Note

      Not all network interface cards support reporting and changing all values from the example output.

    • The Adaptive RX/TX value represents the adaptive interrupt coalescence mechanism, which adjusts the interrupt coalescence settings dynamically. Based on the packet conditions, the NIC driver auto-calculates coalesce values when Adaptive RX/TX are enabled (the algorithm differs for every NIC driver).
  5. Modify the coalescence settings as needed. For example:

    • While ethtool.coalesce-adaptive-rx is disabled, configure ethtool.coalesce-rx-usecs to set the delay before generating an interrupt to 100 microseconds for the RX packets:

      # nmcli connection modify enp1s0 ethtool.coalesce-rx-usecs 100
    • Enable ethtool.coalesce-adaptive-rx while ethtool.coalesce-rx-usecs is set to its default value:

      # nmcli connection modify enp1s0 ethtool.coalesce-adaptive-rx on

      Red Hat recommends that modifying the Adaptive-RX setting as follows:

      • Users concerned with low latency (sub-50us) should not enable Adaptive-RX.
      • Users concerned with throughput can probably enable Adaptive-RX with no harm. If they do not want to use the adaptive interrupt coalescence mechanism, they can try setting large values like 100us, or 250us to ethtool.coalesce-rx-usecs.
      • Users unsure about their needs should not modify this setting until an issue occurs.
  6. Re-activate the connection:

    # nmcli connection up enp1s0

Verification steps

  • Monitor the network performance and check for dropped packets:

    # ethtool -S enp1s0
    NIC statistics:
         rx_packets: 1234
         tx_packets: 5678
         rx_bytes: 12345678
         tx_bytes: 87654321
         rx_errors: 0
         tx_errors: 0
         rx_missed: 0
         tx_dropped: 0
         coalesced_pkts: 12
         coalesced_events: 34
         coalesced_aborts: 56
    ...

    The value of the rx_errors, rx_dropped, tx_errors, and tx_dropped fields should be 0 or close to it (up to few hundreds, depending on the network traffic and system resources). A high value in these fields indicates a network problem. Your counters can have different names. Closely monitor packet counters containing "drop", "discard", or "error" in their name.

    The value of the rx_packets, tx_packets, rx_bytes, and tx_bytes should increase over time. If the values do not increase, there might be a network problem. The packet counters can have different names, depending on your NIC driver.

    Important

    The ethtool command output can vary depending on the NIC and driver in use.

    Users with focus on extremely low latency can use application-level metrics or the kernel packet time-stamping API for their monitoring purposes.

18.2. Flow control for Ethernet networks

On an Ethernet link, continuous data transmission between a network interface and a switch port can lead to full buffer capacity. Full buffer capacity results in network congestion. In this case, when the sender transmits data at a higher rate than the processing capacity of the receiver, packet loss can occur due to the lower data processing capacity of a network interface on the other end of the link which is a switch port.

The flow control mechanism manages data transmission across the Ethernet link where each sender and receiver has different sending and receiving capacities. To avoid packet loss, the Ethernet flow control mechanism temporarily suspends the packet transmission to manage a higher transmission rate from a switch port. Note that routers do not forward pause frames beyond a switch port.

When receive (RX) buffers become full, a receiver sends pause frames to the transmitter. The transmitter then stops data transmission for a short sub-second time frame, while continuing to buffer incoming data during this pause period. This duration provides enough time for the receiver to empty its interface buffers and prevent buffer overflow.

Note

Either end of the Ethernet link can send pause frames to another end. If the receive buffers of a network interface are full, the network interface will send pause frames to the switch port. Similarly, when the receive buffers of a switch port are full, the switch port sends pause frames to the network interface.

By default, most of the network drivers in Red Hat Enterprise Linux have pause frame support enabled. To display the current settings of a network interface, enter:

# ethtool --show-pause enp1s0
Pause parameters for enp1s0:
...
RX:     on
TX:     on
...

Verify with your switch vendor to confirm if your switch supports pause frames.

18.3. Additional resources

  • ethtool(8) man page
  • netstat(8) man page
Red Hat logoGithubRedditYoutubeTwitter

Learn

Try, buy, & sell

Communities

About Red Hat Documentation

We help Red Hat users innovate and achieve their goals with our products and services with content they can trust.

Making open source more inclusive

Red Hat is committed to replacing problematic language in our code, documentation, and web properties. For more details, see the Red Hat Blog.

About Red Hat

We deliver hardened solutions that make it easier for enterprises to work across platforms and environments, from the core datacenter to the network edge.

© 2024 Red Hat, Inc.