이 콘텐츠는 선택한 언어로 제공되지 않습니다.

Chapter 3. Troubleshooting Networking Issues

This chapter lists basic troubleshooting procedures connected with networking and Network Time Protocol (NTP).

3.1. Basic Networking Troubleshooting

Red Hat Ceph Storage depends heavily on a reliable network connection. Red Hat Ceph Storage nodes use the network for communicating with each other. Networking issues can cause many problems with Ceph OSDs, such as them flapping, or being incorrectly reported as down. Networking issues can also cause the Ceph Monitor’s clock skew errors. In addition, packet loss, high latency, or limited bandwidth can impact the cluster performance and stability.

Procedure: Basic Networking Troubleshooting

  1. Installing the net-tools package can help when troubleshooting network issues that can occur in a Ceph storage cluster:


    [root@mon ~]# yum install net-tools
    [root@mon ~]# yum install telnet

  2. Verify that the cluster_network and public_network parameters in the Ceph configuration file include the correct values:


    [root@mon ~]# cat /etc/ceph/ceph.conf | grep net
    cluster_network =
    public_network =

  3. Verify that the network interfaces are up:


    [root@mon ~]# ip link list
    1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000
        link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    2: enp22s0f0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000
        link/ether 40:f2:e9:b8:a0:48 brd ff:ff:ff:ff:ff:ff

  4. Verify that the Ceph nodes are able to reach each other using their short host names. Verify this on each node in the storage cluster:




    [root@mon ~]# ping osd01

  5. If you use a firewall, ensure that Ceph nodes are able to reach other on their appropriate ports. The firewall-cmd and telnet tools can validate the port status, and if the port is open respectively:


    firewall-cmd --info-zone=ZONE
    telnet IP_ADDRESS PORT


    [root@mon ~]# firewall-cmd --info-zone=public
    public (active)
      target: default
      icmp-block-inversion: no
      interfaces: enp1s0
      services: ceph ceph-mon cockpit dhcpv6-client ssh
      ports: 9100/tcp 8443/tcp 9283/tcp 3000/tcp 9092/tcp 9093/tcp 9094/tcp 9094/udp
      masquerade: no
      rich rules:
    [root@mon ~]# telnet 9100

  6. Verify that there are no errors on the interface counters. Verify that the network connectivity between nodes has expected latency, and that there is no packet loss.

    1. Using the ethtool command:


      ethtool -S INTERFACE


      [root@mon ~]# ethtool -S enp22s0f0 | grep errors
      NIC statistics:
           rx_fcs_errors: 0
           rx_align_errors: 0
           rx_frame_too_long_errors: 0
           rx_in_length_errors: 0
           rx_out_length_errors: 0
           tx_mac_errors: 0
           tx_carrier_sense_errors: 0
           tx_errors: 0
           rx_errors: 0

    2. Using the ifconfig command:


      [root@mon ~]# ifconfig
      enp22s0f0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
      inet  netmask  broadcast
      inet6 2620:52:0:8de:42f2:e9ff:feb8:a048  prefixlen 64  scopeid 0x0<global>
      inet6 fe80::42f2:e9ff:feb8:a048  prefixlen 64  scopeid 0x20<link>
      ether 40:f2:e9:b8:a0:48  txqueuelen 1000  (Ethernet)
      RX packets 4219130  bytes 2704255777 (2.5 GiB)
      RX errors 0  dropped 0  overruns 0  frame 0 1
      TX packets 1418329  bytes 738664259 (704.4 MiB)
      TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0 2
      device interrupt 16

    3. Using the netstat command:


      [root@mon ~]# netstat -ai
      Kernel Interface table
      docker0       1500       0      0      0 0           0      0      0      0 BMU
      eno2          1500       0      0      0 0           0      0      0      0 BMU
      eno3          1500       0      0      0 0           0      0      0      0 BMU
      eno4          1500       0      0      0 0           0      0      0      0 BMU
      enp0s20u13u5  1500  253277      0      0 0           0      0      0      0 BMRU
      enp22s0f0     9000  234160      0      0 0      432326      0      0      0 BMRU 1
      lo           65536   10366      0      0 0       10366      0      0      0 LRU

  7. For performance issues, in addition to the latency checks and to verify the network bandwidth between all nodes of the storage cluster, use the iperf3 tool. The iperf3 tool does a simple point-to-point network bandwidth test between a server and a client.

    1. Install the iperf3 package on the Red Hat Ceph Storage nodes you want to check the bandwidth:


      [root@mon ~]# yum install iperf3

    2. On a Red Hat Ceph Storage node, start the iperf3 server:


      [root@mon ~]# iperf3 -s
      Server listening on 5201


      The default port is 5201, but can be set using the -P command argument.

    3. On a different Red Hat Ceph Storage node, start the iperf3 client:


      [root@osd ~]# iperf3 -c mon
      Connecting to host mon, port 5201
      [  4] local xx.x.xxx.xx port 52270 connected to xx.x.xxx.xx port 5201
      [ ID] Interval           Transfer     Bandwidth       Retr  Cwnd
      [  4]   0.00-1.00   sec   114 MBytes   954 Mbits/sec    0    409 KBytes
      [  4]   1.00-2.00   sec   113 MBytes   945 Mbits/sec    0    409 KBytes
      [  4]   2.00-3.00   sec   112 MBytes   943 Mbits/sec    0    454 KBytes
      [  4]   3.00-4.00   sec   112 MBytes   941 Mbits/sec    0    471 KBytes
      [  4]   4.00-5.00   sec   112 MBytes   940 Mbits/sec    0    471 KBytes
      [  4]   5.00-6.00   sec   113 MBytes   945 Mbits/sec    0    471 KBytes
      [  4]   6.00-7.00   sec   112 MBytes   937 Mbits/sec    0    488 KBytes
      [  4]   7.00-8.00   sec   113 MBytes   947 Mbits/sec    0    520 KBytes
      [  4]   8.00-9.00   sec   112 MBytes   939 Mbits/sec    0    520 KBytes
      [  4]   9.00-10.00  sec   112 MBytes   939 Mbits/sec    0    520 KBytes
      - - - - - - - - - - - - - - - - - - - - - - - - -
      [ ID] Interval           Transfer     Bandwidth       Retr
      [  4]   0.00-10.00  sec  1.10 GBytes   943 Mbits/sec    0             sender
      [  4]   0.00-10.00  sec  1.10 GBytes   941 Mbits/sec                  receiver
      iperf Done.

      This output shows a network bandwidth of 1.1 Gbits/second between the Red Hat Ceph Storage nodes, along with no retransmissions (Retr) during the test.

      Red Hat recommends you validate the network bandwidth between all the nodes in the storage cluster.

  8. Ensure that all nodes have the same network interconnect speed. Slower attached nodes might slow down the faster connected ones. Also, ensure that the inter switch links can handle the aggregated bandwidth of the attached nodes:


    ethtool INTERFACE


    [root@mon ~]# ethtool enp22s0f0
    Settings for enp22s0f0:
    Supported ports: [ TP ]
    Supported link modes:   10baseT/Half 10baseT/Full
                            100baseT/Half 100baseT/Full
                            1000baseT/Half 1000baseT/Full
    Supported pause frame use: No
    Supports auto-negotiation: Yes
    Supported FEC modes: Not reported
    Advertised link modes:  10baseT/Half 10baseT/Full
                            100baseT/Half 100baseT/Full
                            1000baseT/Half 1000baseT/Full
    Advertised pause frame use: Symmetric
    Advertised auto-negotiation: Yes
    Advertised FEC modes: Not reported
    Link partner advertised link modes:  10baseT/Half 10baseT/Full
                                         100baseT/Half 100baseT/Full
    Link partner advertised pause frame use: Symmetric
    Link partner advertised auto-negotiation: Yes
    Link partner advertised FEC modes: Not reported
    Speed: 1000Mb/s 1
    Duplex: Full 2
    Port: Twisted Pair
    PHYAD: 1
    Transceiver: internal
    Auto-negotiation: on
    MDI-X: off
    Supports Wake-on: g
    Wake-on: d
    Current message level: 0x000000ff (255)
           drv probe link timer ifdown ifup rx_err tx_err
    Link detected: yes 3

See Also

3.2. Basic NTP Troubleshooting

This section includes basic NTP troubleshooting steps.

Procedure: Basic NTP Troubleshooting

  1. Verify that the ntpd daemon is running on the Monitor hosts:

    # systemctl status ntpd
  2. If ntpd is not running, enable and start it:

    # systemctl enable ntpd
    # systemctl start ntpd
  3. Ensure that ntpd is synchronizing the clocks correctly:

    $ ntpq -p
  4. See the How to troubleshoot NTP issues solution on the Red Hat Customer Portal for advanced NTP troubleshooting steps.

See Also

Red Hat logoGithubRedditYoutubeTwitter

자세한 정보

평가판, 구매 및 판매


Red Hat 문서 정보

Red Hat을 사용하는 고객은 신뢰할 수 있는 콘텐츠가 포함된 제품과 서비스를 통해 혁신하고 목표를 달성할 수 있습니다.

보다 포괄적 수용을 위한 오픈 소스 용어 교체

Red Hat은 코드, 문서, 웹 속성에서 문제가 있는 언어를 교체하기 위해 최선을 다하고 있습니다. 자세한 내용은 다음을 참조하세요.Red Hat 블로그.

Red Hat 소개

Red Hat은 기업이 핵심 데이터 센터에서 네트워크 에지에 이르기까지 플랫폼과 환경 전반에서 더 쉽게 작업할 수 있도록 강화된 솔루션을 제공합니다.

© 2024 Red Hat, Inc.