Home
Products
Red Hat Ceph Storage
4
Troubleshooting Guide
Chapter 3. Troubleshooting networking issues

Chapter 3. Troubleshooting networking issues

This chapter lists basic troubleshooting procedures connected with networking and Network Time Protocol (NTP).

3.1. Prerequisites
Copy link

A running Red Hat Ceph Storage cluster.

3.2. Basic networking troubleshooting
Copy link

Red Hat Ceph Storage depends heavily on a reliable network connection. Red Hat Ceph Storage nodes use the network for communicating with each other. Networking issues can cause many problems with Ceph OSDs, such as them flapping, or being incorrectly reported as down. Networking issues can also cause the Ceph Monitor’s clock skew errors. In addition, packet loss, high latency, or limited bandwidth can impact the cluster performance and stability.

Prerequisites

Root-level access to the node.

Procedure

Installing the net-tools and telnet packages can help when troubleshooting network issues that can occur in a Ceph storage cluster:
Red Hat Enterprise Linux 7
```
yum install net-tools
yum install telnet
```
```
[root@mon ~]# yum install net-tools
[root@mon ~]# yum install telnet
```
Copy to Clipboard Toggle word wrap
Red Hat Enterprise Linux 8
```
dnf install net-tools
dnf install telnet
```
```
[root@mon ~]# dnf install net-tools
[root@mon ~]# dnf install telnet
```
Copy to Clipboard Toggle word wrap

Verify that the cluster_network and public_network parameters in the Ceph configuration file include the correct values:

Example

[root@mon ~]# cat /etc/ceph/ceph.conf | grep net
cluster_network = 192.168.1.0/24
public_network = 192.168.0.0/24

[root@mon ~]# cat /etc/ceph/ceph.conf | grep net
cluster_network = 192.168.1.0/24
public_network = 192.168.0.0/24

Copy to Clipboard

Toggle word wrap

Verify that the network interfaces are up:

Example

ip link list
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
2: enp22s0f0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000
    link/ether 40:f2:e9:b8:a0:48 brd ff:ff:ff:ff:ff:ff

[root@mon ~]# ip link list
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
2: enp22s0f0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000
    link/ether 40:f2:e9:b8:a0:48 brd ff:ff:ff:ff:ff:ff

Copy to Clipboard

Toggle word wrap

Verify that the Ceph nodes are able to reach each other using their short host names. Verify this on each node in the storage cluster:
Syntax
```
ping SHORT_HOST_NAME
```
```
ping SHORT_HOST_NAME
```
Copy to Clipboard Toggle word wrap
Example
```
ping osd01
```
```
[root@mon ~]# ping osd01
```
Copy to Clipboard Toggle word wrap

If you use a firewall, ensure that Ceph nodes are able to reach other on their appropriate ports. The firewall-cmd and telnet tools can validate the port status, and if the port is open respectively:

Syntax

firewall-cmd --info-zone=ZONE
telnet IP_ADDRESS PORT

firewall-cmd --info-zone=ZONE
telnet IP_ADDRESS PORT

Copy to Clipboard

Toggle word wrap

Example

firewall-cmd --info-zone=public
public (active)
  target: default
  icmp-block-inversion: no
  interfaces: enp1s0
  sources: 192.168.0.0/24
  services: ceph ceph-mon cockpit dhcpv6-client ssh
  ports: 9100/tcp 8443/tcp 9283/tcp 3000/tcp 9092/tcp 9093/tcp 9094/tcp 9094/udp
  protocols:
  masquerade: no
  forward-ports:
  source-ports:
  icmp-blocks:
  rich rules:

telnet 192.168.0.22 9100

[root@mon ~]# firewall-cmd --info-zone=public
public (active)
  target: default
  icmp-block-inversion: no
  interfaces: enp1s0
  sources: 192.168.0.0/24
  services: ceph ceph-mon cockpit dhcpv6-client ssh
  ports: 9100/tcp 8443/tcp 9283/tcp 3000/tcp 9092/tcp 9093/tcp 9094/tcp 9094/udp
  protocols:
  masquerade: no
  forward-ports:
  source-ports:
  icmp-blocks:
  rich rules:

[root@mon ~]# telnet 192.168.0.22 9100

Copy to Clipboard

Toggle word wrap

Verify that there are no errors on the interface counters. Verify that the network connectivity between nodes has expected latency, and that there is no packet loss.

Using the ethtool command:

Syntax

ethtool -S INTERFACE

ethtool -S INTERFACE

Copy to Clipboard

Toggle word wrap

Example

ethtool -S enp22s0f0 | grep errors
NIC statistics:
     rx_fcs_errors: 0
     rx_align_errors: 0
     rx_frame_too_long_errors: 0
     rx_in_length_errors: 0
     rx_out_length_errors: 0
     tx_mac_errors: 0
     tx_carrier_sense_errors: 0
     tx_errors: 0
     rx_errors: 0

[root@mon ~]# ethtool -S enp22s0f0 | grep errors
NIC statistics:
     rx_fcs_errors: 0
     rx_align_errors: 0
     rx_frame_too_long_errors: 0
     rx_in_length_errors: 0
     rx_out_length_errors: 0
     tx_mac_errors: 0
     tx_carrier_sense_errors: 0
     tx_errors: 0
     rx_errors: 0

Copy to Clipboard

Toggle word wrap

Using the ifconfig command:

Example

ifconfig
enp22s0f0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
inet 10.8.222.13  netmask 255.255.254.0  broadcast 10.8.223.255
inet6 2620:52:0:8de:42f2:e9ff:feb8:a048  prefixlen 64  scopeid 0x0<global>
inet6 fe80::42f2:e9ff:feb8:a048  prefixlen 64  scopeid 0x20<link>
ether 40:f2:e9:b8:a0:48  txqueuelen 1000  (Ethernet)
RX packets 4219130  bytes 2704255777 (2.5 GiB)
RX errors 0  dropped 0  overruns 0  frame 0
TX packets 1418329  bytes 738664259 (704.4 MiB)
TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0
device interrupt 16

[root@mon ~]# ifconfig
enp22s0f0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
inet 10.8.222.13  netmask 255.255.254.0  broadcast 10.8.223.255
inet6 2620:52:0:8de:42f2:e9ff:feb8:a048  prefixlen 64  scopeid 0x0<global>
inet6 fe80::42f2:e9ff:feb8:a048  prefixlen 64  scopeid 0x20<link>
ether 40:f2:e9:b8:a0:48  txqueuelen 1000  (Ethernet)
RX packets 4219130  bytes 2704255777 (2.5 GiB)
RX errors 0  dropped 0  overruns 0  frame 0


TX packets 1418329  bytes 738664259 (704.4 MiB)
TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0


device interrupt 16

Copy to Clipboard

Toggle word wrap

Using the netstat command:

Example

netstat -ai
Kernel Interface table
Iface          MTU   RX-OK RX-ERR RX-DRP RX-OVR  TX-OK TX-ERR TX-DRP TX-OVR Flg
docker0       1500       0      0      0 0           0      0      0      0 BMU
eno2          1500       0      0      0 0           0      0      0      0 BMU
eno3          1500       0      0      0 0           0      0      0      0 BMU
eno4          1500       0      0      0 0           0      0      0      0 BMU
enp0s20u13u5  1500  253277      0      0 0           0      0      0      0 BMRU
enp22s0f0     9000  234160      0      0 0      432326      0      0      0 BMRU
lo           65536   10366      0      0 0       10366      0      0      0 LRU

[root@mon ~]# netstat -ai
Kernel Interface table
Iface          MTU   RX-OK RX-ERR RX-DRP RX-OVR  TX-OK TX-ERR TX-DRP TX-OVR Flg
docker0       1500       0      0      0 0           0      0      0      0 BMU
eno2          1500       0      0      0 0           0      0      0      0 BMU
eno3          1500       0      0      0 0           0      0      0      0 BMU
eno4          1500       0      0      0 0           0      0      0      0 BMU
enp0s20u13u5  1500  253277      0      0 0           0      0      0      0 BMRU
enp22s0f0     9000  234160      0      0 0      432326      0      0      0 BMRU


lo           65536   10366      0      0 0       10366      0      0      0 LRU

Copy to Clipboard

Toggle word wrap

For performance issues, in addition to the latency checks and to verify the network bandwidth between all nodes of the storage cluster, use the iperf3 tool. The iperf3 tool does a simple point-to-point network bandwidth test between a server and a client.

Install the iperf3 package on the Red Hat Ceph Storage nodes you want to check the bandwidth:
Red Hat Enterprise Linux 7
```
yum install iperf3
```
```
[root@mon ~]# yum install iperf3
```
Copy to Clipboard Toggle word wrap
Red Hat Enterprise Linux 8
```
dnf install iperf3
```
```
[root@mon ~]# dnf install iperf3
```
Copy to Clipboard Toggle word wrap

On a Red Hat Ceph Storage node, start the iperf3 server:

Example

iperf3 -s
-----------------------------------------------------------
Server listening on 5201
-----------------------------------------------------------

[root@mon ~]# iperf3 -s
-----------------------------------------------------------
Server listening on 5201
-----------------------------------------------------------

Copy to Clipboard

Toggle word wrap

Note

The default port is 5201, but can be set using the -P command argument.

On a different Red Hat Ceph Storage node, start the iperf3 client:

Example

iperf3 -c mon
Connecting to host mon, port 5201
[  4] local xx.x.xxx.xx port 52270 connected to xx.x.xxx.xx port 5201
[ ID] Interval           Transfer     Bandwidth       Retr  Cwnd
[  4]   0.00-1.00   sec   114 MBytes   954 Mbits/sec    0    409 KBytes
[  4]   1.00-2.00   sec   113 MBytes   945 Mbits/sec    0    409 KBytes
[  4]   2.00-3.00   sec   112 MBytes   943 Mbits/sec    0    454 KBytes
[  4]   3.00-4.00   sec   112 MBytes   941 Mbits/sec    0    471 KBytes
[  4]   4.00-5.00   sec   112 MBytes   940 Mbits/sec    0    471 KBytes
[  4]   5.00-6.00   sec   113 MBytes   945 Mbits/sec    0    471 KBytes
[  4]   6.00-7.00   sec   112 MBytes   937 Mbits/sec    0    488 KBytes
[  4]   7.00-8.00   sec   113 MBytes   947 Mbits/sec    0    520 KBytes
[  4]   8.00-9.00   sec   112 MBytes   939 Mbits/sec    0    520 KBytes
[  4]   9.00-10.00  sec   112 MBytes   939 Mbits/sec    0    520 KBytes
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bandwidth       Retr
[  4]   0.00-10.00  sec  1.10 GBytes   943 Mbits/sec    0             sender
[  4]   0.00-10.00  sec  1.10 GBytes   941 Mbits/sec                  receiver

iperf Done.

[root@osd ~]# iperf3 -c mon
Connecting to host mon, port 5201
[  4] local xx.x.xxx.xx port 52270 connected to xx.x.xxx.xx port 5201
[ ID] Interval           Transfer     Bandwidth       Retr  Cwnd
[  4]   0.00-1.00   sec   114 MBytes   954 Mbits/sec    0    409 KBytes
[  4]   1.00-2.00   sec   113 MBytes   945 Mbits/sec    0    409 KBytes
[  4]   2.00-3.00   sec   112 MBytes   943 Mbits/sec    0    454 KBytes
[  4]   3.00-4.00   sec   112 MBytes   941 Mbits/sec    0    471 KBytes
[  4]   4.00-5.00   sec   112 MBytes   940 Mbits/sec    0    471 KBytes
[  4]   5.00-6.00   sec   113 MBytes   945 Mbits/sec    0    471 KBytes
[  4]   6.00-7.00   sec   112 MBytes   937 Mbits/sec    0    488 KBytes
[  4]   7.00-8.00   sec   113 MBytes   947 Mbits/sec    0    520 KBytes
[  4]   8.00-9.00   sec   112 MBytes   939 Mbits/sec    0    520 KBytes
[  4]   9.00-10.00  sec   112 MBytes   939 Mbits/sec    0    520 KBytes
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bandwidth       Retr
[  4]   0.00-10.00  sec  1.10 GBytes   943 Mbits/sec    0             sender
[  4]   0.00-10.00  sec  1.10 GBytes   941 Mbits/sec                  receiver

iperf Done.

Copy to Clipboard

Toggle word wrap

This output shows a network bandwidth of 1.1 Gbits/second between the Red Hat Ceph Storage nodes, along with no retransmissions (Retr) during the test.

Red Hat recommends you validate the network bandwidth between all the nodes in the storage cluster.

Ensure that all nodes have the same network interconnect speed. Slower attached nodes might slow down the faster connected ones. Also, ensure that the inter switch links can handle the aggregated bandwidth of the attached nodes:

Syntax

ethtool INTERFACE

ethtool INTERFACE

Copy to Clipboard

Toggle word wrap

Example

ethtool enp22s0f0
Settings for enp22s0f0:
Supported ports: [ TP ]
Supported link modes:   10baseT/Half 10baseT/Full
                        100baseT/Half 100baseT/Full
                        1000baseT/Half 1000baseT/Full
Supported pause frame use: No
Supports auto-negotiation: Yes
Supported FEC modes: Not reported
Advertised link modes:  10baseT/Half 10baseT/Full
                        100baseT/Half 100baseT/Full
                        1000baseT/Half 1000baseT/Full
Advertised pause frame use: Symmetric
Advertised auto-negotiation: Yes
Advertised FEC modes: Not reported
Link partner advertised link modes:  10baseT/Half 10baseT/Full
                                     100baseT/Half 100baseT/Full
                                     1000baseT/Full
Link partner advertised pause frame use: Symmetric
Link partner advertised auto-negotiation: Yes
Link partner advertised FEC modes: Not reported
Speed: 1000Mb/s
Duplex: Full
Port: Twisted Pair
PHYAD: 1
Transceiver: internal
Auto-negotiation: on
MDI-X: off
Supports Wake-on: g
Wake-on: d
Current message level: 0x000000ff (255)
       drv probe link timer ifdown ifup rx_err tx_err
Link detected: yes

[root@mon ~]# ethtool enp22s0f0
Settings for enp22s0f0:
Supported ports: [ TP ]
Supported link modes:   10baseT/Half 10baseT/Full
                        100baseT/Half 100baseT/Full
                        1000baseT/Half 1000baseT/Full
Supported pause frame use: No
Supports auto-negotiation: Yes
Supported FEC modes: Not reported
Advertised link modes:  10baseT/Half 10baseT/Full
                        100baseT/Half 100baseT/Full
                        1000baseT/Half 1000baseT/Full
Advertised pause frame use: Symmetric
Advertised auto-negotiation: Yes
Advertised FEC modes: Not reported
Link partner advertised link modes:  10baseT/Half 10baseT/Full
                                     100baseT/Half 100baseT/Full
                                     1000baseT/Full
Link partner advertised pause frame use: Symmetric
Link partner advertised auto-negotiation: Yes
Link partner advertised FEC modes: Not reported
Speed: 1000Mb/s


Duplex: Full


Port: Twisted Pair
PHYAD: 1
Transceiver: internal
Auto-negotiation: on
MDI-X: off
Supports Wake-on: g
Wake-on: d
Current message level: 0x000000ff (255)
       drv probe link timer ifdown ifup rx_err tx_err
Link detected: yes

Copy to Clipboard

Toggle word wrap

Additional Resources

See the Basic Network troubleshooting solution on the Customer Portal for details.
See the Verifying and configuring the MTU value section in the Red Hat Ceph Storage Configuration Guide.
See the Configuring Firewall section in the Red Hat Ceph Storage Installation Guide.
See the What is the "ethtool" command and how can I use it to obtain information about my network devices and interfaces for details.
See the RHEL network interface dropping packets solutions on the Customer Portal for details.
For details, see the What are the performance benchmarking tools available for Red Hat Ceph Storage? solution on the Customer Portal.
The Networking Guide for Red Hat Enterprise Linux 7.
For more information, see Knowledgebase articles and solutions related to troubleshooting networking issues on the Customer Portal.

3.3. Basic chrony NTP troubleshooting
Copy link

This section includes basic chrony troubleshooting steps.

Prerequisites

A running Red Hat Ceph Storage cluster.
Root-level access to the Ceph Monitor node.

Procedure

Verify that the chronyd daemon is running on the Ceph Monitor hosts:
Example
```
systemctl status chronyd
```
```
[root@mon ~]# systemctl status chronyd
```
Copy to Clipboard Toggle word wrap

If chronyd is not running, enable and start it:

Example

systemctl enable chronyd
systemctl start chronyd

[root@mon ~]# systemctl enable chronyd
[root@mon ~]# systemctl start chronyd

Copy to Clipboard

Toggle word wrap

Ensure that chronyd is synchronizing the clocks correctly:

Example

chronyc sources
chronyc sourcestats
chronyc tracking

[root@mon ~]# chronyc sources
[root@mon ~]# chronyc sourcestats
[root@mon ~]# chronyc tracking

Copy to Clipboard

Toggle word wrap

3.4. Basic NTP troubleshooting
Copy link

This section includes basic NTP troubleshooting steps.

Prerequisites

A running Red Hat Ceph Storage cluster.
Root-level access to the Ceph Monitor node.

Procedure

Verify that the ntpd daemon is running on the Ceph Monitor hosts:
Example
```
systemctl status ntpd
```
```
[root@mon ~]# systemctl status ntpd
```
Copy to Clipboard Toggle word wrap

If ntpd is not running, enable and start it:

Example

systemctl enable ntpd
systemctl start ntpd

[root@mon ~]# systemctl enable ntpd
[root@mon ~]# systemctl start ntpd

Copy to Clipboard

Toggle word wrap

Ensure that ntpd is synchronizing the clocks correctly:
Example
```
ntpq -p
```
```
[root@mon ~]# ntpq -p
```
Copy to Clipboard Toggle word wrap

Chapter 3. Troubleshooting networking issues

3.1. Prerequisites
Copy link

3.2. Basic networking troubleshooting
Copy link

3.3. Basic chrony NTP troubleshooting
Copy link

3.4. Basic NTP troubleshooting
Copy link

Learn

Try, buy, & sell

Communities

About Red Hat Documentation

Making open source more inclusive

About Red Hat

Theme

Red Hat legal and privacy links

Red Hat legal and privacy links

Chapter 3. Troubleshooting networking issues

3.1. PrerequisitesCopy linkLink copied to clipboard!

3.2. Basic networking troubleshootingCopy linkLink copied to clipboard!

3.3. Basic chrony NTP troubleshootingCopy linkLink copied to clipboard!

3.4. Basic NTP troubleshootingCopy linkLink copied to clipboard!

Learn

Try, buy, & sell

Communities

About Red Hat Documentation

Making open source more inclusive

About Red Hat

Theme

Red Hat legal and privacy links

Red Hat legal and privacy links

3.1. Prerequisites
Copy link

3.2. Basic networking troubleshooting
Copy link

3.3. Basic chrony NTP troubleshooting
Copy link

3.4. Basic NTP troubleshooting
Copy link