Troubleshooting Guide
Troubleshooting Red Hat Ceph Storage
Abstract
Chapter 1. Initial Troubleshooting Copy linkLink copied to clipboard!
This chapter includes information on:
- How to start troubleshooting Ceph errors (Section 1.1, “Identifying Problems”)
-
Most common
ceph health
error messages (Section 1.2, “Understanding the Output of theceph health
Command”) - Most common Ceph log error messages (Section 1.3, “Understanding Ceph Logs”)
1.1. Identifying Problems Copy linkLink copied to clipboard!
To determine possible causes of the error with Red Hat Ceph Storage you encounter, answer the following question:
- Certain problems can arise when using unsupported configurations. Ensure that your configuration is supported. See the Red Hat Ceph Storage: Supported configurations article for details.
Do you know what Ceph component causes the problem?
- No. Follow Section 1.1.1, “Diagnosing the Health of a Ceph Storage Cluster”.
- Monitors. See Chapter 4, Troubleshooting Monitors.
- OSDs. See Chapter 5, Troubleshooting OSDs.
- Placement groups. See Chapter 7, Troubleshooting Placement Groups.
1.1.1. Diagnosing the Health of a Ceph Storage Cluster Copy linkLink copied to clipboard!
This procedure lists basic steps to diagnose the health of a Ceph Storage Cluster.
Check the overall status of the cluster:
ceph health detail
# ceph health detail
Copy to Clipboard Copied! Toggle word wrap Toggle overflow If the command returns
HEALTH_WARN
orHEALTH_ERR
see Section 1.2, “Understanding the Output of theceph health
Command” for details.-
Check the Ceph logs for any error messages listed in Section 1.3, “Understanding Ceph Logs”. The logs are located by default in the
/var/log/ceph/
directory. - If the logs do not include sufficient amount of information, increase the debugging level and try to reproduce the action that failed. See Chapter 2, Configuring Logging for details.
-
Use the
ceph-medic
utility to diagnose the storage cluster. See the Usingceph-medic
to diagnose a Ceph storage cluster section in the Red Hat Ceph Storage 3 Administration Guide for more details.
1.2. Understanding the Output of the ceph health Command Copy linkLink copied to clipboard!
The ceph health
command returns information about the status of the Ceph Storage Cluster:
-
HEALTH_OK
indicates that the cluster is healthy. -
HEALTH_WARN
indicates a warning. In some cases, the Ceph status returns toHEALTH_OK
automatically, for example when Ceph finishes the rebalancing process. However, consider further troubleshooting if a cluster is in theHEALTH_WARN
state for longer time. -
HEALTH_ERR
indicates a more serious problem that requires your immediate attention.
Use the ceph health detail
and ceph -s
commands to get a more detailed output.
The following tables list the most common HEALTH_ERR
and HEALTH_WARN
error messages related to Monitors, OSDs, and placement groups. The tables provide links to corresponding sections that explain the errors and point to specific procedures to fix problems.
Error message | See |
---|---|
| |
| |
| |
|
Error message | See |
---|---|
| |
|
Error message | See |
---|---|
| |
| |
| |
| |
| |
| |
|
Error message | See |
---|---|
| |
| |
| |
| |
| |
| |
|
1.3. Understanding Ceph Logs Copy linkLink copied to clipboard!
By default, Ceph stores its logs in the /var/log/ceph/
directory.
The <cluster-name>.log
is the main cluster log file that includes the global cluster events. By default, this log is named ceph.log
. Only the Monitor hosts include the main cluster log.
Each OSD and Monitor has its own log file, named <cluster-name>-osd.<number>.log
and <cluster-name>-mon.<hostname>.log
.
When you increase debugging level for Ceph subsystems, Ceph generates a new log files for those subsystems as well. For details about logging, see Chapter 2, Configuring Logging.
The following tables list the most common Ceph log error messages related to Monitors and OSDs. The tables provide links to corresponding sections that explain the errors and point to specific procedures to fix them.
Error message | Log file | See |
---|---|---|
| Main cluster log | |
| Main cluster log | |
| Monitor log | |
| Monitor log | |
| Monitor log |
Error message | Log file | See |
---|---|---|
| Main cluster log | |
| Main cluster log | |
| Main cluster log | |
| OSD log | |
| OSD log |
Chapter 2. Configuring Logging Copy linkLink copied to clipboard!
This chapter describes how to configure logging for various Ceph subsystems.
Logging is resource intensive. Also, verbose logging can generate a huge amount of data in a relatively short time. It you are encountering problems in a specific subsystem of the cluster, enable logging only of that subsystem. See Section 2.1, “Ceph Subsystems” for more information.
In addition, consider setting up a rotation of log files. See Section 2.4, “Accelerating Log Rotation” for details.
Once you fix any problems you encounter, change the subsystems log and memory levels to their default values. See Appendix A, Subsystems Default Logging Levels Values for list of all Ceph subsystems and their default values.
You can configure Ceph logging by:
-
Using the
ceph
command at runtime. This is the most common approach. See Section 2.2, “Configuring Logging at Runtime” for details. - Updating the Ceph configuration file. Use this approach if you are encountering problems when starting the cluster. See Section 2.3, “Configuring Logging in the Ceph Configuration File” for details.
2.1. Ceph Subsystems Copy linkLink copied to clipboard!
This section contains information about Ceph subsystems and their logging levels.
Understanding Ceph Subsystems and Their Logging Levels
Ceph consists of several subsystems. Each subsystem has a logging level of its:
-
Output logs that are stored by default in
/var/log/ceph/
directory (log level) - Logs that are stored in a memory cache (memory level)
In general, Ceph does not send logs stored in memory to the output logs unless:
- A fatal signal is raised
- An assert in source code is triggered
- You request it
You can set different values for each of these subsystems. Ceph logging levels operate on scale of 1
to 20
, where 1
is terse and 20
is verbose.
Use a single value for the log level and memory level to set them both to the same value. For example, debug_osd = 5
sets the debug level for the ceph-osd
daemon to 5
.
To use different values for the output log level and the memory level, separate the values with a forward slash (/
). For example, debug_mon = 1/5
sets the debug log level for the ceph-mon
daemon to 1
and its memory log level to 5
.
The Most Used Ceph Subsystems and Their Default Values
Subsystem | Log Level | Memory Level | Description |
---|---|---|---|
| 1 | 5 | The administration socket |
| 1 | 5 | Authentication |
| 0 | 5 |
Any application or library that uses |
| 1 | 5 | The FileStore OSD back end |
| 1 | 5 | The OSD journal |
| 1 | 5 | The Metadata Servers |
| 0 | 5 | The Monitor client handles communication between most Ceph daemons and Monitors |
| 1 | 5 | Monitors |
| 0 | 5 | The messaging system between Ceph components |
| 0 | 5 | The OSD Daemons |
| 0 | 5 | The algorithm that Monitors use to establish a consensus |
| 0 | 5 | Reliable Autonomic Distributed Object Store, a core component of Ceph |
| 0 | 5 | The Ceph Block Devices |
| 1 | 5 | The Ceph Object Gateway |
Example Log Outputs
The following examples show the type of messages in the logs when you increase the verbosity for the Monitors and OSDs.
Monitor Debug Settings
debug_ms = 5 debug_mon = 20 debug_paxos = 20 debug_auth = 20
debug_ms = 5
debug_mon = 20
debug_paxos = 20
debug_auth = 20
Example Log Output of Monitor Debug Settings
OSD Debug Settings
debug_ms = 5 debug_osd = 20 debug_filestore = 20 debug_journal = 20
debug_ms = 5
debug_osd = 20
debug_filestore = 20
debug_journal = 20
Example Log Output of OSD Debug Settings
See Also
2.2. Configuring Logging at Runtime Copy linkLink copied to clipboard!
To activate the Ceph debugging output, dout()
, at runtime:
ceph tell <type>.<id> injectargs --debug-<subsystem> <value> [--<name> <value>]
ceph tell <type>.<id> injectargs --debug-<subsystem> <value> [--<name> <value>]
Replace:
-
<type>
with the type of Ceph daemons (osd
,mon
, ormds
) -
<id>
with a specific ID of the Ceph daemon. Alternatively, use*
to apply the runtime setting to all daemons of a particular type. -
<subsystem>
with a specific subsystem. See Section 2.1, “Ceph Subsystems” for details. -
<value>
with a number from1
to20
, where1
is terse and20
is verbose
For example, to set the log level for the OSD subsystem on the OSD named osd.0
to 0 and the memory level to 5:
ceph tell osd.0 injectargs --debug-osd 0/5
# ceph tell osd.0 injectargs --debug-osd 0/5
To see the configuration settings at runtime:
-
Log in to the host with a running Ceph daemon, for example
ceph-osd
orceph-mon
. Display the configuration:
ceph daemon <name> config show | less
ceph daemon <name> config show | less
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Specify the name of the Ceph daemon, for example:
ceph daemon osd.0 config show | less
# ceph daemon osd.0 config show | less
Copy to Clipboard Copied! Toggle word wrap Toggle overflow
See Also
- Section 2.3, “Configuring Logging in the Ceph Configuration File”
- The Logging Configuration Reference chapter in the Configuration Guide for Red Hat Ceph Storage 3
2.3. Configuring Logging in the Ceph Configuration File Copy linkLink copied to clipboard!
To activate Ceph debugging output, dout()
at boot time, add the debugging settings to the Ceph configuration file.
-
For subsystems common to each daemon, add the settings under the
[global]
section. -
For subsystems for particular daemons, add the settings under a daemon section, such as
[mon]
,[osd]
, or[mds]
.
For example:
See Also
- Section 2.1, “Ceph Subsystems”
- Section 2.2, “Configuring Logging at Runtime”
- The Logging Configuration Reference chapter in the Configuration Guide for Red Hat Ceph Storage 3
2.4. Accelerating Log Rotation Copy linkLink copied to clipboard!
Increasing debugging level for Ceph components might generate a huge amount of data. If you have almost full disks, you can accelerate log rotation by modifying the Ceph log rotation file at /etc/logrotate.d/ceph
. The Cron job scheduler uses this file to schedule log rotation.
Procedure: Accelerating Log Rotation
Add the size setting after the rotation frequency to the log rotation file:
rotate 7 weekly size <size> compress sharedscripts
rotate 7 weekly size <size> compress sharedscripts
Copy to Clipboard Copied! Toggle word wrap Toggle overflow For example, to rotate a log file when it reaches 500 MB:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Open the
crontab
editor:crontab -e
$ crontab -e
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Add an entry to check the
/etc/logrotate.d/ceph
file. For example, to instruct Cron to check/etc/logrotate.d/ceph
every 30 minutes:30 * * * * /usr/sbin/logrotate /etc/logrotate.d/ceph >/dev/null 2>&1
30 * * * * /usr/sbin/logrotate /etc/logrotate.d/ceph >/dev/null 2>&1
Copy to Clipboard Copied! Toggle word wrap Toggle overflow
See Also
- The Scheduling a Recurring Job Using Cron section in the System Administrator’s Guide for Red Hat Enterprise Linux 7.
Chapter 3. Troubleshooting Networking Issues Copy linkLink copied to clipboard!
This chapter lists basic troubleshooting procedures connected with networking and Network Time Protocol (NTP).
3.1. Basic Networking Troubleshooting Copy linkLink copied to clipboard!
Red Hat Ceph Storage depends heavily on a reliable network connection. Red Hat Ceph Storage nodes use the network for communicating with each other. Networking issues can cause many problems with Ceph OSDs, such as them flapping, or being incorrectly reported as down
. Networking issues can also cause the Ceph Monitor’s clock skew errors. In addition, packet loss, high latency, or limited bandwidth can impact the cluster performance and stability.
Procedure: Basic Networking Troubleshooting
Installing the
net-tools
package can help when troubleshooting network issues that can occur in a Ceph storage cluster:Example
yum install net-tools yum install telnet
[root@mon ~]# yum install net-tools [root@mon ~]# yum install telnet
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Verify that the
cluster_network
andpublic_network
parameters in the Ceph configuration file include the correct values:Example
cat /etc/ceph/ceph.conf | grep net cluster_network = 192.168.1.0/24 public_network = 192.168.0.0/24
[root@mon ~]# cat /etc/ceph/ceph.conf | grep net cluster_network = 192.168.1.0/24 public_network = 192.168.0.0/24
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Verify that the network interfaces are up:
Example
ip link list
[root@mon ~]# ip link list 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 2: enp22s0f0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000 link/ether 40:f2:e9:b8:a0:48 brd ff:ff:ff:ff:ff:ff
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Verify that the Ceph nodes are able to reach each other using their short host names. Verify this on each node in the storage cluster:
Syntax
ping SHORT_HOST_NAME
ping SHORT_HOST_NAME
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Example
ping osd01
[root@mon ~]# ping osd01
Copy to Clipboard Copied! Toggle word wrap Toggle overflow If you use a firewall, ensure that Ceph nodes are able to reach other on their appropriate ports. The
firewall-cmd
andtelnet
tools can validate the port status, and if the port is open respectively:Syntax
firewall-cmd --info-zone=ZONE telnet IP_ADDRESS PORT
firewall-cmd --info-zone=ZONE telnet IP_ADDRESS PORT
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Example
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Verify that there are no errors on the interface counters. Verify that the network connectivity between nodes has expected latency, and that there is no packet loss.
Using the
ethtool
command:Syntax
ethtool -S INTERFACE
ethtool -S INTERFACE
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Example
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Using the
ifconfig
command:Example
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Using the
netstat
command:Example
Copy to Clipboard Copied! Toggle word wrap Toggle overflow
For performance issues, in addition to the latency checks and to verify the network bandwidth between all nodes of the storage cluster, use the
iperf3
tool. Theiperf3
tool does a simple point-to-point network bandwidth test between a server and a client.Install the
iperf3
package on the Red Hat Ceph Storage nodes you want to check the bandwidth:Example
yum install iperf3
[root@mon ~]# yum install iperf3
Copy to Clipboard Copied! Toggle word wrap Toggle overflow On a Red Hat Ceph Storage node, start the
iperf3
server:Example
iperf3 -s
[root@mon ~]# iperf3 -s ----------------------------------------------------------- Server listening on 5201 -----------------------------------------------------------
Copy to Clipboard Copied! Toggle word wrap Toggle overflow NoteThe default port is 5201, but can be set using the
-P
command argument.On a different Red Hat Ceph Storage node, start the
iperf3
client:Example
Copy to Clipboard Copied! Toggle word wrap Toggle overflow This output shows a network bandwidth of 1.1 Gbits/second between the Red Hat Ceph Storage nodes, along with no retransmissions (
Retr
) during the test.Red Hat recommends you validate the network bandwidth between all the nodes in the storage cluster.
Ensure that all nodes have the same network interconnect speed. Slower attached nodes might slow down the faster connected ones. Also, ensure that the inter switch links can handle the aggregated bandwidth of the attached nodes:
Syntax
ethtool INTERFACE
ethtool INTERFACE
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Example
Copy to Clipboard Copied! Toggle word wrap Toggle overflow
See Also
- The Networking Guide for Red Hat Enterprise Linux 7.
- See the Verifying and configuring the MTU value section in the Red Hat Ceph Storage Configuration Guide.
- Knowledgebase articles and solutions related to troubleshooting networking issues on the Customer Portal.
3.2. Basic NTP Troubleshooting Copy linkLink copied to clipboard!
This section includes basic NTP troubleshooting steps.
Procedure: Basic NTP Troubleshooting
Verify that the
ntpd
daemon is running on the Monitor hosts:systemctl status ntpd
# systemctl status ntpd
Copy to Clipboard Copied! Toggle word wrap Toggle overflow If
ntpd
is not running, enable and start it:systemctl enable ntpd systemctl start ntpd
# systemctl enable ntpd # systemctl start ntpd
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Ensure that
ntpd
is synchronizing the clocks correctly:ntpq -p
$ ntpq -p
Copy to Clipboard Copied! Toggle word wrap Toggle overflow - See the How to troubleshoot NTP issues solution on the Red Hat Customer Portal for advanced NTP troubleshooting steps.
See Also
Chapter 4. Troubleshooting Monitors Copy linkLink copied to clipboard!
This chapter contains information on how to fix the most common errors related to the Ceph Monitors.
Before You Start
- Verify your network connection. See Chapter 3, Troubleshooting Networking Issues for details.
4.1. The Most Common Error Messages Related to Monitors Copy linkLink copied to clipboard!
The following tables list the most common error messages that are returned by the ceph health detail
command, or included in the Ceph logs. The tables provide links to corresponding sections that explain the errors and point to specific procedures to fix the problems.
Error message | See |
---|---|
| |
| |
| |
|
Error message | Log file | See |
---|---|---|
| Main cluster log | |
| Main cluster log | |
| Monitor log | |
| Monitor log | |
| Monitor log |
4.1.1. A Monitor Is Out of Quorum Copy linkLink copied to clipboard!
One or more Monitors are marked as down
but the other Monitors are still able to form a quorum. In addition, the ceph health detail
command returns an error message similar to the following one:
HEALTH_WARN 1 mons down, quorum 1,2 mon.b,mon.c mon.a (rank 0) addr 127.0.0.1:6789/0 is down (out of quorum)
HEALTH_WARN 1 mons down, quorum 1,2 mon.b,mon.c
mon.a (rank 0) addr 127.0.0.1:6789/0 is down (out of quorum)
What This Means
Ceph marks a Monitor as down
due to various reasons.
If the ceph-mon
daemon is not running, it might have a corrupted store or some other error is preventing the daemon from starting. Also, the /var/
partition might be full. As a consequence, ceph-mon
is not able to perform any operations to the store located by default at /var/lib/ceph/mon-<short-host-name>/store.db
and terminates.
If the ceph-mon
daemon is running but the Monitor is out of quorum and marked as down
, the cause of the problem depends on the Monitor state:
-
If the Monitor is in the probing state longer than expected, it cannot find the other Monitors. This problem can be caused by networking issues, or the Monitor can have an outdated Monitor map (
monmap
) and be trying to reach the other Monitors on incorrect IP addresses. Alternatively, if themonmap
is up-to-date, Monitor’s clock might not be synchronized. - If the Monitor is in the electing state longer than expected, the Monitor’s clock might not be synchronized.
- If the Monitor changes its state from synchronizing to electing and back, the cluster state is advancing. This means that it is generating new maps faster than the synchronization process can handle.
- If the Monitor marks itself as the leader or a peon, then it believes to be in a quorum, while the remaining cluster is sure that it is not. This problem can be caused by failed clock synchronization.
To Troubleshoot This Problem
Verify that the
ceph-mon
daemon is running. If not, start it:systemctl status ceph-mon@<host-name> systemctl start ceph-mon@<host-name>
systemctl status ceph-mon@<host-name> systemctl start ceph-mon@<host-name>
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Replace
<host-name>
with the short name of the host where the daemon is running. Use thehostname -s
command when unsure.-
If you are not able to start
ceph-mon
, follow the steps in Theceph-mon
Daemon Cannot Start. -
If you are able to start the
ceph-mon
daemon but is is marked asdown
, follow the steps in Theceph-mon
Daemon Is Running, but Still Marked asdown
.
The ceph-mon
Daemon Cannot Start
-
Check the corresponding Monitor log, by default located at
/var/log/ceph/ceph-mon.<host-name>.log
. If the log contains error messages similar to the following ones, the Monitor might have a corrupted store.
Corruption: error in middle of record Corruption: 1 missing files; e.g.: /var/lib/ceph/mon/mon.0/store.db/1234567.ldb
Corruption: error in middle of record Corruption: 1 missing files; e.g.: /var/lib/ceph/mon/mon.0/store.db/1234567.ldb
Copy to Clipboard Copied! Toggle word wrap Toggle overflow To fix this problem, replace the Monitor. See Section 4.4, “Replacing a Failed Monitor”.
If the log contains an error message similar to the following one, the
/var/
partition might be full. Delete any unnecessary data from/var/
.Caught signal (Bus error)
Caught signal (Bus error)
Copy to Clipboard Copied! Toggle word wrap Toggle overflow ImportantDo not delete any data from the Monitor directory manually. Instead, use the
ceph-monstore-tool
to compact it. See Section 4.5, “Compacting the Monitor Store” for details.- If you see any other error messages, open a support ticket. See Chapter 9, Contacting Red Hat Support Service for details.
The ceph-mon
Daemon Is Running, but Still Marked as down
From the Monitor host that is out of the quorum, use the
mon_status
command to check its state:ceph daemon <id> mon_status
ceph daemon <id> mon_status
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Replace
<id>
with the ID of the Monitor, for example:ceph daemon mon.a mon_status
# ceph daemon mon.a mon_status
Copy to Clipboard Copied! Toggle word wrap Toggle overflow If the status is probing, verify the locations of the other Monitors in the
mon_status
output.-
If the addresses are incorrect, the Monitor has incorrect Monitor map (
monmap
). To fix this problem, see Section 4.2, “Injecting a Monitor Map”. - If the addresses are correct, verify that the Monitor clocks are synchronized. See ] for details. In addition, troubleshoot any networking issues, see xref:troubleshooting-networking-issues[.
-
If the addresses are incorrect, the Monitor has incorrect Monitor map (
- If the status is electing, verify that the Monitor clocks are synchronized. See Section 4.1.2, “Clock Skew”.
- If the status changes from electing to synchronizing, open a support ticket. See Chapter 9, Contacting Red Hat Support Service for details.
- If the Monitor is the leader or a peon, verify that the Monitor clocks are synchronized. See ]. Open a support ticket if synchronizing the clocks does not solve the problem. See xref:contacting-red-hat-support-service[ for details.
See Also
- Section 4.1.4, “Understanding Monitor Status”
- The Starting, Stopping, Restarting a Daemon by Instances section in the Administration Guide for Red Hat Ceph Storage 3
- The Using the Administration Socket section in the Administration Guide for Red Hat Ceph Storage 3
4.1.2. Clock Skew Copy linkLink copied to clipboard!
A Ceph Monitor is out of quorum, and the ceph health detail
command output contains error messages similar to these:
mon.a (rank 0) addr 127.0.0.1:6789/0 is down (out of quorum) mon.a addr 127.0.0.1:6789/0 clock skew 0.08235s > max 0.05s (latency 0.0045s)
mon.a (rank 0) addr 127.0.0.1:6789/0 is down (out of quorum)
mon.a addr 127.0.0.1:6789/0 clock skew 0.08235s > max 0.05s (latency 0.0045s)
In addition, Ceph logs contain error messages similar to these:
2015-06-04 07:28:32.035795 7f806062e700 0 log [WRN] : mon.a 127.0.0.1:6789/0 clock skew 0.14s > max 0.05s 2015-06-04 04:31:25.773235 7f4997663700 0 log [WRN] : message from mon.1 was stamped 0.186257s in the future, clocks not synchronized
2015-06-04 07:28:32.035795 7f806062e700 0 log [WRN] : mon.a 127.0.0.1:6789/0 clock skew 0.14s > max 0.05s
2015-06-04 04:31:25.773235 7f4997663700 0 log [WRN] : message from mon.1 was stamped 0.186257s in the future, clocks not synchronized
What This Means
The clock skew
error message indicates that Monitors' clocks are not synchronized. Clock synchronization is important because Monitors depend on time precision and behave unpredictably if their clocks are not synchronized.
The mon_clock_drift_allowed
parameter determines what disparity between the clocks is tolerated. By default, this parameter is set to 0.05 seconds.
Do not change the default value of mon_clock_drift_allowed
without previous testing. Changing this value might affect the stability of the Monitors and the Ceph Storage Cluster in general.
Possible causes of the clock skew
error include network problems or problems with Network Time Protocol (NTP) synchronization if that is configured. In addition, time synchronization does not work properly on Monitors deployed on virtual machines.
To Troubleshoot This Problem
- Verify that your network works correctly. For details, see ]. In particular, troubleshoot any problems with NTP clients if you use NTP. See xref:basic-ntp-troubleshooting[ for more information.
- If you use a remote NTP server, consider deploying your own NTP server on your network. For details, see the Configuring NTP Using ntpd chapter in the System Administrator’s Guide for Red Hat Enterprise Linux 7.
- If you do not use an NTP client, set one up. For details, see the Configuring the Network Time Protocol for Red Hat Ceph Storage section in the Red Hat Ceph Storage 3 Installation Guide for Red Hat Enterprise Linux or Ubuntu.
- If you use virtual machines for hosting the Monitors, move them to bare metal hosts. Using virtual machines for hosting Monitors is not supported. For details, see the Red Hat Ceph Storage: Supported configurations article on the Red Hat Customer Portal.
Ceph evaluates time synchronization every five minutes only so there will be a delay between fixing the problem and clearing the clock skew
messages.
See Also
4.1.3. The Monitor Store is Getting Too Big Copy linkLink copied to clipboard!
The ceph health
command returns an error message similar to the following one:
mon.ceph1 store is getting too big! 48031 MB >= 15360 MB -- 62% avail
mon.ceph1 store is getting too big! 48031 MB >= 15360 MB -- 62% avail
What This Means
Ceph Monitors store is in fact a LevelDB database that stores entries as key–values pairs. The database includes a cluster map and is located by default at /var/lib/ceph/mon/<cluster-name>-<short-host-name>/store.db
.
Querying a large Monitor store can take time. As a consequence, the Monitor can be delayed in responding to client queries.
In addition, if the /var/
partition is full, the Monitor cannot perform any write operations to the store and terminates. See Section 4.1.1, “A Monitor Is Out of Quorum” for details on troubleshooting this issue.
To Troubleshoot This Problem
Check the size of the database:
du -sch /var/lib/ceph/mon/<cluster-name>-<short-host-name>/store.db
du -sch /var/lib/ceph/mon/<cluster-name>-<short-host-name>/store.db
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Specify the name of the cluster and the short host name of the host where the
ceph-mon
is running, for example:du -sch /var/lib/ceph/mon/ceph-host1/store.db
# du -sch /var/lib/ceph/mon/ceph-host1/store.db 47G /var/lib/ceph/mon/ceph-ceph1/store.db/ 47G total
Copy to Clipboard Copied! Toggle word wrap Toggle overflow - Compact the Monitor store. For details, see Section 4.5, “Compacting the Monitor Store”.
See Also
4.1.4. Understanding Monitor Status Copy linkLink copied to clipboard!
The mon_status
command returns information about a Monitor, such as:
- State
- Rank
- Elections epoch
-
Monitor map (
monmap
)
If Monitors are able to form a quorum, use mon_status
with the ceph
command-line utility.
If Monitors are not able to form a quorum, but the ceph-mon
daemon is running, use the administration socket to execute mon_status
. For details, see the Using the Administration Socket section in the Administration Guide for Red Hat Ceph Storage 3.
An example output of mon_status
Monitor States
- Leader
-
During the electing phase, Monitors are electing a leader. The leader is the Monitor with the highest rank, that is the rank with the lowest value. In the example above, the leader is
mon.1
. - Peon
- Peons are the Monitors in the quorum that are not leaders. If the leader fails, the peon with the highest rank becomes a new leader.
- Probing
-
A Monitor is in the probing state if it is looking for other Monitors. For example after you start the Monitors, they are probing until they find enough Monitors specified in the Monitor map (
monmap
) to form a quorum. - Electing
- A Monitor is in the electing state if it is in the process of electing the leader. Usually, this status changes quickly.
- Synchronizing
- A Monitor is in the synchronizing state if it is synchronizing with the other Monitors to join the quorum. The smaller the Monitor store it, the faster the synchronization process. Therefore, if you have a large store, synchronization takes longer time.
4.2. Injecting a Monitor Map Copy linkLink copied to clipboard!
If a Monitor has an outdated or corrupted Monitor map (monmap
), it cannot join a quorum because it is trying to reach the other Monitors on incorrect IP addresses.
The safest way to fix this problem is to obtain and inject the actual Monitor map from other Monitors. Note that this action overwrites the existing Monitor map kept by the Monitor.
This procedure shows how to inject the Monitor map when the other Monitors are able to form a quorum, or when at least one Monitor has a correct Monitor map. If all Monitors have corrupted store and therefore also the Monitor map, see Section 4.3, “Recovering the Monitor Store”.
Procedure: Injecting a Monitor Map
If the remaining Monitors are able to form a quorum, get the Monitor map by using the
ceph mon getmap
command:ceph mon getmap -o /tmp/monmap
# ceph mon getmap -o /tmp/monmap
Copy to Clipboard Copied! Toggle word wrap Toggle overflow If the remaining Monitors are not able to form the quorum and you have at least one Monitor with a correct Monitor map, copy it from that Monitor:
Stop the Monitor which you want to copy the Monitor map from:
systemctl stop ceph-mon@<host-name>
systemctl stop ceph-mon@<host-name>
Copy to Clipboard Copied! Toggle word wrap Toggle overflow For example, to stop the Monitor running on a host with the
host1
short host name:systemctl stop ceph-mon@host1
# systemctl stop ceph-mon@host1
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Copy the Monitor map:
ceph-mon -i <id> --extract-monmap /tmp/monmap
ceph-mon -i <id> --extract-monmap /tmp/monmap
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Replace
<id>
with the ID of the Monitor which you want to copy the Monitor map from, for example:ceph-mon -i mon.a --extract-monmap /tmp/monmap
# ceph-mon -i mon.a --extract-monmap /tmp/monmap
Copy to Clipboard Copied! Toggle word wrap Toggle overflow
Stop the Monitor with the corrupted or outdated Monitor map:
systemctl stop ceph-mon@<host-name>
systemctl stop ceph-mon@<host-name>
Copy to Clipboard Copied! Toggle word wrap Toggle overflow For example, to stop a Monitor running on a host with the
host2
short host name:systemctl stop ceph-mon@host2
# systemctl stop ceph-mon@host2
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Inject the Monitor map:
ceph-mon -i <id> --inject-monmap /tmp/monmap
ceph-mon -i <id> --inject-monmap /tmp/monmap
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Replace
<id>
with the ID of the Monitor with the corrupted or outdated Monitor map, for example:ceph-mon -i mon.c --inject-monmap /tmp/monmap
# ceph-mon -i mon.c --inject-monmap /tmp/monmap
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Start the Monitor, for example:
systemctl start ceph-mon@host2
# systemctl start ceph-mon@host2
Copy to Clipboard Copied! Toggle word wrap Toggle overflow If you copied the Monitor map from another Monitor, start that Monitor, too, for example:
systemctl start ceph-mon@host1
# systemctl start ceph-mon@host1
Copy to Clipboard Copied! Toggle word wrap Toggle overflow
See Also
4.3. Recovering the Monitor Store Copy linkLink copied to clipboard!
Ceph Monitors store the cluster map in a key–value store such as LevelDB. If the store is corrupted on a Monitor, the Monitor terminates unexpectedly and fails to start again. The Ceph logs might include the following errors:
Corruption: error in middle of record Corruption: 1 missing files; e.g.: /var/lib/ceph/mon/mon.0/store.db/1234567.ldb
Corruption: error in middle of record
Corruption: 1 missing files; e.g.: /var/lib/ceph/mon/mon.0/store.db/1234567.ldb
Production clusters must use at least three Monitors so that if one fails, it can be replaced with another one. However, under certain circumstances, all Monitors can have corrupted stores. For example, when the Monitor nodes have incorrectly configured disk or file system settings, a power outage can corrupt the underlying file system.
If the store is corrupted on all Monitors, you can recover it with information stored on the OSD nodes by using utilities called ceph-monstore-tool
and ceph-objectstore-tool
.
This procedure cannot recover the following information:
- Metadata Daemon Server (MDS) keyrings and maps
Placement Group settings:
-
full ratio
set by using theceph pg set_full_ratio
command -
nearfull ratio
set by using theceph pg set_nearfull_ratio
command
-
Never restore the monitor store from an old backup. Rebuild the monitor store from the current cluster state using the following steps and restore from that.
Before You Start
-
Ensure that you have the
rsync
utility and theceph-test
package installed.
Procedure: Recovering the Monitor Store
Use the following commands from the Monitor node with the corrupted store.
Collect the cluster map from all OSD nodes:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Replace
<directory>
with a temporary directory to store the collected cluster map, for example:Copy to Clipboard Copied! Toggle word wrap Toggle overflow Set appropriate capabilities:
ceph-authtool <keyring> -n mon. --cap mon 'allow *' ceph-authtool <keyring> -n client.admin --cap mon 'allow *' --cap osd 'allow *' --cap mds 'allow *'
ceph-authtool <keyring> -n mon. --cap mon 'allow *' ceph-authtool <keyring> -n client.admin --cap mon 'allow *' --cap osd 'allow *' --cap mds 'allow *'
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Replace
<keyring>
with the path to the client administration keyring, for example:ceph-authtool /etc/ceph/ceph.client.admin.keyring -n mon. --cap mon 'allow *' ceph-authtool /etc/ceph/ceph.client.admin.keyring -n client.admin --cap mon 'allow *' --cap osd 'allow *' --cap mds 'allow *'
$ ceph-authtool /etc/ceph/ceph.client.admin.keyring -n mon. --cap mon 'allow *' $ ceph-authtool /etc/ceph/ceph.client.admin.keyring -n client.admin --cap mon 'allow *' --cap osd 'allow *' --cap mds 'allow *'
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Rebuild the Monitor store from the collected map:
ceph-monstore-tool <directory> rebuild -- --keyring <keyring>
ceph-monstore-tool <directory> rebuild -- --keyring <keyring>
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Replace
<directory>
with the temporary directory from the first step and<keyring>
with the path to the client administration keyring, for example:ceph-monstore-tool /tmp/mon-store rebuild -- --keyring /etc/ceph/ceph.client.admin.keyring
$ ceph-monstore-tool /tmp/mon-store rebuild -- --keyring /etc/ceph/ceph.client.admin.keyring
Copy to Clipboard Copied! Toggle word wrap Toggle overflow NoteIf you do not use the
cephfx
authentication, omit the--keyring
option:ceph-monstore-tool /tmp/mon-store rebuild
$ ceph-monstore-tool /tmp/mon-store rebuild
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Back up the corrupted store:
mv /var/lib/ceph/mon/<mon-ID>/store.db \ /var/lib/ceph/mon/<mon-ID>/store.db.corrupted
mv /var/lib/ceph/mon/<mon-ID>/store.db \ /var/lib/ceph/mon/<mon-ID>/store.db.corrupted
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Replace
<mon-ID>
with the Monitor ID, for example<mon.0>
:mv /var/lib/ceph/mon/mon.0/store.db \ /var/lib/ceph/mon/mon.0/store.db.corrupted
# mv /var/lib/ceph/mon/mon.0/store.db \ /var/lib/ceph/mon/mon.0/store.db.corrupted
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Replace the corrupted store:
mv /tmp/mon-store/store.db /var/lib/ceph/mon/<mon-ID>/store.db
mv /tmp/mon-store/store.db /var/lib/ceph/mon/<mon-ID>/store.db
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Replace
<mon-ID>
with the Monitor ID, for example<mon.0>
:mv /tmp/mon-store/store.db /var/lib/ceph/mon/mon.0/store.db
# mv /tmp/mon-store/store.db /var/lib/ceph/mon/mon.0/store.db
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Repeat this step for all Monitors with corrupted store.
Change the owner of the new store:
chown -R ceph:ceph /var/lib/ceph/mon/<mon-ID>/store.db
chown -R ceph:ceph /var/lib/ceph/mon/<mon-ID>/store.db
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Replace
<mon-ID>
with the Monitor ID, for example<mon.0>
:chown -R ceph:ceph /var/lib/ceph/mon/mon.0/store.db
# chown -R ceph:ceph /var/lib/ceph/mon/mon.0/store.db
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Repeat this step for all Monitors with corrupted store.
See also
4.4. Replacing a Failed Monitor Copy linkLink copied to clipboard!
When a Monitor has a corrupted store, the recommended way to fix this problem is to replace the Monitor by using the Ansible automation application.
Before You Start
- Before removing a Monitor, ensure that the other Monitors are running and able to form a quorum.
Procedure: Replacing a Failed Monitor
From the Monitor host, remove the Monitor store by default located at
/var/lib/ceph/mon/<cluster-name>-<short-host-name>
:rm -rf /var/lib/ceph/mon/<cluster-name>-<short-host-name>
rm -rf /var/lib/ceph/mon/<cluster-name>-<short-host-name>
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Specify the short host name of the Monitor host and the cluster name. For example, to remove the Monitor store of a Monitor running on
host1
from a cluster calledremote
:rm -rf /var/lib/ceph/mon/remote-host1
# rm -rf /var/lib/ceph/mon/remote-host1
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Remove the Monitor from the Monitor map (
monmap
):ceph mon remove <short-host-name> --cluster <cluster-name>
ceph mon remove <short-host-name> --cluster <cluster-name>
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Specify the short host name of the Monitor host and the cluster name. For example, to remove the Monitor running on
host1
from a cluster calledremote
:ceph mon remove host1 --cluster remote
# ceph mon remove host1 --cluster remote
Copy to Clipboard Copied! Toggle word wrap Toggle overflow - Troubleshoot and fix any problems related to the underlying file system or hardware of the Monitor host.
From the Ansible administration node, redeploy the Monitor by running the
ceph-ansible
playbook:/usr/share/ceph-ansible/ansible-playbook site.yml
$ /usr/share/ceph-ansible/ansible-playbook site.yml
Copy to Clipboard Copied! Toggle word wrap Toggle overflow
See Also
- Section 4.1.1, “A Monitor Is Out of Quorum”
- The Managing Cluster Size chapter in the Administration Guide for Red Hat Ceph Storage 3
- The Deploying Red Hat Ceph Storage chapter in the Red Hat Ceph Storage 3 Installation Guide for Red Hat Enterprise Linux
4.5. Compacting the Monitor Store Copy linkLink copied to clipboard!
When the Monitor store has grown big in size, you can compact it:
-
Dynamically by using the
ceph tell
command. See the Compacting the Monitor Store Dynamically procedure for details. -
Upon the start of the
ceph-mon
daemon. See the Compacting the Monitor Store at Startup procedure for details. -
By using the
ceph-monstore-tool
when theceph-mon
daemon is not running. Use this method when the previously mentioned methods fail to compact the Monitor store or when the Monitor is out of quorum and its log contains theCaught signal (Bus error)
error message. See the Compacting the Monitor Store withceph-monstore-tool
procedure for details.
Monitor store size changes when the cluster is not in the active+clean
state or during the rebalancing process. For this reason, compact the Monitor store when rebalancing is completed. Also, ensure that the placement groups are in the active+clean
state.
Procedure: Compacting the Monitor Store Dynamically
To compact the Monitor store when the ceph-mon
daemon is running:
ceph tell mon.<host-name> compact
ceph tell mon.<host-name> compact
Replace <host-name>
with the short host name of the host where the ceph-mon
is running. Use the hostname -s
command when unsure.
ceph tell mon.host1 compact
# ceph tell mon.host1 compact
Procedure: Compacting the Monitor Store at Startup
Add the following parameter to the Ceph configuration under the
[mon]
section:[mon] mon_compact_on_start = true
[mon] mon_compact_on_start = true
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Restart the
ceph-mon
daemon:systemctl restart ceph-mon@<host-name>
systemctl restart ceph-mon@<host-name>
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Replace
<host-name>
with the short name of the host where the daemon is running. Use thehostname -s
command when unsure.systemctl restart ceph-mon@host1
# systemctl restart ceph-mon@host1
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Ensure that Monitors have formed a quorum:
ceph mon stat
# ceph mon stat
Copy to Clipboard Copied! Toggle word wrap Toggle overflow - Repeat these steps on other Monitors if needed.
Procedure: Compacting Monitor Store with ceph-monstore-tool
Before you start, ensure that you have the ceph-test
package installed.
Verify that the
ceph-mon
daemon with the large store is not running. Stop the daemon if needed.systemctl status ceph-mon@<host-name> systemctl stop ceph-mon@<host-name>
systemctl status ceph-mon@<host-name> systemctl stop ceph-mon@<host-name>
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Replace
<host-name>
with the short name of the host where the daemon is running. Use thehostname -s
command when unsure.systemctl status ceph-mon@host1 systemctl stop ceph-mon@host1
# systemctl status ceph-mon@host1 # systemctl stop ceph-mon@host1
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Compact the Monitor store:
ceph-monstore-tool /var/lib/ceph/mon/mon.<host-name> compact
ceph-monstore-tool /var/lib/ceph/mon/mon.<host-name> compact
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Replace
<host-name>
with a short host name of the Monitor host.ceph-monstore-tool /var/lib/ceph/mon/mon.node1 compact
# ceph-monstore-tool /var/lib/ceph/mon/mon.node1 compact
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Start
ceph-mon
again:systemctl start ceph-mon@<host-name>
systemctl start ceph-mon@<host-name>
Copy to Clipboard Copied! Toggle word wrap Toggle overflow For example:
systemctl start ceph-mon@host1
# systemctl start ceph-mon@host1
Copy to Clipboard Copied! Toggle word wrap Toggle overflow
See Also
4.6. Opening Ports for Ceph Manager Copy linkLink copied to clipboard!
The ceph-mgr
daemons receive placement group information from OSDs on the same range of ports as the ceph-osd
daemons. If these ports are not open, a cluster will devolve from HEALTH_OK
to HEALTH_WARN
and will indicate that PGs are unknown
with a percentage count of the PGs unknown.
To resolve this situation, for each host running ceph-mgr
daemons, open ports 6800:7300
. For example:
[root@ceph-mgr] # firewall-cmd --add-port 6800:7300/tcp [root@ceph-mgr] # firewall-cmd --add-port 6800:7300/tcp --permanent
[root@ceph-mgr] # firewall-cmd --add-port 6800:7300/tcp
[root@ceph-mgr] # firewall-cmd --add-port 6800:7300/tcp --permanent
Then, restart the ceph-mgr
daemons.
Chapter 5. Troubleshooting OSDs Copy linkLink copied to clipboard!
This chapter contains information on how to fix the most common errors related to Ceph OSDs.
Before You Start
- Verify your network connection. See Chapter 3, Troubleshooting Networking Issues for details.
-
Verify that Monitors have a quorum by using the
ceph health
command. If the command returns a health status (HEALTH_OK
,HEALTH_WARN
, orHEALTH_ERR
), the Monitors are able to form a quorum. If not, address any Monitor problems first. See ] for details. For details aboutceph health
see xref:understanding-ceph-health[. - Optionally, stop the rebalancing process to save time and resources. See Section 5.2, “Stopping and Starting Rebalancing” for details.
5.1. The Most Common Error Messages Related to OSDs Copy linkLink copied to clipboard!
The following tables list the most common error messages that are returned by the ceph health detail
command, or included in the Ceph logs. The tables provide links to corresponding sections that explain the errors and point to specific procedures to fix the problems.
Error message | See |
---|---|
| |
| |
| |
| |
| |
| |
|
Error message | Log file | See |
---|---|---|
| Main cluster log | |
| Main cluster log | |
| Main cluster log | |
| OSD log | |
| OSD log |
5.1.1. Full OSDs Copy linkLink copied to clipboard!
The ceph health detail
command returns an error message similar to the following one:
HEALTH_ERR 1 full osds osd.3 is full at 95%
HEALTH_ERR 1 full osds
osd.3 is full at 95%
What This Means
Ceph prevents clients from performing I/O operations on full OSD nodes to avoid losing data. It returns the HEALTH_ERR full osds
message when the cluster reaches the capacity set by the mon_osd_full_ratio
parameter. By default, this parameter is set to 0.95
which means 95% of the cluster capacity.
To Troubleshoot This Problem
Determine how many percent of raw storage (%RAW USED
) is used:
ceph df
# ceph df
If %RAW USED
is above 70-75%, you can:
- Delete unnecessary data. This is a short-term solution to avoid production downtime. See Section 5.6, “Deleting Data from a Full Cluster” for details.
- Scale the cluster by adding a new OSD node. This is a long-term solution recommended by Red Hat. For details, see the Adding and Removing OSD Nodes chapter in the Administration Guide for Red Hat Ceph Storage 3.
See Also
5.1.2. Nearfull OSDs Copy linkLink copied to clipboard!
The ceph health detail
command returns an error message similar to the following one:
HEALTH_WARN 1 nearfull osds osd.2 is near full at 85%
HEALTH_WARN 1 nearfull osds
osd.2 is near full at 85%
What This Means
Ceph returns the nearfull osds
message when the cluster reaches the capacity set by the mon osd nearfull ratio defaults
parameter. By default, this parameter is set to 0.85
which means 85% of the cluster capacity.
Ceph distributes data based on the CRUSH hierarchy in the best possible way but it cannot guarantee equal distribution. The main causes of the uneven data distribution and the nearfull osds
messages are:
- The OSDs are not balanced among the OSD nodes in the cluster. That is, some OSD nodes host significantly more OSDs than others, or the weight of some OSDs in the CRUSH map is not adequate to their capacity.
- The Placement Group (PG) count is not proper as per the number of the OSDs, use case, target PGs per OSD, and OSD utilization.
- The cluster uses inappropriate CRUSH tunables.
- The back-end storage for OSDs is almost full.
To Troubleshoot This Problem:
- Verify that the PG count is sufficient and increase it if needed. See Section 7.5, “Increasing the PG Count” for details.
- Verify that you use CRUSH tunables optimal to the cluster version and adjust them if not. For details, see the CRUSH Tunables section in the Storage Strategies guide for Red Hat Ceph Storage 3 and the How can I test the impact CRUSH map tunable modifications will have on my PG distribution across OSDs in Red Hat Ceph Storage? solution on the Red Hat Customer Portal.
- Change the weight of OSDs by utilization. See the Set an OSD’s Weight by Utilization section in the Storage Strategies guide for Red Hat Ceph Storage 3.
Determine how much space is left on the disks used by OSDs.
To view how much space OSDs use in general:
ceph osd df
# ceph osd df
Copy to Clipboard Copied! Toggle word wrap Toggle overflow To view how much space OSDs use on particular nodes. Use the following command from the node containing
nearful
OSDs:df
$ df
Copy to Clipboard Copied! Toggle word wrap Toggle overflow - If needed, add a new OSD node. See the Adding and Removing OSD Nodes chapter in the Administration Guide for Red Hat Ceph Storage 3.
See Also
5.1.3. One or More OSDs Are Down Copy linkLink copied to clipboard!
The ceph health
command returns an error similar to the following one:
HEALTH_WARN 1/3 in osds are down
HEALTH_WARN 1/3 in osds are down
What This Means
One of the ceph-osd
processes is unavailable due to a possible service failure or problems with communication with other OSDs. As a consequence, the surviving ceph-osd
daemons reported this failure to the Monitors.
If the ceph-osd
daemon is not running, the underlying OSD drive or file system is either corrupted, or some other error, such as a missing keyring, is preventing the daemon from starting.
In most cases, networking issues cause the situation when the ceph-osd
daemon is running but still marked as down
.
To Troubleshoot This Problem
Determine which OSD is
down
:ceph health detail
# ceph health detail HEALTH_WARN 1/3 in osds are down osd.0 is down since epoch 23, last address 192.168.106.220:6800/11080
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Try to restart the
ceph-osd
daemon:systemctl restart ceph-osd@<OSD-number>
systemctl restart ceph-osd@<OSD-number>
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Replace
<OSD-number>
with the ID of the OSD that isdown
, for example:systemctl restart ceph-osd@0
# systemctl restart ceph-osd@0
Copy to Clipboard Copied! Toggle word wrap Toggle overflow -
If you are not able start
ceph-osd
, follow the steps in Theceph-osd
daemon cannot start. -
If you are able to start the
ceph-osd
daemon but it is marked asdown
, follow the steps in Theceph-osd
daemon is running but still marked asdown
.
-
If you are not able start
The ceph-osd
daemon cannot start
- If you have a node containing a number of OSDs (generally, more that twelve), verify that the default maximum number of threads (PID count) is sufficient. See Section 5.5, “Increasing the PID count” for details.
Verify that the OSD data and journal partitions are mounted properly:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow A partition is mounted if
ceph-disk
marks it asactive
. If a partition isprepared
, mount it. See Section 5.3, “Mounting the OSD Data Partition” for details. If a partition isunprepared
, you must prepare it first before mounting. See the Preparing the OSD Data and Journal Drives section in the Administration Guide Red Hat Ceph Storage 3.-
If you got the
ERROR: missing keyring, cannot use cephx for authentication
error message, the OSD is a missing keyring. See the Keyring Management section in the Administration Guide for Red Hat Ceph Storage 3. If you got the
ERROR: unable to open OSD superblock on /var/lib/ceph/osd/ceph-1
error message, theceph-osd
daemon cannot read the underlying file system. See the following steps for instructions on how to troubleshoot and fix this error.NoteIf this error message is returned during boot time of the OSD host, open a support ticket as this might indicate a known issue tracked in the Red Hat Bugzilla 1439210. See Chapter 9, Contacting Red Hat Support Service for details.
Check the corresponding log file to determine the cause of the failure. By default, Ceph stores log files in the
/var/log/ceph/
directory.An
EIO
error message similar to the following one indicates a failure of the underlying disk:FAILED assert(!m_filestore_fail_eio || r != -5)
FAILED assert(!m_filestore_fail_eio || r != -5)
Copy to Clipboard Copied! Toggle word wrap Toggle overflow To fix this problem replace the underlying OSD disk. See Section 5.4, “Replacing an OSD Drive” for details.
If the log includes any other
FAILED assert
errors, such as the following one, open a support ticket. See Chapter 9, Contacting Red Hat Support Service for details.FAILED assert(0 == "hit suicide timeout")
FAILED assert(0 == "hit suicide timeout")
Copy to Clipboard Copied! Toggle word wrap Toggle overflow
Check the
dmesg
output for the errors with the underlying file system or disk:dmesg
$ dmesg
Copy to Clipboard Copied! Toggle word wrap Toggle overflow The
error -5
error message similar to the following one indicates corruption of the underlying XFS file system. For details on how to fix this problem, see the What is the meaning of "xfs_log_force: error -5 returned"? solution on the Red Hat Customer Portal.xfs_log_force: error -5 returned
xfs_log_force: error -5 returned
Copy to Clipboard Copied! Toggle word wrap Toggle overflow -
If the
dmesg
output includes anySCSI error
error messages, see the SCSI Error Codes Solution Finder solution on the Red Hat Customer Portal to determine the best way to fix the problem. - Alternatively, if you are unable to fix the underlying file system, replace the OSD drive. See Section 5.4, “Replacing an OSD Drive” for details.
If the OSD failed with a segmentation fault, such as the following one, gather the required information and open a support ticket. See Chapter 9, Contacting Red Hat Support Service for details.
Caught signal (Segmentation fault)
Caught signal (Segmentation fault)
Copy to Clipboard Copied! Toggle word wrap Toggle overflow
The ceph-osd
is running but still marked as down
Check the corresponding log file to determine the cause of the failure. By default, Ceph stores log files in the
/var/log/ceph/
directory.If the log includes error messages similar to the following ones, see Section 5.1.4, “Flapping OSDs”.
wrongly marked me down heartbeat_check: no reply from osd.2 since back
wrongly marked me down heartbeat_check: no reply from osd.2 since back
Copy to Clipboard Copied! Toggle word wrap Toggle overflow - If you see any other errors, open a support ticket. See Chapter 9, Contacting Red Hat Support Service for details.
See Also
- Section 5.1.4, “Flapping OSDs”
- Section 7.1.1, “Stale Placement Groups”
- The Starting, Stopping, Restarting a Daemon by Instances section in the Administration Guide for Red Hat Ceph Storage 3
5.1.4. Flapping OSDs Copy linkLink copied to clipboard!
The ceph -w | grep osds
command shows OSDs repeatedly as down
and then up
again within a short period of time:
In addition, the Ceph log contains error messages similar to the following ones:
2016-07-25 03:44:06.510583 osd.50 127.0.0.1:6801/149046 18992 : cluster [WRN] map e600547 wrongly marked me down
2016-07-25 03:44:06.510583 osd.50 127.0.0.1:6801/149046 18992 : cluster [WRN] map e600547 wrongly marked me down
2016-07-25 19:00:08.906864 7fa2a0033700 -1 osd.254 609110 heartbeat_check: no reply from osd.2 since back 2016-07-25 19:00:07.444113 front 2016-07-25 18:59:48.311935 (cutoff 2016-07-25 18:59:48.906862)
2016-07-25 19:00:08.906864 7fa2a0033700 -1 osd.254 609110 heartbeat_check: no reply from osd.2 since back 2016-07-25 19:00:07.444113 front 2016-07-25 18:59:48.311935 (cutoff 2016-07-25 18:59:48.906862)
What This Means
The main causes of flapping OSDs are:
- Certain cluster operations, such as scrubbing or recovery, take an abnormal amount of time. For example, if you perform these operations on objects with a large index or large placement groups. Usually, after these operations finish, the flapping OSDs problem is solved.
-
Problems with the underlying physical hardware. In this case, the
ceph health detail
command also returns theslow requests
error message. For details, see Section 5.1.5, “Slow Requests, and Requests are Blocked”. - Problems with network.
OSDs cannot handle well the situation when the cluster (back-end) network fails or develops significant latency while the public (front-end) network operates optimally.
OSDs use the cluster network for sending heartbeat packets to each other to indicate that they are up
and in
. If the cluster network does not work properly, OSDs are unable to send and receive the heartbeat packets. As a consequence, they report each other as being down
to the Monitors, while marking themselves as up
.
The following parameters in the Ceph configuration file influence this behavior:
Parameter | Description | Default value |
---|---|---|
|
How long OSDs wait for the heartbeat packets to return before reporting an OSD as | 20 seconds |
|
How many OSDs must report another OSD as | 2 |
This table shows that in the default configuration, the Ceph Monitors mark an OSD as down
if only one OSD made three distinct reports about the first OSD being down
. In some cases, if one single host encounters network issues, the entire cluster can experience flapping OSDs. This is because the OSDs that reside on the host will report other OSDs in the cluster as down
.
The flapping OSDs scenario does not include the situation when the OSD processes are started and then immediately killed.
To Troubleshoot This Problem
Check the output of the
ceph health detail
command again. If it includes theslow requests
error message, see Section 5.1.5, “Slow Requests, and Requests are Blocked” for details on how to troubleshoot this issue.Copy to Clipboard Copied! Toggle word wrap Toggle overflow Determine which OSDs are marked as
down
and on what nodes they reside:ceph osd tree | grep down
# ceph osd tree | grep down
Copy to Clipboard Copied! Toggle word wrap Toggle overflow - On the nodes containing the flapping OSDs, troubleshoot and fix any networking problems. For details, see Chapter 3, Troubleshooting Networking Issues.
Alternatively, you can temporary force Monitors to stop marking the OSDs as
down
andup
by setting thenoup
andnodown
flags:ceph osd set noup ceph osd set nodown
# ceph osd set noup # ceph osd set nodown
Copy to Clipboard Copied! Toggle word wrap Toggle overflow ImportantUsing the
noup
andnodown
flags does not fix the root cause of the problem but only prevents OSDs from flapping. Open a support ticket, if you are unable to fix and troubleshoot the error by yourself. See Chapter 9, Contacting Red Hat Support Service for details.-
Additionally, flapping OSDs can be fixed by setting
osd heartbeat min size = 100
in the Ceph configuration file and then restarting the OSDs. This resolves network issue due to MTU misconfiguration.
Additional Resources
- The Verifying the Network Configuration for Red Hat Ceph Storage section in the Red Hat Ceph Storage 3 Installation Guide for Red Hat Enterprise Linux or Installation Guide for Ubuntu
- The Heartbeating section in the Architecture Guide for Red Hat Ceph Storage 3
5.1.5. Slow Requests, and Requests are Blocked Copy linkLink copied to clipboard!
The ceph-osd
daemon is slow to respond to a request and the ceph health detail
command returns an error message similar to the following one:
In addition, the Ceph logs include an error message similar to the following ones:
2015-08-24 13:18:10.024659 osd.1 127.0.0.1:6812/3032 9 : cluster [WRN] 6 slow requests, 6 included below; oldest blocked for > 61.758455 secs
2015-08-24 13:18:10.024659 osd.1 127.0.0.1:6812/3032 9 : cluster [WRN] 6 slow requests, 6 included below; oldest blocked for > 61.758455 secs
2016-07-25 03:44:06.510583 osd.50 [WRN] slow request 30.005692 seconds old, received at {date-time}: osd_op(client.4240.0:8 benchmark_data_ceph-1_39426_object7 [write 0~4194304] 0.69848840) v4 currently waiting for subops from [610]
2016-07-25 03:44:06.510583 osd.50 [WRN] slow request 30.005692 seconds old, received at {date-time}: osd_op(client.4240.0:8 benchmark_data_ceph-1_39426_object7 [write 0~4194304] 0.69848840) v4 currently waiting for subops from [610]
What This Means
An OSD with slow requests is every OSD that is not able to service the I/O operations per second (IOPS) in the queue within the time defined by the osd_op_complaint_time
parameter. By default, this parameter is set to 30 seconds.
The main causes of OSDs having slow requests are:
- Problems with the underlying hardware, such as disk drives, hosts, racks, or network switches
- Problems with network. These problems are usually connected with flapping OSDs. See Section 5.1.4, “Flapping OSDs” for details.
- System load
The following table shows the types of slow requests. Use the dump_historic_ops
administration socket command to determine the type of a slow request. For details about the administration socket, see the Using the Administration Socket section in the Administration Guide for Red Hat Ceph Storage 3.
Slow request type | Description |
---|---|
| The OSD is waiting to acquire a lock on a placement group for the operation. |
| The OSD is waiting for replica OSDs to apply the operation to the journal. |
| The OSD did not reach any major operation milestone. |
| The OSDs have not replicated an object the specified number of times yet. |
To Troubleshoot This Problem
- Determine if the OSDs with slow or block requests share a common piece of hardware, for example a disk drive, host, rack, or network switch.
If the OSDs share a disk:
Use the
smartmontools
utility to check the health of the disk or the logs to determine any errors on the disk.NoteThe
smartmontools
utility is included in thesmartmontools
package.Use the
iostat
utility to get the I/O wait report (%iowai
) on the OSD disk to determine if the disk is under heavy load.NoteThe
iostat
utility is included in thesysstat
package.
If the OSDs share a host:
- Check the RAM and CPU utilization
-
Use the
netstat
utility to see the network statistics on the Network Interface Controllers (NICs) and troubleshoot any networking issues. See also Chapter 3, Troubleshooting Networking Issues for further information.
- If the OSDs share a rack, check the network switch for the rack. For example, if you use jumbo frames, verify that the NIC in the path has jumbo frames set.
- If you are unable to determine a common piece of hardware shared by OSDs with slow requests, or to troubleshoot and fix hardware and networking problems, open a support ticket. See Chapter 9, Contacting Red Hat Support Service for details.
See Also
- The Using the Administration Socket section in the Administration Guide for Red Hat Ceph Storage 3
5.2. Stopping and Starting Rebalancing Copy linkLink copied to clipboard!
When an OSD fails or you stop it, the CRUSH algorithm automatically starts the rebalancing process to redistribute data across the remaining OSDs.
Rebalancing can take time and resources, therefore, consider stopping rebalancing during troubleshooting or maintaining OSDs. To do so, set the noout
flag before stopping the OSD:
ceph osd set noout
# ceph osd set noout
When you finish troubleshooting or maintenance, unset the noout
flag to start rebalancing:
ceph osd unset noout
# ceph osd unset noout
Placement groups within the stopped OSDs become degraded
during troubleshooting and maintenance.
See Also
- The Rebalancing and Recovery section in the Architecture Guide for Red Hat Ceph Storage 3
5.3. Mounting the OSD Data Partition Copy linkLink copied to clipboard!
If the OSD data partition is not mounted correctly, the ceph-osd
daemon cannot start. If you discover that the partition is not mounted as expected, follow the steps in this section to mount it.
Procedure: Mounting the OSD Data Partition
Mount the partition:
mount -o noatime <partition> /var/lib/ceph/osd/<cluster-name>-<osd-number>
# mount -o noatime <partition> /var/lib/ceph/osd/<cluster-name>-<osd-number>
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Replace
<partition>
with the path to the partition on the OSD drive dedicated to OSD data. Specify the cluster name and the OSD number, for example:mount -o noatime /dev/sdd1 /var/lib/ceph/osd/ceph-0
# mount -o noatime /dev/sdd1 /var/lib/ceph/osd/ceph-0
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Try to start the failed
ceph-osd
daemon:systemctl start ceph-osd@<OSD-number>
# systemctl start ceph-osd@<OSD-number>
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Replace the
<OSD-number>
with the ID of the OSD, for example:systemctl start ceph-osd@0
# systemctl start ceph-osd@0
Copy to Clipboard Copied! Toggle word wrap Toggle overflow
See Also
5.4. Replacing an OSD Drive Copy linkLink copied to clipboard!
Ceph is designed for fault tolerance, which means that it can operate in a degraded
state without losing data. Consequently, Ceph can operate even if a data storage drive fails. In the context of a failed drive, the degraded
state means that the extra copies of the data stored on other OSDs will backfill automatically to other OSDs in the cluster. However, if this occurs, replace the failed OSD drive and recreate the OSD manually.
When a drive fails, Ceph reports the OSD as down
:
HEALTH_WARN 1/3 in osds are down osd.0 is down since epoch 23, last address 192.168.106.220:6800/11080
HEALTH_WARN 1/3 in osds are down
osd.0 is down since epoch 23, last address 192.168.106.220:6800/11080
Ceph can mark an OSD as down
also as a consequence of networking or permissions problems. See Section 5.1.3, “One or More OSDs Are Down” for details.
Modern servers typically deploy with hot-swappable drives so you can pull a failed drive and replace it with a new one without bringing down the node. The whole procedure includes these steps:
- Remove the OSD from the Ceph cluster. For details, see the Removing an OSD from the Ceph Cluster procedure.
- Replace the drive. For details see, the Replacing the Physical Drive section.
- Add the OSD to the cluster. For details, see the Adding an OSD to the Ceph Cluster procedure.
Before You Start
Determine which OSD is
down
:ceph osd tree | grep -i down
# ceph osd tree | grep -i down ID WEIGHT TYPE NAME UP/DOWN REWEIGHT PRIMARY-AFFINITY 0 0.00999 osd.0 down 1.00000 1.00000
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Ensure that the OSD process is stopped. Use the following command from the OSD node:
systemctl status ceph-osd@<OSD-number>
# systemctl status ceph-osd@<OSD-number>
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Replace
<OSD-number>
with the ID of the OSD marked asdown
, for example:systemctl status ceph-osd@osd.0
# systemctl status ceph-osd@osd.0 ... Active: inactive (dead)
Copy to Clipboard Copied! Toggle word wrap Toggle overflow If the
ceph-osd
daemon is running. See Section 5.1.3, “One or More OSDs Are Down” for more details about troubleshooting OSDs that are marked asdown
but their correspondingceph-osd
daemon is running.
Procedure: Removing an OSD from the Ceph Cluster
Mark the OSD as
out
:ceph osd out osd.<OSD-number>
# ceph osd out osd.<OSD-number>
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Replace
<OSD-number>
with the ID of the OSD that is marked asdown
, for example:ceph osd out osd.0
# ceph osd out osd.0 marked out osd.0.
Copy to Clipboard Copied! Toggle word wrap Toggle overflow NoteIf the OSD is
down
, Ceph marks it asout
automatically after 600 seconds when it does not receive any heartbeat packet from the OSD. When this happens, other OSDs with copies of the failed OSD data begin backfilling to ensure that the required number of copies exists within the cluster. While the cluster is backfilling, the cluster will be in adegraded
state.Ensure that the failed OSD is backfilling. The output will include information similar to the following one:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Remove the OSD from the CRUSH map:
ceph osd crush remove osd.<OSD-number>
# ceph osd crush remove osd.<OSD-number>
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Replace
<OSD-number>
with the ID of the OSD that is marked asdown
, for example:ceph osd crush remove osd.0
# ceph osd crush remove osd.0 removed item id 0 name 'osd.0' from crush map
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Remove authentication keys related to the OSD:
ceph auth del osd.<OSD-number>
# ceph auth del osd.<OSD-number>
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Replace
<OSD-number>
with the ID of the OSD that is marked asdown
, for example:ceph auth del osd.0
# ceph auth del osd.0 updated
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Remove the OSD from the Ceph Storage Cluster:
ceph osd rm osd.<OSD-number>
# ceph osd rm osd.<OSD-number>
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Replace
<OSD-number>
with the ID of the OSD that is marked asdown
, for example:ceph osd rm osd.0
# ceph osd rm osd.0 removed osd.0
Copy to Clipboard Copied! Toggle word wrap Toggle overflow If you have removed the OSD successfully, it is not present in the output of the following command:
ceph osd tree
# ceph osd tree
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Unmount the failed drive:
umount /var/lib/ceph/osd/<cluster-name>-<OSD-number>
# umount /var/lib/ceph/osd/<cluster-name>-<OSD-number>
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Specify the name of the cluster and the ID of the OSD, for example:
umount /var/lib/ceph/osd/ceph-0/
# umount /var/lib/ceph/osd/ceph-0/
Copy to Clipboard Copied! Toggle word wrap Toggle overflow If you have unmounted the drive successfully, it is not present in the output of the following command:
df -h
# df -h
Copy to Clipboard Copied! Toggle word wrap Toggle overflow
Procedure: Replacing the Physical Drive
See the documentation for the hardware node for details on replacing the physical drive.
- If the drive is hot-swappable, replace the failed drive with a new one.
- If the drive is not hot-swappable and the node contains multiple OSDs, you might have to shut down the whole node and replace the physical drive. Consider preventing the cluster from backfilling. See Section 5.2, “Stopping and Starting Rebalancing” for details.
-
When the drive appears under the
/dev/
directory, make a note of the drive path. - If you want to add the OSD manually, find the OSD drive and format the disk.
Procedure: Adding an OSD to the Ceph Cluster
Add the OSD again.
If you used Ansible to deploy the cluster, run the
ceph-ansible
playbook again from the Ceph administration server:ansible-playbook /usr/share/ceph-ansible site.yml
# ansible-playbook /usr/share/ceph-ansible site.yml
Copy to Clipboard Copied! Toggle word wrap Toggle overflow - If you added the OSD manually, see the Adding an OSD with the Command-line Interface section in the _Administration Guid_e for Red Hat Ceph Storage 3.
Ensure that the CRUSH hierarchy is accurate:
ceph osd tree
# ceph osd tree
Copy to Clipboard Copied! Toggle word wrap Toggle overflow If you are not satisfied with the location of the OSD in the CRUSH hierarchy, move the OSD to a desired location:
ceph osd crush move <bucket-to-move> <bucket-type>=<parent-bucket>
ceph osd crush move <bucket-to-move> <bucket-type>=<parent-bucket>
Copy to Clipboard Copied! Toggle word wrap Toggle overflow For example, to move the bucket located at
sdd:row1
to the root bucket:ceph osd crush move ssd:row1 root=ssd:root
# ceph osd crush move ssd:row1 root=ssd:root
Copy to Clipboard Copied! Toggle word wrap Toggle overflow
See Also
- Section 5.1.3, “One or More OSDs Are Down”
- The Managing the Cluster Size chapter in the Administration Guide for Red Hat Ceph Storage 3
- The Red Hat Ceph Storage 3 Installation Guide for Red Hat Enterprise Linux or the Installation Guide for Ubuntu
5.5. Increasing the PID count Copy linkLink copied to clipboard!
If you have a node containing more than 12 Ceph OSDs, the default maximum number of threads (PID count) can be insufficient, especially during recovery. As a consequence, some ceph-osd
daemons can terminate and fail to start again. If this happens, increase the maximum possible number of threads allowed.
To temporary increase the number:
sysctl -w kernel.pid.max=4194303
# sysctl -w kernel.pid.max=4194303
To permanently increase the number, update the /etc/sysctl.conf
file as follows:
kernel.pid.max = 4194303
kernel.pid.max = 4194303
5.6. Deleting Data from a Full Cluster Copy linkLink copied to clipboard!
Ceph automatically prevents any I/O operations on OSDs that reached the capacity specified by the mon_osd_full_ratio
parameter and returns the full osds
error message.
This procedure shows how to delete unnecessary data to fix this error.
The mon_osd_full_ratio
parameter sets the value of the full_ratio
parameter when creating a cluster. You cannot change the value of mon_osd_full_ratio
afterwards. To temporarily increase the full_ratio
value, increase the set-full-ratio
instead.
Procedure: Deleting Data from a Full Cluster
Determine the current value of
full_ratio
, by default it is set to0.95
:ceph osd dump | grep -i full
# ceph osd dump | grep -i full full_ratio 0.95
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Temporarily increase the value by setting
set-full-ratio
to0.97
:ceph osd set-full-ratio 0.97
# ceph osd set-full-ratio 0.97
Copy to Clipboard Copied! Toggle word wrap Toggle overflow ImportantRed Hat strongly recommends to not set the
set-full-ratio
to a value higher than 0.97. Setting this parameter to a higher value makes the recovery process harder. As a consequence, you might not be able to recover full OSDs at all.Verify that you successfully set the parameter to
0.97
:ceph osd dump | grep -i full
# ceph osd dump | grep -i full full_ratio 0.97
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Monitor the cluster state:
ceph -w
# ceph -w
Copy to Clipboard Copied! Toggle word wrap Toggle overflow As soon as the cluster changes its state from
full
tonearfull
, delete any unnecessary data.Set the value of
full_ratio
back to0.95
:ceph osd set-full-ratio 0.95
# ceph osd set-full-ratio 0.95
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Verify that you successfully set the parameter to
0.95
:ceph osd dump | grep -i full
# ceph osd dump | grep -i full full_ratio 0.95
Copy to Clipboard Copied! Toggle word wrap Toggle overflow
See Also
Chapter 6. Troubleshooting a multisite Ceph Object Gateway Copy linkLink copied to clipboard!
This chapter contains information on how to fix the most common errors related to multisite Ceph Object Gateways configuration and operational conditions.
6.1. Prerequisites Copy linkLink copied to clipboard!
- A running Red Hat Ceph Storage 3 environment.
- A running Ceph Object Gateway.
6.2. Error code definitions for the Ceph Object Gateway Copy linkLink copied to clipboard!
The Ceph Object Gateway logs contain error and warning messages to assist in troubleshooting conditions in your environment. Some common ones are listed below with suggested resolutions. Contact Red Hat Support for any additional assistance.
Common error messages
data_sync: ERROR: a sync operation returned error
- This is the high-level data sync process complaining that a lower-level bucket sync process returned an error. This message is redundant; the bucket sync error appears above it in the log.
data sync: ERROR: failed to sync object: <bucket name>:<object name>
- Either the process failed to fetch the required object over HTTP from a remote gateway or the process failed to write that object to RADOS and it will be tried again.
data sync: ERROR: failure in sync, backing out (sync_status=2)
-
A low level message reflecting one of the above conditions, specifically that the data was deleted before it could sync and thus showing a
-2 ENOENT
status. data sync: ERROR: failure in sync, backing out (sync_status=-5)
-
A low level message reflecting one of the above conditions, specifically that we failed to write that object to RADOS and thus showing a
-5 EIO
. ERROR: failed to fetch remote data log info: ret=11
-
This is the
EAGAIN
generic error code fromlibcurl
reflecting an error condition from another gateway. It will try again by default. meta sync: ERROR: failed to read mdlog info with (2) No such file or directory
- The shard of the mdlog was never created so there is nothing to sync.
Syncing error messages
failed to sync object
- Either the process failed to fetch this object over HTTP from a remote gateway or it failed to write that object to RADOS and it will be tried again.
failed to sync bucket instance: (11) Resource temporarily unavailable
- A connection issue between primary and secondary zones.
failed to sync bucket instance: (125) Operation canceled
- A racing condition exists between writes to the same RADOS object.
6.3. Syncing a multisite Ceph Object Gateway Copy linkLink copied to clipboard!
A multisite sync reads the change log from other zones. To get a high-level view of the sync progress from the metadata and the data loags, you can use the following command:
radosgw-admin sync status
radosgw-admin sync status
This command lists which log shards, if any, which are behind their source zone.
If the results of the sync status you have run above reports log shards are behind, run the following command substituting the shard-id for X.
radosgw-admin data sync status --shard-id=X
radosgw-admin data sync status --shard-id=X
- Replace…
- X with the ID number of the shard.
Example
The output lists which buckets are next to sync and which buckets, if any, are going to be retried due to previous errors.
Inspect the status of individual buckets with the following command, substituting the bucket id for X.
radosgw-admin bucket sync status --bucket=X.
radosgw-admin bucket sync status --bucket=X.
- Replace…
- X with the ID number of the bucket.
The result shows which bucket index log shards are behind their source zone.
A common error in sync is EBUSY
, which means the sync is already in progress, often on another gateway. Read errors written to the sync error log, which can be read with the following command:
radosgw-admin sync error list
radosgw-admin sync error list
The syncing process will try again until it is successful. Errors can still occur that can require intervention.
6.3.1. Performance counters for multi-site Ceph Object Gateway data sync Copy linkLink copied to clipboard!
The following performance counters are available for multi-site configurations of the Ceph Object Gateway to measure data sync:
-
poll_latency
measures the latency of requests for remote replication logs. -
fetch_bytes
measures the number of objects and bytes fetched by data sync.
Use the ceph daemon .. perf dump
command to view the current metric data for the performance counters:
ceph daemon /var/run/ceph/{rgw}.asok
# ceph daemon /var/run/ceph/{rgw}.asok
Example output:
You must run the ceph daemon
command from the node running the daemon.
Additional Resources
- For more information about performance counters, see the Performance Counters section in the Administration Guide for Red Hat Ceph Storage 3
Chapter 7. Troubleshooting Placement Groups Copy linkLink copied to clipboard!
This section contains information about fixing the most common errors related to the Ceph Placement Groups (PGs).
Before You Start
- Verify your network connection. See Chapter 3, Troubleshooting Networking Issues for details.
- Ensure that Monitors are able to form a quorum. See Chapter 4, Troubleshooting Monitors for details about troubleshooting the most common errors related to Monitors.
-
Ensure that all healthy OSDs are
up
andin
, and the backfilling and recovery processes are finished. See Chapter 5, Troubleshooting OSDs for details about troubleshooting the most common errors related to OSDs.
7.2. Listing Placement Groups in stale, inactive, or unclean State Copy linkLink copied to clipboard!
After a failure, placement groups enter states like degraded
or peering
. This states indicate normal progression through the failure recovery process.
However, if a placement group stays in one of these states for a longer time than expected, it can be an indication of a larger problem. The Monitors reports when placement groups get stuck in a state that is not optimal.
The following table lists these states together with a short explanation.
State | What it means | Most common causes | See |
---|---|---|---|
| The PG has not been able to service read/write requests. |
| |
| The PG contains objects that are not replicated the desired number of times. Something is preventing the PG from recovering. |
| |
|
The status of the PG has not been updated by a |
|
The mon_pg_stuck_threshold
parameter in the Ceph configuration file determines the number of seconds after which placement groups are considered inactive
, unclean
, or stale
.
List the stuck PGs:
ceph pg dump_stuck inactive ceph pg dump_stuck unclean ceph pg dump_stuck stale
# ceph pg dump_stuck inactive
# ceph pg dump_stuck unclean
# ceph pg dump_stuck stale
See Also
- The Monitoring Placement Group States section in the Administration Guide for Red Hat Ceph Storage 3
7.3. Listing Inconsistencies Copy linkLink copied to clipboard!
Use the rados
utility to list inconsistencies in various replicas of an objects. Use the --format=json-pretty
option to list a more detailed output.
You can list:
Listing Inconsistent Placement Groups in a Pool
rados list-inconsistent-pg <pool> --format=json-pretty
rados list-inconsistent-pg <pool> --format=json-pretty
For example, list all inconsistent placement groups in a pool named data
:
rados list-inconsistent-pg data --format=json-pretty
# rados list-inconsistent-pg data --format=json-pretty
[0.6]
Listing Inconsistent Objects in a Placement Group
rados list-inconsistent-obj <placement-group-id>
rados list-inconsistent-obj <placement-group-id>
For example, list inconsistent objects in a placement group with ID 0.6
:
The following fields are important to determine what causes the inconsistency:
-
name
: The name of the object with inconsistent replicas. -
nspace
: The namespace that is a logical separation of a pool. It’s empty by default. -
locator
: The key that is used as the alternative of the object name for placement. -
snap
: The snapshot ID of the object. The only writable version of the object is calledhead
. If an object is a clone, this field includes its sequential ID. -
version
: The version ID of the object with inconsistent replicas. Each write operation to an object increments it. errors
: A list of errors that indicate inconsistencies between shards without determining which shard or shards are incorrect. See theshard
array to further investigate the errors.-
data_digest_mismatch
: The digest of the replica read from one OSD is different from the other OSDs. -
size_mismatch
: The size of a clone or thehead
object does not match the expectation. -
read_error
: This error indicates inconsistencies caused most likely by disk errors.
-
union_shard_error
: The union of all errors specific to shards. These errors are connected to a faulty shard. The errors that end withoi
indicate that you have to compare the information from a faulty object to information with selected objects. See theshard
array to further investigate the errors.In the above example, the object replica stored on
osd.2
has different digest than the replicas stored onosd.0
andosd.1
. Specifically, the digest of the replica is not0xffffffff
as calculated from the shard read fromosd.2
, but0xe978e67f
. In addition, the size of the replica read fromosd.2
is 0, while the size reported byosd.0
andosd.1
is 968.
Listing Inconsistent Snapshot Sets in a Placement Group
rados list-inconsistent-snapset <placement-group-id>
rados list-inconsistent-snapset <placement-group-id>
For example, list inconsistent sets of snapshots (snapsets
) in a placement group with ID 0.23
:
The command returns the following errors:
-
ss_attr_missing
: One or more attributes are missing. Attributes are information about snapshots encoded into a snapshot set as a list of key-value pairs. -
ss_attr_corrupted
: One or more attributes fail to decode. -
clone_missing
: A clone is missing. -
snapset_mismatch
: The snapshot set is inconsistent by itself. -
head_mismatch
: The snapshot set indicates thathead
exists or not, but the scrub results report otherwise. -
headless
: Thehead
of the snapshot set is missing. -
size_mismatch
: The size of a clone or thehead
object does not match the expectation.
See Also
7.4. Repairing Inconsistent Placement Groups Copy linkLink copied to clipboard!
Due to an error during deep scrubbing, some placement groups can include inconsistencies. Ceph reports such placement groups as inconsistent
:
HEALTH_ERR 1 pgs inconsistent; 2 scrub errors pg 0.6 is active+clean+inconsistent, acting [0,1,2] 2 scrub errors
HEALTH_ERR 1 pgs inconsistent; 2 scrub errors
pg 0.6 is active+clean+inconsistent, acting [0,1,2]
2 scrub errors
You can repair only certain inconsistencies. Do not repair the placement groups if the Ceph logs include the following errors:
<pg.id> shard <osd>: soid <object> digest <digest> != known digest <digest> <pg.id> shard <osd>: soid <object> omap_digest <digest> != known omap_digest <digest>
<pg.id> shard <osd>: soid <object> digest <digest> != known digest <digest>
<pg.id> shard <osd>: soid <object> omap_digest <digest> != known omap_digest <digest>
Open a support ticket instead. See Chapter 9, Contacting Red Hat Support Service for details.
Repair the inconsistent
placement groups:
ceph pg repair <id>
ceph pg repair <id>
Replace <id>
with the ID of the inconsistent
placement group.
See Also
7.5. Increasing the PG Count Copy linkLink copied to clipboard!
Insufficient Placement Group (PG) count impacts the performance of the Ceph cluster and data distribution. It is one of the main causes of the nearfull osds
error messages.
The recommended ratio is between 100 and 300 PGs per OSD. This ratio can decrease when you add more OSDs to the cluster.
The pg_num
and pgp_num
parameters determine the PG count. These parameters are configured per each pool, and therefore, you must adjust each pool with low PG count separately.
Increasing the PG count is the most intensive process that you can perform on a Ceph cluster. This process might have serious performance impact if not done in a slow and methodical way. Once you increase pgp_num
, you will not be able to stop or reverse the process and you must complete it.
Consider increasing the PG count outside of business critical processing time allocation, and alert all clients about the potential performance impact.
Do not change the PG count if the cluster is in the HEALTH_ERR
state.
Procedure: Increasing the PG Count
Reduce the impact of data redistribution and recovery on individual OSDs and OSD hosts:
Lower the value of the
osd max backfills
,osd_recovery_max_active
, andosd_recovery_op_priority
parameters:ceph tell osd.* injectargs '--osd_max_backfills 1 --osd_recovery_max_active 1 --osd_recovery_op_priority 1'
# ceph tell osd.* injectargs '--osd_max_backfills 1 --osd_recovery_max_active 1 --osd_recovery_op_priority 1'
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Disable the shallow and deep scrubbing:
ceph osd set noscrub ceph osd set nodeep-scrub
# ceph osd set noscrub # ceph osd set nodeep-scrub
Copy to Clipboard Copied! Toggle word wrap Toggle overflow
-
Use the Ceph Placement Groups (PGs) per Pool Calculator to calculate the optimal value of the
pg_num
andpgp_num
parameters. Increase the
pg_num
value in small increments until you reach the desired value.- Determine the starting increment value. Use a very low value that is a power of two, and increase it when you determine the impact on the cluster. The optimal value depends on the pool size, OSD count, and client I/O load.
Increment the
pg_num
value:ceph osd pool set <pool> pg_num <value>
ceph osd pool set <pool> pg_num <value>
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Specify the pool name and the new value, for example:
ceph osd pool set data pg_num 4
# ceph osd pool set data pg_num 4
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Monitor the status of the cluster:
ceph -s
# ceph -s
Copy to Clipboard Copied! Toggle word wrap Toggle overflow The PGs state will change from
creating
toactive+clean
. Wait until all PGs are in theactive+clean
state.
Increase the
pgp_num
value in small increments until you reach the desired value:- Determine the starting increment value. Use a very low value that is a power of two, and increase it when you determine the impact on the cluster. The optimal value depends on the pool size, OSD count, and client I/O load.
Increment the
pgp_num
value:ceph osd pool set <pool> pgp_num <value>
ceph osd pool set <pool> pgp_num <value>
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Specify the pool name and the new value, for example:
ceph osd pool set data pgp_num 4
# ceph osd pool set data pgp_num 4
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Monitor the status of the cluster:
ceph -s
# ceph -s
Copy to Clipboard Copied! Toggle word wrap Toggle overflow The PGs state will change through
peering
,wait_backfill
,backfilling
,recover
, and others. Wait until all PGs are in theactive+clean
state.
- Repeat the previous steps for all pools with insufficient PG count.
Set
osd max backfills
,osd_recovery_max_active
, andosd_recovery_op_priority
to their default values:ceph tell osd.* injectargs '--osd_max_backfills 1 --osd_recovery_max_active 3 --osd_recovery_op_priority 3'
# ceph tell osd.* injectargs '--osd_max_backfills 1 --osd_recovery_max_active 3 --osd_recovery_op_priority 3'
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Enable the shallow and deep scrubbing:
ceph osd unset noscrub ceph osd unset nodeep-scrub
# ceph osd unset noscrub # ceph osd unset nodeep-scrub
Copy to Clipboard Copied! Toggle word wrap Toggle overflow
See also
- Section 5.1.2, “Nearfull OSDs”
- The Monitoring Placement Group States section in the Administration Guide for Red Hat Ceph Storage 3
Chapter 8. Troubleshooting objects Copy linkLink copied to clipboard!
As a storage administrator, you can use the ceph-objectstore-tool
utility to perform high-level or low-level object operations. The ceph-objectstore-tool
utility can help you troubleshoot problems related to objects within a particular OSD or placement group.
Manipulating objects can cause unrecoverable data loss. Contact Red Hat support before using the ceph-objectstore-tool
utility.
8.1. Prerequisites Copy linkLink copied to clipboard!
- Verify there are no network related issues.
8.2. Troubleshooting high-level object operations Copy linkLink copied to clipboard!
As a storage administrator, you can use the ceph-objectstore-tool
utility to perform high-level object operations. The ceph-objectstore-tool
utility supports the following high-level object operations:
- List objects
- List lost objects
- Fix lost objects
Manipulating objects can cause unrecoverable data loss. Contact Red Hat support before using the ceph-objectstore-tool
utility.
8.2.1. Prerequisites Copy linkLink copied to clipboard!
-
Having
root
access to the Ceph OSD nodes.
8.2.2. Listing objects Copy linkLink copied to clipboard!
The OSD can contain zero to many placement groups, and zero to many objects within a placement group (PG). The ceph-objectstore-tool
utility allows you to list objects stored within an OSD.
Prerequisites
-
Having
root
access to the Ceph OSD node. -
Stopping the
ceph-osd
daemon.
Procedure
Verify the appropriate OSD is down:
Syntax
systemctl status ceph-osd@$OSD_NUMBER
systemctl status ceph-osd@$OSD_NUMBER
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Example
systemctl status ceph-osd@1
[root@osd ~]# systemctl status ceph-osd@1
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Identify all the objects within an OSD, regardless of their placement group:
Syntax
ceph-objectstore-tool --data-path $PATH_TO_OSD --op list
ceph-objectstore-tool --data-path $PATH_TO_OSD --op list
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Example
ceph-objectstore-tool --data-path /var/lib/ceph/osd/ceph-0 --op list
[root@osd ~]# ceph-objectstore-tool --data-path /var/lib/ceph/osd/ceph-0 --op list
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Identify all the objects within a placement group:
Syntax
ceph-objectstore-tool --data-path $PATH_TO_OSD --pgid $PG_ID --op list
ceph-objectstore-tool --data-path $PATH_TO_OSD --pgid $PG_ID --op list
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Example
ceph-objectstore-tool --data-path /var/lib/ceph/osd/ceph-0 --pgid 0.1c --op list
[root@osd ~]# ceph-objectstore-tool --data-path /var/lib/ceph/osd/ceph-0 --pgid 0.1c --op list
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Identify the PG an object belongs to:
Syntax
ceph-objectstore-tool --data-path $PATH_TO_OSD --op list $OBJECT_ID
ceph-objectstore-tool --data-path $PATH_TO_OSD --op list $OBJECT_ID
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Example
ceph-objectstore-tool --data-path /var/lib/ceph/osd/ceph-0 --op list default.region
[root@osd ~]# ceph-objectstore-tool --data-path /var/lib/ceph/osd/ceph-0 --op list default.region
Copy to Clipboard Copied! Toggle word wrap Toggle overflow
Additional Resources
- For more information on stopping an OSD, see the Starting, Stopping, and Restarting a Ceph Daemons by Instance section in the Red Hat Ceph Storage Administration Guide.
8.2.3. Fixing lost objects Copy linkLink copied to clipboard!
You can use the ceph-objectstore-tool
utility to list and fix lost and unfound objects stored within a Ceph OSD. This procedure applies only to legacy objects.
Prerequisites
-
Having
root
access to the Ceph OSD node. -
Stopping the
ceph-osd
daemon.
Procedure
Verify the appropriate OSD is down:
Syntax
systemctl status ceph-osd@$OSD_NUMBER
systemctl status ceph-osd@$OSD_NUMBER
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Example
systemctl status ceph-osd@1
[root@osd ~]# systemctl status ceph-osd@1
Copy to Clipboard Copied! Toggle word wrap Toggle overflow To list all the lost legacy objects:
Syntax
ceph-objectstore-tool --data-path $PATH_TO_OSD --op fix-lost --dry-run
ceph-objectstore-tool --data-path $PATH_TO_OSD --op fix-lost --dry-run
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Example
ceph-objectstore-tool --data-path /var/lib/ceph/osd/ceph-0 --op fix-lost --dry-run
[root@osd ~]# ceph-objectstore-tool --data-path /var/lib/ceph/osd/ceph-0 --op fix-lost --dry-run
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Use the
ceph-objectstore-tool
utility to fix lost and unfound objects. Select the appropriate circumstance:To fix all lost objects:
Syntax
ceph-objectstore-tool --data-path $PATH_TO_OSD --op fix-lost
ceph-objectstore-tool --data-path $PATH_TO_OSD --op fix-lost
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Example
ceph-objectstore-tool --data-path /var/lib/ceph/osd/ceph-0 --op fix-lost
[root@osd ~]# ceph-objectstore-tool --data-path /var/lib/ceph/osd/ceph-0 --op fix-lost
Copy to Clipboard Copied! Toggle word wrap Toggle overflow To fix all the lost objects within a placement group:
Syntax
ceph-objectstore-tool --data-path $PATH_TO_OSD --pgid $PG_ID --op fix-lost
ceph-objectstore-tool --data-path $PATH_TO_OSD --pgid $PG_ID --op fix-lost
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Example
ceph-objectstore-tool --data-path /var/lib/ceph/osd/ceph-0 --pgid 0.1c --op fix-lost
[root@osd ~]# ceph-objectstore-tool --data-path /var/lib/ceph/osd/ceph-0 --pgid 0.1c --op fix-lost
Copy to Clipboard Copied! Toggle word wrap Toggle overflow To fix a lost object by its identifier:
Syntax
ceph-objectstore-tool --data-path $PATH_TO_OSD --op fix-lost $OBJECT_ID
ceph-objectstore-tool --data-path $PATH_TO_OSD --op fix-lost $OBJECT_ID
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Example
ceph-objectstore-tool --data-path /var/lib/ceph/osd/ceph-0 --op fix-lost default.region
[root@osd ~]# ceph-objectstore-tool --data-path /var/lib/ceph/osd/ceph-0 --op fix-lost default.region
Copy to Clipboard Copied! Toggle word wrap Toggle overflow
Additional Resources
- For more information on stopping an OSD, see the Starting, Stopping, and Restarting a Ceph Daemons by Instance section in the Red Hat Ceph Storage Administration Guide.
8.3. Troubleshooting low-level object operations Copy linkLink copied to clipboard!
As a storage administrator, you can use the ceph-objectstore-tool
utility to perform low-level object operations. The ceph-objectstore-tool
utility supports the following low-level object operations:
- Manipulate the object’s content
- Remove an object
- List the object map (OMAP)
- Manipulate the OMAP header
- Manipulate the OMAP key
- List the object’s attributes
- Manipulate the object’s attribute key
Manipulating objects can cause unrecoverable data loss. Contact Red Hat support before using the ceph-objectstore-tool
utility.
8.3.1. Prerequisites Copy linkLink copied to clipboard!
-
Having
root
access to the Ceph OSD nodes.
8.3.2. Manipulating the object’s content Copy linkLink copied to clipboard!
With the ceph-objectstore-tool
utility, you can get or set bytes on an object.
Setting the bytes on an object can cause unrecoverable data loss. To prevent data loss, make a backup copy of the object.
Prerequisites
-
Having
root
access to the Ceph OSD node. -
Stopping the
ceph-osd
daemon.
Procedure
Verify the appropriate OSD is down:
Syntax
systemctl status ceph-osd@$OSD_NUMBER
systemctl status ceph-osd@$OSD_NUMBER
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Example
systemctl status ceph-osd@1
[root@osd ~]# systemctl status ceph-osd@1
Copy to Clipboard Copied! Toggle word wrap Toggle overflow - Find the object by listing the objects of the OSD or placement group (PG).
Before setting the bytes on an object, make a backup and a working copy of the object:
Syntax
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Example
Copy to Clipboard Copied! Toggle word wrap Toggle overflow - Edit the working copy object file and modify the object contents accordingly.
Set the bytes of the object:
Syntax
ceph-objectstore-tool --data-path $PATH_TO_OSD --pgid $PG_ID \ $OBJECT \ set-bytes < $OBJECT_FILE_NAME
ceph-objectstore-tool --data-path $PATH_TO_OSD --pgid $PG_ID \ $OBJECT \ set-bytes < $OBJECT_FILE_NAME
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Example
ceph-objectstore-tool --data-path /var/lib/ceph/osd/ceph-0 --pgid 0.1c \ '{"oid":"zone_info.default","key":"","snapid":-2,"hash":235010478,"max":0,"pool":11,"namespace":""}' \ set-bytes < zone_info.default.working-copy
[root@osd ~]# ceph-objectstore-tool --data-path /var/lib/ceph/osd/ceph-0 --pgid 0.1c \ '{"oid":"zone_info.default","key":"","snapid":-2,"hash":235010478,"max":0,"pool":11,"namespace":""}' \ set-bytes < zone_info.default.working-copy
Copy to Clipboard Copied! Toggle word wrap Toggle overflow
Additional Resources
- For more information on stopping an OSD, see the Starting, Stopping, and Restarting a Ceph Daemons by Instance section in the Red Hat Ceph Storage Administration Guide.
8.3.3. Removing an object Copy linkLink copied to clipboard!
Use the ceph-objectstore-tool
utility to remove an object. By removing an object, its contents and references are removed from the placement group (PG).
You cannot recreate an object once it is removed.
Prerequisites
-
Having
root
access to the Ceph OSD node. -
Stopping the
ceph-osd
daemon.
Procedure
Remove an object:
Syntax
ceph-objectstore-tool --data-path $PATH_TO_OSD --pgid $PG_ID \ $OBJECT \ remove
ceph-objectstore-tool --data-path $PATH_TO_OSD --pgid $PG_ID \ $OBJECT \ remove
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Example
ceph-objectstore-tool --data-path /var/lib/ceph/osd/ceph-0 --pgid 0.1c \ '{"oid":"zone_info.default","key":"","snapid":-2,"hash":235010478,"max":0,"pool":11,"namespace":""}' \ remove
[root@osd ~]# ceph-objectstore-tool --data-path /var/lib/ceph/osd/ceph-0 --pgid 0.1c \ '{"oid":"zone_info.default","key":"","snapid":-2,"hash":235010478,"max":0,"pool":11,"namespace":""}' \ remove
Copy to Clipboard Copied! Toggle word wrap Toggle overflow
Additional Resources
- For more information on stopping an OSD, see the Starting, Stopping, and Restarting a Ceph Daemons by Instance section in the Red Hat Ceph Storage Administration Guide.
8.3.4. Listing the object map Copy linkLink copied to clipboard!
Use the ceph-objectstore-tool
utility to list the contents of the object map (OMAP). The output provides you a list of keys.
Prerequisites
-
Having
root
access to the Ceph OSD node. -
Stopping the
ceph-osd
daemon.
Procedure
Verify the appropriate OSD is down:
Syntax
systemctl status ceph-osd@$OSD_NUMBER
systemctl status ceph-osd@$OSD_NUMBER
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Example
systemctl status ceph-osd@1
[root@osd ~]# systemctl status ceph-osd@1
Copy to Clipboard Copied! Toggle word wrap Toggle overflow List the object map:
Syntax
ceph-objectstore-tool --data-path $PATH_TO_OSD --pgid $PG_ID \ $OBJECT \ list-omap
ceph-objectstore-tool --data-path $PATH_TO_OSD --pgid $PG_ID \ $OBJECT \ list-omap
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Example
ceph-objectstore-tool --data-path /var/lib/ceph/osd/ceph-0 --pgid 0.1c \ '{"oid":"zone_info.default","key":"","snapid":-2,"hash":235010478,"max":0,"pool":11,"namespace":""}' \ list-omap
[root@osd ~]# ceph-objectstore-tool --data-path /var/lib/ceph/osd/ceph-0 --pgid 0.1c \ '{"oid":"zone_info.default","key":"","snapid":-2,"hash":235010478,"max":0,"pool":11,"namespace":""}' \ list-omap
Copy to Clipboard Copied! Toggle word wrap Toggle overflow
Additional Resources
- For more information on stopping an OSD, see the Starting, Stopping, and Restarting a Ceph Daemons by Instance section in the Red Hat Ceph Storage Administration Guide.
8.3.5. Manipulating the object map header Copy linkLink copied to clipboard!
The ceph-objectstore-tool
utility will output the object map (OMAP) header with the values associated with the object’s keys.
If using FileStore as the OSD backend object store, then add the --journal-path $PATH_TO_JOURNAL
argument when getting or setting the object map header. Where the $PATH_TO_JOURNAL
variable is the absolute path to the OSD journal, for example /var/lib/ceph/osd/ceph-0/journal
.
Prerequisites
-
Having
root
access to the Ceph OSD node. -
Stopping the
ceph-osd
daemon.
Procedure
Verify the appropriate OSD is down:
Syntax
systemctl status ceph-osd@$OSD_NUMBER
systemctl status ceph-osd@$OSD_NUMBER
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Example
systemctl status ceph-osd@1
[root@osd ~]# systemctl status ceph-osd@1
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Get the object map header:
Syntax
ceph-objectstore-tool --data-path $PATH_TO_OSD \ --pgid $PG_ID $OBJECT \ get-omaphdr > $OBJECT_MAP_FILE_NAME
ceph-objectstore-tool --data-path $PATH_TO_OSD \ --pgid $PG_ID $OBJECT \ get-omaphdr > $OBJECT_MAP_FILE_NAME
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Example
ceph-objectstore-tool --data-path /var/lib/ceph/osd/ceph-0 \ --pgid 0.1c '{"oid":"zone_info.default","key":"","snapid":-2,"hash":235010478,"max":0,"pool":11,"namespace":""}' \ get-omaphdr > zone_info.default.omaphdr.txt
[root@osd ~]# ceph-objectstore-tool --data-path /var/lib/ceph/osd/ceph-0 \ --pgid 0.1c '{"oid":"zone_info.default","key":"","snapid":-2,"hash":235010478,"max":0,"pool":11,"namespace":""}' \ get-omaphdr > zone_info.default.omaphdr.txt
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Set the object map header:
Syntax
ceph-objectstore-tool --data-path $PATH_TO_OSD \ --pgid $PG_ID $OBJECT \ get-omaphdr < $OBJECT_MAP_FILE_NAME
ceph-objectstore-tool --data-path $PATH_TO_OSD \ --pgid $PG_ID $OBJECT \ get-omaphdr < $OBJECT_MAP_FILE_NAME
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Example
ceph-objectstore-tool --data-path /var/lib/ceph/osd/ceph-0 \ --pgid 0.1c '{"oid":"zone_info.default","key":"","snapid":-2,"hash":235010478,"max":0,"pool":11,"namespace":""}' \ set-omaphdr < zone_info.default.omaphdr.txt
[root@osd ~]# ceph-objectstore-tool --data-path /var/lib/ceph/osd/ceph-0 \ --pgid 0.1c '{"oid":"zone_info.default","key":"","snapid":-2,"hash":235010478,"max":0,"pool":11,"namespace":""}' \ set-omaphdr < zone_info.default.omaphdr.txt
Copy to Clipboard Copied! Toggle word wrap Toggle overflow
Additional Resources
- For more information on stopping an OSD, see the Starting, Stopping, and Restarting a Ceph Daemons by Instance section in the Red Hat Ceph Storage Administration Guide.
8.3.6. Manipulating the object map key Copy linkLink copied to clipboard!
Use the ceph-objectstore-tool
utility to change the object map (OMAP) key. You need to provide the data path, the placement group identifier (PG ID), the object, and the key in the OMAP.
If using FileStore as the OSD backend object store, then add the --journal-path $PATH_TO_JOURNAL
argument when getting, setting or removing the object map key. Where the $PATH_TO_JOURNAL
variable is the absolute path to the OSD journal, for example /var/lib/ceph/osd/ceph-0/journal
.
Prerequisites
-
Having
root
access to the Ceph OSD node. -
Stopping the
ceph-osd
daemon.
Procedure
Get the object map key:
Syntax
ceph-objectstore-tool --data-path $PATH_TO_OSD \ --pgid $PG_ID $OBJECT \ get-omap $KEY > $OBJECT_MAP_FILE_NAME
ceph-objectstore-tool --data-path $PATH_TO_OSD \ --pgid $PG_ID $OBJECT \ get-omap $KEY > $OBJECT_MAP_FILE_NAME
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Example
ceph-objectstore-tool --data-path /var/lib/ceph/osd/ceph-0 \ --pgid 0.1c '{"oid":"zone_info.default","key":"","snapid":-2,"hash":235010478,"max":0,"pool":11,"namespace":""}' \ get-omap "" > zone_info.default.omap.txt
[root@osd ~]# ceph-objectstore-tool --data-path /var/lib/ceph/osd/ceph-0 \ --pgid 0.1c '{"oid":"zone_info.default","key":"","snapid":-2,"hash":235010478,"max":0,"pool":11,"namespace":""}' \ get-omap "" > zone_info.default.omap.txt
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Set the object map key:
Syntax
ceph-objectstore-tool --data-path $PATH_TO_OSD \ --pgid $PG_ID $OBJECT \ set-omap $KEY < $OBJECT_MAP_FILE_NAME
ceph-objectstore-tool --data-path $PATH_TO_OSD \ --pgid $PG_ID $OBJECT \ set-omap $KEY < $OBJECT_MAP_FILE_NAME
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Example
ceph-objectstore-tool --data-path /var/lib/ceph/osd/ceph-0 \ --pgid 0.1c '{"oid":"zone_info.default","key":"","snapid":-2,"hash":235010478,"max":0,"pool":11,"namespace":""}' \ set-omap "" < zone_info.default.omap.txt
[root@osd ~]# ceph-objectstore-tool --data-path /var/lib/ceph/osd/ceph-0 \ --pgid 0.1c '{"oid":"zone_info.default","key":"","snapid":-2,"hash":235010478,"max":0,"pool":11,"namespace":""}' \ set-omap "" < zone_info.default.omap.txt
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Remove the object map key:
Syntax
ceph-objectstore-tool --data-path $PATH_TO_OSD \ --pgid $PG_ID $OBJECT \ rm-omap $KEY
ceph-objectstore-tool --data-path $PATH_TO_OSD \ --pgid $PG_ID $OBJECT \ rm-omap $KEY
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Example
ceph-objectstore-tool --data-path /var/lib/ceph/osd/ceph-0 \ --pgid 0.1c '{"oid":"zone_info.default","key":"","snapid":-2,"hash":235010478,"max":0,"pool":11,"namespace":""}' \ rm-omap ""
[root@osd ~]# ceph-objectstore-tool --data-path /var/lib/ceph/osd/ceph-0 \ --pgid 0.1c '{"oid":"zone_info.default","key":"","snapid":-2,"hash":235010478,"max":0,"pool":11,"namespace":""}' \ rm-omap ""
Copy to Clipboard Copied! Toggle word wrap Toggle overflow
Additional Resources
- For more information on stopping an OSD, see the Starting, Stopping, and Restarting a Ceph Daemons by Instance section in the Red Hat Ceph Storage Administration Guide.
8.3.7. Listing the object’s attributes Copy linkLink copied to clipboard!
Use the ceph-objectstore-tool
utility to list an object’s attributes. The output provides you with the object’s keys and values.
If using FileStore as the OSD backend object store, then add the --journal-path $PATH_TO_JOURNAL
argument when listing an object’s attributes. Where the $PATH_TO_JOURNAL
variable is the absolute path to the OSD journal, for example /var/lib/ceph/osd/ceph-0/journal
.
Prerequisites
-
Having
root
access to the Ceph OSD node. -
Stopping the
ceph-osd
daemon.
Procedure
Verify the appropriate OSD is down:
Syntax
systemctl status ceph-osd@$OSD_NUMBER
systemctl status ceph-osd@$OSD_NUMBER
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Example
systemctl status ceph-osd@1
[root@osd ~]# systemctl status ceph-osd@1
Copy to Clipboard Copied! Toggle word wrap Toggle overflow List the object’s attributes:
Syntax
ceph-objectstore-tool --data-path $PATH_TO_OSD \ --pgid $PG_ID $OBJECT \ list-attrs
ceph-objectstore-tool --data-path $PATH_TO_OSD \ --pgid $PG_ID $OBJECT \ list-attrs
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Example
ceph-objectstore-tool --data-path /var/lib/ceph/osd/ceph-0 \ --pgid 0.1c '{"oid":"zone_info.default","key":"","snapid":-2,"hash":235010478,"max":0,"pool":11,"namespace":""}' \ list-attrs
[root@osd ~]# ceph-objectstore-tool --data-path /var/lib/ceph/osd/ceph-0 \ --pgid 0.1c '{"oid":"zone_info.default","key":"","snapid":-2,"hash":235010478,"max":0,"pool":11,"namespace":""}' \ list-attrs
Copy to Clipboard Copied! Toggle word wrap Toggle overflow
Additional Resources
- For more information on stopping an OSD, see the Starting, Stopping, and Restarting a Ceph Daemons by Instance section in the Red Hat Ceph Storage Administration Guide.
8.3.8. Manipulating the object attribute key Copy linkLink copied to clipboard!
Use the ceph-objectstore-tool
utility to change an object’s attributes. To manipulate the object’s attributes you need the data and journal paths, the placement group identifier (PG ID), the object, and the key in the object’s attribute.
If using FileStore as the OSD backend object store, then add the --journal-path $PATH_TO_JOURNAL
argument when getting, setting or removing the object’s attributes. Where the $PATH_TO_JOURNAL
variable is the absolute path to the OSD journal, for example /var/lib/ceph/osd/ceph-0/journal
.
Prerequisites
-
Having
root
access to the Ceph OSD node. -
Stopping the
ceph-osd
daemon.
Procedure
Verify the appropriate OSD is down:
Syntax
systemctl status ceph-osd@$OSD_NUMBER
systemctl status ceph-osd@$OSD_NUMBER
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Example
systemctl status ceph-osd@1
[root@osd ~]# systemctl status ceph-osd@1
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Get the object’s attributes:
Syntax
ceph-objectstore-tool --data-path $PATH_TO_OSD \ --pgid $PG_ID $OBJECT \ get-attrs $KEY > $OBJECT_ATTRS_FILE_NAME
ceph-objectstore-tool --data-path $PATH_TO_OSD \ --pgid $PG_ID $OBJECT \ get-attrs $KEY > $OBJECT_ATTRS_FILE_NAME
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Example
ceph-objectstore-tool --data-path /var/lib/ceph/osd/ceph-0 \ --pgid 0.1c '{"oid":"zone_info.default","key":"","snapid":-2,"hash":235010478,"max":0,"pool":11,"namespace":""}' \ get-attrs "oid" > zone_info.default.attr.txt
[root@osd ~]# ceph-objectstore-tool --data-path /var/lib/ceph/osd/ceph-0 \ --pgid 0.1c '{"oid":"zone_info.default","key":"","snapid":-2,"hash":235010478,"max":0,"pool":11,"namespace":""}' \ get-attrs "oid" > zone_info.default.attr.txt
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Set an object’s attributes:
Syntax
ceph-objectstore-tool --data-path $PATH_TO_OSD \ --pgid $PG_ID $OBJECT \ set-attrs $KEY < $OBJECT_ATTRS_FILE_NAME
ceph-objectstore-tool --data-path $PATH_TO_OSD \ --pgid $PG_ID $OBJECT \ set-attrs $KEY < $OBJECT_ATTRS_FILE_NAME
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Example
ceph-objectstore-tool --data-path /var/lib/ceph/osd/ceph-0 \ --pgid 0.1c '{"oid":"zone_info.default","key":"","snapid":-2,"hash":235010478,"max":0,"pool":11,"namespace":""}' \ set-attrs "oid" < zone_info.default.attr.txt
[root@osd ~]# ceph-objectstore-tool --data-path /var/lib/ceph/osd/ceph-0 \ --pgid 0.1c '{"oid":"zone_info.default","key":"","snapid":-2,"hash":235010478,"max":0,"pool":11,"namespace":""}' \ set-attrs "oid" < zone_info.default.attr.txt
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Remove an object’s attributes:
Syntax
ceph-objectstore-tool --data-path $PATH_TO_OSD \ --pgid $PG_ID $OBJECT \ rm-attrs $KEY
ceph-objectstore-tool --data-path $PATH_TO_OSD \ --pgid $PG_ID $OBJECT \ rm-attrs $KEY
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Example
ceph-objectstore-tool --data-path /var/lib/ceph/osd/ceph-0 \ --pgid 0.1c '{"oid":"zone_info.default","key":"","snapid":-2,"hash":235010478,"max":0,"pool":11,"namespace":""}' \ rm-attrs "oid"
[root@osd ~]# ceph-objectstore-tool --data-path /var/lib/ceph/osd/ceph-0 \ --pgid 0.1c '{"oid":"zone_info.default","key":"","snapid":-2,"hash":235010478,"max":0,"pool":11,"namespace":""}' \ rm-attrs "oid"
Copy to Clipboard Copied! Toggle word wrap Toggle overflow
Additional Resources
- For more information on stopping an OSD, see the Starting, Stopping, and Restarting a Ceph Daemons by Instance section in the Red Hat Ceph Storage Administration Guide.
8.4. Additional Resources Copy linkLink copied to clipboard!
- For Red Hat Ceph Storage support, see the Red Hat Customer Portal.
Chapter 9. Contacting Red Hat Support Service Copy linkLink copied to clipboard!
If the information in this guide did not help you to solve the problem, this chapter explains how you contact the Red Hat Support Service.
9.1. Providing Information to Red Hat Support Engineers Copy linkLink copied to clipboard!
If you are unable to fix problems related to Red Hat Ceph Storage by yourself, contact the Red Hat Support Service and provide sufficient amount of information that helps the support engineers to faster troubleshoot the problem you encounter.
Procedure: Providing Information to Red Hat Support Engineers
- Open a support ticket on the Red Hat Customer Portal.
-
Ideally, attach an
sosreport
to the ticket. See the What is a sosreport and how to create one in Red Hat Enterprise Linux 4.6 and later? solution for details. - If the Ceph daemons failed with a segmentation fault, consider generating a human-readable core dump file. See Section 9.2, “Generating readable core dump files” for details.
9.2. Generating readable core dump files Copy linkLink copied to clipboard!
When a Ceph daemon terminates unexpectedly with a segmentation fault, gather the information about its failure and provide it to the Red Hat Support Engineers.
Such information speeds up the initial investigation. Also, the Support Engineers can compare the information from the core dump files with {storage-product} cluster known issues.
9.2.1. Prerequisites Copy linkLink copied to clipboard!
Install the
ceph-debuginfo
package if it is not installed already.Enable the repository containing the
ceph-debuginfo
package:subscription-manager repos --enable=rhel-7-server-rhceph-3-DAEMON-debug-rpms
subscription-manager repos --enable=rhel-7-server-rhceph-3-DAEMON-debug-rpms
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Replace
DAEMON
withosd
ormon
depending on the type of the node.Install the
ceph-debuginfo
package:yum install ceph-debuginfo
[root@mon ~]# yum install ceph-debuginfo
Copy to Clipboard Copied! Toggle word wrap Toggle overflow
Ensure that the
gdb
package is installed and if it is not, install it:yum install gdb
[root@mon ~]# yum install gdb
Copy to Clipboard Copied! Toggle word wrap Toggle overflow
Continue with the procedure based on the type of your deployment:
9.2.2. Generating readable core dump files on bare-metal deployments Copy linkLink copied to clipboard!
Follow this procedure to generate a core dump file if you use Red Hat Ceph Storage on bare-metal.
Procedure
Enable generating core dump files for Ceph.
Set the proper
ulimits
for the core dump files by adding the following parameter to the/etc/systemd/system.conf
file:DefaultLimitCORE=infinity
DefaultLimitCORE=infinity
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Comment out the
PrivateTmp=true
parameter in the Ceph daemon service file, by default located at/lib/systemd/system/CLUSTER_NAME-DAEMON@.service
:PrivateTmp=true
[root@mon ~]# PrivateTmp=true
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Set the
suid_dumpable
flag to2
to allow the Ceph daemons to generate dump core files:sysctl fs.suid_dumpable=2
[root@mon ~]# sysctl fs.suid_dumpable=2
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Adjust the core dump files location:
sysctl kernel.core_pattern=/tmp/core
[root@mon ~]# sysctl kernel.core_pattern=/tmp/core
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Reload the
systemd
service for the changes to take effect:systemctl daemon-reload
[root@mon ~]# systemctl daemon-reload
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Restart the Ceph daemon for the changes to take effect:
systemctl restart ceph-DAEMON@ID
[root@mon ~]# systemctl restart ceph-DAEMON@ID
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Specify the daemon type (
osd
ormon
) and its ID (numbers for OSDs, or short host names for Monitors) for example:systemctl restart ceph-osd@1
[root@mon ~]# systemctl restart ceph-osd@1
Copy to Clipboard Copied! Toggle word wrap Toggle overflow
- Reproduce the failure, for example try to start the daemon again.
Use the GNU Debugger (GDB) to generate a readable backtrace from an application core dump file:
gdb /usr/bin/ceph-DAEMON /tmp/core.PID
gdb /usr/bin/ceph-DAEMON /tmp/core.PID
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Specify the daemon type and the PID of the failed process, for example:
gdb /usr/bin/ceph-osd /tmp/core.123456
$ gdb /usr/bin/ceph-osd /tmp/core.123456
Copy to Clipboard Copied! Toggle word wrap Toggle overflow In the GDB command prompt disable paging and enable logging to a file by entering the commands
set pag off
andset log on
:(gdb) set pag off (gdb) set log on
(gdb) set pag off (gdb) set log on
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Apply the
backtrace
command to all threads of the process by enteringthr a a bt full
:(gdb) thr a a bt full
(gdb) thr a a bt full
Copy to Clipboard Copied! Toggle word wrap Toggle overflow After the backtrace is generated turn off logging by entering
set log off
:(gdb) set log off
(gdb) set log off
Copy to Clipboard Copied! Toggle word wrap Toggle overflow -
Transfer the log file
gdb.txt
to the system you access the Red Hat Customer Portal from and attach it to a support ticket.
9.2.3. Generating readable core dump files in containerized deployments Copy linkLink copied to clipboard!
Follow this procedure to generate a core dump file if you use {storage-product} in containers. The procedure involves two scenarios of capturing the core dump file:
- When a Ceph process terminates unexpectedly due to the SIGILL, SIGTRAP, SIGABRT, or SIGSEGV error.
or
- Manually, for example for debugging issues such as Ceph processes are consuming high CPU cycles, or are not responding.
Prerequisites
- Root-level access to the container node running the Ceph containers.
- Installation of the appropriate debugging packages.
-
Installation of the GNU Project Debugger (
gdb
) package.
Procedure
If a Ceph process terminates unexpectedly due to the SIGILL, SIGTRAP, SIGABRT, or SIGSEGV error:
Set the core pattern to the
systemd-coredump
service on the node where the container with the failed Ceph process is running, for example:echo "| /usr/lib/systemd/systemd-coredump %P %u %g %s %t %e" > /proc/sys/kernel/core_pattern
[root@mon]# echo "| /usr/lib/systemd/systemd-coredump %P %u %g %s %t %e" > /proc/sys/kernel/core_pattern
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Watch for the next container failure due to a Ceph process and search for a core dump file in the
/var/lib/systemd/coredump/
directory, for example:ls -ltr /var/lib/systemd/coredump
[root@mon]# ls -ltr /var/lib/systemd/coredump total 8232 -rw-r-----. 1 root root 8427548 Jan 22 19:24 core.ceph-osd.167.5ede29340b6c4fe4845147f847514c12.15622.1584573794000000.xz
Copy to Clipboard Copied! Toggle word wrap Toggle overflow
To manually capture a core dump file for the Ceph Monitors and Ceph Managers:
Get the
ceph-mon
package details of the Ceph daemon from the container:docker exec -it NAME /bin/bash rpm -qa | grep ceph
[root@mon]# docker exec -it NAME /bin/bash [root@mon]# rpm -qa | grep ceph
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Replace NAME with the name of the Ceph container.
Make a backup copy and open for editing the
ceph-mon@.service
file:cp /etc/systemd/system/ceph-mon@.service /etc/systemd/system/ceph-mon@.service.orig
[root@mon]# cp /etc/systemd/system/ceph-mon@.service /etc/systemd/system/ceph-mon@.service.orig
Copy to Clipboard Copied! Toggle word wrap Toggle overflow In the
ceph-mon@.service
file, add these three options to the[Service]
section, each on a separate line:--pid=host \ --ipc=host \ --cap-add=SYS_PTRACE \
--pid=host \ --ipc=host \ --cap-add=SYS_PTRACE \
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Example
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Restart the Ceph Monitor daemon:
Syntax
systemctl restart ceph-mon@MONITOR_ID
systemctl restart ceph-mon@MONITOR_ID
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Replace MONITOR_ID with the ID number of the Ceph Monitor.
Example
systemctl restart ceph-mon@1
[root@mon]# systemctl restart ceph-mon@1
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Install the
gdb
package inside the Ceph Monitor container:docker exec -it ceph-mon-MONITOR_ID /bin/bash
[root@mon]# docker exec -it ceph-mon-MONITOR_ID /bin/bash sh $ yum install gdb
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Replace MONITOR_ID with the ID number of the Ceph Monitor.
Find the process ID:
Syntax
ps -aef | grep PROCESS | grep -v run
ps -aef | grep PROCESS | grep -v run
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Replace PROCESS with the name of failed process, for example
ceph-mon
.Example
ps -aef | grep ceph-mon | grep -v run
[root@mon]# ps -aef | grep ceph-mon | grep -v run ceph 15390 15266 0 18:54 ? 00:00:29 /usr/bin/ceph-mon --cluster ceph --setroot ceph --setgroup ceph -d -i 5 ceph 18110 17985 1 19:40 ? 00:00:08 /usr/bin/ceph-mon --cluster ceph --setroot ceph --setgroup ceph -d -i 2
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Generate the core dump file:
Syntax
gcore ID
gcore ID
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Replace ID with the ID of the failed process that you got from the previous step, for example
18110
:Example
gcore 18110
[root@mon]# gcore 18110 warning: target file /proc/18110/cmdline contained unexpected null characters Saved corefile core.18110
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Verify that the core dump file has been generated correctly.
Example
ls -ltr
[root@mon]# ls -ltr total 709772 -rw-r--r--. 1 root root 726799544 Mar 18 19:46 core.18110
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Copy the core dump file outside of the Ceph Monitor container:
docker cp ceph-mon-MONITOR_ID:/tmp/mon.core.MONITOR_PID /tmp
[root@mon]# docker cp ceph-mon-MONITOR_ID:/tmp/mon.core.MONITOR_PID /tmp
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Replace MONITOR_ID with the ID number of the Ceph Monitor and replace MONITOR_PID with the process ID number.
Restore the backup copy of the
ceph-mon@.service
file:cp /etc/systemd/system/ceph-mon@.service.orig /etc/systemd/system/ceph-mon@.service
[root@mon]# cp /etc/systemd/system/ceph-mon@.service.orig /etc/systemd/system/ceph-mon@.service
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Restart the Ceph Monitor daemon:
Syntax
systemctl restart ceph-mon@MONITOR_ID
systemctl restart ceph-mon@MONITOR_ID
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Replace MONITOR_ID with the ID number of the Ceph Monitor.
Example
systemctl restart ceph-mon@1
[root@mon]# systemctl restart ceph-mon@1
Copy to Clipboard Copied! Toggle word wrap Toggle overflow - Upload the core dump file for analysis by Red Hat support, see step 4.
To manually capture a core dump file for Ceph OSDs:
Get the
ceph-osd
package details of the Ceph daemon from the container:docker exec -it NAME /bin/bash rpm -qa | grep ceph
[root@osd]# docker exec -it NAME /bin/bash [root@osd]# rpm -qa | grep ceph
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Replace NAME with the name of the Ceph container.
Install the Ceph package for the same version of the
ceph-osd
package on the node where the Ceph containers are running:yum install ceph-osd
[root@osd]# yum install ceph-osd
Copy to Clipboard Copied! Toggle word wrap Toggle overflow If needed, enable the appropriate repository first. See the Enabling the Red Hat Ceph Storage repositories section in the Installation Guide for details.
Find the ID of the process that has failed:
ps -aef | grep PROCESS | grep -v run
ps -aef | grep PROCESS | grep -v run
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Replace PROCESS with the name of failed process, for example
ceph-osd
.ps -aef | grep ceph-osd | grep -v run
[root@osd]# ps -aef | grep ceph-osd | grep -v run ceph 15390 15266 0 18:54 ? 00:00:29 /usr/bin/ceph-osd --cluster ceph --setroot ceph --setgroup ceph -d -i 5 ceph 18110 17985 1 19:40 ? 00:00:08 /usr/bin/ceph-osd --cluster ceph --setroot ceph --setgroup ceph -d -i 2
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Generate the core dump file:
gcore ID
gcore ID
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Replace ID with the ID of the failed process that you got from the previous step, for example
18110
:gcore 18110
[root@osd]# gcore 18110 warning: target file /proc/18110/cmdline contained unexpected null characters Saved corefile core.18110
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Verify that the core dump file has been generated correctly.
ls -ltr
[root@osd]# ls -ltr total 709772 -rw-r--r--. 1 root root 726799544 Mar 18 19:46 core.18110
Copy to Clipboard Copied! Toggle word wrap Toggle overflow - Upload the core dump file for analysis by Red Hat support, see the next step.
- Upload the core dump file for analysis to a Red Hat support case. See Providing information to Red Hat Support engineers for details.
9.2.4. Additional Resources Copy linkLink copied to clipboard!
- The How to use gdb to generate a readable backtrace from an application core solution on the Red Hat Customer Portal
- The How to enable core file dumps when an application crashes or segmentation faults solution on the Red Hat Customer Portal
Appendix A. Subsystems Default Logging Levels Values Copy linkLink copied to clipboard!
Subsystem | Log Level | Memory Level |
---|---|---|
| 1 | 5 |
| 1 | 5 |
| 0 | 0 |
| 0 | 5 |
| 0 | 5 |
| 1 | 5 |
| 0 | 5 |
| 0 | 5 |
| 1 | 5 |
| 1 | 5 |
| 1 | 5 |
| 1 | 5 |
| 0 | 5 |
| 1 | 5 |
| 0 | 5 |
| 1 | 5 |
| 1 | 5 |
| 1 | 5 |
| 1 | 5 |
| 1 | 5 |
| 1 | 5 |
| 0 | 5 |
| 1 | 5 |
| 0 | 5 |
| 0 | 5 |
| 0 | 5 |
| 0 | 0 |
| 0 | 5 |
| 0 | 5 |
| 0 | 5 |
| 1 | 5 |
| 0 | 5 |
| 0 | 5 |
| 1 | 5 |
| 1 | 5 |
| 0 | 5 |
| 0 | 5 |