Chapter 1. Initial Troubleshooting

1.1. Prerequisites
Copy link

A running Red Hat Ceph Storage cluster.

1.2. Identifying problems
Copy link

To determine possible causes of the error with the Red Hat Ceph Storage cluster, answer the questions in the Procedure section.

Prerequisites

A running Red Hat Ceph Storage cluster.

Procedure

Certain problems can arise when using unsupported configurations. Ensure that your configuration is supported.
Do you know what Ceph component causes the problem?
1. No. Follow Diagnosing the health of a Ceph storage cluster procedure in the Red Hat Ceph Storage Troubleshooting Guide.
2. Ceph Monitors. See Troubleshooting Ceph Monitors section in the Red Hat Ceph Storage Troubleshooting Guide.
3. Ceph OSDs. See Troubleshooting Ceph OSDs section in the Red Hat Ceph Storage Troubleshooting Guide.
4. Ceph placement groups. See Troubleshooting Ceph placement groups section in the Red Hat Ceph Storage Troubleshooting Guide.
5. Multi-site Ceph Object Gateway. See Troubleshooting a multi-site Ceph Object Gateway section in the Red Hat Ceph Storage Troubleshooting Guide.

Additional Resources

See the Red Hat Ceph Storage: Supported configurations article for details.

1.2.1. Diagnosing the health of a storage cluster
Copy link

This procedure lists basic steps to diagnose the health of a Red Hat Ceph Storage cluster.

Prerequisites

A running Red Hat Ceph Storage cluster.

Procedure

Check the overall status of the storage cluster:
```
ceph health detail
```
```
[root@mon ~]# ceph health detail
```
Copy to Clipboard Toggle word wrap
If the command returns HEALTH_WARN or HEALTH_ERR see Understanding Ceph health for details.
Check the Ceph logs for any error messages listed in Understanding Ceph logs. The logs are located by default in the /var/log/ceph/ directory.
If the logs do not include sufficient amount of information, increase the debugging level and try to reproduce the action that failed. See Configuring logging for details.

1.3. Understanding Ceph health
Copy link

The ceph health command returns information about the status of the Red Hat Ceph Storage cluster:

HEALTH_OK indicates that the cluster is healthy.
HEALTH_WARN indicates a warning. In some cases, the Ceph status returns to HEALTH_OK automatically. For example when Red Hat Ceph Storage cluster finishes the rebalancing process. However, consider further troubleshooting if a cluster is in the HEALTH_WARN state for longer time.
HEALTH_ERR indicates a more serious problem that requires your immediate attention.

Use the ceph health detail and ceph -s commands to get a more detailed output.

Additional Resources

See the Ceph Monitor error messages table in the Red Hat Ceph Storage Troubleshooting Guide.
See the Ceph OSD error messages table in the Red Hat Ceph Storage Troubleshooting Guide.
See the Placement group error messages table in the Red Hat Ceph Storage Troubleshooting Guide.

1.4. Understanding Ceph logs
Copy link

1.4.1. Non containerized deployment
Copy link

By default, Ceph stores its logs in the /var/log/ceph/ directory.

The CLUSTER_NAME.log is the main storage cluster log file that includes global events. By default, the log file name is ceph.log. Only the Ceph Monitor nodes include the main storage cluster log.

Each Ceph OSD and Monitor has its own log file, named CLUSTER_NAME-osd.NUMBER.log and CLUSTER_NAME-mon.HOSTNAME.log.

When you increase debugging level for Ceph subsystems, Ceph generates new log files for those subsystems as well.

1.4.2. Container-based deployment
Copy link

For container-based deployment, by default, Ceph log to journald, accessible using the journactl command. However, you can configure Ceph to log to files in /var/log/ceph in the configuration settings.

To enable logging Ceph Monitors, Ceph Manager, Ceph Object Gateway, and any other daemons, set log_to_file to true under [global] settings.
Example
```
[ceph: root@host01 ~]# ceph config set global log_to_file true
```
```
[ceph: root@host01 ~]# ceph config set global log_to_file true
```
Copy to Clipboard Toggle word wrap
To enable logging for Ceph Monitor cluster and audit logs, set mon_cluster_log_to_file to true.
Example
```
[ceph: root@host01 ~]# ceph config set mon mon_cluster_log_to_file true
```
```
[ceph: root@host01 ~]# ceph config set mon mon_cluster_log_to_file true
```
Copy to Clipboard Toggle word wrap

Note

If you choose to log to files, it is recommended to disable logging to journald or else everything is logged twice. Run the following commands to disable logging to journald:

ceph config set global log_to_journald false
ceph config set global mon_cluster_log_to_journald false

# ceph config set global log_to_journald false
# ceph config set global mon_cluster_log_to_journald false

Copy to Clipboard

Toggle word wrap

1.5. Gathering logs from multiple hosts in a Ceph cluster using Ansible
Copy link

Starting with Red Hat Ceph Storage 4.2, you can use ceph-ansible to gather logs from multiple hosts in a Ceph cluster. It captures etc/ceph and /var/log/ceph directories from the Ceph nodes. This playbook can be used to collect logs for a bare-metal and containerized storage cluster.

Prerequisites

A running Red Hat Ceph Storage cluster.
Root-level access to the nodes.
The ceph-ansible package is installed on the node.

Procedure

Log into the Ansible administration node as an ansible user.
Note
Ensure the node has adequate space to collect the logs from the hosts.
Navigate to /usr/share/ceph-ansible directory:
Example
```
cd /usr/share/ceph-ansible
```
```
[ansible@admin ~]# cd /usr/share/ceph-ansible
```
Copy to Clipboard Toggle word wrap

Run the Ansible playbook to gather the logs:

Example

ansible@admin ceph-ansible]$ ansible-playbook infrastructure-playbooks/gather-ceph-logs.yml -i hosts

ansible@admin ceph-ansible]$ ansible-playbook infrastructure-playbooks/gather-ceph-logs.yml -i hosts

Copy to Clipboard

Toggle word wrap

The logs are stored in the /tmp directory of the Ansible node.

1.1. Prerequisites
Copy link

1.2. Identifying problems
Copy link

1.2.1. Diagnosing the health of a storage cluster
Copy link

1.3. Understanding Ceph health
Copy link

1.4. Understanding Ceph logs
Copy link

1.4.1. Non containerized deployment
Copy link

1.4.2. Container-based deployment
Copy link

1.5. Gathering logs from multiple hosts in a Ceph cluster using Ansible
Copy link

Learn

Try, buy, & sell

Communities

About Red Hat Documentation

Making open source more inclusive

About Red Hat

Theme

Red Hat legal and privacy links

Red Hat legal and privacy links

Chapter 1. Initial Troubleshooting

1.1. PrerequisitesCopy linkLink copied to clipboard!

1.2. Identifying problemsCopy linkLink copied to clipboard!

1.2.1. Diagnosing the health of a storage clusterCopy linkLink copied to clipboard!

1.3. Understanding Ceph healthCopy linkLink copied to clipboard!

1.4. Understanding Ceph logsCopy linkLink copied to clipboard!

1.4.1. Non containerized deploymentCopy linkLink copied to clipboard!

1.4.2. Container-based deploymentCopy linkLink copied to clipboard!

1.5. Gathering logs from multiple hosts in a Ceph cluster using AnsibleCopy linkLink copied to clipboard!

Learn

Try, buy, & sell

Communities

About Red Hat Documentation

Making open source more inclusive

About Red Hat

Theme

Red Hat legal and privacy links

Red Hat legal and privacy links

1.1. Prerequisites
Copy link

1.2. Identifying problems
Copy link

1.2.1. Diagnosing the health of a storage cluster
Copy link

1.3. Understanding Ceph health
Copy link

1.4. Understanding Ceph logs
Copy link

1.4.1. Non containerized deployment
Copy link

1.4.2. Container-based deployment
Copy link

1.5. Gathering logs from multiple hosts in a Ceph cluster using Ansible
Copy link