Chapter 10. Contacting Red Hat support for service
If the information in this guide did not help you to solve the problem, this chapter explains how you contact the Red Hat support service.
Prerequisites
- Red Hat support account.
10.1. Providing information to Red Hat Support engineers
If you are unable to fix problems related to Red Hat Ceph Storage, contact the Red Hat Support Service and provide sufficient amount of information that helps the support engineers to faster troubleshoot the problem you encounter.
Prerequisites
- Root-level access to the node.
- Red Hat support account.
Procedure
- Open a support ticket on the Red Hat Customer Portal.
-
Ideally, attach an
sosreport
to the ticket. See the What is a sosreport and how to create one in Red Hat Enterprise Linux? solution for details. - If the Ceph daemons fail with a segmentation fault, consider generating a human-readable core dump file. See Generating readable core dump files for details.
10.2. Generating readable core dump files
When a Ceph daemon terminates unexpectedly with a segmentation fault, gather the information about its failure and provide it to the Red Hat Support Engineers.
Such information speeds up the initial investigation. Also, the Support Engineers can compare the information from the core dump files with Red Hat Ceph Storage cluster known issues.
Prerequisites
Install the debuginfo packages if they are not installed already.
Enable the following repositories to install the required debuginfo packages.
Example
Copy to Clipboard Copied! Toggle word wrap Toggle overflow subscription-manager repos --enable=rhceph-6-tools-for-rhel-9-x86_64-rpms yum --enable=rhceph-6-tools-for-rhel-9-x86_64-debug-rpms
[root@host01 ~]# subscription-manager repos --enable=rhceph-6-tools-for-rhel-9-x86_64-rpms [root@host01 ~]# yum --enable=rhceph-6-tools-for-rhel-9-x86_64-debug-rpms
Once the repository is enabled, you can install the debug info packages that you need from this list of supported packages:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow ceph-base-debuginfo ceph-common-debuginfo ceph-debugsource ceph-fuse-debuginfo ceph-immutable-object-cache-debuginfo ceph-mds-debuginfo ceph-mgr-debuginfo ceph-mon-debuginfo ceph-osd-debuginfo ceph-radosgw-debuginfo cephfs-mirror-debuginfo
ceph-base-debuginfo ceph-common-debuginfo ceph-debugsource ceph-fuse-debuginfo ceph-immutable-object-cache-debuginfo ceph-mds-debuginfo ceph-mgr-debuginfo ceph-mon-debuginfo ceph-osd-debuginfo ceph-radosgw-debuginfo cephfs-mirror-debuginfo
Ensure that the
gdb
package is installed and if it is not, install it:Example
Copy to Clipboard Copied! Toggle word wrap Toggle overflow dnf install gdb
[root@host01 ~]# dnf install gdb
10.2.1. Generating readable core dump files in containerized deployments
You can generate a core dump file for Red Hat Ceph Storage 6 which involves two scenarios of capturing the core dump file:
- When a Ceph process terminates unexpectedly due to the SIGILL, SIGTRAP, SIGABRT, or SIGSEGV error.
or
- Manually, for example for debugging issues such as Ceph processes are consuming high CPU cycles, or are not responding.
Prerequisites
- Root-level access to the container node running the Ceph containers.
- Installation of the appropriate debugging packages.
-
Installation of the GNU Project Debugger (
gdb
) package. - Ensure the hosts has at least 8 GB RAM. If there are multiple daemons on the host, then Red Hat recommends more RAM.
Procedure
If a Ceph process terminates unexpectedly due to the SIGILL, SIGTRAP, SIGABRT, or SIGSEGV error:
Set the core pattern to the
systemd-coredump
service on the node where the container with the failed Ceph process is running:Example
Copy to Clipboard Copied! Toggle word wrap Toggle overflow echo "| /usr/lib/systemd/systemd-coredump %P %u %g %s %t %c %h %e" > /proc/sys/kernel/core_pattern
[root@mon]# echo "| /usr/lib/systemd/systemd-coredump %P %u %g %s %t %c %h %e" > /proc/sys/kernel/core_pattern
Watch for the next container failure due to a Ceph process and search for the core dump file in the
/var/lib/systemd/coredump/
directory:Example
Copy to Clipboard Copied! Toggle word wrap Toggle overflow ls -ltr /var/lib/systemd/coredump
[root@mon]# ls -ltr /var/lib/systemd/coredump total 8232 -rw-r-----. 1 root root 8427548 Jan 22 19:24 core.ceph-osd.167.5ede29340b6c4fe4845147f847514c12.15622.1584573794000000.xz
To manually capture a core dump file for the Ceph Monitors and Ceph OSDs:
Get the MONITOR_ID or the OSD_ID and enter the container:
Syntax
Copy to Clipboard Copied! Toggle word wrap Toggle overflow podman ps podman exec -it MONITOR_ID_OR_OSD_ID bash
podman ps podman exec -it MONITOR_ID_OR_OSD_ID bash
Example
Copy to Clipboard Copied! Toggle word wrap Toggle overflow podman ps podman exec -it ceph-1ca9f6a8-d036-11ec-8263-fa163ee967ad-osd-2 bash
[root@host01 ~]# podman ps [root@host01 ~]# podman exec -it ceph-1ca9f6a8-d036-11ec-8263-fa163ee967ad-osd-2 bash
Install the
procps-ng
andgdb
packages inside the container:Example
Copy to Clipboard Copied! Toggle word wrap Toggle overflow dnf install procps-ng gdb
[root@host01 ~]# dnf install procps-ng gdb
Find the process ID:
Syntax
Copy to Clipboard Copied! Toggle word wrap Toggle overflow ps -aef | grep PROCESS | grep -v run
ps -aef | grep PROCESS | grep -v run
Replace PROCESS with the name of the running process, for example
ceph-mon
orceph-osd
.Example
Copy to Clipboard Copied! Toggle word wrap Toggle overflow ps -aef | grep ceph-mon | grep -v run
[root@host01 ~]# ps -aef | grep ceph-mon | grep -v run ceph 15390 15266 0 18:54 ? 00:00:29 /usr/bin/ceph-mon --cluster ceph --setroot ceph --setgroup ceph -d -i 5 ceph 18110 17985 1 19:40 ? 00:00:08 /usr/bin/ceph-mon --cluster ceph --setroot ceph --setgroup ceph -d -i 2
Generate the core dump file:
Syntax
Copy to Clipboard Copied! Toggle word wrap Toggle overflow gcore ID
gcore ID
Replace ID with the ID of the process that you got from the previous step, for example
18110
:Example
Copy to Clipboard Copied! Toggle word wrap Toggle overflow gcore 18110
[root@host01 ~]# gcore 18110 warning: target file /proc/18110/cmdline contained unexpected null characters Saved corefile core.18110
Verify that the core dump file has been generated correctly.
Example
Copy to Clipboard Copied! Toggle word wrap Toggle overflow ls -ltr
[root@host01 ~]# ls -ltr total 709772 -rw-r--r--. 1 root root 726799544 Mar 18 19:46 core.18110
Copy the core dump file outside of the Ceph Monitor container:
Syntax
Copy to Clipboard Copied! Toggle word wrap Toggle overflow podman cp ceph-mon-MONITOR_ID:/tmp/mon.core.MONITOR_PID /tmp
podman cp ceph-mon-MONITOR_ID:/tmp/mon.core.MONITOR_PID /tmp
Replace MONITOR_ID with the ID number of the Ceph Monitor and replace MONITOR_PID with the process ID number.
To manually capture a core dump file for other Ceph daemons:
Log in to the
cephadm shell
:Example
Copy to Clipboard Copied! Toggle word wrap Toggle overflow cephadm shell
[root@host03 ~]# cephadm shell
Enable
ptrace
for the daemons:Example
Copy to Clipboard Copied! Toggle word wrap Toggle overflow [ceph: root@host01 /]# ceph config set mgr mgr/cephadm/allow_ptrace true
[ceph: root@host01 /]# ceph config set mgr mgr/cephadm/allow_ptrace true
Redeploy the daemon service:
Syntax
Copy to Clipboard Copied! Toggle word wrap Toggle overflow ceph orch redeploy SERVICE_ID
ceph orch redeploy SERVICE_ID
Example
Copy to Clipboard Copied! Toggle word wrap Toggle overflow [ceph: root@host01 /]# ceph orch redeploy mgr [ceph: root@host01 /]# ceph orch redeploy rgw.rgw.1
[ceph: root@host01 /]# ceph orch redeploy mgr [ceph: root@host01 /]# ceph orch redeploy rgw.rgw.1
Exit the
cephadm shell
and log in to the host where the daemons are deployed:Example
Copy to Clipboard Copied! Toggle word wrap Toggle overflow ssh root@10.0.0.11
[ceph: root@host01 /]# exit [root@host01 ~]# ssh root@10.0.0.11
Get the DAEMON_ID and enter the container:
Example
Copy to Clipboard Copied! Toggle word wrap Toggle overflow podman ps podman exec -it ceph-1ca9f6a8-d036-11ec-8263-fa163ee967ad-rgw-rgw-1-host04 bash
[root@host04 ~]# podman ps [root@host04 ~]# podman exec -it ceph-1ca9f6a8-d036-11ec-8263-fa163ee967ad-rgw-rgw-1-host04 bash
Install the
procps-ng
andgdb
packages:Example
Copy to Clipboard Copied! Toggle word wrap Toggle overflow dnf install procps-ng gdb
[root@host04 /]# dnf install procps-ng gdb
Get the PID of process:
Example
Copy to Clipboard Copied! Toggle word wrap Toggle overflow ps aux | grep rados
[root@host04 /]# ps aux | grep rados ceph 6 0.3 2.8 5334140 109052 ? Sl May10 5:25 /usr/bin/radosgw -n client.rgw.rgw.1.host04 -f --setuser ceph --setgroup ceph --default-log-to-file=false --default-log-to-stderr=true --default-log-stderr-prefix=debug
Gather core dump:
Syntax
Copy to Clipboard Copied! Toggle word wrap Toggle overflow gcore PID
gcore PID
Example
Copy to Clipboard Copied! Toggle word wrap Toggle overflow gcore 6
[root@host04 /]# gcore 6
Verify that the core dump file has been generated correctly.
Example
Copy to Clipboard Copied! Toggle word wrap Toggle overflow ls -ltr
[root@host04 /]# ls -ltr total 108798 -rw-r--r--. 1 root root 726799544 Mar 18 19:46 core.6
Copy the core dump file outside the container:
Syntax
Copy to Clipboard Copied! Toggle word wrap Toggle overflow podman cp ceph-mon-DAEMON_ID:/tmp/mon.core.PID /tmp
podman cp ceph-mon-DAEMON_ID:/tmp/mon.core.PID /tmp
Replace DAEMON_ID with the ID number of the Ceph daemon and replace PID with the process ID number.
To allow
systemd-coredump
to successfully store the core dump for a crashed ceph daemon:Set
DefaultLimitCORE
to infinity in/etc/systemd/system.conf
to allow core dump collection for a crashed process:Syntax
Copy to Clipboard Copied! Toggle word wrap Toggle overflow cat /etc/systemd/system.conf DefaultLimitCORE=infinity
# cat /etc/systemd/system.conf DefaultLimitCORE=infinity
Restart
systemd
or the node to apply the updatedsystemd
settings:Syntax
Copy to Clipboard Copied! Toggle word wrap Toggle overflow sudo systemctl daemon-reexec
# sudo systemctl daemon-reexec
Verify the core dump files associated with previous daemon crashes:
Syntax
Copy to Clipboard Copied! Toggle word wrap Toggle overflow ls -ltr /var/lib/systemd/coredump/
# ls -ltr /var/lib/systemd/coredump/
- Upload the core dump file for analysis to a Red Hat support case. See Providing information to Red Hat Support engineers for details.
Additional Resources
- The How to use gdb to generate a readable backtrace from an application core solution on the Red Hat Customer Portal
- The How to enable core file dumps when an application crashes or segmentation faults solution on the Red Hat Customer Portal