Este conteúdo não está disponível no idioma selecionado.
Chapter 9. Troubleshooting clusters in stretch mode
You can replace and remove the failed tiebreaker monitors. You can also force the cluster into the recovery or healthy mode if needed.
9.1. Replacing the tiebreaker with a monitor in quorum
If your tiebreaker monitor fails, you can replace it with an existing monitor in quorum and remove it from the cluster.
Prerequisites
- A running Red Hat Ceph Storage cluster
- Stretch mode is enabled on a cluster
Procedure
- Disable automated monitor deployment: - Example - [ceph: root@host01 /]# ceph orch apply mon --unmanaged Scheduled mon update… - [ceph: root@host01 /]# ceph orch apply mon --unmanaged Scheduled mon update…- Copy to Clipboard Copied! - Toggle word wrap Toggle overflow 
- View the monitors in quorum: - Example - [ceph: root@host01 /]# ceph -s mon: 5 daemons, quorum host01, host02, host04, host05 (age 30s), out of quorum: host07 - [ceph: root@host01 /]# ceph -s mon: 5 daemons, quorum host01, host02, host04, host05 (age 30s), out of quorum: host07- Copy to Clipboard Copied! - Toggle word wrap Toggle overflow 
- Set the monitor in quorum as a new tiebreaker: - Syntax - ceph mon set_new_tiebreaker NEW_HOST - ceph mon set_new_tiebreaker NEW_HOST- Copy to Clipboard Copied! - Toggle word wrap Toggle overflow - Example - [ceph: root@host01 /]# ceph mon set_new_tiebreaker host02 - [ceph: root@host01 /]# ceph mon set_new_tiebreaker host02- Copy to Clipboard Copied! - Toggle word wrap Toggle overflow Important- You get an error message if the monitor is in the same location as existing non-tiebreaker monitors: - Example - [ceph: root@host01 /]# ceph mon set_new_tiebreaker host02 Error EINVAL: mon.host02 has location DC1, which matches mons host02 on the datacenter dividing bucket for stretch mode. - [ceph: root@host01 /]# ceph mon set_new_tiebreaker host02 Error EINVAL: mon.host02 has location DC1, which matches mons host02 on the datacenter dividing bucket for stretch mode.- Copy to Clipboard Copied! - Toggle word wrap Toggle overflow - If that happens, change the location of the monitor: - Syntax - ceph mon set_location HOST datacenter=DATACENTER - ceph mon set_location HOST datacenter=DATACENTER- Copy to Clipboard Copied! - Toggle word wrap Toggle overflow - Example - [ceph: root@host01 /]# ceph mon set_location host02 datacenter=DC3 - [ceph: root@host01 /]# ceph mon set_location host02 datacenter=DC3- Copy to Clipboard Copied! - Toggle word wrap Toggle overflow 
- Remove the failed tiebreaker monitor: - Syntax - ceph orch daemon rm FAILED_TIEBREAKER_MONITOR --force - ceph orch daemon rm FAILED_TIEBREAKER_MONITOR --force- Copy to Clipboard Copied! - Toggle word wrap Toggle overflow - Example - [ceph: root@host01 /]# ceph orch daemon rm mon.host07 --force Removed mon.host07 from host 'host07' - [ceph: root@host01 /]# ceph orch daemon rm mon.host07 --force Removed mon.host07 from host 'host07'- Copy to Clipboard Copied! - Toggle word wrap Toggle overflow 
- Once the monitor is removed from the host, redeploy the monitor: - Syntax - ceph mon add HOST IP_ADDRESS datacenter=DATACENTER ceph orch daemon add mon HOST - ceph mon add HOST IP_ADDRESS datacenter=DATACENTER ceph orch daemon add mon HOST- Copy to Clipboard Copied! - Toggle word wrap Toggle overflow - Example - [ceph: root@host01 /]# ceph mon add host07 213.222.226.50 datacenter=DC1 [ceph: root@host01 /]# ceph orch daemon add mon host07 - [ceph: root@host01 /]# ceph mon add host07 213.222.226.50 datacenter=DC1 [ceph: root@host01 /]# ceph orch daemon add mon host07- Copy to Clipboard Copied! - Toggle word wrap Toggle overflow 
- Ensure there are five monitors in quorum: - Example - [ceph: root@host01 /]# ceph -s mon: 5 daemons, quorum host01, host02, host04, host05, host07 (age 15s) - [ceph: root@host01 /]# ceph -s mon: 5 daemons, quorum host01, host02, host04, host05, host07 (age 15s)- Copy to Clipboard Copied! - Toggle word wrap Toggle overflow 
- Verify that everything is configured properly: - Example - Copy to Clipboard Copied! - Toggle word wrap Toggle overflow 
- Redeploy the monitors: - Syntax - ceph orch apply mon --placement="HOST_1, HOST_2, HOST_3, HOST_4, HOST_5” - ceph orch apply mon --placement="HOST_1, HOST_2, HOST_3, HOST_4, HOST_5”- Copy to Clipboard Copied! - Toggle word wrap Toggle overflow - Example - [ceph: root@host01 /]# ceph orch apply mon --placement="host01, host02, host04, host05, host07" Scheduled mon update... - [ceph: root@host01 /]# ceph orch apply mon --placement="host01, host02, host04, host05, host07" Scheduled mon update...- Copy to Clipboard Copied! - Toggle word wrap Toggle overflow 
9.2. Replacing the tiebreaker with a new monitor
If your tiebreaker monitor fails, you can replace it with a new monitor and remove it from the cluster.
Prerequisites
- A running Red Hat Ceph Storage cluster
- Stretch mode in enabled on a cluster
Procedure
- Add a new monitor to the cluster: - Manually add the - crush_locationto the new monitor:- Syntax - ceph mon add NEW_HOST IP_ADDRESS datacenter=DATACENTER - ceph mon add NEW_HOST IP_ADDRESS datacenter=DATACENTER- Copy to Clipboard Copied! - Toggle word wrap Toggle overflow - Example - [ceph: root@host01 /]# ceph mon add host06 213.222.226.50 datacenter=DC3 adding mon.host06 at [v2:213.222.226.50:3300/0,v1:213.222.226.50:6789/0] - [ceph: root@host01 /]# ceph mon add host06 213.222.226.50 datacenter=DC3 adding mon.host06 at [v2:213.222.226.50:3300/0,v1:213.222.226.50:6789/0]- Copy to Clipboard Copied! - Toggle word wrap Toggle overflow Note- The new monitor has to be in a different location than existing non-tiebreaker monitors. 
- Disable automated monitor deployment: - Example - [ceph: root@host01 /]# ceph orch apply mon --unmanaged Scheduled mon update… - [ceph: root@host01 /]# ceph orch apply mon --unmanaged Scheduled mon update…- Copy to Clipboard Copied! - Toggle word wrap Toggle overflow 
- Deploy the new monitor: - Syntax - ceph orch daemon add mon NEW_HOST - ceph orch daemon add mon NEW_HOST- Copy to Clipboard Copied! - Toggle word wrap Toggle overflow - Example - [ceph: root@host01 /]# ceph orch daemon add mon host06 - [ceph: root@host01 /]# ceph orch daemon add mon host06- Copy to Clipboard Copied! - Toggle word wrap Toggle overflow 
 
- Ensure there are 6 monitors, from which 5 are in quorum: - Example - [ceph: root@host01 /]# ceph -s mon: 6 daemons, quorum host01, host02, host04, host05, host06 (age 30s), out of quorum: host07 - [ceph: root@host01 /]# ceph -s mon: 6 daemons, quorum host01, host02, host04, host05, host06 (age 30s), out of quorum: host07- Copy to Clipboard Copied! - Toggle word wrap Toggle overflow 
- Set the new monitor as a new tiebreaker: - Syntax - ceph mon set_new_tiebreaker NEW_HOST - ceph mon set_new_tiebreaker NEW_HOST- Copy to Clipboard Copied! - Toggle word wrap Toggle overflow - Example - [ceph: root@host01 /]# ceph mon set_new_tiebreaker host06 - [ceph: root@host01 /]# ceph mon set_new_tiebreaker host06- Copy to Clipboard Copied! - Toggle word wrap Toggle overflow 
- Remove the failed tiebreaker monitor: - Syntax - ceph orch daemon rm FAILED_TIEBREAKER_MONITOR --force - ceph orch daemon rm FAILED_TIEBREAKER_MONITOR --force- Copy to Clipboard Copied! - Toggle word wrap Toggle overflow - Example - [ceph: root@host01 /]# ceph orch daemon rm mon.host07 --force Removed mon.host07 from host 'host07' - [ceph: root@host01 /]# ceph orch daemon rm mon.host07 --force Removed mon.host07 from host 'host07'- Copy to Clipboard Copied! - Toggle word wrap Toggle overflow 
- Verify that everything is configured properly: - Example - Copy to Clipboard Copied! - Toggle word wrap Toggle overflow 
- Redeploy the monitors: - Syntax - ceph orch apply mon --placement="HOST_1, HOST_2, HOST_3, HOST_4, HOST_5” - ceph orch apply mon --placement="HOST_1, HOST_2, HOST_3, HOST_4, HOST_5”- Copy to Clipboard Copied! - Toggle word wrap Toggle overflow - Example - [ceph: root@host01 /]# ceph orch apply mon --placement="host01, host02, host04, host05, host06" Scheduled mon update… - [ceph: root@host01 /]# ceph orch apply mon --placement="host01, host02, host04, host05, host06" Scheduled mon update…- Copy to Clipboard Copied! - Toggle word wrap Toggle overflow 
9.3. Forcing stretch cluster into recovery or healthy mode
When in stretch degraded mode, the cluster goes into the recovery mode automatically after the disconnected data center comes back. If that does not happen, or you want to enable recovery mode early, you can force the stretch cluster into the recovery mode.
Prerequisites
- A running Red Hat Ceph Storage cluster
- Stretch mode in enabled on a cluster
Procedure
- Force the stretch cluster into the recovery mode: - Example - [ceph: root@host01 /]# ceph osd force_recovery_stretch_mode --yes-i-really-mean-it - [ceph: root@host01 /]# ceph osd force_recovery_stretch_mode --yes-i-really-mean-it- Copy to Clipboard Copied! - Toggle word wrap Toggle overflow Note- The recovery state puts the cluster in the - HEALTH_WARNstate.
- When in recovery mode, the cluster should go back into normal stretch mode after the placement groups are healthy. If that does not happen, you can force the stretch cluster into the healthy mode: - Example - [ceph: root@host01 /]# ceph osd force_healthy_stretch_mode --yes-i-really-mean-it - [ceph: root@host01 /]# ceph osd force_healthy_stretch_mode --yes-i-really-mean-it- Copy to Clipboard Copied! - Toggle word wrap Toggle overflow Note- You can also run this command if you want to force the cross-data-center peering early and you are willing to risk data downtime, or you have verified separately that all the placement groups can peer, even if they are not fully recovered. - You might also wish to invoke the healthy mode to remove the - HEALTH_WARNstate, which is generated by the recovery state.Note- The - force_recovery_stretch_modeand- force_recovery_healthy_modecommands should not be necessary, as they are included in the process of managing unanticipated situations.