Chapter 4. Management of monitors using the Ceph Orchestrator
As a storage administrator, you can deploy additional monitors using placement specification, add monitors using service specification, add monitors to a subnet configuration, and add monitors to specific hosts. Apart from this, you can remove the monitors using the Ceph Orchestrator.
By default, a typical Red Hat Ceph Storage cluster has three or five monitor daemons deployed on different hosts.
Red Hat recommends deploying five monitors if there are five or more nodes in a cluster.
Red Hat recommends deploying three monitors when Ceph is deployed with the OSP director.
Ceph deploys monitor daemons automatically as the cluster grows, and scales back monitor daemons automatically as the cluster shrinks. The smooth execution of this automatic growing and shrinking depends upon proper subnet configuration.
If your monitor nodes or your entire cluster are located on a single subnet, then Cephadm automatically adds up to five monitor daemons as you add new hosts to the cluster. Cephadm automatically configures the monitor daemons on the new hosts. The new hosts reside on the same subnet as the bootstrapped host in the storage cluster.
Cephadm can also deploy and scale monitors to correspond to changes in the size of the storage cluster.
4.1. Ceph Monitors
Ceph Monitors are lightweight processes that maintain a master copy of the storage cluster map. All Ceph clients contact a Ceph monitor and retrieve the current copy of the storage cluster map, enabling clients to bind to a pool and read and write data.
Ceph Monitors use a variation of the Paxos protocol to establish consensus about maps and other critical information across the storage cluster. Due to the nature of Paxos, Ceph requires a majority of monitors running to establish a quorum, thus establishing consensus.
Red Hat requires at least three monitors on separate hosts to receive support for a production cluster.
Red Hat recommends deploying an odd number of monitors. An odd number of Ceph Monitors has a higher resiliency to failures than an even number of monitors. For example, to maintain a quorum on a two-monitor deployment, Ceph cannot tolerate any failures; with three monitors, one failure; with four monitors, one failure; with five monitors, two failures. This is why an odd number is advisable. Summarizing, Ceph needs a majority of monitors to be running and to be able to communicate with each other, two out of three, three out of four, and so on.
For an initial deployment of a multi-node Ceph storage cluster, Red Hat requires three monitors, increasing the number two at a time if a valid need for more than three monitors exists.
Since Ceph Monitors are lightweight, it is possible to run them on the same host as OpenStack nodes. However, Red Hat recommends running monitors on separate hosts.
Red Hat ONLY supports collocating Ceph services in containerized environments.
When you remove monitors from a storage cluster, consider that Ceph Monitors use the Paxos protocol to establish a consensus about the master storage cluster map. You must have a sufficient number of Ceph Monitors to establish a quorum.
Additional Resources
- See the Red Hat Ceph Storage Supported configurations Knowledgebase article for all the supported Ceph configurations.
4.2. Configuring monitor election strategy
The monitor election strategy identifies the net splits and handles failures. You can configure the election monitor strategy in three different modes:
-
classic
- This is the default mode in which the lowest ranked monitor is voted based on the elector module between the two sites. -
disallow
- This mode lets you mark monitors as disallowed, in which case they will participate in the quorum and serve clients, but cannot be an elected leader. This lets you add monitors to a list of disallowed leaders. If a monitor is in the disallowed list, it will always defer to another monitor. -
connectivity
- This mode is mainly used to resolve network discrepancies. It evaluates connection scores, based on pings that check liveness, provided by each monitor for its peers and elects the most connected and reliable monitor to be the leader. This mode is designed to handle net splits, which may happen if your cluster is stretched across multiple data centers or otherwise susceptible. This mode incorporates connection score ratings and elects the monitor with the best score. If a specific monitor is desired to be the leader, configure the election strategy so that the specific monitor is the first monitor in the list with a rank is0
.
Red Hat recommends you to stay in the classic
mode unless you require features in the other modes.
Before constructing the cluster, change the election_strategy
to classic
, disallow
, or connectivity
in the following command:
Syntax
ceph mon set election_strategy {classic|disallow|connectivity}
4.3. Deploying the Ceph monitor daemons using the command line interface
The Ceph Orchestrator deploys one monitor daemon by default. You can deploy additional monitor daemons by using the placement
specification in the command line interface. To deploy a different number of monitor daemons, specify a different number. If you do not specify the hosts where the monitor daemons should be deployed, the Ceph Orchestrator randomly selects the hosts and deploys the monitor daemons to them.
If you are using a cluster in stretched mode, before adding the Ceph Monitor, add the crush_location
to the monitor manually:
Syntax
ceph mon add HOST IP_ADDRESS datacenter=DATACENTER
Example
[ceph: root@host01 /]# ceph mon add host01 213.222.226.50 datacenter=DC1 adding mon.host01 at [v2:213.222.226.50:3300/0,v1:213.222.226.50:6789/0]
In this example datacenter=DC1
is the crush_location
.
Prerequisites
- A running Red Hat Ceph Storage cluster.
- Hosts are added to the cluster.
Procedure
Log into the Cephadm shell:
Example
[root@host01 ~]# cephadm shell
- There are four different ways of deploying Ceph monitor daemons:
Method 1
Use placement specification to deploy monitors on hosts:
NoteRed Hat recommends that you use the
--placement
option to deploy on specific hosts.Syntax
ceph orch apply mon --placement="HOST_NAME_1 HOST_NAME_2 HOST_NAME_3"
Example
[ceph: root@host01 /]# ceph orch apply mon --placement="host01 host02 host03"
NoteBe sure to include the bootstrap node as the first node in the command.
ImportantDo not add the monitors individually as
ceph orch apply mon
supersedes and will not add the monitors to all the hosts. For example, if you run the following commands, then the first command creates a monitor onhost01
. Then the second command supersedes the monitor on host1 and creates a monitor onhost02
. Then the third command supersedes the monitor onhost02
and creates a monitor onhost03
. Eventually. there is a monitor only on the third host.# ceph orch apply mon host01 # ceph orch apply mon host02 # ceph orch apply mon host03
Method 2
Use placement specification to deploy specific number of monitors on specific hosts with labels:
Add the labels to the hosts:
Syntax
ceph orch host label add HOSTNAME_1 LABEL
Example
[ceph: root@host01 /]# ceph orch host label add host01 mon
Deploy the daemons:
Syntax
ceph orch apply mon --placement="HOST_NAME_1:mon HOST_NAME_2:mon HOST_NAME_3:mon"
Example
[ceph: root@host01 /]# ceph orch apply mon --placement="host01:mon host02:mon host03:mon"
Method 3
Use placement specification to deploy specific number of monitors on specific hosts:
Syntax
ceph orch apply mon --placement="NUMBER_OF_DAEMONS HOST_NAME_1 HOST_NAME_2 HOST_NAME_3"
Example
[ceph: root@host01 /]# ceph orch apply mon --placement="3 host01 host02 host03"
Method 4
Deploy monitor daemons randomly on the hosts in the storage cluster:
Syntax
ceph orch apply mon NUMBER_OF_DAEMONS
Example
[ceph: root@host01 /]# ceph orch apply mon 3
Verification
List the service:
Example
[ceph: root@host01 /]# ceph orch ls
List the hosts, daemons, and processes:
Syntax
ceph orch ps --daemon_type=DAEMON_NAME
Example
[ceph: root@host01 /]# ceph orch ps --daemon_type=mon
4.4. Deploying the Ceph monitor daemons using the service specification
The Ceph Orchestrator deploys one monitor daemon by default. You can deploy additional monitor daemons by using the service specification, like a YAML format file.
Prerequisites
- A running Red Hat Ceph Storage cluster.
- Hosts are added to the cluster.
Procedure
Create the
mon.yaml
file:Example
[root@host01 ~]# touch mon.yaml
Edit the
mon.yaml
file to include the following details:Syntax
service_type: mon placement: hosts: - HOST_NAME_1 - HOST_NAME_2
Example
service_type: mon placement: hosts: - host01 - host02
Mount the YAML file under a directory in the container:
Example
[root@host01 ~]# cephadm shell --mount mon.yaml:/var/lib/ceph/mon/mon.yaml
Navigate to the directory:
Example
[ceph: root@host01 /]# cd /var/lib/ceph/mon/
Deploy the monitor daemons:
Syntax
ceph orch apply -i FILE_NAME.yaml
Example
[ceph: root@host01 mon]# ceph orch apply -i mon.yaml
Verification
List the service:
Example
[ceph: root@host01 /]# ceph orch ls
List the hosts, daemons, and processes:
Syntax
ceph orch ps --daemon_type=DAEMON_NAME
Example
[ceph: root@host01 /]# ceph orch ps --daemon_type=mon
4.5. Deploying the monitor daemons on specific network using the Ceph Orchestrator
The Ceph Orchestrator deploys one monitor daemon by default. You can explicitly specify the IP address or CIDR network for each monitor and control where each monitor is placed.
Prerequisites
- A running Red Hat Ceph Storage cluster.
- Hosts are added to the cluster.
Procedure
Log into the Cephadm shell:
Example
[root@host01 ~]# cephadm shell
Disable automated monitor deployment:
Example
[ceph: root@host01 /]# ceph orch apply mon --unmanaged
Deploy monitors on hosts on specific network:
Syntax
ceph orch daemon add mon HOST_NAME_1:IP_OR_NETWORK
Example
[ceph: root@host01 /]# ceph orch daemon add mon host03:10.1.2.123
Verification
List the service:
Example
[ceph: root@host01 /]# ceph orch ls
List the hosts, daemons, and processes:
Syntax
ceph orch ps --daemon_type=DAEMON_NAME
Example
[ceph: root@host01 /]# ceph orch ps --daemon_type=mon
4.6. Removing the monitor daemons using the Ceph Orchestrator
To remove the monitor daemons from the host, you can just redeploy the monitor daemons on other hosts.
Prerequisites
- A running Red Hat Ceph Storage cluster.
- Hosts are added to the cluster.
- At least one monitor daemon deployed on the hosts.
Procedure
Log into the Cephadm shell:
Example
[root@host01 ~]# cephadm shell
Run the
ceph orch apply
command to deploy the required monitor daemons:Syntax
ceph orch apply mon “NUMBER_OF_DAEMONS HOST_NAME_1 HOST_NAME_3”
If you want to remove monitor daemons from
host02
, then you can redeploy the monitors on other hosts.Example
[ceph: root@host01 /]# ceph orch apply mon “2 host01 host03”
Verification
List the hosts,daemons, and processes:
Syntax
ceph orch ps --daemon_type=DAEMON_NAME
Example
[ceph: root@host01 /]# ceph orch ps --daemon_type=mon
Additional Resources
- See Deploying the Ceph monitor daemons using the command line interface section in the Red Hat Ceph Storage Operations Guide for more information.
- See Deploying the Ceph monitor daemons using the service specification section in the Red Hat Ceph Storage Operations Guide for more information.
4.7. Removing a Ceph Monitor from an unhealthy storage cluster
You can remove a ceph-mon
daemon from an unhealthy storage cluster. An unhealthy storage cluster is one that has placement groups persistently in not active + clean
state.
Prerequisites
- A running Red Hat Ceph Storage cluster.
- Root-level access to the Ceph Monitor node.
- At least one running Ceph Monitor node.
Procedure
Identify a surviving monitor and log into the host:
Syntax
ssh root@MONITOR_ID
Example
[root@admin ~]# ssh root@host00
Log in to each Ceph Monitor host and stop all the Ceph Monitors:
Syntax
cephadm unit --name DAEMON_NAME.HOSTNAME stop
Example
[root@host00 ~]# cephadm unit --name mon.host00 stop
Set up the environment suitable for extended daemon maintenance and to run the daemon interactively:
Syntax
cephadm shell --name DAEMON_NAME.HOSTNAME
Example
[root@host00 ~]# cephadm shell --name mon.host00
Extract a copy of the
monmap
file:Syntax
ceph-mon -i HOSTNAME --extract-monmap TEMP_PATH
Example
[ceph: root@host00 /]# ceph-mon -i host01 --extract-monmap /tmp/monmap 2022-01-05T11:13:24.440+0000 7f7603bd1700 -1 wrote monmap to /tmp/monmap
Remove the non-surviving Ceph Monitor(s):
Syntax
monmaptool TEMPORARY_PATH --rm HOSTNAME
Example
[ceph: root@host00 /]# monmaptool /tmp/monmap --rm host01
Inject the surviving monitor map with the removed monitor(s) into the surviving Ceph Monitor:
Syntax
ceph-mon -i HOSTNAME --inject-monmap TEMP_PATH
Example
[ceph: root@host00 /]# ceph-mon -i host00 --inject-monmap /tmp/monmap
Start only the surviving monitors:
Syntax
cephadm unit --name DAEMON_NAME.HOSTNAME start
Example
[root@host00 ~]# cephadm unit --name mon.host00 start
Verify the monitors form a quorum:
Example
[ceph: root@host00 /]# ceph -s
-
Optional: Archive the removed Ceph Monitor’s data directory in
/var/lib/ceph/CLUSTER_FSID/mon.HOSTNAME
directory.