Chapter 2. Understanding process management for Ceph

As a storage administrator, you can manipulate the various Ceph daemons by type or instance in a Red Hat Ceph Storage cluster. Manipulating these daemons allows you to start, stop and restart all of the Ceph services as needed.

2.1. Ceph process management

In Red Hat Ceph Storage, all process management is done through the Systemd service. Each time you want to start, restart, and stop the Ceph daemons, you must specify the daemon type or the daemon instance.

Additional Resources

For more information on using systemd, see Managing system services with systemctl.

2.2. Starting, stopping, and restarting all Ceph daemons using `systemctl` command

You can start, stop, and restart all Ceph daemons as the root user from the host where you want to stop the Ceph daemons.

Prerequisites

A running Red Hat Ceph Storage cluster.
Having root access to the node.

Procedure

On the host where you want to start, stop, and restart the daemons, run the systemctl service to get the SERVICE_ID of the service.
Example
```
[root@host01 ~]# systemctl --type=service
ceph-499829b4-832f-11eb-8d6d-001a4a000635@mon.host01.service
```

Starting all Ceph daemons:

Syntax

systemctl start SERVICE_ID

Example

[root@host01 ~]# systemctl start ceph-499829b4-832f-11eb-8d6d-001a4a000635@mon.host01.service

Stopping all Ceph daemons:

Syntax

systemctl stop SERVICE_ID

Example

[root@host01 ~]# systemctl stop ceph-499829b4-832f-11eb-8d6d-001a4a000635@mon.host01.service

Restarting all Ceph daemons:

Syntax

systemctl restart SERVICE_ID

Example

[root@host01 ~]# systemctl restart ceph-499829b4-832f-11eb-8d6d-001a4a000635@mon.host01.service

2.3. Starting, stopping, and restarting all Ceph services

Ceph services are logical groups of Ceph daemons of the same type, configured to run in the same Red Hat Ceph Storage cluster. The orchestration layer in Ceph allows the user to manage these services in a centralized way, making it easy to execute operations that affect all the Ceph daemons that belong to the same logical service. The Ceph daemons running in each host are managed through the Systemd service. You can start, stop, and restart all Ceph services from the host where you want to manage the Ceph services.

Important

If you want to start,stop, or restart a specific Ceph daemon in a specific host, you need to use the SystemD service. To obtain a list of the SystemD services running in a specific host, connect to the host, and run the following command:

Example

[root@host01 ~]# systemctl list-units “ceph*”

The output will give you a list of the service names that you can use, to manage each Ceph daemon.

Prerequisites

A running Red Hat Ceph Storage cluster.
Having root access to the node.

Procedure

Log into the Cephadm shell:
Example
```
[root@host01 ~]# cephadm shell
```

Run the ceph orch ls command to get a list of Ceph services configured in the Red Hat Ceph Storage cluster and to get the specific service ID.

Example

[ceph: root@host01 /]# ceph orch ls
NAME                       RUNNING  REFRESHED  AGE  PLACEMENT  IMAGE NAME                                                       IMAGE ID
alertmanager                   1/1  4m ago     4M   count:1    registry.redhat.io/openshift4/ose-prometheus-alertmanager:v4.5   b7bae610cd46
crash                          3/3  4m ago     4M   *          registry.redhat.io/rhceph-alpha/rhceph-6-rhel9:latest            c88a5d60f510
grafana                        1/1  4m ago     4M   count:1    registry.redhat.io/rhceph-alpha/rhceph-6-dashboard-rhel9:latest  bd3d7748747b
mgr                            2/2  4m ago     4M   count:2    registry.redhat.io/rhceph-alpha/rhceph-6-rhel9:latest            c88a5d60f510
mon                            2/2  4m ago     10w  count:2    registry.redhat.io/rhceph-alpha/rhceph-6-rhel9:latest            c88a5d60f510
nfs.foo                        0/1  -          -    count:1    <unknown>                                                        <unknown>
node-exporter                  1/3  4m ago     4M   *          registry.redhat.io/openshift4/ose-prometheus-node-exporter:v4.5  mix
osd.all-available-devices      5/5  4m ago     3M   *          registry.redhat.io/rhceph-alpha/rhceph-6-rhel9:latest            c88a5d60f510
prometheus                     1/1  4m ago     4M   count:1    registry.redhat.io/openshift4/ose-prometheus:v4.6                bebb0ddef7f0
rgw.test_realm.test_zone       2/2  4m ago     3M   count:2    registry.redhat.io/rhceph-alpha/rhceph-6-rhel9:latest            c88a5d60f510

To start a specific service, run the following command:

Syntax

ceph orch start SERVICE_ID

Example

[ceph: root@host01 /]# ceph orch start node-exporter

To stop a specific service, run the following command:
Important
The ceph orch stop SERVICE_ID command results in the Red Hat Ceph Storage cluster being inaccessible, only for the MON and MGR service. It is recommended to use the systemctl stop SERVICE_ID command to stop a specific daemon in the host.
Syntax
```
ceph orch stop SERVICE_ID
```
Example
```
[ceph: root@host01 /]# ceph orch stop node-exporter
```
In the example the ceph orch stop node-exporter command removes all the daemons of the node exporter service.

To restart a specific service, run the following command:

Syntax

ceph orch restart SERVICE_ID

Example

[ceph: root@host01 /]# ceph orch restart node-exporter

2.4. Viewing log files of Ceph daemons that run in containers

Use the journald daemon from the container host to view a log file of a Ceph daemon from a container.

Prerequisites

Installation of the Red Hat Ceph Storage software.
Root-level access to the node.

Procedure

To view the entire Ceph log file, run a journalctl command as root composed in the following format:
Syntax
```
journalctl -u ceph SERVICE_ID
```
Example
```
[root@host01 ~]# journalctl -u ceph-499829b4-832f-11eb-8d6d-001a4a000635@osd.8.service
```
In the above example, you can view the entire log for the OSD with ID osd.8.

To show only the recent journal entries, use the -f option.

Syntax

journalctl -fu SERVICE_ID

Example

[root@host01 ~]# journalctl -fu ceph-499829b4-832f-11eb-8d6d-001a4a000635@osd.8.service

Note

You can also use the sosreport utility to view the journald logs. For more details about SOS reports, see the What is an sosreport and how to create one in Red Hat Enterprise Linux? solution on the Red Hat Customer Portal.

Additional Resources

The journalctl manual page.

2.5. Powering down and rebooting Red Hat Ceph Storage cluster

You can power down and reboot the Red Hat Ceph Storage cluster using two different approaches: systemctl commands and the Ceph Orchestrator. You can choose either approach to power down and reboot the cluster.

2.5.1. Powering down and rebooting the cluster using the `systemctl` commands

You can use the systemctl commands approach to power down and reboot the Red Hat Ceph Storage cluster. This approach follows the Linux way of stopping the services.

Prerequisites

A running Red Hat Ceph Storage cluster.
Root-level access.

Procedure

Powering down the Red Hat Ceph Storage cluster

Stop the clients from using the Block Device images RADOS Gateway - Ceph Object Gateway on this cluster and any other clients.
Log into the Cephadm shell:
Example
```
[root@host01 ~]# cephadm shell
```
The cluster must be in healthy state (Health_OK and all PGs active+clean) before proceeding. Run ceph status on the host with the client keyrings, for example, the Ceph Monitor or OpenStack controller nodes, to ensure the cluster is healthy.
Example
```
[ceph: root@host01 /]# ceph -s
```

If you use the Ceph File System (CephFS), bring down the CephFS cluster:

Syntax

ceph fs set FS_NAME max_mds 1
ceph fs fail FS_NAME
ceph status
ceph fs set FS_NAME joinable false

Example

[ceph: root@host01 /]# ceph fs set cephfs max_mds 1
[ceph: root@host01 /]# ceph fs fail cephfs
[ceph: root@host01 /]# ceph status
[ceph: root@host01 /]# ceph fs set cephfs joinable false

Set the noout, norecover, norebalance, nobackfill, nodown, and pause flags. Run the following on a node with the client keyrings, for example, the Ceph Monitor or OpenStack controller node:
Example
```
[ceph: root@host01 /]# ceph osd set noout
[ceph: root@host01 /]# ceph osd set norecover
[ceph: root@host01 /]# ceph osd set norebalance
[ceph: root@host01 /]# ceph osd set nobackfill
[ceph: root@host01 /]# ceph osd set nodown
[ceph: root@host01 /]# ceph osd set pause
```
Important
The above example is only for stopping the service and each OSD in the OSD node and it needs to be repeated on each OSD node.
If the MDS and Ceph Object Gateway nodes are on their own dedicated nodes, power them off.

Get the systemd target of the daemons:

Example

[root@host01 ~]# systemctl list-units --type target | grep ceph
ceph-0b007564-ec48-11ee-b736-525400fd02f8.target loaded active active Ceph cluster 0b007564-ec48-11ee-b736-525400fd02f8
ceph.target                                      loaded active active All Ceph clusters and services

Disable the target that includes the cluster FSID:

Example

[root@host01 ~]# systemctl disable ceph-0b007564-ec48-11ee-b736-525400fd02f8.target

Removed "/etc/systemd/system/multi-user.target.wants/ceph-0b007564-ec48-11ee-b736-525400fd02f8.target".
Removed "/etc/systemd/system/ceph.target.wants/ceph-0b007564-ec48-11ee-b736-525400fd02f8.target".

Stop the target:
Example
```
[root@host01 ~]# systemctl stop ceph-0b007564-ec48-11ee-b736-525400fd02f8.target
```
This stops all the daemons on the host that needs to be stopped.

Shutdown the node:

Example

[root@host01 ~]# shutdown
Shutdown scheduled for Wed 2024-03-27 11:47:19 EDT, use 'shutdown -c' to cancel.

Repeat the above steps for all the nodes of the cluster.

Rebooting the Red Hat Ceph Storage cluster

If network equipment was involved, ensure it is powered ON and stable prior to powering ON any Ceph hosts or nodes.
Power ON the administration node.

Enable the systemd target to get all the daemons running:

Example

[root@host01 ~]# systemctl enable ceph-0b007564-ec48-11ee-b736-525400fd02f8.target
Created symlink /etc/systemd/system/multi-user.target.wants/ceph-0b007564-ec48-11ee-b736-525400fd02f8.target  /etc/systemd/system/ceph-0b007564-ec48-11ee-b736-525400fd02f8.target.
Created symlink /etc/systemd/system/ceph.target.wants/ceph-0b007564-ec48-11ee-b736-525400fd02f8.target  /etc/systemd/system/ceph-0b007564-ec48-11ee-b736-525400fd02f8.target.

Start the systemd target:

Example

[root@host01 ~]# systemctl start ceph-0b007564-ec48-11ee-b736-525400fd02f8.target

Wait for all the nodes to come up. Verify all the services are up and there are no connectivity issues between the nodes.

Unset the noout, norecover, norebalance, nobackfill, nodown and pause flags. Run the following on a node with the client keyrings, for example, the Ceph Monitor or OpenStack controller node:

Example

[ceph: root@host01 /]# ceph osd unset noout
[ceph: root@host01 /]# ceph osd unset norecover
[ceph: root@host01 /]# ceph osd unset norebalance
[ceph: root@host01 /]# ceph osd unset nobackfill
[ceph: root@host01 /]# ceph osd unset nodown
[ceph: root@host01 /]# ceph osd unset pause

If you use the Ceph File System (CephFS), bring the CephFS cluster back up by setting the joinable flag to true:
Syntax
```
ceph fs set FS_NAME joinable true
```
Example
```
[ceph: root@host01 /]# ceph fs set cephfs joinable true
```

Verification

Verify the cluster is in healthy state (Health_OK and all PGs active+clean). Run ceph status on a node with the client keyrings, for example, the Ceph Monitor or OpenStack controller nodes, to ensure the cluster is healthy.

Example

[ceph: root@host01 /]# ceph -s

Additional Resources

For more information on installing Ceph, see the Red Hat Ceph Storage Installation Guide.

2.5.2. Powering down and rebooting the cluster using the Ceph Orchestrator

You can also use the capabilities of the Ceph Orchestrator to power down and reboot the Red Hat Ceph Storage cluster. In most cases, it is a single system login that can help in powering off the cluster.

The Ceph Orchestrator supports several operations, such as start, stop, and restart. You can use these commands with systemctl, for some cases, in powering down or rebooting the cluster.

Prerequisites

A running Red Hat Ceph Storage cluster.
Root-level access to the node.

Procedure

Powering down the Red Hat Ceph Storage cluster

Stop the clients from using the user Block Device Image and Ceph Object Gateway on this cluster and any other clients.
Log into the Cephadm shell:
Example
```
[root@host01 ~]# cephadm shell
```
The cluster must be in healthy state (Health_OK and all PGs active+clean) before proceeding. Run ceph status on the host with the client keyrings, for example, the Ceph Monitor or OpenStack controller nodes, to ensure the cluster is healthy.
Example
```
[ceph: root@host01 /]# ceph -s
```

If you use the Ceph File System (CephFS), bring down the CephFS cluster:

Syntax

ceph fs set FS_NAME max_mds 1
ceph fs fail FS_NAME
ceph status
ceph fs set FS_NAME joinable false
ceph mds fail FS_NAME:N

Example

[ceph: root@host01 /]# ceph fs set cephfs max_mds 1
[ceph: root@host01 /]# ceph fs fail cephfs
[ceph: root@host01 /]# ceph status
[ceph: root@host01 /]# ceph fs set cephfs joinable false
[ceph: root@host01 /]# ceph mds fail cephfs:1

Set the noout, norecover, norebalance, nobackfill, nodown, and pause flags. Run the following on a node with the client keyrings, for example, the Ceph Monitor or OpenStack controller node:

Example

[ceph: root@host01 /]# ceph osd set noout
[ceph: root@host01 /]# ceph osd set norecover
[ceph: root@host01 /]# ceph osd set norebalance
[ceph: root@host01 /]# ceph osd set nobackfill
[ceph: root@host01 /]# ceph osd set nodown
[ceph: root@host01 /]# ceph osd set pause

Stop the MDS service.
1. Fetch the MDS service name:
  Example
```
[ceph: root@host01 /]# ceph orch ls --service-type mds
```
2. Stop the MDS service using the fetched name in the previous step:
  Syntax
```
ceph orch stop SERVICE-NAME
```
Stop the Ceph Object Gateway services. Repeat for each deployed service.
1. Fetch the Ceph Object Gateway service names:
  Example
```
[ceph: root@host01 /]# ceph orch ls --service-type rgw
```
2. Stop the Ceph Object Gateway service using the fetched name:
  Syntax
```
ceph orch stop SERVICE-NAME
```

Stop the Alertmanager service:

Example

[ceph: root@host01 /]# ceph orch stop alertmanager

Stop the node-exporter service which is a part of the monitoring stack:
Example
```
[ceph: root@host01 /]# ceph orch stop node-exporter
```

Stop the Prometheus service:

Example

[ceph: root@host01 /]# ceph orch stop prometheus

Stop the Grafana dashboard service:

Example

[ceph: root@host01 /]# ceph orch stop grafana

Stop the crash service:

Example

[ceph: root@host01 /]# ceph orch stop crash

Shut down the OSD nodes from the cephadm node, one by one. Repeat this step for all the OSDs in the cluster.
1. Fetch the OSD ID:
  Example
```
[ceph: root@host01 /]# ceph orch ps --daemon-type=osd
```
2. Shut down the OSD node using the OSD ID you fetched:
  Example
```
[ceph: root@host01 /]# ceph orch daemon stop osd.1
Scheduled to stop osd.1 on host 'host02'
```
Stop the monitors one by one.
1. Identify the hosts hosting the monitors:
  Example
```
[ceph: root@host01 /]# ceph orch ps --daemon-type mon
```
2. On each host, stop the monitor.
  1. Identify the systemctl unit name:
    Example
    
    [ceph: root@host01 /]# systemctl list-units ceph-* | grep mon
  2. Stop the service:
    Syntax
    
    systemct stop SERVICE-NAME
Shut down all the hosts.

Rebooting the Red Hat Ceph Storage cluster

If network equipment was involved, ensure it is powered ON and stable prior to powering ON any Ceph hosts or nodes.
Power ON all the Ceph hosts.
Log into the administration node from the Cephadm shell:
Example
```
[root@host01 ~]# cephadm shell
```
Verify all the services are in running state:
Example
```
[ceph: root@host01 /]# ceph orch ls
```
Ensure the cluster health is `Health_OK`status:
Example
```
[ceph: root@host01 /]# ceph -s
```

Example

[ceph: root@host01 /]# ceph osd unset noout
[ceph: root@host01 /]# ceph osd unset norecover
[ceph: root@host01 /]# ceph osd unset norebalance
[ceph: root@host01 /]# ceph osd unset nobackfill
[ceph: root@host01 /]# ceph osd unset nodown
[ceph: root@host01 /]# ceph osd unset pause

If you use the Ceph File System (CephFS), bring the CephFS cluster back up by setting the joinable flag to true:
Syntax
```
ceph fs set FS_NAME joinable true
```
Example
```
[ceph: root@host01 /]# ceph fs set cephfs joinable true
```

Verification

Verify the cluster is in healthy state (Health_OK and all PGs active+clean). Run ceph status on a node with the client keyrings, for example, the Ceph Monitor or OpenStack controller nodes, to ensure the cluster is healthy.

Example

[ceph: root@host01 /]# ceph -s

Additional Resources

For more information on installing Ceph see the Red Hat Ceph Storage Installation Guide

Chapter 2. Understanding process management for Ceph

2.1. Ceph process management

2.2. Starting, stopping, and restarting all Ceph daemons using `systemctl` command

2.3. Starting, stopping, and restarting all Ceph services

2.4. Viewing log files of Ceph daemons that run in containers

2.5. Powering down and rebooting Red Hat Ceph Storage cluster

2.5.1. Powering down and rebooting the cluster using the `systemctl` commands

2.5.2. Powering down and rebooting the cluster using the Ceph Orchestrator

Learn

Try, buy, & sell

Communities

About Red Hat Documentation

Making open source more inclusive

About Red Hat

Red Hat legal and privacy links

Red Hat legal and privacy links

Chapter 2. Understanding process management for Ceph

2.1. Ceph process management

2.2. Starting, stopping, and restarting all Ceph daemons using systemctl command

2.3. Starting, stopping, and restarting all Ceph services

2.4. Viewing log files of Ceph daemons that run in containers

2.5. Powering down and rebooting Red Hat Ceph Storage cluster

2.5.1. Powering down and rebooting the cluster using the systemctl commands

2.5.2. Powering down and rebooting the cluster using the Ceph Orchestrator

Learn

Try, buy, & sell

Communities

About Red Hat Documentation

Making open source more inclusive

About Red Hat

Red Hat legal and privacy links

Red Hat legal and privacy links

2.2. Starting, stopping, and restarting all Ceph daemons using `systemctl` command

2.5.1. Powering down and rebooting the cluster using the `systemctl` commands