Chapter 2. Understanding process management for Ceph


As a storage administrator, you can manipulate the various Ceph daemons by type or instance in a Red Hat Ceph Storage cluster. Manipulating these daemons allows you to start, stop and restart all of the Ceph services as needed.

2.1. Ceph process management

In Red Hat Ceph Storage, all process management is done through the Systemd service. Each time you want to start, restart, and stop the Ceph daemons, you must specify the daemon type or the daemon instance.

Additional Resources

You can start, stop, and restart all Ceph daemons as the root user from the host where you want to stop the Ceph daemons.

Prerequisites

  • A running Red Hat Ceph Storage cluster.
  • Having root access to the node.

Procedure

  1. On the host where you want to start, stop, and restart the daemons, run the systemctl service to get the SERVICE_ID of the service.

    Example

    [root@host01 ~]# systemctl --type=service
    ceph-499829b4-832f-11eb-8d6d-001a4a000635@mon.host01.service
    Copy to Clipboard Toggle word wrap

  2. Starting all Ceph daemons:

    Syntax

    systemctl start SERVICE_ID
    Copy to Clipboard Toggle word wrap

    Example

    [root@host01 ~]# systemctl start ceph-499829b4-832f-11eb-8d6d-001a4a000635@mon.host01.service
    Copy to Clipboard Toggle word wrap

  3. Stopping all Ceph daemons:

    Syntax

    systemctl stop SERVICE_ID
    Copy to Clipboard Toggle word wrap

    Example

    [root@host01 ~]# systemctl stop ceph-499829b4-832f-11eb-8d6d-001a4a000635@mon.host01.service
    Copy to Clipboard Toggle word wrap

  4. Restarting all Ceph daemons:

    Syntax

    systemctl restart SERVICE_ID
    Copy to Clipboard Toggle word wrap

    Example

    [root@host01 ~]# systemctl restart ceph-499829b4-832f-11eb-8d6d-001a4a000635@mon.host01.service
    Copy to Clipboard Toggle word wrap

Ceph services are logical groups of Ceph daemons of the same type, configured to run in the same Red Hat Ceph Storage cluster. The orchestration layer in Ceph allows the user to manage these services in a centralized way, making it easy to execute operations that affect all the Ceph daemons that belong to the same logical service. The Ceph daemons running in each host are managed through the Systemd service. You can start, stop, and restart all Ceph services from the host where you want to manage the Ceph services.

Important

If you want to start,stop, or restart a specific Ceph daemon in a specific host, you need to use the SystemD service. To obtain a list of the SystemD services running in a specific host, connect to the host, and run the following command:

Example

[root@host01 ~]# systemctl list-units “ceph*”
Copy to Clipboard Toggle word wrap

The output will give you a list of the service names that you can use, to manage each Ceph daemon.

Prerequisites

  • A running Red Hat Ceph Storage cluster.
  • Having root access to the node.

Procedure

  1. Log into the Cephadm shell:

    Example

    [root@host01 ~]# cephadm shell
    Copy to Clipboard Toggle word wrap

  2. Run the ceph orch ls command to get a list of Ceph services configured in the Red Hat Ceph Storage cluster and to get the specific service ID.

    Example

    [ceph: root@host01 /]# ceph orch ls
    NAME                       RUNNING  REFRESHED  AGE  PLACEMENT  IMAGE NAME                                                       IMAGE ID
    alertmanager                   1/1  4m ago     4M   count:1    registry.redhat.io/openshift4/ose-prometheus-alertmanager:v4.5   b7bae610cd46
    crash                          3/3  4m ago     4M   *          registry.redhat.io/rhceph-alpha/rhceph-6-rhel9:latest            c88a5d60f510
    grafana                        1/1  4m ago     4M   count:1    registry.redhat.io/rhceph-alpha/rhceph-6-dashboard-rhel9:latest  bd3d7748747b
    mgr                            2/2  4m ago     4M   count:2    registry.redhat.io/rhceph-alpha/rhceph-6-rhel9:latest            c88a5d60f510
    mon                            2/2  4m ago     10w  count:2    registry.redhat.io/rhceph-alpha/rhceph-6-rhel9:latest            c88a5d60f510
    nfs.foo                        0/1  -          -    count:1    <unknown>                                                        <unknown>
    node-exporter                  1/3  4m ago     4M   *          registry.redhat.io/openshift4/ose-prometheus-node-exporter:v4.5  mix
    osd.all-available-devices      5/5  4m ago     3M   *          registry.redhat.io/rhceph-alpha/rhceph-6-rhel9:latest            c88a5d60f510
    prometheus                     1/1  4m ago     4M   count:1    registry.redhat.io/openshift4/ose-prometheus:v4.6                bebb0ddef7f0
    rgw.test_realm.test_zone       2/2  4m ago     3M   count:2    registry.redhat.io/rhceph-alpha/rhceph-6-rhel9:latest            c88a5d60f510
    Copy to Clipboard Toggle word wrap

  3. To start a specific service, run the following command:

    Syntax

    ceph orch start SERVICE_ID
    Copy to Clipboard Toggle word wrap

    Example

    [ceph: root@host01 /]# ceph orch start node-exporter
    Copy to Clipboard Toggle word wrap

  4. To stop a specific service, run the following command:

    Important

    The ceph orch stop SERVICE_ID command results in the Red Hat Ceph Storage cluster being inaccessible, only for the MON and MGR service. It is recommended to use the systemctl stop SERVICE_ID command to stop a specific daemon in the host.

    Syntax

    ceph orch stop SERVICE_ID
    Copy to Clipboard Toggle word wrap

    Example

    [ceph: root@host01 /]# ceph orch stop node-exporter
    Copy to Clipboard Toggle word wrap

    In the example the ceph orch stop node-exporter command removes all the daemons of the node exporter service.

  5. To restart a specific service, run the following command:

    Syntax

    ceph orch restart SERVICE_ID
    Copy to Clipboard Toggle word wrap

    Example

    [ceph: root@host01 /]# ceph orch restart node-exporter
    Copy to Clipboard Toggle word wrap

Use the journald daemon from the container host to view a log file of a Ceph daemon from a container.

Prerequisites

  • Installation of the Red Hat Ceph Storage software.
  • Root-level access to the node.

Procedure

  1. To view the entire Ceph log file, run a journalctl command as root composed in the following format:

    Syntax

    journalctl -u ceph SERVICE_ID
    Copy to Clipboard Toggle word wrap

    Example

    [root@host01 ~]# journalctl -u ceph-499829b4-832f-11eb-8d6d-001a4a000635@osd.8.service
    Copy to Clipboard Toggle word wrap

    In the above example, you can view the entire log for the OSD with ID osd.8.

  2. To show only the recent journal entries, use the -f option.

    Syntax

    journalctl -fu SERVICE_ID
    Copy to Clipboard Toggle word wrap

    Example

    [root@host01 ~]# journalctl -fu ceph-499829b4-832f-11eb-8d6d-001a4a000635@osd.8.service
    Copy to Clipboard Toggle word wrap

Note

You can also use the sosreport utility to view the journald logs. For more details about SOS reports, see the What is an sosreport and how to create one in Red Hat Enterprise Linux? solution on the Red Hat Customer Portal.

Additional Resources

  • The journalctl manual page.

You can power down and reboot the Red Hat Ceph Storage cluster using two different approaches: systemctl commands and the Ceph Orchestrator. You can choose either approach to power down and reboot the cluster.

Note

When powering down or rebooting a Red Hat Ceph Storage cluster with the Ceph Object gateway multi-site, ensure that no IOs are in progress. Also, power off/on the sites one at a time.

You can use the systemctl commands approach to power down and reboot the Red Hat Ceph Storage cluster. This approach follows the Linux way of stopping the services.

Prerequisites

  • A running Red Hat Ceph Storage cluster.
  • Root-level access.

Procedure

Powering down the Red Hat Ceph Storage cluster

  1. Stop the clients from using the Block Device images RADOS Gateway - Ceph Object Gateway on this cluster and any other clients.
  2. Log into the Cephadm shell:

    Example

    [root@host01 ~]# cephadm shell
    Copy to Clipboard Toggle word wrap

  3. The cluster must be in healthy state (Health_OK and all PGs active+clean) before proceeding. Run ceph status on the host with the client keyrings, for example, the Ceph Monitor or OpenStack controller nodes, to ensure the cluster is healthy.

    Example

    [ceph: root@host01 /]# ceph -s
    Copy to Clipboard Toggle word wrap

  4. If you use the Ceph File System (CephFS), bring down the CephFS cluster:

    Syntax

    ceph fs set FS_NAME max_mds 1
    ceph fs fail FS_NAME
    ceph status
    ceph fs set FS_NAME joinable false
    Copy to Clipboard Toggle word wrap

    Example

    [ceph: root@host01 /]# ceph fs set cephfs max_mds 1
    [ceph: root@host01 /]# ceph fs fail cephfs
    [ceph: root@host01 /]# ceph status
    [ceph: root@host01 /]# ceph fs set cephfs joinable false
    Copy to Clipboard Toggle word wrap

  5. Set the noout, norecover, norebalance, nobackfill, nodown, and pause flags. Run the following on a node with the client keyrings, for example, the Ceph Monitor or OpenStack controller node:

    Example

    [ceph: root@host01 /]# ceph osd set noout
    [ceph: root@host01 /]# ceph osd set norecover
    [ceph: root@host01 /]# ceph osd set norebalance
    [ceph: root@host01 /]# ceph osd set nobackfill
    [ceph: root@host01 /]# ceph osd set nodown
    [ceph: root@host01 /]# ceph osd set pause
    Copy to Clipboard Toggle word wrap

    Important

    The above example is only for stopping the service and each OSD in the OSD node and it needs to be repeated on each OSD node.

  6. If the MDS and Ceph Object Gateway nodes are on their own dedicated nodes, power them off.
  7. Get the systemd target of the daemons:

    Example

    [root@host01 ~]# systemctl list-units --type target | grep ceph
    ceph-0b007564-ec48-11ee-b736-525400fd02f8.target loaded active active Ceph cluster 0b007564-ec48-11ee-b736-525400fd02f8
    ceph.target                                      loaded active active All Ceph clusters and services
    Copy to Clipboard Toggle word wrap

  8. Disable the target that includes the cluster FSID:

    Example

    [root@host01 ~]# systemctl disable ceph-0b007564-ec48-11ee-b736-525400fd02f8.target
    
    Removed "/etc/systemd/system/multi-user.target.wants/ceph-0b007564-ec48-11ee-b736-525400fd02f8.target".
    Removed "/etc/systemd/system/ceph.target.wants/ceph-0b007564-ec48-11ee-b736-525400fd02f8.target".
    Copy to Clipboard Toggle word wrap

  9. Stop the target:

    Example

    [root@host01 ~]# systemctl stop ceph-0b007564-ec48-11ee-b736-525400fd02f8.target
    Copy to Clipboard Toggle word wrap

    This stops all the daemons on the host that needs to be stopped.

  10. Shutdown the node:

    Example

    [root@host01 ~]# shutdown
    Shutdown scheduled for Wed 2024-03-27 11:47:19 EDT, use 'shutdown -c' to cancel.
    Copy to Clipboard Toggle word wrap

  11. Repeat the above steps for all the nodes of the cluster.

Rebooting the Red Hat Ceph Storage cluster

  1. If network equipment was involved, ensure it is powered ON and stable prior to powering ON any Ceph hosts or nodes.
  2. Power ON the administration node.
  3. Enable the systemd target to get all the daemons running:

    Example

    [root@host01 ~]# systemctl enable ceph-0b007564-ec48-11ee-b736-525400fd02f8.target
    Created symlink /etc/systemd/system/multi-user.target.wants/ceph-0b007564-ec48-11ee-b736-525400fd02f8.target  /etc/systemd/system/ceph-0b007564-ec48-11ee-b736-525400fd02f8.target.
    Created symlink /etc/systemd/system/ceph.target.wants/ceph-0b007564-ec48-11ee-b736-525400fd02f8.target  /etc/systemd/system/ceph-0b007564-ec48-11ee-b736-525400fd02f8.target.
    Copy to Clipboard Toggle word wrap

  4. Start the systemd target:

    Example

    [root@host01 ~]# systemctl start ceph-0b007564-ec48-11ee-b736-525400fd02f8.target
    Copy to Clipboard Toggle word wrap

  5. Wait for all the nodes to come up. Verify all the services are up and there are no connectivity issues between the nodes.
  6. Unset the noout, norecover, norebalance, nobackfill, nodown and pause flags. Run the following on a node with the client keyrings, for example, the Ceph Monitor or OpenStack controller node:

    Example

    [ceph: root@host01 /]# ceph osd unset noout
    [ceph: root@host01 /]# ceph osd unset norecover
    [ceph: root@host01 /]# ceph osd unset norebalance
    [ceph: root@host01 /]# ceph osd unset nobackfill
    [ceph: root@host01 /]# ceph osd unset nodown
    [ceph: root@host01 /]# ceph osd unset pause
    Copy to Clipboard Toggle word wrap

  7. If you use the Ceph File System (CephFS), bring the CephFS cluster back up by setting the joinable flag to true:

    Syntax

    ceph fs set FS_NAME joinable true
    Copy to Clipboard Toggle word wrap

    Example

    [ceph: root@host01 /]# ceph fs set cephfs joinable true
    Copy to Clipboard Toggle word wrap

Verification

  • Verify the cluster is in healthy state (Health_OK and all PGs active+clean). Run ceph status on a node with the client keyrings, for example, the Ceph Monitor or OpenStack controller nodes, to ensure the cluster is healthy.

Example

[ceph: root@host01 /]# ceph -s
Copy to Clipboard Toggle word wrap

You can also use the capabilities of the Ceph Orchestrator to power down and reboot the Red Hat Ceph Storage cluster. In most cases, it is a single system login that can help in powering off the cluster.

The Ceph Orchestrator supports several operations, such as start, stop, and restart. You can use these commands with systemctl, for some cases, in powering down or rebooting the cluster.

Prerequisites

  • A running Red Hat Ceph Storage cluster.
  • Root-level access to the node.

Procedure

Powering down the Red Hat Ceph Storage cluster

  1. Stop the clients from using the user Block Device Image and Ceph Object Gateway on this cluster and any other clients.
  2. Log into the Cephadm shell:

    Example

    [root@host01 ~]# cephadm shell
    Copy to Clipboard Toggle word wrap

  3. The cluster must be in healthy state (Health_OK and all PGs active+clean) before proceeding. Run ceph status on the host with the client keyrings, for example, the Ceph Monitor or OpenStack controller nodes, to ensure the cluster is healthy.

    Example

    [ceph: root@host01 /]# ceph -s
    Copy to Clipboard Toggle word wrap

  4. If you use the Ceph File System (CephFS), bring down the CephFS cluster:

    Syntax

    ceph fs set FS_NAME max_mds 1
    ceph fs fail FS_NAME
    ceph status
    ceph fs set FS_NAME joinable false
    ceph mds fail FS_NAME:N
    Copy to Clipboard Toggle word wrap

    Example

    [ceph: root@host01 /]# ceph fs set cephfs max_mds 1
    [ceph: root@host01 /]# ceph fs fail cephfs
    [ceph: root@host01 /]# ceph status
    [ceph: root@host01 /]# ceph fs set cephfs joinable false
    [ceph: root@host01 /]# ceph mds fail cephfs:1
    Copy to Clipboard Toggle word wrap

  5. Set the noout, norecover, norebalance, nobackfill, nodown, and pause flags. Run the following on a node with the client keyrings, for example, the Ceph Monitor or OpenStack controller node:

    Example

    [ceph: root@host01 /]# ceph osd set noout
    [ceph: root@host01 /]# ceph osd set norecover
    [ceph: root@host01 /]# ceph osd set norebalance
    [ceph: root@host01 /]# ceph osd set nobackfill
    [ceph: root@host01 /]# ceph osd set nodown
    [ceph: root@host01 /]# ceph osd set pause
    Copy to Clipboard Toggle word wrap

  6. Stop the MDS service.

    1. Fetch the MDS service name:

      Example

      [ceph: root@host01 /]# ceph orch ls --service-type mds
      Copy to Clipboard Toggle word wrap

    2. Stop the MDS service using the fetched name in the previous step:

      Syntax

      ceph orch stop SERVICE-NAME
      Copy to Clipboard Toggle word wrap

  7. Stop the Ceph Object Gateway services. Repeat for each deployed service.

    1. Fetch the Ceph Object Gateway service names:

      Example

      [ceph: root@host01 /]# ceph orch ls --service-type rgw
      Copy to Clipboard Toggle word wrap

    2. Stop the Ceph Object Gateway service using the fetched name:

      Syntax

      ceph orch stop SERVICE-NAME
      Copy to Clipboard Toggle word wrap

  8. Stop the Alertmanager service:

    Example

    [ceph: root@host01 /]# ceph orch stop alertmanager
    Copy to Clipboard Toggle word wrap

  9. Stop the node-exporter service which is a part of the monitoring stack:

    Example

    [ceph: root@host01 /]# ceph orch stop node-exporter
    Copy to Clipboard Toggle word wrap

  10. Stop the Prometheus service:

    Example

    [ceph: root@host01 /]# ceph orch stop prometheus
    Copy to Clipboard Toggle word wrap

  11. Stop the Grafana dashboard service:

    Example

    [ceph: root@host01 /]# ceph orch stop grafana
    Copy to Clipboard Toggle word wrap

  12. Stop the crash service:

    Example

    [ceph: root@host01 /]# ceph orch stop crash
    Copy to Clipboard Toggle word wrap

  13. Shut down the OSD nodes from the cephadm node, one by one. Repeat this step for all the OSDs in the cluster.

    1. Fetch the OSD ID:

      Example

      [ceph: root@host01 /]# ceph orch ps --daemon-type=osd
      Copy to Clipboard Toggle word wrap

    2. Shut down the OSD node using the OSD ID you fetched:

      Example

      [ceph: root@host01 /]# ceph orch daemon stop osd.1
      Scheduled to stop osd.1 on host 'host02'
      Copy to Clipboard Toggle word wrap

  14. Stop the monitors one by one.

    1. Identify the hosts hosting the monitors:

      Example

      [ceph: root@host01 /]# ceph orch ps --daemon-type mon
      Copy to Clipboard Toggle word wrap

    2. On each host, stop the monitor.

      1. Identify the systemctl unit name:

        Example

        [ceph: root@host01 /]# systemctl list-units ceph-* | grep mon
        Copy to Clipboard Toggle word wrap

      2. Stop the service:

        Syntax

        systemct stop SERVICE-NAME
        Copy to Clipboard Toggle word wrap

  15. Shut down all the hosts.

Rebooting the Red Hat Ceph Storage cluster

  1. If network equipment was involved, ensure it is powered ON and stable prior to powering ON any Ceph hosts or nodes.
  2. Power ON all the Ceph hosts.
  3. Log into the administration node from the Cephadm shell:

    Example

    [root@host01 ~]# cephadm shell
    Copy to Clipboard Toggle word wrap

  4. Verify all the services are in running state:

    Example

    [ceph: root@host01 /]# ceph orch ls
    Copy to Clipboard Toggle word wrap

  5. Ensure the cluster health is `Health_OK`status:

    Example

    [ceph: root@host01 /]# ceph -s
    Copy to Clipboard Toggle word wrap

  6. Unset the noout, norecover, norebalance, nobackfill, nodown and pause flags. Run the following on a node with the client keyrings, for example, the Ceph Monitor or OpenStack controller node:

    Example

    [ceph: root@host01 /]# ceph osd unset noout
    [ceph: root@host01 /]# ceph osd unset norecover
    [ceph: root@host01 /]# ceph osd unset norebalance
    [ceph: root@host01 /]# ceph osd unset nobackfill
    [ceph: root@host01 /]# ceph osd unset nodown
    [ceph: root@host01 /]# ceph osd unset pause
    Copy to Clipboard Toggle word wrap

  7. If you use the Ceph File System (CephFS), bring the CephFS cluster back up by setting the joinable flag to true:

    Syntax

    ceph fs set FS_NAME joinable true
    Copy to Clipboard Toggle word wrap

    Example

    [ceph: root@host01 /]# ceph fs set cephfs joinable true
    Copy to Clipboard Toggle word wrap

Verification

  • Verify the cluster is in healthy state (Health_OK and all PGs active+clean). Run ceph status on a node with the client keyrings, for example, the Ceph Monitor or OpenStack controller nodes, to ensure the cluster is healthy.

Example

[ceph: root@host01 /]# ceph -s
Copy to Clipboard Toggle word wrap

Back to top
Red Hat logoGithubredditYoutubeTwitter

Learn

Try, buy, & sell

Communities

About Red Hat Documentation

We help Red Hat users innovate and achieve their goals with our products and services with content they can trust. Explore our recent updates.

Making open source more inclusive

Red Hat is committed to replacing problematic language in our code, documentation, and web properties. For more details, see the Red Hat Blog.

About Red Hat

We deliver hardened solutions that make it easier for enterprises to work across platforms and environments, from the core datacenter to the network edge.

Theme

© 2025 Red Hat