Chapter 7. Migrating the Red Hat Ceph Storage cluster
In the context of data plane adoption, where the Red Hat OpenStack Platform (RHOSP) services are redeployed in Red Hat OpenShift Container Platform (RHOCP), you migrate a director-deployed Red Hat Ceph Storage cluster by using a process called “externalizing” the Red Hat Ceph Storage cluster.
There are two deployment topologies that include an internal Red Hat Ceph Storage cluster:
- RHOSP includes dedicated Red Hat Ceph Storage nodes to host object storage daemons (OSDs)
- Hyperconverged Infrastructure (HCI), where Compute and Storage services are colocated on hyperconverged nodes
In either scenario, there are some Red Hat Ceph Storage processes that are deployed on RHOSP Controller nodes: Red Hat Ceph Storage monitors, Ceph Object Gateway (RGW), Rados Block Device (RBD), Ceph Metadata Server (MDS), Ceph Dashboard, and NFS Ganesha. To migrate your Red Hat Ceph Storage cluster, you must decommission the Controller nodes and move the Red Hat Ceph Storage daemons to a set of target nodes that are already part of the Red Hat Ceph Storage cluster.
7.1. Prerequisites Copy linkLink copied to clipboard!
- Before you begin the migration, complete the tasks in your Red Hat OpenStack Platform 17.1 environment. For more information, see "Red Hat Ceph Storage prerequisites" in the Adoption overview chapter.
7.2. Red Hat Ceph Storage migration for Distributed Compute Node deployments Copy linkLink copied to clipboard!
Before you adopt your Distributed Compute Node (DCN) deployments that host Red Hat Ceph Storage clusters on Compute nodes at edge sites so that your architecture runs on Red Hat OpenStack Services on OpenShift (RHOSO) (RHOSO), be aware of important considerations.
- Supported edge storage topologies
DCN deployments support the following storage topologies at edge sites:
- Hyperconverged Infrastructure (HCI): Red Hat Ceph Storage daemons run on Compute nodes at each edge site.
- director-deployed dedicated storage: Red Hat Ceph Storage runs on separate storage nodes deployed by director.
- External Red Hat Ceph Storage cluster: Edge sites connect to pre-existing Red Hat Ceph Storage clusters not managed by director.
- Central site Red Hat Ceph Storage migration
- For the central site, migrate Red Hat Ceph Storage daemons from the RHOSP Controller nodes by using the same process as a non-DCN deployment. For more information, see Red Hat Ceph Storage daemon cardinality.
- Edge site Red Hat Ceph Storage migration
For edge sites that use HCI or director-deployed dedicated storage, the Red Hat Ceph Storage daemons can continue to run on their current nodes without migration. The Compute nodes or dedicated storage nodes at edge sites are not decommissioned during adoption, so the Red Hat Ceph Storage daemons remain operational.
For edge sites that use external Red Hat Ceph Storage clusters, no migration is required because the Red Hat Ceph Storage cluster is not managed by director.
- Red Hat Ceph Storage back-end configuration and key distribution
In a DCN deployment, each site has its own Red Hat Ceph Storage cluster with its own configuration file and Red Hat Ceph Storage keyring. These must be stored in Kubernetes secrets and mounted into the appropriate Red Hat OpenStack Services on OpenShift (RHOSO) service pods.
Rather than storing all Red Hat Ceph Storage keys in a single secret accessible to every pod, the recommended approach is to create one secret per site containing only the keys that site actually needs. This limits the security impact if a site is compromised: a pod at an edge site can authenticate only to its local Red Hat Ceph Storage cluster and the central cluster, not to the Red Hat Ceph Storage keyrings of other edge sites.
The key distribution rule for N sites is:
-
The central site (site 0) receives the Red Hat Ceph Storage keys and configuration for all clusters, because central services such as Image service use the
splitback end and must be able to copy images to and from any site. Each edge site (site 1 through N) receives only the keys for the central cluster and its own local cluster.
For example, in a three-site deployment with a central site and two edge sites:
ceph-conf-central -> central.conf + central.keyring dcn1.conf + dcn1.keyring dcn2.conf + dcn2.keyring ceph-conf-dcn1 -> central.conf + central.keyring dcn1.conf + dcn1.keyring ceph-conf-dcn2 -> central.conf + central.keyring dcn2.conf + dcn2.keyringThe per-site secrets are created and then mounted into the appropriate pods using
extraMountspropagation labels. The procedure in Configuring a Red Hat Ceph Storage back end covers both creating the secrets and applying the propagation labels so that each pod receives only its site-specific keys.
-
The central site (site 0) receives the Red Hat Ceph Storage keys and configuration for all clusters, because central services such as Image service use the
7.3. Red Hat Ceph Storage daemon cardinality Copy linkLink copied to clipboard!
Red Hat Ceph Storage 7 and later applies strict constraints in the way daemons can be colocated within the same node. For more information, see the Red Hat Knowledgebase article Red Hat Ceph Storage: Supported configurations. Your topology depends on the available hardware and the amount of Red Hat Ceph Storage services in the Controller nodes that you retire. The amount of services that you can migrate depends on the amount of available nodes in the cluster. The following diagrams show the distribution of Red Hat Ceph Storage daemons on Red Hat Ceph Storage nodes where at least 3 nodes are required.
The following scenario includes only RGW and RBD, without the Red Hat Ceph Storage dashboard:
| | | | |----|---------------------|-------------| | osd | mon/mgr/crash | rgw/ingress | | osd | mon/mgr/crash | rgw/ingress | | osd | mon/mgr/crash | rgw/ingress |With the Red Hat Ceph Storage dashboard, but without Shared File Systems service (manila), at least 4 nodes are required. The Red Hat Ceph Storage dashboard has no failover:
| | | | |-----|---------------------|-------------| | osd | mon/mgr/crash | rgw/ingress | | osd | mon/mgr/crash | rgw/ingress | | osd | mon/mgr/crash | dashboard/grafana | | osd | rgw/ingress | (free) |With the Red Hat Ceph Storage dashboard and the Shared File Systems service, a minimum of 5 nodes are required, and the Red Hat Ceph Storage dashboard has no failover:
| | | | |-----|---------------------|-------------------------| | osd | mon/mgr/crash | rgw/ingress | | osd | mon/mgr/crash | rgw/ingress | | osd | mon/mgr/crash | mds/ganesha/ingress | | osd | rgw/ingress | mds/ganesha/ingress | | osd | mds/ganesha/ingress | dashboard/grafana |
7.4. Migrating the monitoring stack component to new nodes within an existing Red Hat Ceph Storage cluster Copy linkLink copied to clipboard!
The Red Hat Ceph Storage Dashboard module adds web-based monitoring and administration to the Ceph Manager. With director-deployed Red Hat Ceph Storage, the Red Hat Ceph Storage Dashboard is enabled as part of the overcloud deploy and is composed of the following components:
- Ceph Manager module
- Grafana
- Prometheus
- Alertmanager
- Node exporter
The Red Hat Ceph Storage Dashboard containers are included through tripleo-container-image-prepare parameters, and high availability (HA) relies on HAProxy and Pacemaker to be deployed on the Red Hat OpenStack Platform (RHOSP) environment. For an external Red Hat Ceph Storage cluster, HA is not supported.
7.4.1. Prerequisites Copy linkLink copied to clipboard!
You migrate and relocate the Ceph Monitoring components to free Controller nodes. Before you begin the migration, complete the tasks in your Red Hat OpenStack Platform 17.1 environment. For more information, see "Red Hat Ceph Storage prerequisites" in the "Adoption overview" chapter.
7.4.2. Migrating the monitoring stack to the target nodes Copy linkLink copied to clipboard!
To migrate the monitoring stack to the target nodes, you add the monitoring label to your existing nodes and update the configuration of each daemon. You do not need to migrate node exporters. These daemons are deployed across the nodes that are part of the Red Hat Ceph Storage cluster (the placement is ‘*’).
Depending on the target nodes and the number of deployed or active daemons, you can either relocate the existing containers to the target nodes, or select a subset of nodes that host the monitoring stack daemons. High availability (HA) is not supported. Reducing the placement with count: 1 allows you to migrate the existing daemons in a Hyperconverged Infrastructure, or hardware-limited, scenario without impacting other services.
7.4.2.1. Migrating the existing daemons to the target nodes Copy linkLink copied to clipboard!
The following procedure is an example of an environment with 3 Red Hat Ceph Storage nodes or ComputeHCI nodes. This scenario extends the monitoring labels to all the Red Hat Ceph Storage or ComputeHCI nodes that are part of the cluster. This means that you keep 3 placements for the target nodes.
Prerequisites
- Confirm that the firewall rules are in place and the ports are open for a given monitoring stack service.
Procedure
Add the monitoring label to all the Red Hat Ceph Storage or ComputeHCI nodes in the cluster:
for item in $(sudo cephadm shell -- ceph orch host ls --format json | jq -r '.[].hostname'); do sudo cephadm shell -- ceph orch host label add $item monitoring; doneVerify that all the hosts on the target nodes have the monitoring label:
[tripleo-admin@controller-0 ~]$ sudo cephadm shell -- ceph orch host ls HOST ADDR LABELS cephstorage-0.redhat.local 192.168.24.11 osd monitoring cephstorage-1.redhat.local 192.168.24.12 osd monitoring cephstorage-2.redhat.local 192.168.24.47 osd monitoring controller-0.redhat.local 192.168.24.35 _admin mon mgr monitoring controller-1.redhat.local 192.168.24.53 mon _admin mgr monitoring controller-2.redhat.local 192.168.24.10 mon _admin mgr monitoringRemove the labels from the Controller nodes:
$ for i in 0 1 2; do sudo cephadm shell -- ceph orch host label rm "controller-$i.redhat.local" monitoring; done Removed label monitoring from host controller-0.redhat.local Removed label monitoring from host controller-1.redhat.local Removed label monitoring from host controller-2.redhat.localDump the current monitoring stack spec:
function export_spec { local component="$1" local target_dir="$2" sudo cephadm shell -- ceph orch ls --export "$component" > "$target_dir/$component" } SPEC_DIR=${SPEC_DIR:-"$PWD/ceph_specs"} mkdir -p ${SPEC_DIR} for m in grafana prometheus alertmanager; do export_spec "$m" "$SPEC_DIR" doneFor each daemon, edit the current spec and replace the
placement.hosts:section with theplacement.label:section, for example:service_type: grafana service_name: grafana placement: label: monitoring networks: - 172.17.3.0/24 spec: port: 3100This step also applies to Prometheus and Alertmanager specs.
Apply the new monitoring spec to relocate the monitoring stack daemons:
SPEC_DIR=${SPEC_DIR:-"$PWD/ceph_specs"} function migrate_daemon { local component="$1" local target_dir="$2" sudo cephadm shell -m "$target_dir" -- ceph orch apply -i /mnt/ceph_specs/$component } for m in grafana prometheus alertmanager; do migrate_daemon "$m" "$SPEC_DIR" doneVerify that the daemons are deployed on the expected nodes:
[ceph: root@controller-0 /]# ceph orch ps | grep -iE "(prome|alert|grafa)" alertmanager.cephstorage-2 cephstorage-2.redhat.local 172.17.3.144:9093,9094 grafana.cephstorage-0 cephstorage-0.redhat.local 172.17.3.83:3100 prometheus.cephstorage-1 cephstorage-1.redhat.local 172.17.3.53:9092NoteAfter you migrate the monitoring stack, you lose high availability. The monitoring stack daemons no longer have a Virtual IP address and HAProxy anymore. Node exporters are still running on all the nodes.
Review the Red Hat Ceph Storage configuration to ensure that it aligns with the configuration on the target nodes. In particular, focus on the following configuration entries:
[ceph: root@controller-0 /]# ceph config dump | grep -i dashboard ... mgr advanced mgr/dashboard/ALERTMANAGER_API_HOST http://172.17.3.83:9093 mgr advanced mgr/dashboard/GRAFANA_API_URL https://172.17.3.144:3100 mgr advanced mgr/dashboard/PROMETHEUS_API_HOST http://172.17.3.83:9092 mgr advanced mgr/dashboard/controller-0.ycokob/server_addr 172.17.3.33 mgr advanced mgr/dashboard/controller-1.lmzpuc/server_addr 172.17.3.147 mgr advanced mgr/dashboard/controller-2.xpdgfl/server_addr 172.17.3.138Verify that the
API_HOST/URLof thegrafana,alertmanagerandprometheusservices points to the IP addresses on the storage network of the node where each daemon is relocated:[ceph: root@controller-0 /]# ceph orch ps | grep -iE "(prome|alert|grafa)" alertmanager.cephstorage-0 cephstorage-0.redhat.local 172.17.3.83:9093,9094 alertmanager.cephstorage-1 cephstorage-1.redhat.local 172.17.3.53:9093,9094 alertmanager.cephstorage-2 cephstorage-2.redhat.local 172.17.3.144:9093,9094 grafana.cephstorage-0 cephstorage-0.redhat.local 172.17.3.83:3100 grafana.cephstorage-1 cephstorage-1.redhat.local 172.17.3.53:3100 grafana.cephstorage-2 cephstorage-2.redhat.local 172.17.3.144:3100 prometheus.cephstorage-0 cephstorage-0.redhat.local 172.17.3.83:9092 prometheus.cephstorage-1 cephstorage-1.redhat.local 172.17.3.53:9092 prometheus.cephstorage-2 cephstorage-2.redhat.local 172.17.3.144:9092[ceph: root@controller-0 /]# ceph config dump ... ... mgr advanced mgr/dashboard/ALERTMANAGER_API_HOST http://172.17.3.83:9093 mgr advanced mgr/dashboard/PROMETHEUS_API_HOST http://172.17.3.83:9092 mgr advanced mgr/dashboard/GRAFANA_API_URL https://172.17.3.144:3100NoteThe Ceph Dashboard, as the service provided by the Ceph
mgr, is not impacted by the relocation. You might experience an impact when the activemgrdaemon is migrated or is force-failed. However, you can define 3 replicas in the Ceph Manager configuration to redirect requests to a different instance.
7.5. Migrating Red Hat Ceph Storage MDS to new nodes within the existing cluster Copy linkLink copied to clipboard!
You can migrate the MDS daemon when Shared File Systems service (manila), deployed with either a cephfs-native or ceph-nfs back end, is part of the overcloud deployment. The MDS migration is performed by cephadm, and you move the daemons placement from a hosts-based approach to a label-based approach. This ensures that you can visualize the status of the cluster and where daemons are placed by using the ceph orch host command. You can also have a general view of how the daemons are co-located within a given host, as described in the Red Hat Knowledgebase article Red Hat Ceph Storage: Supported configurations.
Prerequisites
- Complete the tasks in your Red Hat OpenStack Platform 17.1 environment. For more information, see Red Hat Ceph Storage prerequisites.
Procedure
Verify that the Red Hat Ceph Storage cluster is healthy and check the MDS status:
$ sudo cephadm shell -- ceph fs ls name: cephfs, metadata pool: manila_metadata, data pools: [manila_data ] $ sudo cephadm shell -- ceph mds stat cephfs:1 {0=mds.controller-2.oebubl=up:active} 2 up:standby $ sudo cephadm shell -- ceph fs status cephfs cephfs - 0 clients ====== RANK STATE MDS ACTIVITY DNS INOS DIRS CAPS 0 active mds.controller-2.oebubl Reqs: 0 /s 696 196 173 0 POOL TYPE USED AVAIL manila_metadata metadata 152M 141G manila_data data 3072M 141G STANDBY MDS mds.controller-0.anwiwd mds.controller-1.cwzhogRetrieve more detailed information on the Ceph File System (CephFS) MDS status:
$ sudo cephadm shell -- ceph fs dump e8 enable_multiple, ever_enabled_multiple: 1,1 default compat: compat={},rocompat={},incompat={1=base v0.20,2=client writeable ranges,3=default file layouts on dirs,4=dir inode in separate object,5=mds uses versioned encoding,6=dirfrag is stored in omap,8=no anchor table,9=file layout v2,10=snaprealm v2} legacy client fscid: 1 Filesystem 'cephfs' (1) fs_name cephfs epoch 5 flags 12 joinable allow_snaps allow_multimds_snaps created 2024-01-18T19:04:01.633820+0000 modified 2024-01-18T19:04:05.393046+0000 tableserver 0 root 0 session_timeout 60 session_autoclose 300 max_file_size 1099511627776 required_client_features {} last_failure 0 last_failure_osd_epoch 0 compat compat={},rocompat={},incompat={1=base v0.20,2=client writeable ranges,3=default file layouts on dirs,4=dir inode in separate object,5=mds uses versioned encoding,6=dirfrag is stored in omap,7=mds uses inline data,8=no anchor table,9=file layout v2,10=snaprealm v2} max_mds 1 in 0 up {0=24553} failed damaged stopped data_pools [7] metadata_pool 9 inline_data disabled balancer standby_count_wanted 1 [mds.mds.controller-2.oebubl{0:24553} state up:active seq 2 addr [v2:172.17.3.114:6800/680266012,v1:172.17.3.114:6801/680266012] compat {c=[1],r=[1],i=[7ff]}] Standby daemons: [mds.mds.controller-0.anwiwd{-1:14715} state up:standby seq 1 addr [v2:172.17.3.20:6802/3969145800,v1:172.17.3.20:6803/3969145800] compat {c=[1],r=[1],i=[7ff]}] [mds.mds.controller-1.cwzhog{-1:24566} state up:standby seq 1 addr [v2:172.17.3.43:6800/2227381308,v1:172.17.3.43:6801/2227381308] compat {c=[1],r=[1],i=[7ff]}] dumped fsmap epoch 8Check the OSD blocklist and clean up the client list:
$ sudo cephadm shell -- ceph osd blocklist ls $ for item in $(sudo cephadm shell -- ceph osd blocklist ls | awk '{print $1}'); do > sudo cephadm shell -- ceph osd blocklist rm $item; > doneNoteWhen a file system client is unresponsive or misbehaving, the access to the file system might be forcibly terminated. This process is called eviction. Evicting a CephFS client prevents it from communicating further with MDS daemons and OSD daemons.
Ordinarily, a blocklisted client cannot reconnect to the servers; you must unmount and then remount the client. However, permitting a client that was evicted to attempt to reconnect can be useful. Because CephFS uses the RADOS OSD blocklist to control client eviction, you can permit CephFS clients to reconnect by removing them from the blocklist.
Get the hosts that are currently part of the Red Hat Ceph Storage cluster:
[ceph: root@controller-0 /]# ceph orch host ls HOST ADDR LABELS STATUS cephstorage-0.redhat.local 192.168.24.25 osd cephstorage-1.redhat.local 192.168.24.50 osd cephstorage-2.redhat.local 192.168.24.47 osd controller-0.redhat.local 192.168.24.24 _admin mgr mon controller-1.redhat.local 192.168.24.42 mgr _admin mon controller-2.redhat.local 192.168.24.37 mgr _admin mon 6 hosts in clusterApply the MDS labels to the target nodes:
for item in $(sudo cephadm shell -- ceph orch host ls --format json | jq -r '.[].hostname'); do sudo cephadm shell -- ceph orch host label add $item mds; doneVerify that all the hosts have the MDS label:
$ sudo cephadm shell -- ceph orch host ls HOST ADDR LABELS cephstorage-0.redhat.local 192.168.24.11 osd mds cephstorage-1.redhat.local 192.168.24.12 osd mds cephstorage-2.redhat.local 192.168.24.47 osd mds controller-0.redhat.local 192.168.24.35 _admin mon mgr mds controller-1.redhat.local 192.168.24.53 mon _admin mgr mds controller-2.redhat.local 192.168.24.10 mon _admin mgr mdsDump the current MDS spec:
$ SPEC_DIR=${SPEC_DIR:-"$PWD/ceph_specs"} $ mkdir -p ${SPEC_DIR} $ sudo cephadm shell -- ceph orch ls --export mds > ${SPEC_DIR}/mdsEdit the retrieved spec and replace the
placement.hostssection withplacement.label:service_type: mds service_id: mds service_name: mds.mds placement: label: mdsUse the
ceph orchestratorto apply the new MDS spec:$ SPEC_DIR=${SPEC_DIR:-"$PWD/ceph_specs"} $ sudo cephadm shell -m ${SPEC_DIR}/mds -- ceph orch apply -i /mnt/mds Scheduling new mds deployment ...This results in an increased number of MDS daemons.
Check the new standby daemons that are temporarily added to the CephFS:
$ sudo cephadm shell -- ceph fs dump Active standby_count_wanted 1 [mds.mds.controller-0.awzplm{0:463158} state up:active seq 307 join_fscid=1 addr [v2:172.17.3.20:6802/51565420,v1:172.17.3.20:6803/51565420] compat {c=[1],r=[1],i=[7ff]}] Standby daemons: [mds.mds.cephstorage-1.jkvomp{-1:463800} state up:standby seq 1 join_fscid=1 addr [v2:172.17.3.135:6820/2075903648,v1:172.17.3.135:6821/2075903648] compat {c=[1],r=[1],i=[7ff]}] [mds.mds.controller-2.gfrqvc{-1:475945} state up:standby seq 1 addr [v2:172.17.3.114:6800/2452517189,v1:172.17.3.114:6801/2452517189] compat {c=[1],r=[1],i=[7ff]}] [mds.mds.cephstorage-0.fqcshx{-1:476503} state up:standby seq 1 join_fscid=1 addr [v2:172.17.3.92:6820/4120523799,v1:172.17.3.92:6821/4120523799] compat {c=[1],r=[1],i=[7ff]}] [mds.mds.cephstorage-2.gnfhfe{-1:499067} state up:standby seq 1 addr [v2:172.17.3.79:6820/2448613348,v1:172.17.3.79:6821/2448613348] compat {c=[1],r=[1],i=[7ff]}] [mds.mds.controller-1.tyiziq{-1:499136} state up:standby seq 1 addr [v2:172.17.3.43:6800/3615018301,v1:172.17.3.43:6801/3615018301] compat {c=[1],r=[1],i=[7ff]}]To migrate MDS to the target nodes, set the MDS affinity that manages the MDS failover:
NoteIt is possible to elect a dedicated MDS as "active" for a particular file system. To configure this preference,
CephFSprovides a configuration option for MDS calledmds_join_fs, which enforces this affinity. When failing over MDS daemons, cluster monitors prefer standby daemons withmds_join_fsequal to the file system name with the failed rank. If no standby exists withmds_join_fsequal to the file system name, it chooses an unqualified standby as a replacement.$ sudo cephadm shell -- ceph config set mds.mds.cephstorage-0.fqcshx mds_join_fs cephfs-
Replace
mds.mds.cephstorage-0.fqcshxwith the daemon deployed oncephstorage-0that was retrieved from the previous step.
-
Replace
Remove the labels from the Controller nodes and force the MDS failover to the target node:
$ for i in 0 1 2; do sudo cephadm shell -- ceph orch host label rm "controller-$i.redhat.local" mds; done Removed label mds from host controller-0.redhat.local Removed label mds from host controller-1.redhat.local Removed label mds from host controller-2.redhat.localThe switch to the target node happens in the background. The new active MDS is the one that you set by using the
mds_join_fscommand.Check the result of the failover and the new deployed daemons:
$ sudo cephadm shell -- ceph fs dump … … standby_count_wanted 1 [mds.mds.cephstorage-0.fqcshx{0:476503} state up:active seq 168 join_fscid=1 addr [v2:172.17.3.92:6820/4120523799,v1:172.17.3.92:6821/4120523799] compat {c=[1],r=[1],i=[7ff]}] Standby daemons: [mds.mds.cephstorage-2.gnfhfe{-1:499067} state up:standby seq 1 addr [v2:172.17.3.79:6820/2448613348,v1:172.17.3.79:6821/2448613348] compat {c=[1],r=[1],i=[7ff]}] [mds.mds.cephstorage-1.jkvomp{-1:499760} state up:standby seq 1 join_fscid=1 addr [v2:172.17.3.135:6820/452139733,v1:172.17.3.135:6821/452139733] compat {c=[1],r=[1],i=[7ff]}] $ sudo cephadm shell -- ceph orch ls NAME PORTS RUNNING REFRESHED AGE PLACEMENT crash 6/6 10m ago 10d * mds.mds 3/3 10m ago 32m label:mds $ sudo cephadm shell -- ceph orch ps | grep mds mds.mds.cephstorage-0.fqcshx cephstorage-0.redhat.local running (79m) 3m ago 79m 27.2M - 17.2.6-100.el9cp 1af7b794f353 2a2dc5ba6d57 mds.mds.cephstorage-1.jkvomp cephstorage-1.redhat.local running (79m) 3m ago 79m 21.5M - 17.2.6-100.el9cp 1af7b794f353 7198b87104c8 mds.mds.cephstorage-2.gnfhfe cephstorage-2.redhat.local running (79m) 3m ago 79m 24.2M - 17.2.6-100.el9cp 1af7b794f353 f3cb859e2a15
7.6. Migrating Red Hat Ceph Storage RGW to external RHEL nodes Copy linkLink copied to clipboard!
For Hyperconverged Infrastructure (HCI) or dedicated Storage nodes, you must migrate the Ceph Object Gateway (RGW) daemons that are included in the Red Hat OpenStack Platform Controller nodes into the existing external Red Hat Enterprise Linux (RHEL) nodes. The external RHEL nodes typically include the Compute nodes for an HCI environment or Red Hat Ceph Storage nodes. Your environment must have Red Hat Ceph Storage 7 or later and be managed by cephadm or Ceph Orchestrator.
7.6.1. Prerequisites Copy linkLink copied to clipboard!
Before you begin the migration, complete the tasks in your Red Hat OpenStack Platform 17.1 environment. For more information, see the "Red Hat Ceph Storage prerequisites" in the "Adoption overview" chapter.
7.6.2. Migrating the Red Hat Ceph Storage RGW back ends Copy linkLink copied to clipboard!
You must migrate your Ceph Object Gateway (RGW) back ends from your Controller nodes to your Red Hat Ceph Storage nodes. To ensure that you distribute the correct amount of services to your available nodes, you use cephadm labels to refer to a group of nodes where a given daemon type is deployed. For more information about the cardinality diagram, see Red Hat Ceph Storage daemon cardinality. The following procedure assumes that you have three target nodes, cephstorage-0, cephstorage-1, cephstorage-2.
Procedure
Add the RGW label to the Red Hat Ceph Storage nodes that you want to migrate your RGW back ends to:
$ sudo cephadm shell -- ceph orch host label add cephstorage-0 rgw; $ sudo cephadm shell -- ceph orch host label add cephstorage-1 rgw; $ sudo cephadm shell -- ceph orch host label add cephstorage-2 rgw; Added label rgw to host cephstorage-0 Added label rgw to host cephstorage-1 Added label rgw to host cephstorage-2 $ sudo cephadm shell -- ceph orch host ls HOST ADDR LABELS STATUS cephstorage-0 192.168.24.54 osd rgw cephstorage-1 192.168.24.44 osd rgw cephstorage-2 192.168.24.30 osd rgw controller-0 192.168.24.45 _admin mon mgr controller-1 192.168.24.11 _admin mon mgr controller-2 192.168.24.38 _admin mon mgr 6 hosts in clusterLocate the RGW spec and dump in the spec directory:
$ SPEC_DIR=${SPEC_DIR:-"$PWD/ceph_specs"} $ mkdir -p ${SPEC_DIR} $ sudo cephadm shell -- ceph orch ls --export rgw > ${SPEC_DIR}/rgw $ cat ${SPEC_DIR}/rgwnetworks: - 172.17.3.0/24 placement: hosts: - controller-0 - controller-1 - controller-2 service_id: rgw service_name: rgw.rgw service_type: rgw spec: rgw_frontend_port: 8080 rgw_realm: default rgw_zone: defaultThis example assumes that
172.17.3.0/24is thestoragenetwork.In the
placementsection, ensure that thelabelandrgw_frontend_portvalues are set:--- networks: - 172.17.3.0/24 placement: label: rgw service_id: rgw service_name: rgw.rgw service_type: rgw spec: rgw_frontend_port: 8090 rgw_realm: default rgw_zone: default rgw_frontend_ssl_certificate: ... ssl: true-
networksdefines the storage network where the RGW back ends are deployed. -
placement.label: rgwreplaces the Controller nodes with thergwlabel. -
spec.rgw_frontend_portspecifies the value as8090to avoid conflicts with the Ceph ingress daemon. -
spec.rgw_frontend_ssl_certificatedefines the SSL certificate and key concatenation if TLS is enabled as described in Configuring RGW with TLS for an external Red Hat Ceph Storage cluster in Configuring persistent storage.
-
Apply the new RGW spec by using the orchestrator CLI:
$ SPEC_DIR=${SPEC_DIR:-"$PWD/ceph_specs"} $ sudo cephadm shell -m ${SPEC_DIR}/rgw -- ceph orch apply -i /mnt/rgwThis command triggers the redeploy, for example:
... osd.9 cephstorage-2 rgw.rgw.cephstorage-0.wsjlgx cephstorage-0 172.17.3.23:8090 starting rgw.rgw.cephstorage-1.qynkan cephstorage-1 172.17.3.26:8090 starting rgw.rgw.cephstorage-2.krycit cephstorage-2 172.17.3.81:8090 starting rgw.rgw.controller-1.eyvrzw controller-1 172.17.3.146:8080 running (5h) rgw.rgw.controller-2.navbxa controller-2 172.17.3.66:8080 running (5h) ... osd.9 cephstorage-2 rgw.rgw.cephstorage-0.wsjlgx cephstorage-0 172.17.3.23:8090 running (19s) rgw.rgw.cephstorage-1.qynkan cephstorage-1 172.17.3.26:8090 running (16s) rgw.rgw.cephstorage-2.krycit cephstorage-2 172.17.3.81:8090 running (13s)Ensure that the new RGW back ends are reachable on the new ports, so you can enable an ingress daemon on port
8080later. Log in to each Red Hat Ceph Storage node that includes RGW and add theiptablesrule to allow connections to both 8080 and 8090 ports in the Red Hat Ceph Storage nodes:$ iptables -I INPUT -p tcp -m tcp --dport 8080 -m conntrack --ctstate NEW -m comment --comment "ceph rgw ingress" -j ACCEPT $ iptables -I INPUT -p tcp -m tcp --dport 8090 -m conntrack --ctstate NEW -m comment --comment "ceph rgw backends" -j ACCEPT $ sudo iptables-save $ sudo systemctl restart iptablesIf
nftablesis used in the existing deployment, edit/etc/nftables/tripleo-rules.nftand add the following content:# 100 ceph_rgw {'dport': ['8080','8090']} add rule inet filter TRIPLEO_INPUT tcp dport { 8080,8090 } ct state new counter accept comment "100 ceph_rgw"- Save the file.
Restart the
nftablesservice:$ sudo systemctl restart nftablesVerify that the rules are applied:
$ sudo nft list ruleset | grep ceph_rgwFrom a Controller node, such as
controller-0, try to reach the RGW back ends:$ curl http://cephstorage-0.storage:8090;You should observe the following output:
<?xml version="1.0" encoding="UTF-8"?><ListAllMyBucketsResult xmlns="http://s3.amazonaws.com/doc/2006-03-01/"><Owner><ID>anonymous</ID><DisplayName></DisplayName></Owner><Buckets></Buckets></ListAllMyBucketsResult>Repeat the verification for each node where a RGW daemon is deployed.
If you migrated RGW back ends to the Red Hat Ceph Storage nodes, there is no
internalAPInetwork, except in the case of HCI nodes. You must reconfigure the RGW keystone endpoint to point to the external network that you propagated:[ceph: root@controller-0 /]# ceph config dump | grep keystone global basic rgw_keystone_url http://172.16.1.111:5000 [ceph: root@controller-0 /]# ceph config set global rgw_keystone_url http://<keystone_endpoint>:5000-
Replace
<keystone_endpoint>with the Identity service (keystone) internal endpoint of the service that is deployed in theOpenStackControlPlaneCR when you adopt the Identity service. For more information, see Adopting the Identity service.
-
Replace
7.6.3. Deploying a Red Hat Ceph Storage ingress daemon Copy linkLink copied to clipboard!
To deploy the Ceph ingress daemon, you perform the following actions:
-
Remove the existing
ceph_rgwconfiguration. - Clean up the configuration created by director.
- Redeploy the Object Storage service (swift).
When you deploy the ingress daemon, two new containers are created:
- HAProxy, which you use to reach the back ends.
- Keepalived, which you use to own the virtual IP address.
You use the rgw label to distribute the ingress daemon to only the number of nodes that host Ceph Object Gateway (RGW) daemons. For more information about distributing daemons among your nodes, see Red Hat Ceph Storage daemon cardinality.
After you complete this procedure, you can reach the RGW back end from the ingress daemon and use RGW through the Object Storage service CLI.
Procedure
Log in to each Controller node and remove the following configuration from the
/var/lib/config-data/puppet-generated/haproxy/etc/haproxy/haproxy.cfgfile:listen ceph_rgw bind 10.0.0.103:8080 transparent mode http balance leastconn http-request set-header X-Forwarded-Proto https if { ssl_fc } http-request set-header X-Forwarded-Proto http if !{ ssl_fc } http-request set-header X-Forwarded-Port %[dst_port] option httpchk GET /swift/healthcheck option httplog option forwardfor server controller-0.storage.redhat.local 172.17.3.73:8080 check fall 5 inter 2000 rise 2 server controller-1.storage.redhat.local 172.17.3.146:8080 check fall 5 inter 2000 rise 2 server controller-2.storage.redhat.local 172.17.3.156:8080 check fall 5 inter 2000 rise 2Restart
haproxy-bundleand confirm that it is started:[root@controller-0 ~]# sudo pcs resource restart haproxy-bundle haproxy-bundle successfully restarted [root@controller-0 ~]# sudo pcs status | grep haproxy * Container bundle set: haproxy-bundle [undercloud-0.ctlplane.redhat.local:8787/rh-osbs/rhosp17-openstack-haproxy:pcmklatest]: * haproxy-bundle-podman-0 (ocf:heartbeat:podman): Started controller-0 * haproxy-bundle-podman-1 (ocf:heartbeat:podman): Started controller-1 * haproxy-bundle-podman-2 (ocf:heartbeat:podman): Started controller-2Confirm that no process is connected to port 8080:
[root@controller-0 ~]# ss -antop | grep 8080 [root@controller-0 ~]#You can expect the Object Storage service (swift) CLI to fail to establish the connection:
(overcloud) [root@cephstorage-0 ~]# swift list HTTPConnectionPool(host='10.0.0.103', port=8080): Max retries exceeded with url: /swift/v1/AUTH_852f24425bb54fa896476af48cbe35d3?format=json (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fc41beb0430>: Failed to establish a new connection: [Errno 111] Connection refused'))Set the required images for both HAProxy and Keepalived:
[ceph: root@controller-0 /]# ceph config set mgr mgr/cephadm/container_image_haproxy registry.redhat.io/rhceph/rhceph-haproxy-rhel9:latest [ceph: root@controller-0 /]# ceph config set mgr mgr/cephadm/container_image_keepalived registry.redhat.io/rhceph/keepalived-rhel9:latestCreate a file called
rgw_ingressincontroller-0:$ SPEC_DIR=${SPEC_DIR:-"$PWD/ceph_specs"} $ vim ${SPEC_DIR}/rgw_ingressPaste the following content into the
rgw_ingressfile:--- service_type: ingress service_id: rgw.rgw placement: label: rgw spec: backend_service: rgw.rgw virtual_ip: 10.0.0.89/24 frontend_port: 8080 monitor_port: 8898 virtual_interface_networks: - <external_network> ssl_cert: ...-
Replace
<external_network>with your external network, for example,10.0.0.0/24. For more information, see Completing prerequisites for migrating Red Hat Ceph Storage RGW. - If TLS is enabled, add the SSL certificate and key concatenation as described in Configuring RGW with TLS for an external Red Hat Ceph Storage cluster in Configuring persistent storage.
-
Replace
Apply the
rgw_ingressspec by using the Ceph orchestrator CLI:$ SPEC_DIR=${SPEC_DIR:-"$PWD/ceph_specs"} $ cephadm shell -m ${SPEC_DIR}/rgw_ingress -- ceph orch apply -i /mnt/rgw_ingressWait until the ingress is deployed and query the resulting endpoint:
$ sudo cephadm shell -- ceph orch ls NAME PORTS RUNNING REFRESHED AGE PLACEMENT crash 6/6 6m ago 3d * ingress.rgw.rgw 10.0.0.89:8080,8898 6/6 37s ago 60s label:rgw mds.mds 3/3 6m ago 3d controller-0;controller-1;controller-2 mgr 3/3 6m ago 3d controller-0;controller-1;controller-2 mon 3/3 6m ago 3d controller-0;controller-1;controller-2 osd.default_drive_group 15 37s ago 3d cephstorage-0;cephstorage-1;cephstorage-2 rgw.rgw ?:8090 3/3 37s ago 4m label:rgw$ curl 10.0.0.89:8080 --- <?xml version="1.0" encoding="UTF-8"?><ListAllMyBucketsResult xmlns="http://s3.amazonaws.com/doc/2006-03-01/"><Owner><ID>anonymous</ID><DisplayName></DisplayName></Owner><Buckets></Buckets></ListAllMyBucketsResult>[ceph: root@controller-0 /]# —
7.6.4. Create or update the Object Storage service endpoints Copy linkLink copied to clipboard!
You must create or update the Object Storage service (swift) endpoints to configure the new virtual IP address (VIP) that you reserved on the same network that you used to deploy RGW ingress.
Procedure
List the current swift endpoints and service:
$ oc rsh openstackclient openstack endpoint list | grep "swift.*object"$ oc rsh openstackclient openstack service list | grep "swift.*object"If the service and endpoints do not exist, create the missing swift resources:
$ oc rsh openstackclient openstack service create --name swift --description 'OpenStack Object Storage' object-store $ oc rsh openstackclient openstack role add --user swift --project service member $ oc rsh openstackclient openstack role add --user swift --project service admin > for i in public internal; do > oc rsh openstackclient endpoint create --region regionOne object-store $i http://<RGW_VIP>:8080/swift/v1/AUTH_%\(tenant_id\)s > done $ oc rsh openstackclient openstack role add --project admin --user admin swiftoperator-
Replace
<RGW_VIP>with the Ceph RGW ingress VIP.
-
Replace
If the endpoints exist, update the endpoints to point to the right RGW ingress VIP:
$ oc rsh openstackclient openstack endpoint set --url http://<RGW_VIP>:8080/swift/v1/AUTH_%\(tenant_id\)s <swift_public_endpoint_uuid> $ oc rsh openstackclient openstack endpoint set --url http://<RGW_VIP>:8080/swift/v1/AUTH_%\(tenant_id\)s <swift_internal_endpoint_uuid> $ oc rsh openstackclient openstack endpoint list | grep object | 0d682ad71b564cf386f974f90f80de0d | regionOne | swift | object-store | True | public | http://172.18.0.100:8080/swift/v1/AUTH_%(tenant_id)s | | b311349c305346f39d005feefe464fb1 | regionOne | swift | object-store | True | internal | http://172.18.0.100:8080/swift/v1/AUTH_%(tenant_id)s |-
Replace
<swift_public_endpoint_uuid>with the UUID of the swift public endpoint. -
Replace
<swift_internal_endpoint_uuid>with the UUID of the swift internal endpoint.
-
Replace
Test the migrated service:
$ oc rsh openstackclient openstack container list --debug ... ... ... REQ: curl -g -i -X GET http://keystone-public-openstack.apps.ocp.openstack.lab -H "Accept: application/json" -H "User-Agent: openstacksdk/1.0.2 keystoneauth1/5.1.3 python-requests/2.25.1 CPython/3.9.23" Starting new HTTP connection (1): keystone-public-openstack.apps.ocp.openstack.lab:80 http://keystone-public-openstack.apps.ocp.openstack.lab:80 "GET / HTTP/1.1" 300 298 RESP: [300] content-length: 298 content-type: application/json date: Mon, 14 Jul 2025 17:41:29 GMT location: http://keystone-public-openstack.apps.ocp.openstack.lab/v3/ server: Apache set-cookie: b5697f82cf3c19ece8be533395142512=d5c6a9ee2 267c4b63e9f656ad7565270; path=/; HttpOnly vary: X-Auth-Token x-openstack-request-id: req-452e42c5-e60f-440f-a6e8-fe1b9ea89055 RESP BODY: {"versions": {"values": [{"id": "v3.14", "status": "stable", "updated": "2020-04-07T00:00:00Z", "links": [{"rel": "self", "href": "http://keystone-public-openstack.apps.ocp.openstack.lab/v3/"}], "media-types": [{"base": "applic ation/json", "type": "application/vnd.openstack.identity-v3+json"}]}]}} GET call to http://keystone-public-openstack.apps.ocp.openstack.lab/ used request id req-452e42c5-e60f-440f-a6e8-fe1b9ea89055 ... REQ: curl -g -i -X GET "http://172.18.0.100:8080/swift/v1/AUTH_44477474b0dc4b5b8911ceec23a22246?format=json" -H "User-Agent: openstacksdk/1.0.2 keystoneauth1/5.1.3 python-requests/2.25.1 CPython/3.9.23" -H "X-Auth-Token: {SHA256}ec5deca0be37bd8bfe659f132b9cdf396b8f409db5dc16972d50cbf3f28474d4" Starting new HTTP connection (1): 172.18.0.100:8080 http://172.18.0.100:8080 "GET /swift/v1/AUTH_44477474b0dc4b5b8911ceec23a22246?format=json HTTP/1.1" 200 2 RESP: [200] accept-ranges: bytes content-length: 2 content-type: application/json; charset=utf-8 date: Mon, 14 Jul 2025 17:41:31 GMT x-account-bytes-used: 0 x-account-bytes-used-actual: 0 x-account-container-count: 0 x-account-object-count: 0 x-account-storage-policy-default-placement-bytes-used: 0 x-account-storage-policy-default-placement-bytes-used-actual: 0 x-account-storage-policy-default-placement-container-count: 0 x-account-storage-policy-default-placement-object-count: 0 x-openstack-request-id: tx000001e95361131ccf694-006875414a-7753-default x-timestamp: 1752514891.25991 x-trans-id: tx000001e95361131ccf694-006875414a-7753-default RESP BODY: [] GET call to http://172.18.0.100:8080/swift/v1/AUTH_44477474b0dc4b5b8911ceec23a22246?format=json used request id tx000001e95361131ccf694-006875414a-7753-default clean_up ListContainer: END return value: 0
7.7. Migrating Red Hat Ceph Storage RBD to external RHEL nodes Copy linkLink copied to clipboard!
For Hyperconverged Infrastructure (HCI) or dedicated Storage nodes that are running Red Hat Ceph Storage 7 or later, you must migrate the daemons that are included in the Red Hat OpenStack Platform control plane into the existing external Red Hat Enterprise Linux (RHEL) nodes. The external RHEL nodes typically include the Compute nodes for an HCI environment or dedicated storage nodes.
7.7.1. Prerequisites Copy linkLink copied to clipboard!
Before you begin the migration, complete the tasks in your Red Hat OpenStack Platform 17.1 environment. For more information, see the "Red Hat Ceph Storage prerequisites" in the "Adoption overview" chapter.
7.7.2. Migrating Ceph Manager daemons to Red Hat Ceph Storage nodes Copy linkLink copied to clipboard!
You must migrate your Ceph Manager daemons from the Red Hat OpenStack Platform (RHOSP) Controller nodes to a set of target nodes. Target nodes are either existing Red Hat Ceph Storage nodes, or RHOSP Compute nodes if Red Hat Ceph Storage is deployed by director with a Hyperconverged Infrastructure (HCI) topology.
The following procedure uses cephadm and the Ceph Orchestrator to drive the Ceph Manager migration, and the Ceph spec to modify the placement and reschedule the Ceph Manager daemons. Ceph Manager is run in an active/passive state. It also provides many modules, including the Ceph Orchestrator. Every potential module, such as the Ceph Dashboard, that is provided by ceph-mgr is implicitly migrated with Ceph Manager.
Procedure
SSH into the target node and enable the firewall rules that are required to reach a Ceph Manager service:
dports="6800:7300" ssh heat-admin@<target_node> sudo iptables -I INPUT \ -p tcp --match multiport --dports $dports -j ACCEPT;Replace
<target_node>with the hostname of the hosts that are listed in the Red Hat Ceph Storage environment. Runceph orch host lsto see the list of the hosts.Repeat this step for each target node.
Check that the rules are properly applied to the target node and persist them:
$ sudo iptables-save $ sudo systemctl restart iptablesNoteThe default dashboard port for
ceph-mgrin a greenfield deployment is 8443. With director-deployed Red Hat Ceph Storage, the default port is 8444 because the service ran on the Controller node, and it was necessary to use this port to avoid a conflict. For adoption, update the dashboard port to 8443 in theceph-mgrconfiguration and firewall rules.Log in to
controller-0and update the dashboard port in theceph-mgrconfiguration to 8443:$ sudo cephadm shell $ ceph config set mgr mgr/dashboard/server_port 8443 $ ceph config set mgr mgr/dashboard/ssl_server_port 8443 $ ceph mgr module disable dashboard $ ceph mgr module enable dashboardIf
nftablesis used in the existing deployment, edit/etc/nftables/tripleo-rules.nftand add the following content:# 113 ceph_mgr {'dport': ['6800-7300', 8443]} add rule inet filter TRIPLEO_INPUT tcp dport { 6800-7300,8443 } ct state new counter accept comment "113 ceph_mgr"- Save the file.
Restart the
nftablesservice:$ sudo systemctl restart nftablesVerify that the rules are applied:
$ sudo nft list ruleset | grep ceph_mgrPrepare the target node to host the new Ceph Manager daemon, and add the
mgrlabel to the target node:$ sudo cephadm shell -- ceph orch host label add <target_node> mgr- Repeat steps 1-7 for each target node that hosts a Ceph Manager daemon.
Get the Ceph Manager spec:
$ SPEC_DIR=${SPEC_DIR:-"$PWD/ceph_specs"} $ mkdir -p ${SPEC_DIR} $ sudo cephadm shell -- ceph orch ls --export mgr > ${SPEC_DIR}/mgrEdit the retrieved spec and add the
label: mgrsection to theplacementsection:service_type: mgr service_id: mgr placement: label: mgr- Save the spec.
Apply the spec with
cephadmby using the Ceph Orchestrator:$ SPEC_DIR=${SPEC_DIR:-"$PWD/ceph_specs"} $ sudo cephadm shell -m ${SPEC_DIR}/mgr -- ceph orch apply -i /mnt/mgr
Verification
Verify that the new Ceph Manager daemons are created in the target nodes:
$ sudo cephadm shell -- ceph orch ps | grep -i mgr $ sudo cephadm shell -- ceph -sThe Ceph Manager daemon count should match the number of hosts where the
mgrlabel is added.NoteThe migration does not shrink the Ceph Manager daemons. The count grows by the number of target nodes, and migrating Ceph Monitor daemons to Red Hat Ceph Storage nodes decommissions the stand-by Ceph Manager instances. For more information, see Migrating Ceph Monitor daemons to Red Hat Ceph Storage nodes.
7.7.3. Migrating Ceph Monitor daemons to Red Hat Ceph Storage nodes Copy linkLink copied to clipboard!
You must move Ceph Monitor daemons from the Red Hat OpenStack Platform (RHOSP) Controller nodes to a set of target nodes. Target nodes are either existing Red Hat Ceph Storage nodes, or RHOSP Compute nodes if Red Hat Ceph Storage is deployed by director with a Hyperconverged Infrastructure (HCI) topology. Additional Ceph Monitors are deployed to the target nodes, and they are promoted as _admin nodes that you can use to manage the Red Hat Ceph Storage cluster and perform day 2 operations.
To migrate the Ceph Monitor daemons, you must perform the following high-level steps:
- Configure the target nodes for Ceph Monitor migration.
- Drain the source node
- Migrate your Ceph Monitor IP addresses to the target nodes
- Redeploy the Ceph Monitor on the target node
- Verify that the Red Hat Ceph Storage cluster is healthy
Repeat these steps for any additional Controller node that hosts a Ceph Monitor until you migrate all the Ceph Monitor daemons to the target nodes.
7.7.3.1. Configuring target nodes for Ceph Monitor migration Copy linkLink copied to clipboard!
Prepare the target Red Hat Ceph Storage nodes for the Ceph Monitor migration by performing the following actions:
- Enable firewall rules in a target node and persist them.
-
Create a spec that is based on labels and apply it by using
cephadm. - Ensure that the Ceph Monitor quorum is maintained during the migration process.
Procedure
SSH into the target node and enable the firewall rules that are required to reach a Ceph Monitor service:
$ for port in 3300 6789; { ssh heat-admin@<target_node> sudo iptables -I INPUT \ -p tcp -m tcp --dport $port -m conntrack --ctstate NEW \ -j ACCEPT; }-
Replace
<target_node>with the hostname of the node that hosts the new Ceph Monitor.
-
Replace
Check that the rules are properly applied to the target node and persist them:
$ sudo iptables-save $ sudo systemctl restart iptablesIf
nftablesis used in the existing deployment, edit/etc/nftables/tripleo-rules.nftand add the following content:# 110 ceph_mon {'dport': [6789, 3300, '9100']} add rule inet filter TRIPLEO_INPUT tcp dport { 6789,3300,9100 } ct state new counter accept comment "110 ceph_mon"- Save the file.
Restart the
nftablesservice:$ sudo systemctl restart nftablesVerify that the rules are applied:
$ sudo nft list ruleset | grep ceph_monTo migrate the existing Ceph Monitors to the target Red Hat Ceph Storage nodes, retrieve the Red Hat Ceph Storage mon spec from the first Ceph Monitor, or the first Controller node:
$ SPEC_DIR=${SPEC_DIR:-"$PWD/ceph_specs"} $ mkdir -p ${SPEC_DIR} $ sudo cephadm shell -- ceph orch ls --export mon > ${SPEC_DIR}/monAdd the
label:monsection to theplacementsection:service_type: mon service_id: mon placement: label: mon- Save the spec.
Apply the spec with
cephadmby using the Ceph Orchestrator:$ SPEC_DIR=${SPEC_DIR:-"$PWD/ceph_specs"} $ sudo cephadm shell -m ${SPEC_DIR}/mon -- ceph orch apply -i /mnt/monExtend the
monlabel to the remaining Red Hat Ceph Storage target nodes to ensure that quorum is maintained during the migration process:for item in $(sudo cephadm shell -- ceph orch host ls --format json | jq -r '.[].hostname'); do sudo cephadm shell -- ceph orch host label add $item mon; sudo cephadm shell -- ceph orch host label add $item _admin; doneNoteApplying the
monspec allows the existing strategy to uselabelsinstead ofhosts. As a result, any node with themonlabel can host a Ceph Monitor daemon. Perform this step only once to avoid multiple iterations when multiple Ceph Monitors are migrated.Check the status of the Red Hat Ceph Storage and the Ceph Orchestrator daemons list. Ensure that Ceph Monitors are in a quorum and listed by the
ceph orchcommand:$ sudo cephadm shell -- ceph -s cluster: id: f6ec3ebe-26f7-56c8-985d-eb974e8e08e3 health: HEALTH_OK services: mon: 6 daemons, quorum controller-0,controller-1,controller-2,ceph-0,ceph-1,ceph-2 (age 19m) mgr: controller-0.xzgtvo(active, since 32m), standbys: controller-1.mtxohd, controller-2.ahrgsk osd: 8 osds: 8 up (since 12m), 8 in (since 18m); 1 remapped pgs data: pools: 1 pools, 1 pgs objects: 0 objects, 0 B usage: 43 MiB used, 400 GiB / 400 GiB avail pgs: 1 active+clean$ sudo cephadm shell -- ceph orch host ls HOST ADDR LABELS STATUS ceph-0 192.168.24.14 osd mon mgr _admin ceph-1 192.168.24.7 osd mon mgr _admin ceph-2 192.168.24.8 osd mon mgr _admin controller-0 192.168.24.15 _admin mgr mon controller-1 192.168.24.23 _admin mgr mon controller-2 192.168.24.13 _admin mgr monSet up a Ceph client on the first Controller node that is used during the rest of the procedure to interact with Red Hat Ceph Storage. Set up an additional IP address on the storage network that is used to interact with Red Hat Ceph Storage when the first Controller node is decommissioned:
Back up the content of
/etc/cephin theceph_client_backupdirectory.$ mkdir -p $HOME/ceph_client_backup $ sudo cp -R /etc/ceph/* $HOME/ceph_client_backup-
Edit
/etc/os-net-config/config.yamland add- ip_netmask: 172.17.3.200after the IP address on the VLAN that belongs to the storage network. Replace172.17.3.200with any other available IP address on the storage network that can be statically assigned tocontroller-0. Save the file and refresh the
controller-0network configuration:$ sudo os-net-config -c /etc/os-net-config/config.yamlVerify that the IP address is present in the Controller node:
$ ip -o a | grep 172.17.3.200Ping the IP address and confirm that it is reachable:
$ ping -c 3 172.17.3.200Verify that you can interact with the Red Hat Ceph Storage cluster:
$ sudo cephadm shell -c $HOME/ceph_client_backup/ceph.conf -k $HOME/ceph_client_backup/ceph.client.admin.keyring -- ceph -s
Next steps
Proceed to the next step Draining the source node.
7.7.3.2. Draining the source node Copy linkLink copied to clipboard!
Drain the source node and remove the source node host from the Red Hat Ceph Storage cluster.
Procedure
On the source node, back up the
/etc/ceph/directory to runcephadmand get a shell for the Red Hat Ceph Storage cluster from the source node:$ mkdir -p $HOME/ceph_client_backup $ sudo cp -R /etc/ceph $HOME/ceph_client_backupIdentify the active
ceph-mgrinstance:$ sudo cephadm shell -- ceph mgr statFail the
ceph-mgrif it is active on the source node:$ sudo cephadm shell -- ceph mgr fail <mgr_instance>-
Replace
<mgr_instance>with the Ceph Manager daemon to fail.
-
Replace
From the
cephadmshell, remove the labels on the source node:$ for label in mon mgr _admin; do sudo cephadm shell -- ceph orch host label rm <source_node> $label; done-
Replace
<source_node>with the hostname of the source node.
-
Replace
Optional: Ensure that you remove the Ceph Monitor daemon from the source node if it is still running:
$ sudo cephadm shell -- ceph orch daemon rm mon.<source_node> --forceDrain the source node to remove any leftover daemons:
$ sudo cephadm shell -- ceph orch host drain <source_node>Remove the source node host from the Red Hat Ceph Storage cluster:
$ sudo cephadm shell -- ceph orch host rm <source_node> --forceNoteThe source node is not part of the cluster anymore, and should not appear in the Red Hat Ceph Storage host list when you run
sudo cephadm shell -- ceph orch host ls. However, if you runsudo podman psin the source node, the list might show that both Ceph Monitors and Ceph Managers are still running.[root@controller-1 ~]# sudo podman ps CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 5c1ad36472bc registry.redhat.io/ceph/rhceph@sha256:320c364dcc8fc8120e2a42f54eb39ecdba12401a2546763b7bef15b02ce93bc4 -n mon.contro... 35 minutes ago Up 35 minutes ago ceph-f6ec3ebe-26f7-56c8-985d-eb974e8e08e3-mon-controller-1 3b14cc7bf4dd registry.redhat.io/ceph/rhceph@sha256:320c364dcc8fc8120e2a42f54eb39ecdba12401a2546763b7bef15b02ce93bc4 -n mgr.contro... 35 minutes ago Up 35 minutes ago ceph-f6ec3ebe-26f7-56c8-985d-eb974e8e08e3-mgr-controller-1-mtxohdTo clean up the existing containers and remove the
cephadmdata from the source node, contact Red Hat Support.Confirm that mons are still in quorum:
$ sudo cephadm shell -- ceph -s $ sudo cephadm shell -- ceph orch ps | grep -i mon
Next steps
Proceed to the next step Migrating the Ceph Monitor IP address.
7.7.3.3. Migrating the Ceph Monitor IP address Copy linkLink copied to clipboard!
You must migrate your Ceph Monitor IP addresses to the target Red Hat Ceph Storage nodes. The IP address migration assumes that the target nodes are originally deployed by director and that the network configuration is managed by os-net-config.
Procedure
Get the original Ceph Monitor IP addresses from
$HOME/ceph_client_backup/ceph.conffile on themon_hostline, for example:mon_host = [v2:172.17.3.60:3300/0,v1:172.17.3.60:6789/0] [v2:172.17.3.29:3300/0,v1:172.17.3.29:6789/0] [v2:172.17.3.53:3300/0,v1:172.17.3.53:6789/0]Match the IP address retrieved in the previous step with the storage network IP addresses on the source node, and find the Ceph Monitor IP address:
[tripleo-admin@controller-0 ~]$ ip -o -4 a | grep 172.17.3 9: vlan30 inet 172.17.3.60/24 brd 172.17.3.255 scope global vlan30\ valid_lft forever preferred_lft forever 9: vlan30 inet 172.17.3.13/32 brd 172.17.3.255 scope global vlan30\ valid_lft forever preferred_lft foreverConfirm that the Ceph Monitor IP address is present in the
os-net-configconfiguration that is located in the/etc/os-net-configdirectory on the source node:[tripleo-admin@controller-0 ~]$ grep "172.17.3.60" /etc/os-net-config/config.yaml - ip_netmask: 172.17.3.60/24-
Edit the
/etc/os-net-config/config.yamlfile and remove theip_netmaskline. Save the file and refresh the node network configuration:
$ sudo os-net-config -c /etc/os-net-config/config.yamlVerify that the IP address is not present in the source node anymore, for example:
[controller-0]$ ip -o a | grep 172.17.3.60-
SSH into the target node, for example
cephstorage-0, and add the IP address for the new Ceph Monitor. -
On the target node, edit
/etc/os-net-config/config.yamland add the- ip_netmask: 172.17.3.60line that you removed in the source node. Save the file and refresh the node network configuration:
$ sudo os-net-config -c /etc/os-net-config/config.yamlVerify that the IP address is present in the target node.
$ ip -o a | grep 172.17.3.60From the Ceph client node,
controller-0, ping the IP address that is migrated to the target node and confirm that it is still reachable:[controller-0]$ ping -c 3 172.17.3.60
Next steps
Proceed to the next step Redeploying the Ceph Monitor on the target node.
7.7.3.4. Redeploying a Ceph Monitor on the target node Copy linkLink copied to clipboard!
You use the IP address that you migrated to the target node to redeploy the Ceph Monitor on the target node.
Procedure
From the Ceph client node, for example
controller-0, get the Ceph mon spec:$ SPEC_DIR=${SPEC_DIR:-"$PWD/ceph_specs"} $ sudo cephadm shell -- ceph orch ls --export mon > ${SPEC_DIR}/monEdit the retrieved spec and add the
unmanaged: truekeyword:service_type: mon service_id: mon placement: label: mon unmanaged: true- Save the spec.
Apply the spec with
cephadmby using the Ceph Orchestrator:$ SPEC_DIR=${SPEC_DIR:-"$PWD/ceph_specs"} $ sudo cephadm shell -m ${SPEC_DIR}/mon -- ceph orch apply -i /mnt/monThe Ceph Monitor daemons are marked as
unmanaged, and you can now redeploy the existing daemon and bind it to the migrated IP address.Delete the existing Ceph Monitor on the target node:
$ sudo cephadm shell -- ceph orch daemon rm mon.<target_node> --force-
Replace
<target_node>with the hostname of the target node that is included in the Red Hat Ceph Storage cluster.
-
Replace
Redeploy the new Ceph Monitor on the target node by using the migrated IP address:
$ sudo cephadm shell -- ceph orch daemon add mon <target_node>:<ip_address>-
Replace
<ip_address>with the IP address of the migrated IP address.
-
Replace
Get the Ceph Monitor spec:
$ SPEC_DIR=${SPEC_DIR:-"$PWD/ceph_specs"} $ sudo cephadm shell -- ceph orch ls --export mon > ${SPEC_DIR}/monEdit the retrieved spec and set the
unmanagedkeyword tofalse:service_type: mon service_id: mon placement: label: mon unmanaged: false- Save the spec.
Apply the spec with
cephadmby using the Ceph Orchestrator:$ SPEC_DIR=${SPEC_DIR:-"$PWD/ceph_specs"} $ sudo cephadm shell -m ${SPEC_DIR}/mon -- ceph orch apply -i /mnt/monThe new Ceph Monitor runs on the target node with the original IP address.
Identify the running
mgr:$ sudo cephadm shell -- ceph mgr statRefresh the Ceph Manager information by force-failing it:
$ sudo cephadm shell -- ceph mgr failRefresh the
OSDinformation:$ sudo cephadm shell -- ceph orch reconfig osd.default_drive_group
Next steps
Repeat the procedure starting from step Draining the source node for each node that you want to decommission. Proceed to the next step Verifying the Red Hat Ceph Storage cluster after Ceph Monitor migration.
7.7.3.5. Verifying the Red Hat Ceph Storage cluster after Ceph Monitor migration Copy linkLink copied to clipboard!
After you finish migrating your Ceph Monitor daemons to the target nodes, verify that the Red Hat Ceph Storage cluster is healthy.
Procedure
Verify that the Red Hat Ceph Storage cluster is healthy:
$ ceph -s cluster: id: f6ec3ebe-26f7-56c8-985d-eb974e8e08e3 health: HEALTH_OK ... ...Verify that the Red Hat Ceph Storage mons are running with the old IP addresses. SSH into the target nodes and verify that the Ceph Monitor daemons are bound to the expected IP and port:
$ netstat -tulpn | grep 3300
7.8. Updating the Red Hat Ceph Storage cluster Ceph Dashboard configuration Copy linkLink copied to clipboard!
If the Ceph Dashboard is part of the enabled Ceph Manager modules, you need to reconfigure the failover settings.
Procedure
Regenerate the following Red Hat Ceph Storage configuration keys to point to the right
mgrcontainer:mgr advanced mgr/dashboard/controller-0.ycokob/server_addr 172.17.3.33 mgr advanced mgr/dashboard/controller-1.lmzpuc/server_addr 172.17.3.147 mgr advanced mgr/dashboard/controller-2.xpdgfl/server_addr 172.17.3.138$ sudo cephadm shell $ ceph orch ps | awk '/mgr./ {print $1}'For each retrieved
mgrdaemon, update the corresponding entry in the Red Hat Ceph Storage configuration:$ ceph config set mgr mgr/dashboard/<>/server_addr/<ip addr>