

Chapter 7. Migrating Red Hat Ceph Storage RGW to external RHEL nodes

download PDF

For hyperconverged infrastructure (HCI) or dedicated Storage nodes that are running Red Hat Ceph Storage version 6 or later, you must migrate the RGW daemons that are included in the Red Hat OpenStack Platform Controller nodes into the existing external Red Hat Enterprise Linux (RHEL) nodes. The existing external RHEL nodes typically include the Compute nodes for an HCI environment or Red Hat Ceph Storage nodes.

To migrate Ceph Object Gateway (RGW), your environment must meet the following requirements:

  • Red Hat Ceph Storage is running version 6 or later and is managed by cephadm/orchestrator.
  • An undercloud is still available, and the nodes and networks are managed by director.

7.1. Red Hat Ceph Storage daemon cardinality

Red Hat Ceph Storage 6 and later applies strict constraints in the way daemons can be colocated within the same node. For more information, see Red Hat Ceph Storage: Supported configurations. The resulting topology depends on the available hardware, as well as the amount of Red Hat Ceph Storage services present in the Controller nodes which are going to be retired. For more information about the procedure that is required to migrate the RGW component and keep an HA model using the Ceph ingress daemon, see High availability for the Ceph Object Gateway in Object Gateway Guide. As a general rule, the number of services that can be migrated depends on the number of available nodes in the cluster. The following diagrams cover the distribution of the Red Hat Ceph Storage daemons on the Red Hat Ceph Storage nodes where at least three nodes are required in a scenario that sees only RGW and RBD, without the Dashboard service (horizon):

|    |                     |             |
| osd | mon/mgr/crash      | rgw/ingress |
| osd | mon/mgr/crash      | rgw/ingress |
| osd | mon/mgr/crash      | rgw/ingress |

With the Dashboard service, and without Shared File Systems service (manila) at least four nodes are required. The Dashboard service has no failover:

|     |                     |             |
| osd | mon/mgr/crash | rgw/ingress       |
| osd | mon/mgr/crash | rgw/ingress       |
| osd | mon/mgr/crash | dashboard/grafana |
| osd | rgw/ingress   | (free)            |

With the Dashboard service and the Shared File Systems service, 5 nodes minimum are required, and the Dashboard service has no failover:

|     |                     |                         |
| osd | mon/mgr/crash       | rgw/ingress             |
| osd | mon/mgr/crash       | rgw/ingress             |
| osd | mon/mgr/crash       | mds/ganesha/ingress     |
| osd | rgw/ingress         | mds/ganesha/ingress     |
| osd | mds/ganesha/ingress | dashboard/grafana       |

7.2. Completing prerequisites for migrating Red Hat Ceph Storage RGW

You must complete the following prerequisites before you begin the Red Hat Ceph Storage RGW migration.


  1. Check the current status of the Red Hat Ceph Storage nodes:

    (undercloud) [stack@undercloud-0 ~]$ metalsmith list
        +------------------------+    +----------------+
        | IP Addresses           |    |  Hostname      |
        +------------------------+    +----------------+
        | ctlplane= |    | cephstorage-0  |
        | ctlplane= |    | cephstorage-1  |
        | ctlplane= |    | cephstorage-2  |
        | ctlplane= |    | compute-0      |
        | ctlplane= |    | compute-1      |
        | ctlplane= |    | controller-0   |
        | ctlplane=  |    | controller-1   |
        | ctlplane= |    | controller-2   |
        +------------------------+    +----------------+
  2. Log in to controller-0 and check the pacemaker status to help you identify the information that you need before you start the RGW migration.

    Full List of Resources:
      * ip-	(ocf:heartbeat:IPaddr2):     	Started controller-0
      * ip-   	(ocf:heartbeat:IPaddr2):     	Started controller-1
      * ip- 	(ocf:heartbeat:IPaddr2):     	Started controller-2
      * ip-  	(ocf:heartbeat:IPaddr2):     	Started controller-0
      * ip-  	(ocf:heartbeat:IPaddr2):     	Started controller-1
      * Container bundle set: haproxy-bundle
        * haproxy-bundle-podman-0   (ocf:heartbeat:podman):  Started controller-2
        * haproxy-bundle-podman-1   (ocf:heartbeat:podman):  Started controller-0
        * haproxy-bundle-podman-2   (ocf:heartbeat:podman):  Started controller-1
  3. Use the ip command to identify the ranges of the storage networks.

    [heat-admin@controller-0 ~]$ ip -o -4 a
    1: lo	inet scope host lo\   	valid_lft forever preferred_lft forever
    2: enp1s0	inet brd scope global enp1s0\   	valid_lft forever preferred_lft forever
    2: enp1s0	inet brd scope global enp1s0\   	valid_lft forever preferred_lft forever
    7: br-ex	inet brd scope global br-ex\   	valid_lft forever preferred_lft forever
    8: vlan70	inet brd scope global vlan70\   	valid_lft forever preferred_lft forever
    8: vlan70	inet brd scope global vlan70\   	valid_lft forever preferred_lft forever
    9: vlan50	inet brd scope global vlan50\   	valid_lft forever preferred_lft forever
    10: vlan30	inet brd scope global vlan30\   	valid_lft forever preferred_lft forever
    10: vlan30	inet brd scope global vlan30\   	valid_lft forever preferred_lft forever
    11: vlan20	inet brd scope global vlan20\   	valid_lft forever preferred_lft forever
    12: vlan40	inet brd scope global vlan40\   	valid_lft forever preferred_lft forever
    • vlan30 represents the Storage Network, where the new RGW instances should be started on the Red Hat Ceph Storage nodes.
    • br-ex represents the External Network, which is where in the current environment, haproxy has the frontend Virtual IP (VIP) assigned.
  4. Identify the network that you previously had in haproxy and propagate it through director to the Red Hat Ceph Storage nodes. This network is used to reserve a new VIP that is owned by Red Hat Ceph Storage and used as the entry point for the RGW service.

    1. Log into controller-0 and check the current HAProxy configuration until you find ceph_rgw section:

      $ less /var/lib/config-data/puppet-generated/haproxy/etc/haproxy/haproxy.cfg
      listen ceph_rgw
        bind transparent
        bind transparent
        mode http
        balance leastconn
        http-request set-header X-Forwarded-Proto https if { ssl_fc }
        http-request set-header X-Forwarded-Proto http if !{ ssl_fc }
        http-request set-header X-Forwarded-Port %[dst_port]
        option httpchk GET /swift/healthcheck
        option httplog
        option forwardfor
        server controller-0.storage.redhat.local check fall 5 inter 2000 rise 2
        server controller-1.storage.redhat.local check fall 5 inter 2000 rise 2
        server controller-2.storage.redhat.local check fall 5 inter 2000 rise 2
    2. Confirm that the network is used as an HAProxy frontend:

      [controller-0]$ ip -o -4 a
      7: br-ex	inet brd scope global br-ex\   	valid_lft forever preferred_lft forever

      This example shows that controller-0 is exposing the services by using the external network, which is not present in the Red Hat Ceph Storage nodes, and you need to propagate it through director.

  5. Propagate the HAProxy frontend network to Red Hat Ceph Storage nodes.

    1. Change the NIC template used to define the ceph-storage network interfaces and add the new config section:

      - type: interface
        name: nic1
        use_dhcp: false
        dns_servers: {{ ctlplane_dns_nameservers }}
        - ip_netmask: {{ ctlplane_ip }}/{{ ctlplane_cidr }}
        routes: {{ ctlplane_host_routes }}
      - type: vlan
        vlan_id: {{ storage_mgmt_vlan_id }}
        device: nic1
        - ip_netmask: {{ storage_mgmt_ip }}/{{ storage_mgmt_cidr }}
        routes: {{ storage_mgmt_host_routes }}
      - type: interface
        name: nic2
        use_dhcp: false
        defroute: false
      - type: vlan
        vlan_id: {{ storage_vlan_id }}
        device: nic2
        - ip_netmask: {{ storage_ip }}/{{ storage_cidr }}
        routes: {{ storage_host_routes }}
      - type: ovs_bridge
        name: {{ neutron_physical_bridge_name }}
        dns_servers: {{ ctlplane_dns_nameservers }}
        domain: {{ dns_search_domains }}
        use_dhcp: false
        - ip_netmask: {{ external_ip }}/{{ external_cidr }}
        routes: {{ external_host_routes }}
        - type: interface
          name: nic3
          primary: true
    2. In addition, add the External Network to the baremetal.yaml file used by metalsmith:

      - name: CephStorage
        count: 3
        hostname_format: cephstorage-%index%
        - hostname: cephstorage-0
        name: ceph-0
        - hostname: cephstorage-1
        name: ceph-1
        - hostname: cephstorage-2
        name: ceph-2
        profile: ceph-storage
            template: /home/stack/composable_roles/network/nic-configs/ceph-storage.j2
        - network: ctlplane
            vif: true
        - network: storage
        - network: storage_mgmt
        - network: external
    3. Run the overcloud node provision command passing the --network-config option:

      (undercloud) [stack@undercloud-0]$
      openstack overcloud node provision
         -o overcloud-baremetal-deployed-0.yaml
         --stack overcloud
         --network-config -y
    4. Check the new network on the Red Hat Ceph Storage nodes:

      [root@cephstorage-0 ~]# ip -o -4 a
      1: lo	inet scope host lo\   	valid_lft forever preferred_lft forever
      2: enp1s0	inet brd scope global enp1s0\   	valid_lft forever preferred_lft forever
      11: vlan40	inet brd scope global vlan40\   	valid_lft forever preferred_lft forever
      12: vlan30	inet brd scope global vlan30\   	valid_lft forever preferred_lft forever
      14: br-ex	inet brd scope global br-ex\   	valid_lft forever preferred_lft forever

7.3. Migrating the Red Hat Ceph Storage RGW backends

To match the cardinality diagram, you use cephadm labels to refer to a group of nodes where a given daemon type should be deployed. For more information about the cardinality diagram, see Red Hat Ceph Storage daemon cardinality.


  1. Add the RGW label to the Red Hat Ceph Storage nodes:

    for i in 0 1 2; {
        ceph orch host label add cephstorage-$i rgw;
    [ceph: root@controller-0 /]#
    for i in 0 1 2; {
        ceph orch host label add cephstorage-$i rgw;
    Added label rgw to host cephstorage-0
    Added label rgw to host cephstorage-1
    Added label rgw to host cephstorage-2
    [ceph: root@controller-0 /]# ceph orch host ls
    HOST       	ADDR       	LABELS      	STATUS
    cephstorage-0  osd rgw
    cephstorage-1  osd rgw
    cephstorage-2  osd rgw
    controller-0  _admin mon mgr
    controller-1  _admin mon mgr
    controller-2  _admin mon mgr
    6 hosts in cluster
  2. During the overcloud deployment, RGW is applied at step 2 (external_deployment_steps), and a cephadm compatible spec is generated in /home/ceph-admin/specs/rgw from director. Find the RGW spec:

    [root@controller-0 heat-admin]# cat rgw
      - controller-0
      - controller-1
      - controller-2
    service_id: rgw
    service_name: rgw.rgw
    service_type: rgw
      rgw_frontend_port: 8080
      rgw_realm: default
      rgw_zone: default
  3. In the placement section, replace the following values:

    • Replace the controller nodes with the label: rgw label.
    • Change the ` rgw_frontend_port` value to 8090 to avoid conflicts with the Ceph ingress daemon.

        label: rgw
      service_id: rgw
      service_name: rgw.rgw
      service_type: rgw
        rgw_frontend_port: 8090
        rgw_realm: default
        rgw_zone: default
  4. Apply the new RGW spec by using the orchestrator CLI:

    $ cephadm shell -m /home/ceph-admin/specs/rgw
    $ cephadm shell -- ceph orch apply -i /mnt/rgw

    This command triggers the redeploy:

    osd.9                     	cephstorage-2
    rgw.rgw.cephstorage-0.wsjlgx  cephstorage-0   starting
    rgw.rgw.cephstorage-1.qynkan  cephstorage-1   starting
    rgw.rgw.cephstorage-2.krycit  cephstorage-2   starting
    rgw.rgw.controller-1.eyvrzw   controller-1  running (5h)
    rgw.rgw.controller-2.navbxa   controller-2   running (5h)
    osd.9                     	cephstorage-2
    rgw.rgw.cephstorage-0.wsjlgx  cephstorage-0  running (19s)
    rgw.rgw.cephstorage-1.qynkan  cephstorage-1  running (16s)
    rgw.rgw.cephstorage-2.krycit  cephstorage-2  running (13s)
  5. Ensure that the new RGW backends are reachable on the new ports, because you are going to enable an IngressDaemon on port 8080 later in the process. For this reason, log in to each RGW node (the Red Hat Ceph Storage nodes) and add the iptables rule to allow connections to both 8080 and 8090 ports in the Red Hat Ceph Storage nodes.

    iptables -I INPUT -p tcp -m tcp --dport 8080 -m conntrack --ctstate NEW -m comment --comment "ceph rgw ingress" -j ACCEPT
    iptables -I INPUT -p tcp -m tcp --dport 8090 -m conntrack --ctstate NEW -m comment --comment "ceph rgw backends" -j ACCEPT
    for port in 8080 8090; {
        for i in 25 10 32; {
           ssh heat-admin@192.168.24.$i sudo iptables -I INPUT \
           -p tcp -m tcp --dport $port -m conntrack --ctstate NEW \
           -j ACCEPT;
  6. From a Controller node (e.g. controller-0) try to reach (curl) the RGW backends:

    for i in 26 23 81; do {
        echo "---"
        curl 172.17.3.$i:8090;
        echo "---"

    You should observe the following output:

    <?xml version="1.0" encoding="UTF-8"?><ListAllMyBucketsResult xmlns="http://s3.amazonaws.com/doc/2006-03-01/"><Owner><ID>anonymous</ID><DisplayName></DisplayName></Owner><Buckets></Buckets></ListAllMyBucketsResult>
    <?xml version="1.0" encoding="UTF-8"?><ListAllMyBucketsResult xmlns="http://s3.amazonaws.com/doc/2006-03-01/"><Owner><ID>anonymous</ID><DisplayName></DisplayName></Owner><Buckets></Buckets></ListAllMyBucketsResult>
    <?xml version="1.0" encoding="UTF-8"?><ListAllMyBucketsResult xmlns="http://s3.amazonaws.com/doc/2006-03-01/"><Owner><ID>anonymous</ID><DisplayName></DisplayName></Owner><Buckets></Buckets></ListAllMyBucketsResult>
  7. If RGW backends are migrated in the Red Hat Ceph Storage nodes, there is no "internalAPI" network(this is not true in the case of HCI). Reconfigure the RGW keystone endpoint, pointing to the external network that has been propagated. For more information about propagating the external network, see Completing prerequisites for migrating Red Hat Ceph Storage RGW.

    [ceph: root@controller-0 /]# ceph config dump | grep keystone
    global   basic rgw_keystone_url
    [ceph: root@controller-0 /]# ceph config set global rgw_keystone_url

7.4. Deploying a Red Hat Ceph Storage ingress daemon

To match the cardinality diagram, you use cephadm labels to refer to a group of nodes where a given daemon type should be deployed. For more information about the cardinality diagram, see Red Hat Ceph Storage daemon cardinality. HAProxy is managed by director through Pacemaker: the three running instances at this point will point to the old RGW backends, resulting in a broken configuration. Since you are going to deploy the Ceph ingress daemon, the first thing to do is remove the existing ceph_rgw config, clean up the config created by director and restart the service to make sure other services are not affected by this change. After you complete this procedure, you can reach the RGW backend from the ingress daemon and use RGW through the Object Storage service command line interface (CLI).


  1. Log in to each Controller node and remove the following configuration from the /var/lib/config-data/puppet-generated/haproxy/etc/haproxy/haproxy.cfg file:

    listen ceph_rgw
      bind transparent
      mode http
      balance leastconn
      http-request set-header X-Forwarded-Proto https if { ssl_fc }
      http-request set-header X-Forwarded-Proto http if !{ ssl_fc }
      http-request set-header X-Forwarded-Port %[dst_port]
      option httpchk GET /swift/healthcheck
      option httplog
      option forwardfor
       server controller-0.storage.redhat.local check fall 5 inter 2000 rise 2
      server controller-1.storage.redhat.local check fall 5 inter 2000 rise 2
      server controller-2.storage.redhat.local check fall 5 inter 2000 rise 2
  2. Restart haproxy-bundle and ensure it is started:

    [root@controller-0 ~]# sudo pcs resource restart haproxy-bundle
    haproxy-bundle successfully restarted
    [root@controller-0 ~]# sudo pcs status | grep haproxy
      * Container bundle set: haproxy-bundle [undercloud-0.ctlplane.redhat.local:8787/rh-osbs/rhosp17-openstack-haproxy:pcmklatest]:
        * haproxy-bundle-podman-0   (ocf:heartbeat:podman):  Started controller-0
        * haproxy-bundle-podman-1   (ocf:heartbeat:podman):  Started controller-1
        * haproxy-bundle-podman-2   (ocf:heartbeat:podman):  Started controller-2
  3. Confirm that no process is bound to 8080:

    [root@controller-0 ~]# ss -antop | grep 8080
    [root@controller-0 ~]#

    The Object Storage service (swift) CLI fails at this point:

    (overcloud) [root@cephstorage-0 ~]# swift list
    HTTPConnectionPool(host='', port=8080): Max retries exceeded with url: /swift/v1/AUTH_852f24425bb54fa896476af48cbe35d3?format=json (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fc41beb0430>: Failed to establish a new connection: [Errno 111] Connection refused'))
  4. Set the required images for both HAProxy and Keepalived:

    [ceph: root@controller-0 /]# ceph config set mgr mgr/cephadm/container_image_haproxy registry.redhat.io/rhceph/rhceph-haproxy-rhel9:latest
    [ceph: root@controller-0 /]# ceph config set mgr mgr/cephadm/container_image_keepalived registry.redhat.io/rhceph/keepalived-rhel9:latest
  5. Create a file called rgw_ingress in the /home/ceph-admin/specs/ directory in controller-0:

    $ sudo vim /home/ceph-admin/specs/rgw_ingress
  6. Paste the following content in to the rgw_ingress file:

    service_type: ingress
    service_id: rgw.rgw
      label: rgw
      backend_service: rgw.rgw
      frontend_port: 8080
      monitor_port: 8898
        - <external_network>
  7. Apply the rgw_ingress spec by using the Ceph orchestrator CLI:

    $ cephadm shell -m /home/ceph-admin/specs/rgw_ingress
    $ cephadm shell -- ceph orch apply -i /mnt/rgw_ingress
  8. Wait until the ingress is deployed and query the resulting endpoint:

    [ceph: root@controller-0 /]# ceph orch ls
    NAME                 	PORTS            	RUNNING  REFRESHED  AGE  PLACEMENT
    crash                                         	6/6  6m ago 	3d   *
    ingress.rgw.rgw,8898  	6/6  37s ago	60s  label:rgw
    mds.mds                   3/3  6m ago 	3d   controller-0;controller-1;controller-2
    mgr                       3/3  6m ago 	3d   controller-0;controller-1;controller-2
    mon                       3/3  6m ago 	3d   controller-0;controller-1;controller-2
    osd.default_drive_group   15  37s ago	3d   cephstorage-0;cephstorage-1;cephstorage-2
    rgw.rgw   ?:8090          3/3  37s ago	4m   label:rgw
    [ceph: root@controller-0 /]# curl
    <?xml version="1.0" encoding="UTF-8"?><ListAllMyBucketsResult xmlns="http://s3.amazonaws.com/doc/2006-03-01/"><Owner><ID>anonymous</ID><DisplayName></DisplayName></Owner><Buckets></Buckets></ListAllMyBucketsResult>[ceph: root@controller-0 /]#

7.5. Updating the object-store endpoints

The object-storage endpoints still point to the original virtual IP address (VIP) that is owned by pacemaker. You must update the object-store endpoints because other services stll use the original VIP, and you reserved a new VIP on the same network.


  1. List the current endpoints:

    (overcloud) [stack@undercloud-0 ~]$ openstack endpoint list | grep object
    | 1326241fb6b6494282a86768311f48d1 | regionOne | swift    	| object-store   | True	| internal  | |
    | 8a34817a9d3443e2af55e108d63bb02b | regionOne | swift    	| object-store   | True	| public	|  |
    | fa72f8b8b24e448a8d4d1caaeaa7ac58 | regionOne | swift    	| object-store   | True	| admin 	| |
  2. Update the endpoints that are pointing to the Ingress VIP:

    (overcloud) [stack@undercloud-0 ~]$ openstack endpoint set --url "" 95596a2d92c74c15b83325a11a4f07a3
    (overcloud) [stack@undercloud-0 ~]$ openstack endpoint list | grep object-store
    | 6c7244cc8928448d88ebfad864fdd5ca | regionOne | swift    	| object-store   | True	| internal  | |
    | 95596a2d92c74c15b83325a11a4f07a3 | regionOne | swift    	| object-store   | True	| public	|   |
    | e6d0599c5bf24a0fb1ddf6ecac00de2d | regionOne | swift    	| object-store   | True	| admin 	| |

    Repeat this step for both internal and admin endpoints.

  3. Test the migrated service:

    (overcloud) [stack@undercloud-0 ~]$ swift list --debug
    DEBUG:swiftclient:Versionless auth_url - using as endpoint
    DEBUG:keystoneclient.auth.identity.v3.base:Making authentication request to
    DEBUG:urllib3.connectionpool:Starting new HTTP connection (1):
    DEBUG:urllib3.connectionpool: "POST /v3/auth/tokens HTTP/1.1" 201 7795
    DEBUG:keystoneclient.auth.identity.v3.base:{"token": {"methods": ["password"], "user": {"domain": {"id": "default", "name": "Default"}, "id": "6f87c7ffdddf463bbc633980cfd02bb3", "name": "admin", "password_expires_at": null},
    DEBUG:swiftclient:REQ: curl -i -X GET -H "X-Auth-Token: gAAAAABj7KHdjZ95syP4c8v5a2zfXckPwxFQZYg0pgWR42JnUs83CcKhYGY6PFNF5Cg5g2WuiYwMIXHm8xftyWf08zwTycJLLMeEwoxLkcByXPZr7kT92ApT-36wTfpi-zbYXd1tI5R00xtAzDjO3RH1kmeLXDgIQEVp0jMRAxoVH4zb-DVHUos" -H "Accept-Encoding: gzip"
    DEBUG:swiftclient:RESP STATUS: 200 OK
    DEBUG:swiftclient:RESP HEADERS: {'content-length': '2', 'x-timestamp': '1676452317.72866', 'x-account-container-count': '0', 'x-account-object-count': '0', 'x-account-bytes-used': '0', 'x-account-bytes-used-actual': '0', 'x-account-storage-policy-default-placement-container-count': '0', 'x-account-storage-policy-default-placement-object-count': '0', 'x-account-storage-policy-default-placement-bytes-used': '0', 'x-account-storage-policy-default-placement-bytes-used-actual': '0', 'x-trans-id': 'tx00000765c4b04f1130018-0063eca1dd-1dcba-default', 'x-openstack-request-id': 'tx00000765c4b04f1130018-0063eca1dd-1dcba-default', 'accept-ranges': 'bytes', 'content-type': 'application/json; charset=utf-8', 'date': 'Wed, 15 Feb 2023 09:11:57 GMT'}
    DEBUG:swiftclient:RESP BODY: b'[]'
  4. Run tempest tests against object-storage:

    (overcloud) [stack@undercloud-0 tempest-dir]$  tempest run --regex tempest.api.object_storage
    Ran: 141 tests in 606.5579 sec.
     - Passed: 128
     - Skipped: 13
     - Expected Fail: 0
     - Unexpected Success: 0
     - Failed: 0
    Sum of execute time for each test: 657.5183 sec.
    Worker Balance
     - Worker 0 (1 tests) => 0:10:03.400561
     - Worker 1 (2 tests) => 0:00:24.531916
     - Worker 2 (4 tests) => 0:00:10.249889
     - Worker 3 (30 tests) => 0:00:32.730095
     - Worker 4 (51 tests) => 0:00:26.246044
     - Worker 5 (6 tests) => 0:00:20.114803
     - Worker 6 (20 tests) => 0:00:16.290323
     - Worker 7 (27 tests) => 0:00:17.103827
Red Hat logoGithubRedditYoutubeTwitter







红帽致力于替换我们的代码、文档和 Web 属性中存在问题的语言。欲了解更多详情,请参阅红帽博客.



© 2024 Red Hat, Inc.