Ce contenu n'est pas disponible dans la langue sélectionnée.

Deploying a Distributed Compute Node (DCN) architecture


Red Hat OpenStack Services on OpenShift 18.0

Edge and storage configuration for Red Hat OpenStack Services on Openshift

OpenStack Documentation Team

Abstract

You can deploy Red Hat OpenStack Services on OpenShift (RHOSO) with a distributed compute node (DCN) architecture for edge sites at remote locations. Each site can have its own Red Hat Ceph Storage back end for Image service (glance).

Providing feedback on Red Hat documentation

We appreciate your feedback. Tell us how we can improve the documentation.

To provide documentation feedback for Red Hat OpenStack Services on OpenShift (RHOSO), create a Jira issue in the OSPRH Jira project.

Procedure

  1. Log in to the Red Hat Atlassian Jira.
  2. Click the following link to open a Create Issue page: Create issue
  3. Complete the Summary and Description fields. In the Description field, include the documentation URL, chapter or section number, and a detailed description of the issue.
  4. Click Create.
  5. Review the details of the bug you created.

Chapter 1. Understanding DCN

Note

An upgrade from Red Hat OpenStack Platform (RHOSP) 17.1 to Red Hat OpenStack Services on OpenShift (RHOSO) 18.0.3 is not currently supported for Distributed Compute Node (DCN) deployments.

Distributed compute node (DCN) architecture is for edge use cases that require Compute and storage nodes to be deployed remotely while sharing a centralized control plane. With DCN architecture, you can position workloads strategically closer to your operational needs for higher performance.

The central location consists of, at a minimum, the RHOSO control plane installed on a Red Hat OpenShift Container Platform (RHOCP) cluster. Compute nodes can also be deployed at the central location. The edge locations consist of Compute and optional storage nodes.

DCN architecture consists of multiple availability zones (AZs) to ensure isolation per-site scheduling of the OpenStack resources.

You configure each site with a unique AZ. In this guide, the central site uses az0, the first edge location uses az1, and so on. You can use any naming convention to ensure that the AZ names are unique per site.

Figure 1.1. Basic distributed compute node architecture with storage

Illustration showing the central site connected to three remote sites with storage.

DCN architecture is a hub-and-spoke routed network deployment. DCN is comparable to a spine-leaf deployment for routed provisioning and control plane networking with RHOSO.

  • The hub is the central site with core routers and a datacenter gateway (DC-GW). The hub hosts the control plane which manages the geographically dispersed sites.
  • The spokes are the remote edge sites. Each site is defined by using an OpenStackDataPlaneNodeSet custom resource. Red Hat Ceph Storage (RHCS) is used as the storage back end. You can deploy RHCS in either a hyperconverged configuration, or as a standalone storage back end.

When you launch an instance at an edge site, the required image is copied to the local Image service (glance) store automatically. You can copy images from the central Image store to edge sites by using glance multistore to save time during instance launch.

1.1. Required software for DCN architecture

Verify that your environment meets the minimum software version requirements for distributed compute node (DCN) architecture. Meeting these requirements ensures compatibility and support for your distributed deployment.

Expand
Table 1.1. Required software versions for DCN architecture
PlatformVersionOptional

Red Hat OpenShift Container Platform

4.16

No

Red Hat Enterprise Linux

9.2

No

Red Hat OpenStack Platform

18.0.3

No

Red Hat Ceph Storage

7 or 8

Yes

1.2. DCN storage

Choose from three storage deployment configurations for your central and edge locations. Each configuration option provides different trade-offs between performance, capacity, and operational complexity.

  • Without storage.
  • Using hyperconverged Ceph storage.
  • Using Red Hat Ceph Storage (RHCS) as a standalone storage backend.

The storage you deploy is dedicated to the site you deploy it on. DCN architecture uses an Image service (glance) pod, and a Block Storage service (cinder) pod for each site, deployed at the central location, on the Red Hat OpenShift Container Platform (RHOCP) cluster.

For edge sites deployed without storage, you can use the aggregate cache command to store images in the Compute service (nova) cache. Caching Image service images in the Compute service provides faster boot times for instances by avoiding the process of downloading images across a WAN link.

Example:

$ openstack aggregate cache image <dcn0> <myimage>
  • Replace <dcn0> with the name of your availability zone.
  • Replace <myimage> with the name of your image.
Note

Red Hat OpenStack Services on OpenShift (RHOSO) supports external deployments of Red Hat Ceph Storage 7, 8, and 9. Configuration examples that reference Red Hat Ceph Storage use Release 7 information. If you are using a later version of Red Hat Ceph Storage, adjust the configuration examples accordingly.

Chapter 2. Planning a DCN deployment

Plan your distributed compute node (DCN) deployment to verify that required technologies are available and supported. This planning helps to ensure a successful deployment and to avoid compatibility issues.

2.1. Storage considerations for DCN architecture

Understand storage specific requirements and supported features for distributed compute node (DCN) deployments. These considerations help you plan storage configurations that work within DCN architectural constraints.

The following features are not currently supported for DCN architectures:

  • Copying a volume between edge sites. You can work around this by creating an image from the volume and using the Image service (glance) to copy the image. After the image is copied, you can create a volume from it.
  • Ceph Rados Gateway (RGW) at the edge sites.
  • CephFS at the edge sites.
  • Instance high availability (HA) at the edge sites.
  • RBD mirroring between edge sites.
  • Instance migration, live or cold, either between edge sites, or from the central location to edge sites. You can only migrate instances within a site boundary. To move an image between sites, you must snapshot the image, and use glance image-import.

Additionally, you must consider the following:

  • You must upload images to the central location before copying them to edge sites. A copy of each image must exist in the Image service at the central location.
  • You must use the RBD storage driver for the Image, Compute and Block Storage services.
  • For each site, including the central location, assign a unique availability zone.
  • You can migrate an offline volume from an edge site to the central location, or vice versa. You cannot migrate volumes directly between edge sites.

2.2. Networking considerations for DCN architecture

Understand architectural limitations and requirements for DCN architecture deployments. These considerations help you successfully deploy distributed edge locations and maintain performance.

The following features are not currently supported for DCN architectures:

  • DHCP on DPDK nodes
  • TC Flower Hardware Offload

The following ML2/OVN networking technologies are fully supported:

  • Routed provider networks
  • OVN GW (Networker node) with Neutron AZs supported

Additionally, you must consider the following:

  • Network latency: Balance the latency as measured in round-trip time (RTT), with the expected number of concurrent API operations to maintain acceptable performance. Maximum TCP/IP throughput is inversely proportional to RTT. You can mitigate some issues with high-latency connections with high bandwidth by tuning kernel TCP parameters. Contact Red Hat Support if a cross-site communication exceeds 100 ms.
  • Network drop outs: If the edge site temporarily loses connection to the central site, then no control plane API or CLI operations can be executed at the impacted edge site for the duration of the outage. For example, Compute nodes at the edge site are consequently unable to create a snapshot of an instance, issue an auth token, or delete an image. General control plane API and CLI operations remain functional during this outage, and can continue to serve any other edge sites that have a working connection.
  • Image type: You must use raw images when deploying a DCN architecture with Ceph storage.
  • Image sizing:

    • Compute images: Compute images are downloaded from the central location. These images are potentially large files that are transferred across all necessary networks from the central site to the edge site during provisioning.
    • Instance images: If there is no block storage at the edge, then the Image service images traverse the WAN during first use. The images are copied or cached locally to the target edge nodes for all subsequent use. There is no size limit for images. Transfer times vary with available bandwidth and network latency.
  • Provider networks are the most common approach for DCN deployments. Note that the Networking service (neutron) does not validate where you can attach available networks. For example, if you use a provider network called "site-a" only in edge site A, the Networking service does not validate and prevent you from attaching "site-a" to an instance at site B, which does not work.
  • Site-specific networks: A limitation in DCN networking arises if you use networks that are specific to a certain site: When you deploy centralized neutron controllers with Compute nodes, there are no triggers in the Networking service to identify a certain Compute node as a remote node. Consequently, the Compute nodes receive a list of other Compute nodes and automatically form tunnels between each other. The tunnels are formed from edge to edge through the central site. If you use VXLAN or GENEVE, every Compute node at every site forms a tunnel with every other Compute node, whether or not they are local or remote. This is not an issue if you are using the same networks everywhere. When you use VLANs, the Networking service expects that all Compute nodes have the same bridge mappings, and that all VLANs are available at every site.
  • If edge servers are not pre-provisioned, you must configure DHCP relay for introspection and provisioning on routed segments.
  • Routing must be configured either on the cloud or within the networking infrastructure that connects each edge site to the hub. You should implement a networking design that allocates an L3 subnet for each RHOSO cluster network (external, internal API, and so on), unique to each site. If you are using BGP, you must configure BGP on the routers in these locations to learn the routes advertised by the RHOSO control plane and data plane nodes.

2.3. IP address pool sizing for the internalapi network

Size your internal API network address pool based on the number of distributed compute node (DCN) sites in your deployment. Each site requires dedicated IP addresses for Image service (glance) endpoints and load balancers.

The Image service operator creates an endpoint for each Image service pod with its own DNS name, such as glance-az0-internal.openstack.svc:9292. Each Compute service and Block storage service in each availability zone uses the Image service API server in the same availability zone. For example, when you update the cinderVolumes field in the OpenStackControlPlane custom resource (CR), add a field called glance_api_servers under customServiceConfig:

      cinderVolumes:
        az0:
          customServiceConfig: |
            [DEFAULT]
            enabled_backends = az0
            glance_api_servers = https://glance-az0-internal.openstack.svc:9292

The Image service endpoint DNS name maps to a load balancer IP address in the internalapi address pool as indicated by the internal metadata annotations:

            [glance_store]
            default_backend = ceph
            [ceph]
            rbd_store_ceph_conf = /etc/ceph/ceph.conf
            store_description = "ceph RBD backend"
            rbd_store_pool = images
            rbd_store_user = openstack
            rbd_thin_provisioning = True
          networkAttachments:
          - storage
          override:
            service:
              internal:
                metadata:
                  annotations:
                    metallb.universe.tf/address-pool: internalapi
                    metallb.universe.tf/allow-shared-ip: internalapi
                    metallb.universe.tf/loadBalancerIPs: 172.17.0.80

The range of addresses in this address pool should be sized according to the number of DCN sites. For example, the following shows only 10 available addresses in the internalapi network.

$ oc get ipaddresspool -n metallb-system
NAME          AUTO ASSIGN   AVOID BUGGY IPS   ADDRESSES
ctlplane      true          false             ["192.168.122.80-192.168.122.90"]
internalapi   true          false             ["172.17.0.80-172.17.0.90"]
storage       true          false             ["172.18.0.80-172.18.0.90"]
tenant        true          false             ["172.19.0.80-172.19.0.90"]

Use commands like the following after updating the glance section of the OpenStackControlPlane CR in order to confirm that the Glance Operator has created the service endpoint and route.

$ oc get svc | grep glance
glance-az0-internal             LoadBalancer   172.30.217.178   172.17.0.80      9292:32134/TCP                                   24h
glance-az0-public               ClusterIP      172.30.78.47     <none>           9292/TCP                                         24h
glance-az1-internal             LoadBalancer   172.30.52.123    172.17.0.81      9292:31679/TCP                                   23h
glance-c1ca8-az0-external-api   ClusterIP      None             <none>           9292/TCP                                         24h
glance-c1ca8-az0-internal-api   ClusterIP      None             <none>           9292/TCP                                         24h
glance-c1ca8-az1-edge-api       ClusterIP      None             <none>           9292/TCP                                         23h

$ oc get route | grep glance
glance-az0-public              glance-az0-public-openstack.apps.ocp.openstack.lab                     glance-az0-public              glance-az0-public              reencrypt/Redirect   None
glance-default-public          glance-default-public-openstack.apps.ocp.openstack.lab                 glance-default-public          glance-default-public          reencrypt/Redirect   None

Chapter 3. Installing and preparing the OpenStack Operator

You install the Red Hat OpenStack Services on OpenShift (RHOSO) OpenStack Operator (openstack-operator) and create the RHOSO control plane on an operational Red Hat OpenShift Container Platform (RHOCP) cluster. You install the OpenStack Operator by using the RHOCP OperatorHub. You perform the control plane installation tasks and all data plane creation tasks on a workstation that has access to the RHOCP cluster.

For information about mapping RHOSO versions to OpenStack Operators and OpenStackVersion Custom Resources (CRs), see the Red Hat Knowledgebase article How RHOSO versions map to OpenStack Operators and OpenStackVersion CRs.

3.1. Prerequisites

You can use the Red Hat OpenShift Container Platform (RHOCP) web console to install the OpenStack Operator (openstack-operator) on your RHOCP cluster from the OperatorHub. After you install the Operator, you configure a single instance of the OpenStack Operator initialization resource, OpenStack, to start the OpenStack Operator on your cluster.

Procedure

  1. Log in to the RHOCP web console as a user with cluster-admin permissions.
  2. Select Operators → OperatorHub.
  3. In the Filter by keyword field, type OpenStack.
  4. Click the OpenStack Operator tile with the Red Hat source label.
  5. Read the information about the Operator and click Install.
  6. On the Install Operator page, select "Operator recommended Namespace: openstack-operators" from the Installed Namespace list.
  7. On the Install Operator page, select "Manual" from the Update approval list. For information about how to manually approve a pending Operator update, see Manually approving a pending Operator update in the RHOCP Operators guide.
  8. Click Install to make the Operator available to the openstack-operators namespace. The OpenStack Operator is installed when the Status is Succeeded.
  9. Click Create OpenStack to open the Create OpenStack page.
  10. On the Create OpenStack page, click Create to create an instance of the OpenStack Operator initialization resource. The OpenStack Operator is ready to use when the Status of the openstack instance is Conditions: Ready.

3.3. Installing the OpenStack Operator by using the CLI

You can use the Red Hat OpenShift Container Platform (RHOCP) CLI (oc) to install the OpenStack Operator (openstack-operator) on your RHOCP cluster from the OperatorHub.

To install the OpenStack Operator by using the CLI, you create the openstack-operators namespace for the Red Hat OpenStack Platform (RHOSP) service Operators. You then create the OperatorGroup and Subscription custom resources (CRs) within the namespace. After you install the Operator, you configure a single instance of the OpenStack Operator initialization resource, OpenStack, to start the OpenStack Operator on your cluster.

Procedure

  1. Create the openstack-operators namespace for the RHOSP operators:

    $ cat << EOF | oc apply -f -
    apiVersion: v1
    kind: Namespace
    metadata:
      name: openstack-operators
    spec:
      finalizers:
      - kubernetes
    EOF
  2. Create the OperatorGroup CR in the openstack-operators namespace:

    $ cat << EOF | oc apply -f -
    apiVersion: operators.coreos.com/v1
    kind: OperatorGroup
    metadata:
      name: openstack
      namespace: openstack-operators
    EOF
  3. Create the Subscription CR that subscribes to openstack-operator:

    $ cat << EOF| oc apply -f -
    apiVersion: operators.coreos.com/v1alpha1
    kind: Subscription
    metadata:
      name: openstack-operator
      namespace: openstack-operators
    spec:
      name: openstack-operator
      channel: stable-v1.0
      source: redhat-operators
      sourceNamespace: openshift-marketplace
      installPlanApproval: Manual
    EOF
  4. Wait for the install plan to be created:

    $ oc get installplan -n openstack-operators -o json | jq -r '.items[] | select(.spec.approval=="Manual" and .spec.approved==false) | .metadata.name' | head -n1
  5. Approve the install plan:

    $ oc patch installplan <install_plan_name> -n openstack-operators --type merge -p '{"spec":{"approved":true}}'
  6. Verify that the OpenStack Operator is installed:

    $ oc wait csv -n openstack-operators \
     -l operators.coreos.com/openstack-operator.openstack-operators="" \
     --for jsonpath='{.status.phase}'=Succeeded
  7. Create an instance of the openstack-operator:

    $ cat << EOF | oc apply -f -
    apiVersion: operator.openstack.org/v1beta1
    kind: OpenStack
    metadata:
      name: openstack
      namespace: openstack-operators
    EOF
  8. Confirm that the Openstack Operator is deployed:

    $ oc wait openstack/openstack -n openstack-operators --for condition=Ready --timeout=500s

Additional resources

Chapter 4. Deploying the DCN control plane

Deploy the control plane on a Red Hat OpenShift Container Platform (RHOCP) cluster and configure the required networks. DCN deployments require careful network configuration to manage multiple subnets across central and edge locations.

Configure RHOCP networks before installing the control plane. The subnets that you use are specific to your environment. This document uses the following configuration in each of its examples.

Expand
Table 4.1. Example network configuration for DCN deployment
 Central location (AZ-0)AZ-1AZ-2

Control plane

192.168.122.0/24

192.168.133.0/24

192.168.144.0/24

External

10.0.0.0/24

10.0.10.0/24

10.0.20.0/24

Internal

172.17.0.0/24

172.17.10.0/24

172.17.20.0/24

Storage

172.18.0.0/24

172.18.10.0/24

172.18.20.0/24

Tenant

172.19.0.0/24

172.19.10.0/24

172.19.20.0/24

Storage Management

172.20.0.0/24

172.20.10.0/24

172.20.20.0/24

4.1. Spine-leaf network topology for DCN

Configure the routed spine-leaf network topology to interconnect geographically distributed nodes in your distributed compute node (DCN) deployment. This network topology is required for edge deployments.

You must configure the following CRs:

NodeNetworkConfigurationPolicy
Use the NodeNetworkConfigurationPolicy CR to configure the interfaces for each isolated network on each worker node in RHOCP cluster.
NetworkAttachmentDefinition
Use the NetworkAttachmentDefinition CR to attach service pods to the isolated networks, where needed.
L2Advertisement
Use the L2Advertisement resource to define how the Virtual IPs (VIPs) are announced.
IPAddressPool
Use the IPAddressPool resource to configure which IPs can be used as VIPs.
NetConfig
Use the NetConfig CR to specify the subnets for the data plane networks.
OpenStackControlPlane
Use the OpenStackControlPlane to define and configure OpenStack services on OpenShift.

4.2. Preparing DCN networking

Configure networking for your distributed compute node (DCN) deployment by setting up network interfaces, routes, and IP address pools. Proper network configuration ensures reliable communication between the central control plane and distributed edge locations.

Prerequisites

  • The OpenStack Operator is installed

Procedure

  1. Create a NodeNetworkConfigurationPolicy (nncp) CR definition file on your workstation for each worker node in the RHOCP cluster that hosts OpenStack services.
  2. In each nncp CR file, configure the interfaces for each isolated network. Each service interface must have its own unique address:

    apiVersion: nmstate.io/v1
    kind: NodeNetworkConfigurationPolicy
    metadata:
      labels:
        osp/nncm-config-type: standard
      name: worker-0
      namespace: openstack
    spec:
      desiredState:
        dns-resolver:
          config:
            search: []
            server:
            - 192.168.122.1
        interfaces:
        - description: internalapi vlan interface
          ipv4:
            address:
            - ip: 172.17.0.10
              prefix-length: "24"
            dhcp: false
            enabled: true
          ipv6:
            enabled: false
          mtu: 1496
          name: internalapi
          state: up
          type: vlan
          vlan:
            base-iface: enp7s0
            id: "20"
        - description: storage vlan interface
          ipv4:
            address:
            - ip: 172.18.0.10
              prefix-length: "24"
            dhcp: false
            enabled: true
          ipv6:
            enabled: false
          mtu: 1496
          name: storage
          state: up
          type: vlan
          vlan:
            base-iface: enp7s0
            id: "21"
        - description: tenant vlan interface
          ipv4:
            address:
            - ip: 172.19.0.10
              prefix-length: "24"
            dhcp: false
            enabled: true
          ipv6:
            enabled: false
          mtu: 1496
          name: tenant
          state: up
          type: vlan
          vlan:
            base-iface: enp7s0
            id: "22"
        - description: ctlplane interface
          mtu: 1500
          name: enp7s0
          state: up
          type: ethernet
        - bridge:
            options:
              stp:
                enabled: false
            port:
            - name: enp7s0
              vlan: {}
          description: linux-bridge over ctlplane interface
          ipv4:
            address:
            - ip: 192.168.122.10
              prefix-length: "24"
            dhcp: false
            enabled: true
          ipv6:
            enabled: false
          mtu: 1500
          name: ospbr
          state: up
          type: linux-bridge
  3. Add the route-rules attribute and the route configuration to networks in each remote location to each nncp CR file:

        route-rules:
          config: []
        routes:
          config:
          - destination: 192.168.133.0/24
            next-hop-address: 192.168.122.1
            next-hop-interface: ospbr
            table-id: 254
          - destination: 192.168.144.0/24
            next-hop-address: 192.168.122.1
            next-hop-interface: ospbr
            table-id: 254
          - destination: 172.17.10.0/24
            next-hop-address: 172.17.0.1
            next-hop-interface: internalapi
            table-id: 254
          - destination: 172.18.10.0/24
            next-hop-address: 172.18.0.1
            next-hop-interface: storage
            table-id: 254
          - destination: 172.19.10.0/24
            next-hop-address: 172.19.0.1
            next-hop-interface: tenant
            table-id: 254
          - destination: 172.17.20.0/24
            next-hop-address: 172.17.0.1
            next-hop-interface: internalapi
            table-id: 254
          - destination: 172.18.20.0/24
            next-hop-address: 172.18.0.1
            next-hop-interface: storage
            table-id: 254
          - destination: 172.19.20.0/24
            next-hop-address: 172.19.0.1
            next-hop-interface: tenant
            table-id: 254
      nodeSelector:
        kubernetes.io/hostname: worker-0
        node-role.kubernetes.io/worker: ""
    Note

    Each service network routes to the same network at each remote location. For example, the internalapi network (172.17.0.0/24) has a route to the internalapi network at each remote location (172.17.10.0/24 and 172.17.20.0/24) through a local router at 172.17.0.1.

  4. Create the nncp CRs in the cluster:

    $ oc create -f worker0-nncp.yaml
    $ oc create -f worker1-nncp.yaml
    $ oc create -f worker2-nncp.yaml
  5. Create a NetworkAttachmentDefinition CR definition file for each network. Include routes to each remote location to the networks of the same function. For example, the internalapi NetworkAttachmentDefinition specifies its own subnet range as well as routes to the internalapi networks at remote sites.

    1. Create a NetworkAttachmentDefinition CR definition file for the internalapi network:

      apiVersion: k8s.cni.cncf.io/v1
      kind: NetworkAttachmentDefinition
      metadata:
        labels:
          osp/net: internalapi
          osp/net-attach-def-type: standard
        name: internalapi
        namespace: openstack
      spec:
        config: |
          {
            "cniVersion": "0.3.1",
            "name": "internalapi",
            "type": "macvlan",
            "master": "internalapi",
            "ipam": {
              "type": "whereabouts",
              "range": "172.17.0.0/24",
              "range_start": "172.17.0.30",
              "range_end": "172.17.0.70",
              "routes": [
                  { "dst": "172.17.10.0/24", "gw": "172.17.0.1" },
                  { "dst": "172.17.20.0/24", "gw": "172.17.0.1" }
                ]
            }
          }
    2. Create a NetworkAttachmentDefinition CR definition file for the control network:

      apiVersion: k8s.cni.cncf.io/v1
      kind: NetworkAttachmentDefinition
      metadata:
        labels:
          osp/net: ctlplane
          osp/net-attach-def-type: standard
        name: ctlplane
        namespace: openstack
      spec:
        config: |
          {
            "cniVersion": "0.3.1",
            "name": "ctlplane",
            "type": "macvlan",
            "master": "ospbr",
            "ipam": {
              "type": "whereabouts",
              "range": "192.168.122.0/24",
              "range_start": "192.168.122.30",
              "range_end": "192.168.122.70",
              "routes": [
                  { "dst": "192.168.133.0/24", "gw": "192.168.122.1" },
                  { "dst": "192.168.144.0/24", "gw": "192.168.122.1" }
                ]
            }
          }
    3. Create a NetworkAttachmentDefinition CR definition file for the storage network:

      apiVersion: k8s.cni.cncf.io/v1
      kind: NetworkAttachmentDefinition
      metadata:
        labels:
          osp/net: storage
          osp/net-attach-def-type: standard
        name: storage
        namespace: openstack
      spec:
        config: |
          {
            "cniVersion": "0.3.1",
            "name": "storage",
            "type": "macvlan",
            "master": "storage",
            "ipam": {
              "type": "whereabouts",
              "range": "172.18.0.0/24",
              "range_start": "172.18.0.30",
              "range_end": "172.18.0.70",
              "routes": [
                  { "dst": "172.18.10.0/24", "gw": "172.18.0.1" },
                  { "dst": "172.18.20.0/24", "gw": "172.18.0.1" }
                ]
            }
          }
    4. Create a NetworkAttachmentDefinition CR definition file for the tenant network:

      apiVersion: k8s.cni.cncf.io/v1
      kind: NetworkAttachmentDefinition
      metadata:
        labels:
          osp/net: tenant
          osp/net-attach-def-type: standard
        name: tenant
        namespace: openstack
      spec:
        config: |
          {
            "cniVersion": "0.3.1",
            "name": "tenant",
            "type": "macvlan",
            "master": "tenant",
            "ipam": {
              "type": "whereabouts",
              "range": "172.19.0.0/24",
              "range_start": "172.19.0.30",
              "range_end": "172.19.0.70",
              "routes": [
                  { "dst": "172.19.10.0/24", "gw": "172.19.0.1" },
                  { "dst": "172.19.20.0/24", "gw": "172.19.0.1" }
                ]
            }
          }
  6. Create the NetworkAttachmentDefinition CRs:

    $ oc create -f internalapi-net-attach-def.yaml
    $ oc create -f control-net-attach-def.yaml
    $ oc create -f storage-net-attach-def.yaml
    $ oc create -f tenant-net-attach-def.yaml
  7. Create a NetConfig CR definition file to define which IPs can be used as Virtual IPs (VIPs). Each network is defined under the dnsDomain field, with allocationRanges for each geographic region. These ranges cannot overlap with the whereabouts IPAM range.

    1. Create the file with the added allocation ranges for the control plane networking similar to the following:

      apiVersion: network.openstack.org/v1beta1
      kind: NetConfig
      metadata:
        name: netconfig
        namespace: openstack
      spec:
        networks:
        - dnsDomain: ctlplane.example.com
          mtu: 1500
          name: ctlplane
          subnets:
          - allocationRanges:
            - end: 192.168.122.120
              start: 192.168.122.100
            - end: 192.168.122.170
              start: 192.168.122.150
            cidr: 192.168.122.0/24
            gateway: 192.168.122.1
            name: subnet1
            routes:
            - destination: 192.168.133.0/24
              nexthop: 192.168.122.1
            - destination: 192.168.144.0/24
              nexthop: 192.168.122.1
          - allocationRanges:
            - end: 192.168.133.120
              start: 192.168.133.100
            - end: 192.168.133.170
              start: 192.168.133.150
            cidr: 192.168.133.0/24
            gateway: 192.168.133.1
            name: subnet2
            routes:
            - destination: 192.168.122.0/24
              nexthop: 192.168.133.1
            - destination: 192.168.144.0/24
              nexthop: 192.168.133.1
          - allocationRanges:
            - end: 192.168.144.120
              start: 192.168.144.100
            - end: 192.168.144.170
              start: 192.168.144.150
            cidr: 192.168.144.0/24
            gateway: 192.168.144.1
            name: subnet3
            routes:
            - destination: 192.168.122.0/24
              nexthop: 192.168.144.1
            - destination: 192.168.133.0/24
              nexthop: 192.168.144.1
    2. Add an allocation range for the internalapi network:

        - dnsDomain: internalapi.example.com
          mtu: 1496
          name: internalapi
          subnets:
          - allocationRanges:
            - end: 172.17.0.250
              start: 172.17.0.100
            cidr: 172.17.0.0/24
            name: subnet1
            routes:
            - destination: 172.17.10.0/24
              nexthop: 172.17.0.1
            - destination: 172.17.20.0/24
              nexthop: 172.17.0.1
            vlan: 20
          - allocationRanges:
            - end: 172.17.10.250
              start: 172.17.10.100
            cidr: 172.17.0.0/24
            name: subnet2
            routes:
            - destination: 172.17.0.0/24
              nexthop: 172.17.10.1
            - destination: 172.17.20.0/24
              nexthop: 172.17.10.1
            vlan: 30
          - allocationRanges:
            - end: 172.17.20.250
              start: 172.17.20.100
            cidr: 172.17.20.0/24
            name: subnet3
            routes:
            - destination: 172.17.0.0/24
              nexthop: 172.17.20.1
            - destination: 172.17.10.0/24
              nexthop: 172.17.20.1
            vlan: 40
    3. Add an allocation range for the external network:

        - dnsDomain: external.example.com
          mtu: 1500
          name: external
          subnets:
          - allocationRanges:
            - end: 10.0.0.250
              start: 10.0.0.100
            cidr: 10.0.0.0/24
            name: subnet1
            vlan: 22
        - dnsDomain: external.example.com
          mtu: 1500
          name: external
          subnets:
          - allocationRanges:
            - end: 10.0.10.250
              start: 10.0.10.100
            cidr: 10.0.10.0/24
            name: subnet2
            vlan: 22
        - dnsDomain: external.example.com
          mtu: 1500
          name: external
          subnets:
          - allocationRanges:
            - end: 10.0.20.250
              start: 10.0.20.100
            cidr: 10.0.20.0/24
            name: subnet3
            vlan: 22
        - dnsDomain: storage.example.com
          mtu: 1496
          name: storage
          subnets:
          - allocationRanges:
            - end: 172.18.0.250
              start: 172.18.0.100
            cidr: 172.18.0.0/24
            name: subnet1
            routes:
            - destination: 172.18.10.0/24
              nexthop: 172.18.0.1
            - destination: 172.18.20.0/24
              nexthop: 172.18.0.1
            vlan: 21
          - allocationRanges:
            - end: 172.18.10.250
              start: 172.18.10.100
            cidr: 172.18.10.0/24
            name: subnet2
            routes:
            - destination: 172.18.0.0/24
              nexthop: 172.18.10.1
            - destination: 172.18.20.0/24
              nexthop: 172.18.10.1
            vlan: 31
          - allocationRanges:
            - end: 172.18.20.250
              start: 172.18.20.100
            cidr: 172.18.20.0/24
            name: subnet3
            routes:
            - destination: 172.18.0.0/24
              nexthop: 172.18.20.1
            - destination: 172.18.10.0/24
              nexthop: 172.18.20.1
            vlan: 41
    4. Add an allocation range for the tenant network:

        - dnsDomain: tenant.example.com
          mtu: 1496
          name: tenant
          subnets:
          - allocationRanges:
            - end: 172.19.0.250
              start: 172.19.0.100
            cidr: 172.19.0.0/24
            name: subnet1
            routes:
            - destination: 172.19.10.0/24
              nexthop: 172.19.0.1
            - destination: 172.19.20.0/24
              nexthop: 172.19.0.1
            vlan: 22
          - allocationRanges:
            - end: 172.19.10.250
              start: 172.19.10.100
            cidr: 172.19.10.0/24
            name: subnet2
            routes:
            - destination: 172.19.0.0/24
              nexthop: 172.19.10.1
            - destination: 172.19.20.0/24
              nexthop: 172.19.10.1
            vlan: 32
          - allocationRanges:
            - end: 172.19.20.250
              start: 172.19.20.100
            cidr: 172.19.20.0/24
            name: subnet3
            routes:
            - destination: 172.19.0.0/24
              nexthop: 172.19.20.1
            - destination: 172.19.10.0/24
              nexthop: 172.19.20.1
            vlan: 42
    5. Add an allocation range for the storagemgmt network:

        - dnsDomain: storagemgmt.example.com
          mtu: 1500
          name: storagemgmt
          subnets:
          - allocationRanges:
            - end: 172.20.0.250
              start: 172.20.0.100
            cidr: 172.20.0.0/24
            name: subnet1
            routes:
            - destination: 172.20.10.0/24
              nexthop: 172.20.0.1
            - destination: 172.20.20.0/24
              nexthop: 172.20.0.1
            vlan: 23
          - allocationRanges:
            - end: 172.20.10.250
              start: 172.20.10.100
            cidr: 172.20.10.0/24
            name: subnet2
            routes:
            - destination: 172.20.0.0/24
              nexthop: 172.20.10.1
            - destination: 172.20.20.0/24
              nexthop: 172.20.10.1
            vlan: 33
          - allocationRanges:
            - end: 172.20.20.250
              start: 172.20.20.100
            cidr: 172.20.20.0/24
            name: subnet3
            routes:
            - destination: 172.20.0.0/24
              nexthop: 172.20.20.1
            - destination: 172.20.10.0/24
              nexthop: 172.20.20.1
            vlan: 43
  8. Create the NetConfig CR:

    oc create -f netconfig

4.3. Creating the DCN control plane

Create the control plane that manages your distributed cloud infrastructure. The control plane centrally orchestrates workloads across central and edge node sets.

Prerequisites

  • The OpenStack Operator (openstack-operator) is installed.
  • The RHOCP cluster is prepared for RHOSO networks.
  • The RHOCP cluster is not configured with any network policies that prevent communication between the openstack-operators namespace and the control plane namespace (default openstack). Use the following command to check the existing network policies on the cluster:

    $ oc get networkpolicy -n openstack
  • You are logged on to a workstation that has access to the RHOCP cluster, as a user with cluster-admin privileges.

Procedure

  1. Create a file on your workstation named openstack_control_plane.yaml to define the OpenStackControlPlane CR:

    apiVersion: core.openstack.org/v1beta1
    kind: OpenStackControlPlane
    metadata:
      name: openstack-control-plane
      namespace: openstack
  2. Use the spec field to specify the Secret CR you create to provide secure access to your pod, and the storageClass you create for your Red Hat OpenShift Container Platform (RHOCP) cluster storage back end:

    apiVersion: core.openstack.org/v1beta1
    kind: OpenStackControlPlane
    metadata:
      name: openstack-control-plane
      namespace: openstack
    spec:
      secret: osp-secret
      storageClass: <RHOCP_storage_class>
    • Replace <RHOCP_storage_class> with the storage class you created for your RHOCP cluster storage back end.
  3. Add service configurations. Include service configurations for all required services:

    • Block Storage service (cinder):

        cinder:
          uniquePodNames: false
          apiOverride:
            route: {}
          template:
            customServiceConfig: |
              [DEFAULT]
              storage_availability_zone = az0
            databaseInstance: openstack
            secret: osp-secret
            cinderAPI:
              replicas: 3
              override:
                service:
                  internal:
                    metadata:
                      annotations:
                        metallb.universe.tf/address-pool: internalapi
                        metallb.universe.tf/allow-shared-ip: internalapi
                        metallb.universe.tf/loadBalancerIPs: 172.17.0.80
                    spec:
                      type: LoadBalancer
            cinderScheduler:
              replicas: 1
            cinderVolumes:
              az0:
                networkAttachments:
                - storage
                replicas: 0
      Note

      In RHOSO 18.0.3, You must set the uniquePodNames field to a value of false to allow for the propagation of Secrets. For more information see OSPRH-11240.

      Note
      • Set the replicas field to a value of 0. The replica count is changed and additional cinderVolume services are added after storage is configured.
      • Set the storage_availability_zone field in the template section to az0. All Block storage service (cinder) pods inherit this value, such as cinderBackup, cinderVolume, and so on. You can override this AZ for the cinderVolume service by specifying the backend_availability_zone.
    • Compute service (nova):

        nova:
          apiOverride:
            route: {}
          template:
            apiServiceTemplate:
              replicas: 3
              override:
                service:
                  internal:
                    metadata:
                      annotations:
                        metallb.universe.tf/address-pool: internalapi
                        metallb.universe.tf/allow-shared-ip: internalapi
                        metallb.universe.tf/loadBalancerIPs: 172.17.0.80
                    spec:
                      type: LoadBalancer
            metadataServiceTemplate:
              replicas: 3
              override:
                service:
                  metadata:
                    annotations:
                      metallb.universe.tf/address-pool: internalapi
                      metallb.universe.tf/allow-shared-ip: internalapi
                      metallb.universe.tf/loadBalancerIPs: 172.17.0.80
                  spec:
                    type: LoadBalancer
            schedulerServiceTemplate:
              replicas: 3
              override:
                service:
                  metadata:
                    annotations:
                      metallb.universe.tf/address-pool: internalapi
                      metallb.universe.tf/allow-shared-ip: internalapi
                      metallb.universe.tf/loadBalancerIPs: 172.17.0.80
                  spec:
                    type: LoadBalancer
            cellTemplates:
              cell0:
                cellDatabaseAccount: nova-cell0
                cellDatabaseInstance: openstack
                cellMessageBusInstance: rabbitmq
                hasAPIAccess: true
              cell1:
                cellDatabaseAccount: nova-cell1
                cellDatabaseInstance: openstack-cell1
                cellMessageBusInstance: rabbitmq-cell1
                noVNCProxyServiceTemplate:
                  enabled: true
                  networkAttachments:
                  - ctlplane
                hasAPIAccess: true
            secret: osp-secret
    • DNS service for the data plane:

        dns:
          template:
            options:
            - key: server
              values:
              - <IP address for DNS server reachable from dnsmasq pod>
            override:
              service:
                metadata:
                  annotations:
                    metallb.universe.tf/address-pool: ctlplane
                    metallb.universe.tf/allow-shared-ip: ctlplane
                    metallb.universe.tf/loadBalancerIPs: 192.168.122.80
                spec:
                  type: LoadBalancer
            replicas: 2
      • options: Defines the dnsmasq instances required for each DNS server by using key-value pairs. In this example, there is one key-value pair defined because there is only one DNS server configured to forward requests to.
      • key: Specifies the dnsmasq parameter to customize for the deployed dnsmasq instance. Set to one of the following valid values:

        • server
        • rev-server
        • srv-host
        • txt-record
        • ptr-record
        • rebind-domain-ok
        • naptr-record
        • cname
        • host-record
        • caa-record
        • dns-rr
        • auth-zone
        • synth-domain
        • no-negcache
        • local
      • values: Specifies the value for the DNS server reachable from the dnsmasq pod on the RHOCP cluster network. You can specify a generic DNS server as the value, for example, 1.1.1.1, or a DNS server for a specific domain, for example, /google.com/8.8.8.8.

        Note

        This DNS service, dnsmasq, provides DNS services for nodes on the RHOSO data plane. dnsmasq is different from the RHOSO DNS service (designate) that provides DNS as a service for cloud tenants.

    • Galera

        galera:
          templates:
            openstack:
              storageRequest: 5000M
              secret: osp-secret
              replicas: 3
            openstack-cell1:
              storageRequest: 5000M
              secret: osp-secret
              replicas: 3
    • Identity service (keystone)

        keystone:
          apiOverride:
            route: {}
          template:
            override:
              service:
                internal:
                  metadata:
                    annotations:
                      metallb.universe.tf/address-pool: internalapi
                      metallb.universe.tf/allow-shared-ip: internalapi
                      metallb.universe.tf/loadBalancerIPs: 172.17.0.80
                  spec:
                    type: LoadBalancer
            databaseInstance: openstack
            secret: osp-secret
            replicas: 3
    • Image service (glance):

        glance:
          apiOverrides:
            default:
              route: {}
          template:
            databaseInstance: openstack
            storage:
              storageRequest: 10G
            secret: osp-secret
            keystoneEndpoint: default
            glanceAPIs:
              default:
                replicas: 0
                override:
                  service:
                    internal:
                      metadata:
                        annotations:
                          metallb.universe.tf/address-pool: internalapi
                          metallb.universe.tf/allow-shared-ip: internalapi
                          metallb.universe.tf/loadBalancerIPs: 172.17.0.80
                      spec:
                        type: LoadBalancer
                networkAttachments:
                - storage
      Note

      You must initially set the replicas field to a value of 0. The replica count is changed and additional glanceAPI services are added after storage is configured.

    • Key Management service (barbican):

        barbican:
          apiOverride:
            route: {}
          template:
            databaseInstance: openstack
            secret: osp-secret
            barbicanAPI:
              replicas: 3
              override:
                service:
                  internal:
                    metadata:
                      annotations:
                        metallb.universe.tf/address-pool: internalapi
                        metallb.universe.tf/allow-shared-ip: internalapi
                        metallb.universe.tf/loadBalancerIPs: 172.17.0.80
                    spec:
                      type: LoadBalancer
            barbicanWorker:
              replicas: 3
            barbicanKeystoneListener:
              replicas: 1
    • Memcached

        memcached:
          templates:
            memcached:
               replicas: 3
    • Networking service (neutron):

        neutron:
          apiOverride:
            route: {}
          template:
            customServiceConfig: |
            [DEFAULT]
            network_scheduler_driver = neutron.scheduler.dhcp_agent_scheduler.AZAwareWeightScheduler
            default_availability_zones = az0
            [ml2_type_vlan]
            network_vlan_ranges = datacentre:1:1000
            [neutron]
            physnets = datacentre
            replicas: 3
            override:
              service:
                internal:
                  metadata:
                    annotations:
                      metallb.universe.tf/address-pool: internalapi
                      metallb.universe.tf/allow-shared-ip: internalapi
                      metallb.universe.tf/loadBalancerIPs: 172.17.0.80
                  spec:
                    type: LoadBalancer
            databaseInstance: openstack
            secret: osp-secret
            networkAttachments:
            - internalapi
    • Set the network_scheduler_driver to a value of neutron.scheduler.dhcp_agent_scheduler.AZAwareWeightScheduler if a DHCP agent is deployed.
    • OVN

        ovn:
          template:
            ovnController:
              external-ids:
                availability-zones:
                - az0
                enable-chassis-as-gateway: true
                ovn-bridge: br-int
                ovn-encap-type: geneve
                system-id: random
              networkAttachment: tenant
              nicMappings:
                datacentre: ospbr
            ovnDBCluster:
              ovndbcluster-nb:
                replicas: 3
                dbType: NB
                storageRequest: 10G
                networkAttachment: internalapi
              ovndbcluster-sb:
                replicas: 3
                dbType: SB
                storageRequest: 10G
                networkAttachment: internalapi
            ovnNorthd:
              networkAttachment: internalapi
    • Placement service (placement)

        placement:
          apiOverride:
            route: {}
          template:
            override:
              service:
                internal:
                  metadata:
                    annotations:
                      metallb.universe.tf/address-pool: internalapi
                      metallb.universe.tf/allow-shared-ip: internalapi
                      metallb.universe.tf/loadBalancerIPs: 172.17.0.80
                  spec:
                    type: LoadBalancer
            databaseInstance: openstack
            replicas: 3
            secret: osp-secret
    • RabbitMQ

        rabbitmq:
          templates:
            rabbitmq:
              replicas: 3
              override:
                service:
                  metadata:
                    annotations:
                      metallb.universe.tf/address-pool: internalapi
                      metallb.universe.tf/loadBalancerIPs: 172.17.0.85
                  spec:
                    type: LoadBalancer
            rabbitmq-cell1:
              replicas: 3
              override:
                service:
                  metadata:
                    annotations:
                      metallb.universe.tf/address-pool: internalapi
                      metallb.universe.tf/loadBalancerIPs: 172.17.0.86
                  spec:
                    type: LoadBalancer
  4. Create the control plane:

    oc create -f openstack_control_plane.yaml -n openstack

4.4. Distributing Ceph secret keys across sites

When you deploy a distributed compute node (DCN) environment with multiple Red Hat Ceph Storage backends, you create a Ceph authentication key for each backend. Limiting key distribution to only necessary credentials minimizes security exposure at each location.

  • Add the key for each Ceph backend to the secret for the default location.
  • Add the key for the default Ceph backend, as well as the local ceph backend for each additional location.

For three locations, az0, az1, and az2, you must have three secrets. Locations az1 and az2 each have keys for the local backend as well as the keys for az0. Location az0 contains all Ceph back end keys.

You create the required secrets after Ceph has been deployed at each edge location, and the keyring and configuration file for each has been collected. Alternatively, you can deploy each Ceph backend as needed, and update secrets with each edge deployment.

Procedure

  1. Create a secret for location az0.

    1. If you have already deployed Red Hat Ceph Storage (RHCS) at all edge sites which require storage, create a secret for az0 which contains all keyrings and conf files:

      oc create secret generic ceph-conf-az-0 \
      --from-file=az0.client.openstack.keyring \
      --from-file=az0.conf \
      --from-file=az1.client.openstack.keyring \
      --from-file=az1.conf \
      --from-file=az2.client.openstack.keyring \
      --from-file=az2.conf -n openstack
    2. If you have not deployed RHCS at all edge sites, create a secret for az0 which contains the keyring and conf file for az0:

      oc create secret generic ceph-conf-az-0 \
      --from-file=az0.client.openstack.keyring \
      --from-file=az0.conf -n openstack
  2. When you deploy RHCS at the edge location at availability zone 1 (az1), create a secret for location az1 which contains keyrings and conf files for the local backend, and the default backend:

    oc create secret generic ceph-conf-az-1 \
    --from-file=az0.client.openstack.keyring \
    --from-file=az0.conf \
    --from-file=az1.client.openstack.keyring \
    --from-file=az1.conf -n openstack
  3. If needed, update the secret for the central location:

    oc delete secret ceph-conf-az-0 -n openstack
    
    oc create secret generic ceph-conf-az-0 \
    --from-file=az0.client.openstack.keyring \
    --from-file=az0.conf \
    --from-file=az1.client.openstack.keyring \
    --from-file=az1.conf -n openstack
  4. When you deploy RHCS at the edge location at availability zone 2 (az2) create a secret for location az2 which contains keyrings and conf files for the local backend, and the default backend:

    oc create secret generic ceph-conf-az-2 \
    --from-file=az0.client.openstack.keyring \
    --from-file=az0.conf \
    --from-file=az2.client.openstack.keyring \
    --from-file=az2.conf -n openstack
  5. If needed, update the secret for the central location:

    oc delete secret ceph-conf-az-0 -n openstack
    
    oc create secret generic ceph-conf-az-0 \
    --from-file=az0.client.openstack.keyring \
    --from-file=az0.conf \
    --from-file=az1.client.openstack.keyring \
    --from-file=az1.conf \
    --from-file=az1.client.openstack.keyring \
    --from-file=az1.conf \
    --from-file=az2.client.openstack.keyring \
    --from-file=az2.conf
  6. [Optional] When you have finished creating the necessary keys, you can verify that they show up in the openstack namespace:

    oc get secret -n openstack -o name | grep ceph-conf

    Example output:

    secret/ceph-conf-az-0
    secret/ceph-conf-az-1
    secret/ceph-conf-az-2
  7. When you create an OpenStackDataPlaneNodeSet, use the appropriate key under the extraMounts field:

    apiVersion: dataplane.openstack.org/v1beta1
    kind: OpenStackDataPlaneNodeSet
    metadata:
      name: openstack-edpm-dcn-0
      namespace: openstack
    spec:
      ...
      nodeTemplate:
        extraMounts:
        - extraVolType: Ceph
          volumes:
          - name: ceph
            secret:
              secretName: ceph-conf-az-0
          mounts:
          - name: ceph
            mountPath: "/etc/ceph"
            readOnly: true
  8. When you create a data plane NodeSet, you must also update the OpenStackControlPlane custom resource (CR) with the secret name:

    apiVersion: core.openstack.org/v1beta1
    kind: OpenStackControlPlane
    spec:
      extraMounts:
        - name: v1
          region: r1
          extraVol:
            - propagation:
              - az0
              - CinderBackup
              extraVolType: Ceph
              volumes:
              - name: ceph
                secret:
                  name: ceph-conf-az-0
              mounts:
              - name: ceph
                mountPath: "/etc/ceph"
                readOnly: true
            - propagation:
              - az1
              extraVolType: Ceph
              volumes:
              - name: ceph
                secret:
                  name: ceph-conf-az-1
              mounts:
              - name: ceph
                mountPath: "/etc/ceph"
                readOnly: true
            ...
    Note

    If the CinderBackup service is a part of the deployment, then you must include it in the propagation list because it does not have the availability zone in its pod name.

  9. When you update the glanceAPIs field in the OpenStackControlPlane CR, the Image service (glance) pod name matches the extraMounts propagation instances:

         glanceAPIs:
           az0:
             customServiceConfig: |
             ...
           az1:
             customServiceConfig: |
             ...
  10. When you update the cinderVolumes field in the OpenStackControlPlane CR, the Block Storage service (cinder) pod names must also match the extraMounts propagation instances:

    kind: OpenStackControlPlane
    spec:
      <...>
      cinder
        <...>
        cinderVolumes:
          az0:
            <...>
          az1:
            <...>

Chapter 5. Deploying a DCN node set

Deploy node sets at central and remote edge locations using the same procedures, and use a single control plane to manage your geographically distributed workloads.

Each edge location requires a separate availability zone to ensure proper isolation and resource scheduling. For example, deploy the central location node set at az0, deploy the first edge site at az1, and so on.

5.1. Configuring the data plane node networks

Configure data plane node networks to meet Red Hat Ceph Storage networking requirements. Proper network configuration ensures optimal storage performance and reliable communication between compute and storage services.

Prerequisites

  • Control plane deployment is complete but has not yet been modified to use Ceph Storage.
  • The data plane nodes have been pre-provisioned with an operating system.
  • The data plane nodes are accessible through an SSH key that Ansible can use.
  • If you are using HCI, then the data plane nodes have disks available to be used as Ceph OSDs.
  • There are a minimum of three available data plane nodes. Ceph Storage clusters must have a minimum of three nodes to ensure redundancy.

Procedure

  1. Create a file on your workstation named dcn-data-plane-networks.yaml to define the OpenStackDataPlaneNodeSet CR that configures the data plane node networks:

    apiVersion: dataplane.openstack.org/v1beta1
    kind: OpenStackDataPlaneNodeSet
    metadata:
      name: dcn-data-plane-networks
      namespace: openstack
    spec:
      env:
        - name: ANSIBLE_FORCE_COLOR
          value: "True"
  2. Specify the services to apply to the nodes:

    spec:
      ...
      services:
        - bootstrap
        - configure-network
        - validate-network
        - install-os
        - ceph-hci-pre
        - configure-os
        - ssh-known-hosts
        - run-os
        - reboot-os
  3. Set the edpm_enable_chassis_gw and edpm_ovn_availability_zones fields on the data plane:

    spec:
      env:
      - name: ANSIBLE_FORCE_COLOR
        value: "True"
      networkAttachments:
      - ctlplane
      nodeTemplate:
        ansible:
          ansiblePort: 22
          ansibleUser: cloud-admin
          ansibleVars:
            edpm_enable_chassis_gw: true
            edpm_ovn_availability_zones:
              - az0
  4. Optional: The ceph-hci-pre service prepares data plane nodes to host Red Hat Ceph Storage services after network configuration using the edpm_ceph_hci_pre edpm-ansible role. By default, the edpm_ceph_hci_pre_enabled_services parameter of this role only contains RBD, RGW, and NFS services. DCN only supports RBD services at DCN sites. If you are deploying HCI, disable the RGW and NFS services by adding the edpm_ceph_hci_pre_enabled_services parameter, and adding only ceph RBD services.

    apiVersion: dataplane.openstack.org/v1beta1
    kind: OpenStackDataPlaneNodeSet
    metadata:
      name: openstack-edpm
      namespace: openstack
    spec:
      env:
      - name: ANSIBLE_FORCE_COLOR
    	value: "True"
      networkAttachments:
      - ctlplane
      nodeTemplate:
    	ansible:
      	ansiblePort: 22
      	ansibleUser: cloud-admin
      	ansibleVars:
        	edpm_ceph_hci_pre_enabled_services:
        	- ceph_mon
        	- ceph_mgr
        	- ceph_osd
    ...
    Note

    If other services, such as the Dashboard, are deployed with HCI nodes, they must be added to the edpm_ceph_hci_pre_enabled_services parameter list. For more information about this role, see edpm_ceph_hci_pre role.

  5. Configure the Red Hat Ceph Storage cluster network for storage management.

    The following example has 3 nodes. It assumes the storage management is on VLAN23:

    apiVersion: dataplane.openstack.org/v1beta1
    kind: OpenStackDataPlaneNodeSet
    metadata:
      name: openstack-edpm
      namespace: openstack
    spec:
      env:
      - name: ANSIBLE_FORCE_COLOR
        value: "True"
      networkAttachments:
      - ctlplane
      nodeTemplate:
        ansible:
          ansiblePort: 22
          ansibleUser: cloud-admin
          ansibleVars:
            edpm_ceph_hci_pre_enabled_services:
            - ceph_mon
            - ceph_mgr
            - ceph_osd
            edpm_fips_mode: check
            edpm_iscsid_image: {{ registry_url }}/openstack-iscsid:{{ image_tag }}
            edpm_logrotate_crond_image: {{ registry_url }}/openstack-cron:{{ image_tag }}
            edpm_network_config_hide_sensitive_logs: false
            edpm_network_config_os_net_config_mappings:
              edpm-compute-0:
                nic1: 52:54:00:1e:af:6b
                nic2: 52:54:00:d9:cb:f4
              edpm-compute-1:
                nic1: 52:54:00:f2:bc:af
                nic2: 52:54:00:f1:c7:dd
              edpm-compute-2:
                nic1: 52:54:00:dd:33:14
                nic2: 52:54:00:50:fb:c3
            edpm_network_config_template: |
              ---
              {% set mtu_list = [ctlplane_mtu] %}
              {% for network in nodeset_networks %}
              {{ mtu_list.append(lookup(vars, networks_lower[network] ~ _mtu)) }}
              {%- endfor %}
              {% set min_viable_mtu = mtu_list | max %}
              network_config:
              - type: ovs_bridge
                name: {{ neutron_physical_bridge_name }}
                mtu: {{ min_viable_mtu }}
                use_dhcp: false
                dns_servers: {{ ctlplane_dns_nameservers }}
                domain: {{ dns_search_domains }}
                addresses:
                - ip_netmask: {{ ctlplane_ip }}/{{ ctlplane_cidr }}
                routes: {{ ctlplane_host_routes }}
                members:
                - type: interface
                  name: nic2
                  mtu: {{ min_viable_mtu }}
                  # force the MAC address of the bridge to this interface
                  primary: true
              {% for network in nodeset_networks %}
                - type: vlan
                  mtu: {{ lookup(vars, networks_lower[network] ~ _mtu) }}
                  vlan_id: {{ lookup(vars, networks_lower[network] ~ _vlan_id) }}
                  addresses:
                  - ip_netmask:
                      {{ lookup(vars, networks_lower[network] ~ _ip) }}/{{ lookup(vars, networks_lower[network] ~ _cidr) }}
                  routes: {{ lookup(vars, networks_lower[network] ~ _host_routes) }}
              {% endfor %}
            edpm_neutron_metadata_agent_image: {{ registry_url }}/openstack-neutron-metadata-agent-ovn:{{ image_tag }}
            edpm_nodes_validation_validate_controllers_icmp: false
            edpm_nodes_validation_validate_gateway_icmp: false
            edpm_selinux_mode: enforcing
            edpm_sshd_allowed_ranges:
            - 192.168.111.0/24
            - 192.168.122.0/24
            - 192.168.133.0/24
            - 192.168.144.0/24
            edpm_sshd_configure_firewall: true
            enable_debug: false
            gather_facts: false
            image_tag: current-podified
            neutron_physical_bridge_name: br-ex
            neutron_public_interface_name: eth0
            service_net_map:
              nova_api_network: internalapi
              nova_libvirt_network: internalapi
            storage_mgmt_cidr: "24"
            storage_mgmt_host_routes: []
            storage_mgmt_mtu: 9000
            storage_mgmt_vlan_id: 23
            storage_mtu: 9000
            timesync_ntp_servers:
            - hostname: pool.ntp.org
        ansibleSSHPrivateKeySecret: dataplane-ansible-ssh-private-key-secret
        managementNetwork: ctlplane
        networks:
        - defaultRoute: true
          name: ctlplane
          subnetName: subnet1
        - name: internalapi
          subnetName: subnet1
        - name: storage
          subnetName: subnet1
        - name: tenant
          subnetName: subnet1
      nodes:
        edpm-compute-0:
          ansible:
            host: 192.168.122.100
          hostName: compute-0
          networks:
          - defaultRoute: true
            fixedIP: 192.168.122.100
            name: ctlplane
            subnetName: subnet1
          - name: internalapi
            subnetName: subnet1
          - name: storage
            subnetName: subnet1
          - name: storagemgmt
            subnetName: subnet1
          - name: tenant
            subnetName: subnet1
        edpm-compute-1:
          ansible:
            host: 192.168.122.101
          hostName: compute-1
          networks:
          - defaultRoute: true
            fixedIP: 192.168.122.101
            name: ctlplane
            subnetName: subnet1
          - name: internalapi
            subnetName: subnet1
          - name: storage
            subnetName: subnet1
          - name: storagemgmt
            subnetName: subnet1
          - name: tenant
            subnetName: subnet1
        edpm-compute-2:
          ansible:
            host: 192.168.122.102
          hostName: compute-2
          networks:
          - defaultRoute: true
            fixedIP: 192.168.122.102
            name: ctlplane
            subnetName: subnet1
          - name: internalapi
            subnetName: subnet1
          - name: storage
            subnetName: subnet1
          - name: storagemgmt
            subnetName: subnet1
          - name: tenant
            subnetName: subnet1
      preProvisioned: true
      services:
      - bootstrap
      - configure-network
      - validate-network
      - install-os
      - ceph-hci-pre
      - configure-os
      - ssh-known-hosts
      - run-os
      - reboot-os
  6. Apply the CR:

    $ oc apply -f <dataplane_cr_file>
    • Replace <dataplane_cr_file> with the name of your file.

      Note

      Ansible does not configure or validate the networks until the OpenStackDataPlaneDeployment CRD is created.

  7. Create an OpenStackDataPlaneDeployment CRD, as described in Creating the data plane in the Deploying Red Hat OpenStack Services on OpenShift guide, which has the OpenStackDataPlaneNodeSet CRD file defined above to have Ansible configure the services on the data plane nodes.
  8. To confirm the network is configured, complete the following steps:

    1. SSH into a data plane node.
    2. Use the ip a command to display configured networks.
    3. Confirm the storage networks are in the list of configured networks.

5.2. Configuring hyperconverged Ceph Storage

Configure and deploy hyperconverged Red Hat Ceph Storage on your data plane nodes. This configuration enables compute and storage services to run on the same hardware for optimal resource utilization at edge sites.

Note

The following steps are specifically for a hyperconverged configuration of Red Hat Ceph Storage (RHCS), and are not required if you have deployed an external RHCS cluster.

Procedure

  1. Edit the Red Hat Ceph Storage configuration file.
  2. Add the Storage and Storage Management network ranges. Red Hat Ceph Storage uses the Storage network as the Red Hat Ceph Storage public_network and the Storage Management network as the cluster_network.

    The following example is for a configuration file entry where the Storage network range is 172.18.0.0/24 and the Storage Management network range is 172.20.0.0/24:

    [global]
    public_network = 172.18.0.0/24
    cluster_network = 172.20.0.0/24
  3. Add collocation boundaries between the Compute service and Ceph OSD services. Boundaries should be set between collocated Compute service and Ceph OSD services to reduce CPU and memory contention.

    The following is an example for a Ceph configuration file entry with these boundaries set:

    [osd]
    osd_memory_target_autotune = true
    osd_numa_auto_affinity = true
    [mgr]
    mgr/cephadm/autotune_memory_target_ratio = 0.2

    In this example, the osd_memory_target_autotune parameter is set to true so that the OSD daemons adjust memory consumption based on the osd_memory_target option. The autotune_memory_target_ratio defaults to 0.7. This means 70 percent of the total RAM in the system is the starting point from which any memory consumed by non-autotuned Ceph daemons is subtracted. The remaining memory is divided between the OSDs; assuming all OSDs have osd_memory_target_autotune set to true. For HCI deployments, you can set mgr/cephadm/autotune_memory_target_ratio to 0.2 so that more memory is available for the Compute service.

    For additional information about service collocation, see Collocating services in a HCI environment for NUMA nodes.

    Note

    If these values need to be adjusted after the deployment, use the ceph config set osd <key> <value> command.

  4. Deploy Ceph Storage with the edited configuration file on a data plane node:

    $ cephadm bootstrap --config <config_file> --mon-ip <data_plane_node_ip> --skip-monitoring-stack

    • Replace <config_file> with the name of your Ceph configuration file.
    • Replace <data_plane_node_ip> with the Storage network IP address of the data plane node on which Red Hat Ceph Storage will be installed.

      Note

      The --skip-monitoring-stack option is used in the cephadm bootstrap command to skip the deployment of monitoring services. This ensures the Red Hat Ceph Storage deployment completes successfully if monitoring services have been previously deployed as part of any other preceding process.

      If monitoring services have not been deployed, see the Red Hat Ceph Storage documentation for information and procedures on enabling monitoring services:

  5. After the Red Hat Ceph Storage cluster is bootstrapped on the first EDPM node, see "Red Hat Ceph Storage installation" in the Red Hat Ceph Storage Installation Guide to add the other EDPM nodes to the Ceph cluster:

5.3. Configuring the DCN data plane

Configure the data plane to integrate with your Red Hat Ceph Storage backend. This configuration enables data plane nodes to access Ceph for persistent storage operations.

Prerequisites

Procedure

  1. Edit the OpenStackDataPlaneNodeSet CR.
  2. To make the cephx key and configuration file available for the Compute service (nova), use the extraMounts parameter.

    The following is an example of using the extraMounts parameter for this purpose:

    apiVersion: dataplane.openstack.org/v1beta1
    kind: OpenStackDataPlaneNodeSet
    spec:
      ...
      nodeTemplate:
        extraMounts:
        - extraVolType: Ceph
          volumes:
          - name: ceph
            secret:
              secretName: ceph-conf-files
          mounts:
          - name: ceph
            mountPath: "/etc/ceph"
            readOnly: true
  3. Create a ConfigMap to add required configuration details to the Compute service (nova). Create a file called ceph-nova-az0.yaml and add contents similar to the following. You must add the Image service (glance) endpoint for the local availability zone, as well as set the cross_az_attach parameter to false:

    apiVersion: v1
    kind: ConfigMap
    metadata:
      name: ceph-nova-az0
      namespace: openstack
    data:
      03-ceph-nova.conf:
        [libvirt]
        images_type = rbd
        images_rbd_pool = vms
        images_rbd_ceph_conf = /etc/ceph/az0.conf
        images_rbd_glance_store_name = az0
        images_rbd_glance_copy_poll_interval = 15
        images_rbd_glance_copy_timeout = 600
        rbd_user = openstack
        rbd_secret_uuid = 9cfb3a03-3f91-516a-881e-a675f67c30ea
        hw_disk_discard = unmap
        volume_use_multipath = False
        [glance]
        endpoint_override = http://glance-az0-internal.openstack.svc:9292
        valid_interfaces = internal
        [cinder]
        cross_az_attach = False
        catalog_info = volumev3:cinderv3:internalURL
  4. Create the ConfigMap:

    oc create -f ceph-nova-az0.yaml
  5. Create a custom Compute (nova) service to use the ConfigMap. Create a file called nova-custom-az0.yaml and add contents similar to the following. You must add the name of the ConfigMap that you just created under the dataSources field:

    apiVersion: dataplane.openstack.org/v1beta1
    kind: OpenStackDataPlaneService
    metadata:
      name: nova-custom-ceph-az0
    spec:
      addCertMounts: false
      caCerts: combined-ca-bundle
      dataSources:
      - configMapRef:
          name: ceph-nova-az0
      - secretRef:
          name: nova-cell1-compute-config
      - secretRef:
          name: nova-migration-ssh-key
      edpmServiceType: nova
      playbook: osp.edpm.nova
      tlsCerts:
        default:
          contents:
          - dnsnames
          - ips
          edpmRoleServiceName: nova
          issuer: osp-rootca-issuer-internal
          networks:
          - ctlplane
  6. Create the custom service:

    oc create -f nova-custom-ceph-az0.yaml
    Note

    You must create a unique ConfigMap and custom Compute service for each availability zone. Append the availability zone to the end of these file names as shown in the previous steps.

  7. Locate the services list in the CR.
  8. Edit the services list to restore all of the services removed in Configuring the data plane node networks. Restoring the full services list allows the remaining jobs to be run that complete the configuration of the HCI environment.

    The following is an example of a full services list with the additional services in bold:

    apiVersion: dataplane.openstack.org/v1beta1
    kind: OpenStackDataPlaneNodeSet
    spec:
      ...
      services:
        - bootstrap
        - configure-network
        - validate-network
        - install-os
        - ceph-hci-pre
        - configure-os
        - ssh-known-hosts
        - run-os
        - reboot-os
        - install-certs
        - ceph-client
        - ovn
        - neutron-metadata
        - libvirt
        - nova-custom-ceph-az0
    Note

    In addition to restoring the default service list, the ceph-client service is added after the run-os service. The ceph-client service configures EDPM nodes as clients of a Red Hat Ceph Storage server. This service distributes the files necessary for the clients to connect to the Red Hat Ceph Storage server. The ceph-hci-pre service is only needed when you deploy HCI.

  9. Optional: You can assign compute nodes to Compute service (nova) cells the same as you can in any other environment. Replace the nova service in your OpenStackDataPlaneNodeSet CR with your custom nova service:

    apiVersion: dataplane.openstack.org/v1beta1
    kind: OpenStackDataPlaneNodeSet
    metadata:
      name: openstack-cell2
    spec:
      services:
        - download-cache
        - bootstrap
        - configure-network
        - validate-network
        - install-os
        - configure-os
        - ssh-known-hosts
        - run-os
        - ovn
        - libvirt
        - *nova-cell-custom*

    For more information, see Connecting an OpenStackDataPlaneNodeSetSR to a Compute cell.

    Note

    If you are using cells, then the neutron-metadata service is unique per cell and defined separately. For example neutron-metadata-cell1:

    apiVersion: dataplane.openstack.org/v1beta1
    kind: OpenStackDataPlaneService
    metadata:
     labels:
        app.kubernetes.io/instance: neutron-metadata-cell1
        app.kubernetes.io/name: openstackdataplaneservice
        app.kubernetes.io/part-of: openstack-operator
      name: neutron-metadata-cell1
      ...

    The nova-custom-ceph service is unique for each availability zone and defined separately. For example, nova-custom-ceph-az0:

    apiVersion: dataplane.openstack.org/v1beta1
    kind: OpenStackDataPlaneService
    metadata:
      labels:
        app.kubernetes.io/instance: nova-custom-ceph-az0
        app.kubernetes.io/name: openstackdataplaneservice
        app.kubernetes.io/part-of: openstack-operator
      name: nova-custom-ceph-az0
      namespace: openstack
  10. Optional: If you are deploying Red Hat Ceph Storage (RHCS) as a hyperconverged solution, complete the following steps:

    1. Create a ConfigMap to set the reserved_host_memory_mb parameter to a value appropriate for your configuration:

      The following is an example of a ConfigMap used for this purpose:

      apiVersion: v1
      kind: ConfigMap
      metadata:
        name: reserved-memory-nova
      data:
        04-reserved-memory-nova.conf: |
          [DEFAULT]
          reserved_host_memory_mb=75000
      Note

      The value for the reserved_host_memory_mb parameter may be set so that the Compute service scheduler does not give memory to a virtual machine that a Ceph OSD on the same server needs. The example reserves 5 GB per OSD for 10 OSDs per host in addition to the default reserved memory for the hypervisor. In an IOPS-optimized cluster, you can improve performance by reserving more memory for each OSD. The 5 GB number is provided as a starting point which can be further tuned if necessary.

    2. Edit the OpenStackDataPlaneService/nova-custom-ceph-az file. Add reserved-memory-nova to the configMaps list in the OpenStackDataPlaneService CR called ceph-nova-az0 that you created earlier:

      kind: OpenStackDataPlaneService
      <...>
      spec:
        configMaps:
        - ceph-nova
        - reserved-memory-nova
  11. Apply the CR changes.

    $ oc apply -f <dataplane_cr_file>
    • Replace <dataplane_cr_file> with the name of your file.

      Note

      Ansible does not configure or validate the networks until the OpenStackDataPlaneDeployment CRD is created.

  12. Create an OpenStackDataPlaneDeployment CRD, as described in Creating the data plane in the Deploying Red Hat OpenStack Services on OpenShift guide, which has the OpenStackDataPlaneNodeSet CRD file defined above to have Ansible configure the services on the data plane nodes.

5.4. Example node set resource

Review a configuration example for a node set resource to understand the structure and required fields. This example demonstrates a three-node deployment with storage management networking configured.

apiVersion: dataplane.openstack.org/v1beta1
kind: OpenStackDataPlaneNodeSet
metadata:
  name: openstack-edpm
  namespace: openstack
spec:
  env:
  - name: ANSIBLE_FORCE_COLOR
    value: "True"
  networkAttachments:
  - ctlplane
  nodeTemplate:
    ansible:
      ansiblePort: 22
      ansibleUser: cloud-admin
      ansibleVars:
        edpm_ceph_hci_pre_enabled_services:
        - ceph_mon
        - ceph_mgr
        - ceph_osd
        edpm_fips_mode: check
        edpm_iscsid_image: {{ registry_url }}/openstack-iscsid:{{ image_tag }}
        edpm_logrotate_crond_image: {{ registry_url }}/openstack-cron:{{ image_tag }}
        edpm_network_config_hide_sensitive_logs: false
        edpm_network_config_os_net_config_mappings:
          edpm-compute-0:
            nic1: 52:54:00:1e:af:6b
            nic2: 52:54:00:d9:cb:f4
          edpm-compute-1:
            nic1: 52:54:00:f2:bc:af
            nic2: 52:54:00:f1:c7:dd
          edpm-compute-2:
            nic1: 52:54:00:dd:33:14
            nic2: 52:54:00:50:fb:c3
        edpm_network_config_template: |
          ---
          {% set mtu_list = [ctlplane_mtu] %}
          {% for network in nodeset_networks %}
          {{ mtu_list.append(lookup(vars, networks_lower[network] ~ _mtu)) }}
          {%- endfor %}
          {% set min_viable_mtu = mtu_list | max %}
          network_config:
          - type: ovs_bridge
            name: {{ neutron_physical_bridge_name }}
            mtu: {{ min_viable_mtu }}
            use_dhcp: false
            dns_servers: {{ ctlplane_dns_nameservers }}
            domain: {{ dns_search_domains }}
            addresses:
            - ip_netmask: {{ ctlplane_ip }}/{{ ctlplane_cidr }}
            routes: {{ ctlplane_host_routes }}
            members:
            - type: interface
              name: nic2
              mtu: {{ min_viable_mtu }}
              # force the MAC address of the bridge to this interface
              primary: true
          {% for network in nodeset_networks %}
            - type: vlan
              mtu: {{ lookup(vars, networks_lower[network] ~ _mtu) }}
              vlan_id: {{ lookup(vars, networks_lower[network] ~ _vlan_id) }}
              addresses:
              - ip_netmask:
                  {{ lookup(vars, networks_lower[network] ~ _ip) }}/{{ lookup(vars, networks_lower[network] ~ _cidr) }}
              routes: {{ lookup(vars, networks_lower[network] ~ _host_routes) }}
          {% endfor %}
        edpm_neutron_metadata_agent_image: {{ registry_url }}/openstack-neutron-metadata-agent-ovn:{{ image_tag }}
        edpm_nodes_validation_validate_controllers_icmp: false
        edpm_nodes_validation_validate_gateway_icmp: false
        edpm_selinux_mode: enforcing
        edpm_sshd_allowed_ranges:
        - 192.168.111.0/24
        - 192.168.122.0/24
        - 192.168.133.0/24
        - 192.168.144.0/24
        edpm_sshd_configure_firewall: true
        enable_debug: false
        gather_facts: false
        image_tag: current-podified
        neutron_physical_bridge_name: br-ex
        neutron_public_interface_name: eth0
        service_net_map:
          nova_api_network: internalapi
          nova_libvirt_network: internalapi
        storage_mgmt_cidr: "24"
        storage_mgmt_host_routes: []
        storage_mgmt_mtu: 9000
        storage_mgmt_vlan_id: 23
        storage_mtu: 9000
        timesync_ntp_servers:
        - hostname: pool.ntp.org
    ansibleSSHPrivateKeySecret: dataplane-ansible-ssh-private-key-secret
    managementNetwork: ctlplane
    networks:
    - defaultRoute: true
      name: ctlplane
      subnetName: subnet1
    - name: internalapi
      subnetName: subnet1
    - name: storage
      subnetName: subnet1
    - name: tenant
      subnetName: subnet1
    - name: external
      subnetName: external1
  nodes:
    edpm-compute-0:
      ansible:
        host: 192.168.122.100
      hostName: compute-0
      networks:
      - defaultRoute: true
        fixedIP: 192.168.122.100
        name: ctlplane
        subnetName: subnet1
      - name: internalapi
        subnetName: subnet1
      - name: storage
        subnetName: subnet1
      - name: storagemgmt
        subnetName: subnet1
      - name: tenant
        subnetName: subnet1
      - name: external
        subnetName: external1
    edpm-compute-1:
      ansible:
        host: 192.168.122.101
      hostName: compute-1
      networks:
      - defaultRoute: true
        fixedIP: 192.168.122.101
        name: ctlplane
        subnetName: subnet1
      - name: internalapi
        subnetName: subnet1
      - name: storage
        subnetName: subnet1
      - name: storagemgmt
        subnetName: subnet1
      - name: tenant
        subnetName: subnet1
      - name: external
        subnetName: external1
    edpm-compute-2:
      ansible:
        host: 192.168.122.102
      hostName: compute-2
      networks:
      - defaultRoute: true
        fixedIP: 192.168.122.102
        name: ctlplane
        subnetName: subnet1
      - name: internalapi
        subnetName: subnet1
      - name: storage
        subnetName: subnet1
      - name: storagemgmt
        subnetName: subnet1
      - name: tenant
        subnetName: subnet1
      - name: external
        subnetName: external1
  preProvisioned: true
  services:
  - bootstrap
  - configure-network
  - validate-network
  - install-os
  - ceph-hci-pre
  - configure-os
  - ssh-known-hosts
  - run-os
  - reboot-os
Note

It is not necessary to add the storage management network to the networkAttachments key.

5.5. Updating the control plane

After you have deployed the data plane at the central location, you must update the control plane to integrate the newly deployed data plane.

Prerequisites

  • You have deployed a node set at the central location using Red Hat OpenStack services on OpenShift.
  • You have deployed Red Hat Ceph Storage (RHCS).

Procedure

  1. Optional: Configure the Block Storage backup service in your openstack_control_plane.yaml file:

          cinderBackup:
            customServiceConfig: |
              [DEFAULT]
              backup_driver = cinder.backup.drivers.ceph.CephBackupDriver
              backup_ceph_pool = backups
              backup_ceph_user = openstack

    For more information about configuring the Block Storage backup service, see Configuring the Block Storage backup service.

  2. Update the Block Storage cinder volume service in your openstack_control_plane.yaml file:

          cinderVolumes:
            az0:
              customServiceConfig: |
                [DEFAULT]
                enabled_backends = ceph
                glance_api_servers = https://glance-az0-internal.openstack.svc:9292
                [ceph]
                volume_backend_name = ceph
                volume_driver = cinder.volume.drivers.rbd.RBDDriver
                rbd_ceph_conf = /etc/ceph/az0.conf
                rbd_user = openstack
                rbd_pool = volumes
                rbd_flatten_volume_from_snapshot = False
                rbd_secret_uuid = 795dcbca-e715-5ac3-9b7e-a3f5c64eb89f
                rbd_cluster_name = az0
                backend_availability_zone = az0

    For more information about configuring the Block Storage volume service, see Configuring the Block Storage volume service component.

  3. Add the extraMounts field to your openstack_control_plane.yaml file to define the services that require access to the Red Hat Ceph Storage secret:

      extraMounts:
      - extraVol:
        - extraVolType: Ceph
          mounts:
          - mountPath: /etc/ceph
            name: ceph
            readOnly: true
          propagation:
          - az0
          - CinderBackup
          volumes:
          - name: ceph
            projected:
              sources:
              - secret:
                  name: ceph-conf-az-0
  4. Update the Image service (glance) in your openstack_control_plane.yaml file to configure the Block Storage service as the backend:

          glanceAPIs:
            az0:
              customServiceConfig: |
                [DEFAULT]
                enabled_import_methods = [web-download,copy-image,glance-direct]
                enabled_backends = az0:rbd
                [glance_store]
                default_backend = az0
                [az0]
                rbd_store_ceph_conf = /etc/ceph/az0.conf
                store_description = "az0 RBD backend"
                rbd_store_pool = images
                rbd_store_user = openstack
                rbd_thin_provisioning = True
  5. Apply the changes made to the OpenStackControlPlane CR:

    oc apply -f openstack_control_plane.yaml
  6. Add the AZ to a host aggregate. This allows OpenStack administrators to schedule workloads to a geographical location based on image metadata.

    1. Open a terminal to the openstackclient pod:

      # oc rsh openstackclient
    2. Create a new OpenStack aggregate:

      $ openstack aggregate create <aggregate_name>
    3. Label the OpenStack aggregate with the name of the AZ:

      $ openstack aggregate set --zone <availability_zone> <aggregate_name>
    4. Add each host in the AZ to the aggregate:

      $ openstack aggregate add host <aggregate_name> <compute_node_1>
      $ openstack aggregate add host <aggregate_name> <compute_node_2>
      ...

5.6. Updating the control plane for new edge sites

Update the control plane configuration to integrate newly deployed edge locations. This enables the control plane to manage storage and compute services at the new edge sites.

You can create additional node sets by using the OpenStackDataPlaneNodeSet custom resource (CR). Use a unique availability zone, and the VLANs, NIC mappings, and IP addresses specific to the site you are deploying. For more information about deploying an OpenStackDataPlaneNodeSet CR, see Creating the data plane.

When you deploy a DCN node set with storage, you must update two fields of the OpenStackControlPlane CR at the central location:

  • cinderVolumes
  • glanceAPIs
  • Neutron
  • OVN
Note

If you are using cells, you must also configure cells for the new DCN site.

Prerequisites

  • You have deployed the central location.
  • You have deployed an additional OpenStackDataPlane node set.

Procedure

  1. In the neutron service configuration, update the customServiceConfig field to add the new availability zone and network leaf:

            customServiceConfig: |
        	[DEFAULT]
        	router_scheduler_driver = neutron.scheduler.l3_agent_scheduler.AZLeastRoutersScheduler
        	network_scheduler_driver =  neutron.scheduler.dhcp_agent_scheduler.AZAwareWeightScheduler
        	default_availability_zones = az0,az1
        	[ml2_type_vlan]
        	network_vlan_ranges = datacentre:1:1000,leaf1:1:1000
        	[neutron]
        	physnets = datacentre,leaf
  2. In the OVN service configuration, update the availability zones:

          ovnController:
            external-ids:
              availability-zones:
              - az0
              - az1
              enable-chassis-as-gateway: true
              ovn-bridge: br-int
              ovn-encap-type: geneve
              system-id: random
            networkAttachment: tenant
            nicMappings:
              datacentre: ospbr
  3. Update the cinderVolumes field in the OpenStackControlPlane CR to add the availability zones definitions of the remote location. Each cinder volume service in each availability zone uses the Glance API server for its availability zone. For example, glance_api_servers = https://glance-az1-internal.openstack.svc:9292:

          cinderVolumes:
            az0:
              customServiceConfig: |
                [DEFAULT]
                ....
            az1:
              customServiceConfig: |
                [DEFAULT]
                enabled_backends = ceph
                glance_api_servers = https://glance-az1-internal.openstack.svc:9292
                [ceph]
                volume_backend_name = az1
                volume_driver = cinder.volume.drivers.rbd.RBDDriver
                rbd_ceph_conf = /etc/ceph/az1.conf
                rbd_user = openstack
                rbd_pool = volumes
                rbd_flatten_volume_from_snapshot = False
                rbd_secret_uuid = 19ccdd60-79a0-5f0f-aece-ece700e514f8
                rbd_cluster_name = az1
                backend_availability_zone = az1
  4. Register an Image service (glance) pod to the Identity service (keystone) catalog:

    In DCN, an Image service pod is deployed for each node set. A single Image service pod is registered to the Identity service catalog at any one time. For this reason, in the top-level Glance custom resource (CR), the keystoneEndpoint parameter is defined and exposed. Unless a single instance is deployed, the human operator can choose, before the main OpenStackControlPlane CR is applied, which instance should be registered. Because our default endpoint is the az0 Image service API, the keystoneEndpoint is set to az0:

    spec:
       <...>
       glance:
          enabled: true
          keystoneEndpoint: az0
            glanceAPIs:
              az0:
                apiTimeout: 60
  5. Update the glanceAPIs field:

    For the node sets at az0, the glanceAPIs field configures the Image service pods for the central location. When you add an additional node set in AZ1, the OpenStackControlPlane CR is updated such that the glanceAPIs field contains the Image service (glance) pod definition for AZ0 and AZ1. Additionally, the Image service pod for AZ1 defines the ceph the backend for the central location, and the AZ0 Image service pod for the central location is updated so that it has the ceph backend definition for AZ1.

          glanceAPIs:
            az1:
              customServiceConfig: |
                [DEFAULT]
                enabled_import_methods = [web-download,copy-image,glance-direct]
                enabled_backends = az0:rbd,az1:rbd
                [glance_store]
                default_backend = az1
                [az1]
                rbd_store_ceph_conf = /etc/ceph/az1.conf
                store_description = "az1 RBD backend"
                rbd_store_pool = images
                rbd_store_user = openstack
                rbd_thin_provisioning = True
                [az0]
                rbd_store_ceph_conf = /etc/ceph/az0.conf
                store_description = "az0 RBD backend"
                rbd_store_pool = images
                rbd_store_user = openstack
                rbd_thin_provisioning = True
              networkAttachments:
              - storage
              override:
                service:
                  internal:
                    metadata:
                      annotations:
                        metallb.universe.tf/address-pool: internalapi
                        metallb.universe.tf/allow-shared-ip: internalapi
                        metallb.universe.tf/loadBalancerIPs: 172.17.0.81
                    spec:
                      type: LoadBalancer
              replicas: 2
              type: edge
            az0:
              customServiceConfig: |
                [DEFAULT]
                enabled_import_methods = [web-download,copy-image,glance-direct]
                enabled_backends = az0:rbd,az1:rbd
                [glance_store]
                default_backend = az0
                [az0]
                rbd_store_ceph_conf = /etc/ceph/az0.conf
                store_description = "az0 RBD backend"
                rbd_store_pool = images
                rbd_store_user = openstack
                rbd_thin_provisioning = True
                [az1]
                rbd_store_ceph_conf = /etc/ceph/az1.conf
                store_description = "az1 RBD backend"
                rbd_store_pool = images
                rbd_store_user = openstack
                rbd_thin_provisioning = True
              networkAttachments:
              - storage
              override:
                service:
                  internal:
                    metadata:
                      annotations:
                        metallb.universe.tf/address-pool: internalapi
                        metallb.universe.tf/allow-shared-ip: internalapi
                        metallb.universe.tf/loadBalancerIPs: 172.17.0.80
                    spec:
                      type: LoadBalancer
              replicas: 3
              type: split
    Note

    Availability zone az0 is of type split, and all other availability zones are of type edge.

    The split type is for cloud users to use when uploading images. The edge type is created so that when Cinder or Nova interact with Glance they can be configured to whichever glance is local to them. Use at least 3 replicas for the default split glance pods and 2 replicas for the edge glance pods and increase replicas proportionally to the workload.

  6. Optional: Update the Cell configuration.

    By default, compute nodes across all availability zones (AZs) are placed in a common cell called cell1. You can increase the performance of large deployments by partitioning compute nodes into separate cells. For DCN deployments, place each availability zone into its own cell. For more information, see Adding Compute cells to the control plane.

  7. Apply the changes made to the OpenStackControlPlane CR:

    oc apply -f openstack_control_plane.yaml
  8. Continue to update the control plane for each additional edge site that is added. Add Red Hat Storage Service (RHCS) to your OpenShift secrets as needed.

    1. In the neutron service configuration, update the customServiceConfig field to add the new availability zone and network leaf:

              customServiceConfig: |
          	[DEFAULT]
          	router_scheduler_driver = neutron.scheduler.l3_agent_scheduler.AZLeastRoutersScheduler
          	network_scheduler_driver =  neutron.scheduler.dhcp_agent_scheduler.AZAwareWeightScheduler
          	default_availability_zones = az0,az1,az2
          	[ml2_type_vlan]
          	network_vlan_ranges = datacentre:1:1000,leaf1:1:1000,leaf2:1:1000
          	[neutron]
          	physnets = datacentre,leaf1,leaf2
    2. In the OVN service configuration, update the availability zones

            ovnController:
              external-ids:
                availability-zones:
                - az0
                - az1
                - az2
                enable-chassis-as-gateway: true
                ovn-bridge: br-int
                ovn-encap-type: geneve
                system-id: random
              networkAttachment: tenant
              nicMappings:
                datacentre: ospbr
    3. Add an additional availability zone the cinderVolumes service field:

           cinderVolumes:
              az0:
                customServiceConfig: |
                  [DEFAULT]
                  ...
              az1:
                customServiceConfig: |
                  [DEFAULT]
                  ...
              az2:
                customServiceConfig: |
                  [DEFAULT]
                  enabled_backends = ceph
                  glance_api_servers = https://glance-az2-internal.openstack.svc:9292
                  [ceph]
                  volume_backend_name = ceph
                  volume_driver = cinder.volume.drivers.rbd.RBDDriver
                  rbd_ceph_conf = /etc/ceph/az2.conf
                  rbd_user = openstack
                  rbd_pool = volumes
                  rbd_flatten_volume_from_snapshot = False
                  rbd_secret_uuid = 5c0c7a8e-55b1-5fa8-bc5c-9756b7862d2f
                  rbd_cluster_name = az2
                  backend_availability_zone = az2
    4. Add an additional availability zone to the glanceAPIs field:

      As you add additional AZs, you must ensure that each Image service pod definition contains the storage configuration of the central location (AZ0), and its own local ceph configuration. You must also ensure that the central location has the storage definition of all other sites. This creates a hub and spoke relationship between the central location Image service pod, and the Image service pods for geographically dispersed node sets:

            glanceAPIs:
              az0:
                customServiceConfig: |
                  [DEFAULT]
                  enabled_import_methods = [web-download,copy-image,glance-direct]
                  enabled_backends = az0:rbd,az1:rbd,az2:rbd
                  [glance_store]
                  default_backend = az0
                  [az0]
                  rbd_store_ceph_conf = /etc/ceph/az0.conf
                  store_description = "az0 RBD backend"
                  rbd_store_pool = images
                  rbd_store_user = openstack
                  rbd_thin_provisioning = True
                  [az1]
                  rbd_store_ceph_conf = /etc/ceph/az1.conf
                  store_description = "az1 RBD backend"
                  rbd_store_pool = images
                  rbd_store_user = openstack
                  rbd_thin_provisioning = True
                  [az2]
                  rbd_store_ceph_conf = /etc/ceph/az2.conf
                  store_description = "az2 RBD backend"
                  rbd_store_pool = images
                  rbd_store_user = openstack
                  rbd_thin_provisioning = True
                networkAttachments:
                - storage
                override:
                  service:
                    internal:
                      metadata:
                        annotations:
                          metallb.universe.tf/address-pool: internalapi
                          metallb.universe.tf/allow-shared-ip: internalapi
                          metallb.universe.tf/loadBalancerIPs: 172.17.0.80
                      spec:
                        type: LoadBalancer
                replicas: 3
                type: split
              az1:
                customServiceConfig: |
                  [DEFAULT]
                  enabled_import_methods = [web-download,copy-image,glance-direct]
                  enabled_backends = az0:rbd,az1:rbd
                  [glance_store]
                  default_backend = az1
                  [az1]
                  rbd_store_ceph_conf = /etc/ceph/az1.conf
                  store_description = "az1 RBD backend"
                  rbd_store_pool = images
                  rbd_store_user = openstack
                  rbd_thin_provisioning = True
                  [az0]
                  rbd_store_ceph_conf = /etc/ceph/az0.conf
                  store_description = "az0 RBD backend"
                  rbd_store_pool = images
                  rbd_store_user = openstack
                  rbd_thin_provisioning = True
                networkAttachments:
                - storage
                override:
                  service:
                    internal:
                      metadata:
                        annotations:
                          metallb.universe.tf/address-pool: internalapi
                          metallb.universe.tf/allow-shared-ip: internalapi
                          metallb.universe.tf/loadBalancerIPs: 172.17.0.81
                      spec:
                        type: LoadBalancer
                replicas: 2
                type: edge
              az2:
                customServiceConfig: |
                  [DEFAULT]
                  enabled_import_methods = [web-download,copy-image,glance-direct]
                  enabled_backends = az0:rbd,az2:rbd
                  [glance_store]
                  default_backend = az2
                  [az2]
                  rbd_store_ceph_conf = /etc/ceph/az2.conf
                  store_description = "az2 RBD backend"
                  rbd_store_pool = images
                  rbd_store_user = openstack
                  rbd_thin_provisioning = True
                  [az0]
                  rbd_store_ceph_conf = /etc/ceph/az0.conf
                  store_description = "az0 RBD backend"
                  rbd_store_pool = images
                  rbd_store_user = openstack
                  rbd_thin_provisioning = True
                networkAttachments:
                - storage
                override:
                  service:
                    internal:
                      metadata:
                        annotations:
                          metallb.universe.tf/address-pool: internalapi
                          metallb.universe.tf/allow-shared-ip: internalapi
                          metallb.universe.tf/loadBalancerIPs: 172.17.0.82
                      spec:
                        type: LoadBalancer
                replicas: 2
                type: edge
  9. Apply the changes made to the OpenStackControlPlane CR:

    oc apply -f openstack_control_plane.yaml
  10. Add the AZ to a host aggregate. This allows OpenStack administrators to schedule workloads to a geographical location by passing the --availability-zone argument:

    1. Open a terminal to the openstackclient pod:

      # oc rsh openstackclient
    2. Create a new OpenStack aggregate:

      $ openstack aggregate create <aggregate_name>
    3. Label the OpenStack aggregate with the name of the AZ:

      $ openstack aggregate set --zone <availability_zone> <aggregate_name>
    4. Add each host in the AZ to the aggregate:

      $ openstack aggregate add host <aggregate_name> <compute_node_1>
      $ openstack aggregate add host <aggregate_name> <compute_node_2>
      ...

5.7. Adding nodes to a DCN site

Scale out your data plane capacity at a distributed compute node (DCN) site by adding new compute nodes. This increases the available resources for running workloads at the edge location.

Prerequisites

Procedure

  1. Open the OpenStackDataPlaneNodeSet manifest file for the node set you want to update, for example, openstack_data_plane.yaml.
  2. Add the new node to the node set:

    apiVersion: dataplane.openstack.org/v1beta1
    kind: OpenStackDataPlaneNodeSet
    metadata:
      name: openstack-node-set
    spec:
      preProvisioned: True
      nodes:
      ...
        edpm-compute-3:
          hostName: edpm-compute-3
          ansible:
            ansibleHost: 192.168.122.103
          networks:
          - name: ctlplane
            subnetName: subnet1
            defaultRoute: true
            fixedIP: 192.168.122.103
          - name: internalapi
            subnetName: subnet1
          - name: storage
            subnetName: subnet1
          - name: tenant
            subnetName: subnet1
      ...
  3. In the OpenStackDataPlaneNodeSet manifest file, edit the edpm_network_config_os_net_config_mappings to add the mac address of the new host:

           ...
           edpm_network_config_os_net_config_mappings:
              edpm-compute-0:
                nic1: 52:54:00:1e:af:6b
                nic2: 52:54:00:d9:cb:f4
              edpm-compute-1:
                nic1: 52:54:00:f2:bc:af
                nic2: 52:54:00:f1:c7:dd
              edpm-compute-2:
                nic1: 52:54:00:dd:33:14
                nic2: 52:54:00:50:fb:c3
              edpm-compute-3:
                nic1: 52:54:12:8b:be:9a
                nic2: 52:54:12:8b:be:9b
  4. (Optional) If you are adding a node to a data plane deployment with HCI, then you must complete the following:

    1. The extraMounts parameter must be present to define the cephx key and configuration file for the Compute service (nova). Ensure that your extraMounts configuration is present in your OpenStackDataPlaneNodeSet manifest file:

      apiVersion: dataplane.openstack.org/v1beta1
      kind: OpenStackDataPlaneNodeSet
      spec:
        ...
        nodeTemplate:
          extraMounts:
          - extraVolType: Ceph
            volumes:
            - name: ceph
              secret:
                secretName: ceph-conf-files
            mounts:
            - name: ceph
              mountPath: "/etc/ceph"
              readOnly: true
    2. Specify the additional services to apply to the nodes:

      apiVersion: dataplane.openstack.org/v1beta1
      kind: OpenStackDataPlaneNodeSet
      spec:
        ...
        services:
          - bootstrap
          - configure-network
          - validate-network
          - install-os
          - ceph-hci-pre
          - configure-os
          - ssh-known-hosts
          - run-os
          - reboot-os
          - install-certs
          - ceph-client
          - ovn
          - neutron-metadata
          - libvirt
          - nova-custom-ceph-az0
      Note

      The nova-custom-ceph-az0 service is created when you configure the DCN data plane, and should be present during this step. For more information see Configuring the DCN data plane.

  5. Save the OpenStackDataPlaneNodeSet manifest file.
  6. Apply the updated OpenStackDataPlaneNodeSet CR configuration:

    $ oc apply -f <data-plane-custom-resource-file>
    • Replace <data-plane-custom-resource-file> with the name of the manifest file you have edited.
  7. Verify that the data plane resource is updated by confirming that the status is SetupReady:

    $ oc wait openstackdataplanenodeset <node-set-name> --for condition=SetupReady --timeout=10m
    • Replace <node-set-name> with the name of the OpenStackDataPlaneNodeSet CR that you are adding the node to.

    When the status is SetupReady, the command returns a condition met message, otherwise it returns a timeout error. For information about the data plane conditions and states, see Data plane conditions and states in Deploying Red Hat OpenStack Services on OpenShift.

  8. Create a file on your workstation to define the OpenStackDataPlaneDeployment CR:

    apiVersion: dataplane.openstack.org/v1beta1
    kind: OpenStackDataPlaneDeployment
    metadata:
      name: <node_set_scaling_deployment_name>
    • Replace <node_set_scaling_deployment_name> with the name of the OpenStackDataPlaneDeployment CR. The name must be unique, and cannot match the previously created OpenStackDataPlaneDeployment. The name you choose must consist of lower case alphanumeric characters, - (hyphen) or . (period), and it must start and end with an alphanumeric character.
    Tip

    Give the definition file and the OpenStackDataPlaneDeployment CR unique and descriptive names that indicate the purpose of the modified node set.

  9. Add the OpenStackDataPlaneNodeSet CR that you modified:

    spec:
      nodeSets:
        - <nodeSet_name>
  10. Save the OpenStackDataPlaneDeployment CR deployment file.
  11. Deploy the modified OpenStackDataPlaneNodeSet CR:

    $ oc create -f openstack_data_plane_deploy.yaml -n openstack

    You can view the Ansible logs while the deployment executes:

    $ oc get pod -l app=openstackansibleee -w
    $ oc logs -l app=openstackansibleee -f --max-log-requests 10

    If the oc logs command returns an error similar to the following error, increase the --max-log-requests value:

    error: you are attempting to follow 19 log streams, but maximum allowed concurrency is 10, use --max-log-requests to increase the limit
  12. Verify that the modified OpenStackDataPlaneNodeSet CR is deployed:

    $ oc get openstackdataplanedeployment -n openstack
    NAME                  NODESETS                  STATUS  MESSAGE
    openstack-data-plane  ["openstack-data-plane"]  True    Setup Complete
    
    $ oc get openstackdataplanenodeset -n openstack
    NAME                  STATUS  MESSAGE
    openstack-data-plane  True    NodeSet Ready

    For information about the meaning of the returned status, see Data plane conditions and states in Deploying Red Hat OpenStack Services on OpenShift.

    If the status indicates that the data plane has not been deployed, then troubleshoot the deployment. For more information about troubleshooting, see Troubleshooting data plane creation and deployment in Deploying Red Hat OpenStack Services on OpenShift.

  13. (Optional): If you are adding a node to a data plane deployment with HCI, then you must configure the node as a Ceph OSD node, and configure it to use the collocated Red Hat Ceph Storage cluster. For more information, see "Adding a Ceph OSD node" in the Red Hat Ceph Storage Operations Guide:

  14. If the new nodes are Compute nodes, you must bring them online:

    1. Map the Compute nodes to the Compute cell that they are connected to:

      $ oc rsh nova-cell0-conductor-0 nova-manage cell_v2 discover_hosts --verbose

      If you did not create additional cells, this command maps the Compute nodes to cell1.

    2. Access the remote shell for the openstackclient pod and verify that the deployed Compute nodes are visible on the control plane:

      $ oc rsh -n openstack openstackclient
      $ openstack hypervisor list
      Note

      Use the Ceph Orchestrator with Cephadm in the back end to add hosts to an existing Red Hat Ceph Storage cluster. For more information, see [Adding hosts using the Ceph Orchestrator].

  15. Add the new host to the availability zone (AZ) that corresponds to the node set or geographic location in which they reside:

    1. Open a terminal to the openstackclient pod:

      $ oc rsh openstackclient
    2. Add the host to the AZ aggregate:

      $ openstack aggregate add host <availability_zone> <compute_node_3>
      • Replace <availability_zone> with the name of the AZ that the new host is in.
      • Replace <compute_node_3> with the name of the host you are adding to the node set.

Chapter 6. Removing a DCN node set

Remove an unused edge location and its availability zone from the DCN configuration. This allows you to decommission sites that are no longer needed and reclaim resources.

6.1. Decommissioning a DCN edge site

Decommission edge sites that are no longer required to reclaim hardware and clean up resources. This process safely removes a site from your DCN deployment while preserving the integrity of remaining sites.

Prerequisites

  • All data and workloads on the DCN site are migrated off.

    Important

    Unsaved data and workloads that are not migrated are lost after completing this procedure.

Procedure

  1. Delete the host aggregate for the DCN site that you want to remove, for example, “az2”:

    1. Access the remote shell for the OpenStackClient pod from your workstation:

      $ oc rsh -n openstack openstackclient
    2. Optional: View the Compute nodes assigned to the host aggregate::

      # openstack aggregate show <aggregate_name>
    3. To remove the assigned Compute nodes from the host aggregate, enter the following command for each Compute node:

      # openstack aggregate remove host <aggregate_name> \
      <host_name>
      • Replace <aggregate_name> with the aggregate that corresponds to the DCN site to be removed, for example "az2".
      • Replace <host_name> with each host in the aggregate being removed, in turn.
    4. Remove the aggregate:

      openstack aggregate delete <aggregate_name>
      • Replace <aggregate_name> with the aggregate that corresponds to the DCN site to be removed, for example "az2".
    5. Exit the openstackclient pod:

      $ exit
  2. Optional: Remove the cell:

    1. Open your OpenStackControlPlane CR file, openstack_control_plane.yaml, on your workstation.
    2. Remove the cell definition from the cellTemplates:

            cellTemplates:
              cell0:
                hasAPIAccess: true
                cellDatabaseAccount: nova-cell0
                cellDatabaseInstance: openstack
                cellMessageBusInstance: rabbitmq
              cell1:
                hasAPIAccess: true
                cellDatabaseAccount: nova-cell1
                cellDatabaseInstance: openstack-cell1
                cellMessageBusInstance: rabbitmq-cell1
              cell2:
                hasAPIAccess: true
                cellDatabaseAccount: nova-cell2
                cellDatabaseInstance: openstack-cell2
                cellMessageBusInstance: rabbitmq-cell2
      -       cell3:
      -         hasAPIAccess: true
      -         cellDatabaseAccount: nova-cell3
      -         cellDatabaseInstance: openstack-cell3
      -         cellMessageBusInstance: rabbitmq-cell3
    3. Delete the cell-specific RabbitMQ definition from the OpenStackControlPlane CR:

      spec:
      ...
        rabbitmq:
          templates:
            ...
            rabbitmq-<cellname>:
              ...
    4. Delete the cell-specific Galera definition from the `OpenStackControlPlane CR file:

      spec:
      ...
        galera:
          templates:
            ...
            openstack-<cellname>:
              ...
    5. Update the control plane:

      $ oc apply -f openstack_control_plane.yaml -n openstack
  3. Remove the Block storage (cinder) pods for the DCN site:

    1. Get a list of volumes:

      $ openstack volume service list

      Example

      +------------------+--------------------------+------+---------+-------+----------------------------+
      | Binary           | Host                     | Zone | Status  | State | Updated At                 |
      +------------------+--------------------------+------+---------+-------+----------------------------+
      | cinder-scheduler | cinder-e479e-scheduler-0 | nova | enabled | down  | 2024-11-10T16:29:40.000000 |
      | cinder-scheduler | cinder-scheduler-0       | nova | enabled | up    | 2024-11-12T19:11:08.000000 |
      | cinder-volume    | cinder-volume-az0-0@ceph | az0  | enabled | up    | 2024-11-12T19:11:09.000000 |
      | cinder-backup    | cinder-backup-0          | nova | enabled | up    | 2024-11-12T19:11:08.000000 |
      | cinder-backup    | cinder-backup-1          | nova | enabled | up    | 2024-11-12T19:11:13.000000 |
      | cinder-backup    | cinder-backup-2          | nova | enabled | up    | 2024-11-12T19:11:16.000000 |
      | cinder-volume    | cinder-volume-az1-0@ceph | az1  | enabled | up    | 2024-11-12T19:11:15.000000 |
      | cinder-volume    | cinder-volume-az2-0@ceph | az2  | enabled | up    | 2024-11-12T17:28:28.000000 |
      +------------------+--------------------------+------+---------+-------+----------------------------+
    2. Disable the Block storage volume service of the availability zone(AZ) being removed:

      $ openstack volume service set \
      --disable cinder-volume-az2-0@ceph cinder-volume
    3. Verify that the Block storage volume is disabled:

      openstack volume service list

      Example

      +------------------+--------------------------+------+----------+-------+----------------------------+
      | Binary           | Host                     | Zone | Status   | State | Updated At                 |
      +------------------+--------------------------+------+----------+-------+----------------------------+
      | cinder-scheduler | cinder-e479e-scheduler-0 | nova | enabled  | down  | 2024-11-10T16:29:40.000000 |
      | cinder-scheduler | cinder-scheduler-0       | nova | enabled  | up    | 2024-11-12T19:23:38.000000 |
      | cinder-volume    | cinder-volume-az0-0@ceph | az0  | enabled  | up    | 2024-11-12T19:23:29.000000 |
      | cinder-backup    | cinder-backup-0          | nova | enabled  | up    | 2024-11-12T19:23:38.000000 |
      | cinder-backup    | cinder-backup-1          | nova | enabled  | up    | 2024-11-12T19:23:33.000000 |
      | cinder-backup    | cinder-backup-2          | nova | enabled  | up    | 2024-11-12T19:23:36.000000 |
      | cinder-volume    | cinder-volume-az1-0@ceph | az1  | enabled  | up    | 2024-11-12T19:23:35.000000 |
      | cinder-volume    | cinder-volume-az2-0@ceph | az2  | disabled | up    | 2024-11-12T19:23:24.000000 |
      +------------------+--------------------------+------+----------+-------+----------------------------+
    4. Open the OpenStackControlPlane manifest file, openstack_control_plane.yaml. Remove the CinderVolume pod for the site being removed, as shown in the following example:

           cinderVolumes:
              az2:
                customServiceConfig: |
                  [DEFAULT]
                  enabled_backends = ceph
                  glance_api_servers = https://glance-az2-internal.openstack.svc:9292
                  [ceph]
                  volume_backend_name = ceph
                  volume_driver = cinder.volume.drivers.rbd.RBDDriver
                  rbd_ceph_conf = /etc/ceph/az2.conf
                  rbd_user = openstack
                  rbd_pool = volumes
                  rbd_flatten_volume_from_snapshot = False
                  rbd_secret_uuid = 795dcbca-e715-5ac3-9b7e-a3f5c64eb89f
                  rbd_cluster_name = az2
                  backend_availability_zone = az2
  4. Remove the cinder volume service for the DCN site you are removing:

    1. Open a shell to the cinder scheduler pod:

      oc rsh cinder-scheduler-0
    2. Remove the cinder volume service:

      cinder-manage service remove cinder-volume cinder-volume-az2-0@ceph
    3. Exit the shell

      $ exit
  5. Remove the GlanceAPI pod for the site being removed:

    1. In the openstack-control-plane.yaml custom resource (CR) file, remove the az2 field and all fields under it:

           glanceAPIs:
              az0:
                <...>
              az1:
                <...>
             az2:
                apiTimeout: 60
                customServiceConfig: |
                  [DEFAULT]
                  enabled_import_methods = [web-download,copy-image,glance-direct]
                  enabled_backends = az0:rbd,az2:rbd
                  [glance_store]
                  default_backend = az2
                  [az0]
                  rbd_store_ceph_conf = /etc/ceph/az0.conf
                  store_description = "az0 RBD backend"
                  rbd_store_pool = images
                  rbd_store_user = openstack
                  rbd_thin_provisioning = True
                  [az2]
                  rbd_store_ceph_conf = /etc/ceph/az2.conf
                  store_description = "az2 RBD backend"
                  rbd_store_pool = images
                  rbd_store_user = openstack
                  rbd_thin_provisioning = True
                imageCache:
                  cleanerScheduler: '*/30 * * * *'
                  prunerScheduler: 1 0 * * *
                  size: ""
                networkAttachments:
                - storage
                override:
                  service:
                    internal:
                      metadata:
                        annotations:
                          metallb.universe.tf/address-pool: internalapi
                          metallb.universe.tf/allow-shared-ip: internalapi
                          metallb.universe.tf/loadBalancerIPs: 172.17.0.82
                      spec:
                        type: LoadBalancer
                replicas: 1
                resources: {}
                storage: {}
                tls:
                  api:
                    internal: {}
                    public: {}
                type: edge
    2. Reapply the control plane:

      oc apply -f openstack-control-plane.yaml
    3. Ensure that the Image service (glance) pods have been removed:

      $ oc get pods | grep glance | grep -v purge

      Example

      glance-e479e-az0-external-api-0           3/3     Running     0             2d
      glance-e479e-az0-external-api-1           3/3     Running     0             2d
      glance-e479e-az0-external-api-2           3/3     Running     0             2d
      glance-e479e-az0-internal-api-0           3/3     Running     0             2d
      glance-e479e-az0-internal-api-1           3/3     Running     0             2d
      glance-e479e-az0-internal-api-2           3/3     Running     0             2d
      glance-e479e-az1-edge-api-0
      Note

      glance-e4793-az2-edge-api-0 does not appear in this list.

    4. Ensure that the az2 pods are removed:

      $ oc get pods | grep cinder-volume

      Example

      cinder-volume-az0-0       2/2     Running     0   2d
      cinder-volume-az1-0       2/2     Running     0   2d
  6. Remove the Ceph cluster from the DCN site:

    1. Shut down the Ceph clusters, but do not power off the hosts. For more information, see "Powering down and rebooting the cluster using the Ceph Orchestrator" in the Red Hat Ceph Storage Administration Guide:

    2. Remove the secret that was used for accessing the removed Ceph cluster:

      $ oc delete secret ceph-conf-az-2 -n openstack
    3. Re-create the secret for the central site so that it does not contain the secret for the Red Hat Ceph Storage cluster at az2. For example, if you have a three availability zones, az0, az1, and az2, and you are removing the edge location that corresponds to az2, run the following:

      oc delete secret ceph-conf-az-0 -n openstack
      oc create secret generic ceph-conf-az-0 \
      --from-file=az0.client.openstack.keyring \
      --from-file=az0.conf \
      --from-file=az1.client.openstack.keyring \
      --from-file=az1.conf -n openstack
    4. Edit the extraMounts in the OpenStackControlPlane to remove the reference to the AZ being removed and the removed secret. For example, for AZ2, remove the following list element:

              - propagation:
                - az2
                extraVolType: Ceph
                volumes:
                - name: ceph
                  secret:
                    name: ceph-conf-az-2
  7. Remove the node set. To remove the node set that corresponds to the az2 availability zone, complete the steps in Removing an OpenStackDataPlaneNodeSet resource.

Chapter 7. Removing a DCN node

Remove a node from a DCN site when it is no longer needed. This allows you to repurpose or decommission the hardware while maintaining site operations.

7.1. Cold migrating an instance

Cold migrating an instance involves stopping the instance and moving it to another Compute node. Cold migration facilitates migration scenarios that live migrating cannot facilitate, such as migrating instances that use PCI passthrough.

The scheduler automatically selects the destination Compute node. For more information, see Migration constraints.

Procedure

  1. Access the remote shell for the OpenStackClient pod from your workstation:

    $ oc rsh -n openstack openstackclient
  2. To cold migrate an instance, enter the following command to power off and move the instance:

     $ openstack server migrate <instance> --wait
    • Replace <instance> with the name or ID of the instance to migrate.
    • Specify the --block-migration flag if migrating a locally stored volume.
    • Specify the --wait flag to indicate that you must wait for the migration to complete.
  3. While you wait for the instance migration to complete, you can open another terminal window and check the migration status. For more information, see Checking migration status.
  4. Check the status of the instance:

     $ openstack server list --all-projects

    A status of "VERIFY_RESIZE" indicates you need to confirm or revert the migration:

    • If the migration worked as expected, confirm it:

       $ openstack server resize --confirm <instance>

      Replace <instance> with the name or ID of the instance to migrate. A status of "ACTIVE" indicates that the instance is ready to use.

    • If the migration did not work as expected, revert it:

       $ openstack server resize --revert <instance>

      Replace <instance> with the name or ID of the instance.

  5. Restart the instance:

     $ openstack server start <instance>

    Replace <instance> with the name or ID of the instance.

  6. Optional: If you disabled the source Compute node for maintenance, you must re-enable the node so that new instances can be assigned to it:

     $ openstack compute service set <source> nova-compute --enable

    Replace <source> with the hostname of the source Compute node.

  7. Exit the OpenStackClient pod:

    $ exit

7.2. Removing a compute node from a host aggregate

Remove a compute node from its host aggregate before decommissioning or reassigning the node. This ensures that the scheduler stops directing workloads to the node.

Procedure

  1. Access the remote shell for the OpenStackClient pod from your workstation:

    $ oc rsh -n openstack openstackclient
  2. View a list of all the Compute nodes assigned to the host aggregate:

    # openstack aggregate show <aggregate_name>
  3. To remove an assigned Compute node from the host aggregate, enter the following command:

    # openstack aggregate remove host <aggregate_name> <host_name>
    • Replace <aggregate_name> with the name of the host aggregate to remove the Compute node from.
    • Replace <host_name> with the name of the Compute node to remove from the host aggregate.

7.3. Removing a host from the Ceph cluster

Remove a host from your hyperconverged Ceph cluster before decommissioning the node. This process safely removes the node from the Ceph cluster while maintaining cluster availability.

Prerequisites

  • You are running HCI on a DCN cluster
  • You have Root-level access to all the nodes.
  • The host you are removing is added to the storage cluster.
  • Cephadm is deployed on the node where the service is to be removed.

Procedure

  1. Log in to the Cephadm shell:

    [root@host01 ~]# cephadm shell
  2. Fetch the host details:

    [ceph: root@host01 /]# ceph orch host ls
  3. Drain all the daemons from the host:

    [ceph: root@host01 /]# ceph orch host drain <hostname>
  4. Check the status of OSD removal:

    [ceph: root@host01 /]# ceph orch osd rm status

    When no placement groups (PG) are left on the OSD, the OSD is decommissioned and removed from the storage cluster.

  5. Check if all the daemons are removed from the storage cluster:

    [ceph: root@host01 /]# ceph orch ps <hostname>
    • Replace <hostname> with the name of the host that you are removing.
  6. Remove the host:

    [ceph: root@host01 /]# ceph orch host rm <hostname>
    • Replace <hostname> with the name of the host that you are removing.

7.4. Removing a Compute node from the data plane

You can remove a Compute node from a node set on the data plane. If you remove all the nodes from a node set, then you must also remove the node set from the data plane.

Prerequisites

  • You are logged in to the RHOCP cluster as a user with cluster-admin privileges.
  • The workloads on the Compute nodes have been migrated to other Compute nodes.

Procedure

  1. Access the remote shell for the openstackclient pod:

    $ oc rsh -n openstack openstackclient
  2. Retrieve the IP address of the Compute node that you want to remove:

    $ openstack hypervisor list
  3. Retrieve a list of your Compute nodes to identify the name and UUID of the node that you want to remove:

    $ openstack compute service list
  4. Disable the nova-compute service on the Compute node to be removed:

    $ openstack compute service set <hostname> nova-compute --disable
    Tip

    Use the --disable-reason option to add a short explanation on why the service is being disabled. This is useful if you intend to redeploy the Compute service.

  5. Exit the OpenStackClient pod:

    $ exit
  6. SSH into the Compute node to be removed and stop the ovn and nova-compute containers:

    $ ssh -i <key_file_name> cloud-admin@<node_IP_address>
    [cloud-admin@<hostname> ~]$ sudo systemctl stop edpm_ovn_controller
    [cloud-admin@<hostname> ~]$ sudo systemctl stop edpm_ovn_metadata_agent
    [cloud-admin@<hostname> ~]$ sudo systemctl stop edpm_nova_compute
    • Replace <key_file_name> with the name and location of the SSH key pair file you created to enable Ansible to manage the RHEL nodes.
    • Replace <node_IP_address> with the IP address for the Compute node that you retrieved in step 2.
  7. Remove the systemd unit files that manage the ovn and nova-compute containers to prevent the agents from being automatically started and registered in the database if the removed node is rebooted:

    [cloud-admin@<hostname> ~]$ sudo rm -f /etc/systemd/system/edpm_ovn_controller
    [cloud-admin@<hostname> ~]$ sudo rm -f /etc/systemd/system/edpm_ovn_metadata_agent
    [cloud-admin@<hostname> ~]$ sudo rm -f /etc/systemd/system/edpm_nova_compute
  8. Disconnect from the Compute node:

    $ exit
  9. Access the remote shell for openstackclient:

    $ oc rsh -n openstack openstackclient
  10. Delete the network agents for the Compute node to be removed:

    $ openstack network agent list [--host <hostname>]
    $ openstack network agent delete <agent_id>
  11. Delete the nova-compute service for the Compute node to be removed:

    $ openstack compute service delete <node_uuid>
    • Replace <node_uuid> with the UUID of the node to be removed that you retrieved in step 3.
  12. Exit the OpenStackClient pod:

    $ exit
  13. Remove the node from the OpenStackDataPlaneNodeSet CR:

    $ oc patch openstackdataplanenodeset/<node_set_name> --type json --patch '[{ "op": "remove", "path": "/spec/nodes/<node_name>" }]'
    • Replace <node_set_name> with the name of the OpenStackDataPlaneNodeSet CR that the node belongs to.
    • Replace <node_name> with the name of the node defined in the nodes section of the OpenStackDataPlaneNodeSet CR.
  14. Create a file on your workstation to define the OpenStackDataPlaneDeployment CR to update the node set with the Compute node removed:

    apiVersion: dataplane.openstack.org/v1beta1
    kind: OpenStackDataPlaneDeployment
    metadata:
      name: <node_set_deployment_name>
    • Replace <node_set_deployment_name> with the name of the OpenStackDataPlaneDeployment CR. The name must be unique, must consist of lower case alphanumeric characters, - (hyphen) or . (period), and must start and end with an alphanumeric character.
    Tip

    Give the definition file and the OpenStackDataPlaneDeployment CR unique and descriptive names that indicate the purpose of the modified node set.

  15. Add the OpenStackDataPlaneNodeSet CR that you removed the node from:

    spec:
      nodeSets:
        - <nodeSet_name>
  16. Save the OpenStackDataPlaneDeployment CR deployment file.
  17. Deploy the OpenStackDataPlaneDeployment CR to delete the removed nodes:

    $ oc create -f openstack_data_plane_deploy.yaml -n openstack

    You can view the Ansible logs while the deployment executes:

    $ oc get pod -l app=openstackansibleee -w
    $ oc logs -l app=openstackansibleee -f --max-log-requests 10

    If the oc logs command returns an error similar to the following error, increase the --max-log-requests value:

    error: you are attempting to follow 19 log streams, but maximum allowed concurrency is 10, use --max-log-requests to increase the limit
  18. Verify that the modified OpenStackDataPlaneNodeSet CR is deployed:

    $ oc get openstackdataplanedeployment -n openstack
    NAME                  NODESETS                  STATUS  MESSAGE
    openstack-data-plane  ["openstack-data-plane"]  True    Setup Complete
    
    $ oc get openstackdataplanenodeset -n openstack
    NAME                  STATUS  MESSAGE
    openstack-data-plane  True    NodeSet Ready

    For information about the meaning of the returned status, see Data plane conditions and states in Deploying Red Hat OpenStack Services on OpenShift.

    If the status indicates that the data plane has not been deployed, then troubleshoot the deployment. For information, see Troubleshooting the data plane creation and deployment in Deploying Red Hat OpenStack Services on OpenShift.

Chapter 8. Validating edge storage

To ensure that the deployment of central and edge sites are working, test glance multi-store and instance creation. You can use the openstackclient pod to test commands as an OpenStack administrator.

You can import images into glance that are available on the local filesystem or available on a web server.

Note

Always store an image copy in the central site, even if there are no instances using the image at the central location.

8.1. Viewing Image service stores in DCN

Verify that all Image service (glance) stores are available to confirm that your multi-store configuration is working correctly.

Procedure

  • Check the stores that are available through the Image service by using the glance stores-info command. In the following example, three stores are available: central, dcn1, and dcn2. These correspond to glance stores at the central location and edge sites, respectively:

      $ glance stores-info
      +----------+----------------------------------------------------------------------------------+
      | Property | Value                                                                            |
      +----------+----------------------------------------------------------------------------------+
      | stores   | [{"default": "true", "id": "az0", "description": "central rbd glance             |
      |          | store"}, {"id": "az1", "description": "z1 rbd glance store"},                    |
      |          | {"id": "az2", "description": "az2 rbd glance store"}]                            |
      +----------+----------------------------------------------------------------------------------+

8.2. Importing an image from a local file

Import an image from a local file to the central location and distribute it to edge sites. This workflow ensures that images are centrally managed and consistently available across all locations.

Procedure

  1. Ensure that your image file is in raw format. If the image is not in raw format, you must convert the image before importing it into the Image service:

    $ file cirros-0.5.1-x86_64-disk.img
    cirros-0.5.1-x86_64-disk.img: QEMU QCOW2 Image (v3), 117440512 bytes
    
    $ qemu-img convert -f qcow2 -O raw cirros-0.5.1-x86_64-disk.img cirros-0.5.1-x86_64-disk.raw
  2. Import the image into the default back end at the central site:

    openstack image create \
    --disk-format raw --container-format bare \
    --name cirros --file cirros-0.5.1-x86_64-disk.raw \
    --store central

8.3. Importing an image from a web server

Import images directly from a web server to multiple storage locations. This method streamlines image distribution across your DCN deployment by eliminating manual copying steps.

This procedure assumes that the default image conversion plugin is enabled in the Image service (glance). This feature automatically converts QCOW2 file formats into raw images, which are optimal for Ceph RBD. You can confirm that a glance image is in raw format by running the glance image-show ID | grep disk_format.

Procedure

  1. Use the image-create-via-import parameter of the glance command to import an image from a web server. Use the --stores parameter.

    # glance image-create-via-import \
    --disk-format qcow2 \
    --container-format bare \
    --name cirros \
    --uri http://download.cirros-cloud.net/0.4.0/cirros-0.4.0-x86_64-disk.img \
    --import-method web-download \
    --stores az0,az1

    In this example, the QCOW2 CirrOS image is downloaded from the official CirrOS site, converted to raw by glance, and imported into the central site and edge site 1 as specified by the --stores parameter.

    Note

    Alternatively you can replace --stores with --all-stores True to upload the image to all of the stores.

8.4. Copying an image to a new site

Copy images from the central location to edge sites to make them available for instance creation. This reduces instance launch time by eliminating the need to transfer images over the WAN during deployment.

  1. Use the UUID of the glance image for the copy operation:

    ID=$(openstack image show cirros -c id -f value)
    
    glance image-import $ID --stores az1,az2 --import-method copy-image
    Note

    In this example, the --stores option specifies that the cirros image will be copied from the central site, az0, to edge sites az1 and az2. Alternatively, you can use the --all-stores True option, which uploads the image to all the stores that don’t currently have the image.

  2. Confirm a copy of the image is in each store. Note that the stores key, which is the last item in the properties map, is set to az0,az1,az2.:

    $ openstack image show $ID | grep properties
    
    | properties | os_glance_failed_import=', os_glance_importing_to_stores=', os_hash_algo=sha512, os_hash_value=6b813aa46bb90b4da216a4d19376593fa3f4fc7e617f03a92b7fe11e9a3981cbe8f0959dbebe36225e5f53dc4492341a4863cac4ed1ee0909f3fc78ef9c3e869, os_hidden=False, stores=az0,az1 |
Note

Always store an image copy in the central site even if there is no VM using it on that site.

8.5. Creating image-based boot volumes at edge sites

Create persistent boot volumes from images stored at edge sites to verify image-based volume functionality.

Procedure

  1. Identify the ID of the image to create as a volume, and pass that ID to the openstack volume create command:

    IMG_ID=$(openstack image show cirros -c id -f value)
    openstack volume create --size 8 --availability-zone az1 pet-volume-az1 --image $IMG_ID
  2. Identify the volume ID of the newly created volume and pass it to the openstack server create command:

    VOL_ID=$(openstack volume show -f value -c id pet-volume-az1)
    openstack server create --flavor tiny --key-name az1-key --network az1-network --security-group basic --availability-zone az1 --volume $VOL_ID pet-server-az1
  3. You can verify that the volume is based on the image by running the rbd command within a ceph-mon container at the az1 edge site to list the volumes pool.

    $ sudo cephadm shell -- rbd -p volumes ls -l
    NAME                                      SIZE  PARENT                                           FMT PROT LOCK
    volume-28c6fc32-047b-4306-ad2d-de2be02716b7 8 GiB images/8083c7e7-32d8-4f7a-b1da-0ed7884f1076@snap   2      excl
    $
  4. Confirm that you can create a cinder snapshot of the root volume of the instance. Ensure that the server is stopped to quiesce data to create a clean snapshot. Use the --force option, because the volume status remains in-use when the instance is off.

    openstack server stop pet-server-az1
    openstack volume snapshot create pet-volume-az1-snap --volume $VOL_ID --force
    openstack server start pet-server-az1
  5. List the contents of the volumes pool on the az1 Ceph cluster to show the newly created snapshot.

    $ sudo cephadm shell -- rbd -p volumes ls -l
    NAME                                                                                      SIZE  PARENT                                           FMT PROT LOCK
    volume-28c6fc32-047b-4306-ad2d-de2be02716b7                                               8 GiB images/8083c7e7-32d8-4f7a-b1da-0ed7884f1076@snap   2      excl
    volume-28c6fc32-047b-4306-ad2d-de2be02716b7@snapshot-a1ca8602-6819-45b4-a228-b4cd3e5adf60 8 GiB images/8083c7e7-32d8-4f7a-b1da-0ed7884f1076@snap   2 yes

8.6. Creating and copying instance snapshots between sites

Create instance snapshots at edge sites and copy them to the central location for backup and image management. This validates your image copying workflow across distributed sites.

  1. Verify that you can create a new image at the az1 location. Ensure that the server is stopped to quiesce data to create a clean snapshot:

    NOVA_ID=$(openstack server show pet-server-az1 -f value -c id)
    openstack server stop $NOVA_ID
    openstack server image create --name cirros-snapshot $NOVA_ID
    openstack server start $NOVA_ID
  2. Copy the image from the az1 edge site back to the central location, which is the default backend for glance:

    IMAGE_ID=$(openstack image show cirros-snapshot -f value -c id)
    glance image-import $IMAGE_ID --stores az0 --import-method copy-image

8.7. Backing up and restoring volumes across edge sites

Back up and restore Block Storage (cinder) volumes across edge sites and the central location for data protection and disaster recovery. All backups are centrally stored and managed to provide a consistent recovery point across your distributed architecture.

Back up and restore operations directly between edge sites are not supported.

Prerequisites

  • The Block Storage backup service is deployed in the central AZ. For more information, see Updating the control plane.
  • Block Storage (cinder) REST API microversion 3.51 or later.
  • All sites use a common openstack cephx client name.

Procedure

  1. Create a backup of a volume in the first DCN site:

    $ openstack --os-volume-api-version 3.51 volume backup create \
    --name <volume_backup> --availability-zone <az0> <edge_volume>
    • Replace <volume_backup> with a name for the volume backup.
    • Replace <az0> with the name of the central availability zone that hosts the cinder-backup service.
    • Replace <edge_volume> with the name of the volume that you want to back up.

      Note

      If you experience issues with Ceph keyrings, you might need to restart the cinder-backup container so that the keyrings copy from the host to the container successfully.

  2. Restore the backup to a new volume in the second DCN site:

    $ openstack --os-volume-api-version 3.47 volume create \
    --backup <volume_backup> --availability-zone <az_2> <new_volume>
    • Replace <az_2> with the name of the availability zone where you want to restore the backup.
    • Replace <new_volume> with a name for the new volume.
    • Replace <volume_backup> with the name of the volume backup that you created in the previous step.

Legal Notice

Copyright © Red Hat.
Except as otherwise noted below, the text of and illustrations in this documentation are licensed by Red Hat under the Creative Commons Attribution–Share Alike 3.0 Unported license . If you distribute this document or an adaptation of it, you must provide the URL for the original version.
Red Hat, as the licensor of this document, waives the right to enforce, and agrees not to assert, Section 4d of CC-BY-SA to the fullest extent permitted by applicable law.
Red Hat, the Red Hat logo, JBoss, Hibernate, and RHCE are trademarks or registered trademarks of Red Hat, LLC. or its subsidiaries in the United States and other countries.
Linux® is the registered trademark of Linus Torvalds in the United States and other countries.
XFS is a trademark or registered trademark of Hewlett Packard Enterprise Development LP or its subsidiaries in the United States and other countries.
The OpenStack® Word Mark and OpenStack logo are trademarks or registered trademarks of the Linux Foundation, used under license.
All other trademarks are the property of their respective owners.
Red Hat logoGithubredditYoutubeTwitter

Apprendre

Essayez, achetez et vendez

Communautés

À propos de la documentation Red Hat

Nous aidons les utilisateurs de Red Hat à innover et à atteindre leurs objectifs grâce à nos produits et services avec un contenu auquel ils peuvent faire confiance. Découvrez nos récentes mises à jour.

Rendre l’open source plus inclusif

Red Hat s'engage à remplacer le langage problématique dans notre code, notre documentation et nos propriétés Web. Pour plus de détails, consultez le Blog Red Hat.

À propos de Red Hat

Nous proposons des solutions renforcées qui facilitent le travail des entreprises sur plusieurs plates-formes et environnements, du centre de données central à la périphérie du réseau.

Theme

© 2026 Red Hat
Retour au début