Ce contenu n'est pas disponible dans la langue sélectionnée.
Deploying a Distributed Compute Node (DCN) architecture
Edge and storage configuration for Red Hat OpenStack Services on Openshift
Abstract
Providing feedback on Red Hat documentation Copier lienLien copié sur presse-papiers!
We appreciate your feedback. Tell us how we can improve the documentation.
To provide documentation feedback for Red Hat OpenStack Services on OpenShift (RHOSO), create a Jira issue in the OSPRH Jira project.
Procedure
- Log in to the Red Hat Atlassian Jira.
- Click the following link to open a Create Issue page: Create issue
- Complete the Summary and Description fields. In the Description field, include the documentation URL, chapter or section number, and a detailed description of the issue.
- Click Create.
- Review the details of the bug you created.
Chapter 1. Understanding DCN Copier lienLien copié sur presse-papiers!
An upgrade from Red Hat OpenStack Platform (RHOSP) 17.1 to Red Hat OpenStack Services on OpenShift (RHOSO) 18.0.3 is not currently supported for Distributed Compute Node (DCN) deployments.
Distributed compute node (DCN) architecture is for edge use cases that require Compute and storage nodes to be deployed remotely while sharing a centralized control plane. With DCN architecture, you can position workloads strategically closer to your operational needs for higher performance.
The central location consists of, at a minimum, the RHOSO control plane installed on a Red Hat OpenShift Container Platform (RHOCP) cluster. Compute nodes can also be deployed at the central location. The edge locations consist of Compute and optional storage nodes.
DCN architecture consists of multiple availability zones (AZs) to ensure isolation per-site scheduling of the OpenStack resources.
You configure each site with a unique AZ. In this guide, the central site uses az0, the first edge location uses az1, and so on. You can use any naming convention to ensure that the AZ names are unique per site.
Figure 1.1. Basic distributed compute node architecture with storage
DCN architecture is a hub-and-spoke routed network deployment. DCN is comparable to a spine-leaf deployment for routed provisioning and control plane networking with RHOSO.
- The hub is the central site with core routers and a datacenter gateway (DC-GW). The hub hosts the control plane which manages the geographically dispersed sites.
-
The spokes are the remote edge sites. Each site is defined by using an
OpenStackDataPlaneNodeSetcustom resource. Red Hat Ceph Storage (RHCS) is used as the storage back end. You can deploy RHCS in either a hyperconverged configuration, or as a standalone storage back end.
When you launch an instance at an edge site, the required image is copied to the local Image service (glance) store automatically. You can copy images from the central Image store to edge sites by using glance multistore to save time during instance launch.
1.1. Required software for DCN architecture Copier lienLien copié sur presse-papiers!
Verify that your environment meets the minimum software version requirements for distributed compute node (DCN) architecture. Meeting these requirements ensures compatibility and support for your distributed deployment.
| Platform | Version | Optional |
|---|---|---|
| Red Hat OpenShift Container Platform | 4.16 | No |
| Red Hat Enterprise Linux | 9.2 | No |
| Red Hat OpenStack Platform | 18.0.3 | No |
| Red Hat Ceph Storage | 7 or 8 | Yes |
1.2. DCN storage Copier lienLien copié sur presse-papiers!
Choose from three storage deployment configurations for your central and edge locations. Each configuration option provides different trade-offs between performance, capacity, and operational complexity.
- Without storage.
- Using hyperconverged Ceph storage.
- Using Red Hat Ceph Storage (RHCS) as a standalone storage backend.
The storage you deploy is dedicated to the site you deploy it on. DCN architecture uses an Image service (glance) pod, and a Block Storage service (cinder) pod for each site, deployed at the central location, on the Red Hat OpenShift Container Platform (RHOCP) cluster.
For edge sites deployed without storage, you can use the aggregate cache command to store images in the Compute service (nova) cache. Caching Image service images in the Compute service provides faster boot times for instances by avoiding the process of downloading images across a WAN link.
Example:
$ openstack aggregate cache image <dcn0> <myimage>
-
Replace
<dcn0>with the name of your availability zone. -
Replace
<myimage>with the name of your image.
Red Hat OpenStack Services on OpenShift (RHOSO) supports external deployments of Red Hat Ceph Storage 7, 8, and 9. Configuration examples that reference Red Hat Ceph Storage use Release 7 information. If you are using a later version of Red Hat Ceph Storage, adjust the configuration examples accordingly.
Chapter 2. Planning a DCN deployment Copier lienLien copié sur presse-papiers!
Plan your distributed compute node (DCN) deployment to verify that required technologies are available and supported. This planning helps to ensure a successful deployment and to avoid compatibility issues.
2.1. Storage considerations for DCN architecture Copier lienLien copié sur presse-papiers!
Understand storage specific requirements and supported features for distributed compute node (DCN) deployments. These considerations help you plan storage configurations that work within DCN architectural constraints.
The following features are not currently supported for DCN architectures:
- Copying a volume between edge sites. You can work around this by creating an image from the volume and using the Image service (glance) to copy the image. After the image is copied, you can create a volume from it.
- Ceph Rados Gateway (RGW) at the edge sites.
- CephFS at the edge sites.
- Instance high availability (HA) at the edge sites.
- RBD mirroring between edge sites.
-
Instance migration, live or cold, either between edge sites, or from the central location to edge sites. You can only migrate instances within a site boundary. To move an image between sites, you must snapshot the image, and use
glance image-import.
Additionally, you must consider the following:
- You must upload images to the central location before copying them to edge sites. A copy of each image must exist in the Image service at the central location.
- You must use the RBD storage driver for the Image, Compute and Block Storage services.
- For each site, including the central location, assign a unique availability zone.
- You can migrate an offline volume from an edge site to the central location, or vice versa. You cannot migrate volumes directly between edge sites.
2.2. Networking considerations for DCN architecture Copier lienLien copié sur presse-papiers!
Understand architectural limitations and requirements for DCN architecture deployments. These considerations help you successfully deploy distributed edge locations and maintain performance.
The following features are not currently supported for DCN architectures:
- DHCP on DPDK nodes
- TC Flower Hardware Offload
The following ML2/OVN networking technologies are fully supported:
- Routed provider networks
- OVN GW (Networker node) with Neutron AZs supported
Additionally, you must consider the following:
- Network latency: Balance the latency as measured in round-trip time (RTT), with the expected number of concurrent API operations to maintain acceptable performance. Maximum TCP/IP throughput is inversely proportional to RTT. You can mitigate some issues with high-latency connections with high bandwidth by tuning kernel TCP parameters. Contact Red Hat Support if a cross-site communication exceeds 100 ms.
- Network drop outs: If the edge site temporarily loses connection to the central site, then no control plane API or CLI operations can be executed at the impacted edge site for the duration of the outage. For example, Compute nodes at the edge site are consequently unable to create a snapshot of an instance, issue an auth token, or delete an image. General control plane API and CLI operations remain functional during this outage, and can continue to serve any other edge sites that have a working connection.
- Image type: You must use raw images when deploying a DCN architecture with Ceph storage.
Image sizing:
- Compute images: Compute images are downloaded from the central location. These images are potentially large files that are transferred across all necessary networks from the central site to the edge site during provisioning.
- Instance images: If there is no block storage at the edge, then the Image service images traverse the WAN during first use. The images are copied or cached locally to the target edge nodes for all subsequent use. There is no size limit for images. Transfer times vary with available bandwidth and network latency.
- Provider networks are the most common approach for DCN deployments. Note that the Networking service (neutron) does not validate where you can attach available networks. For example, if you use a provider network called "site-a" only in edge site A, the Networking service does not validate and prevent you from attaching "site-a" to an instance at site B, which does not work.
- Site-specific networks: A limitation in DCN networking arises if you use networks that are specific to a certain site: When you deploy centralized neutron controllers with Compute nodes, there are no triggers in the Networking service to identify a certain Compute node as a remote node. Consequently, the Compute nodes receive a list of other Compute nodes and automatically form tunnels between each other. The tunnels are formed from edge to edge through the central site. If you use VXLAN or GENEVE, every Compute node at every site forms a tunnel with every other Compute node, whether or not they are local or remote. This is not an issue if you are using the same networks everywhere. When you use VLANs, the Networking service expects that all Compute nodes have the same bridge mappings, and that all VLANs are available at every site.
- If edge servers are not pre-provisioned, you must configure DHCP relay for introspection and provisioning on routed segments.
- Routing must be configured either on the cloud or within the networking infrastructure that connects each edge site to the hub. You should implement a networking design that allocates an L3 subnet for each RHOSO cluster network (external, internal API, and so on), unique to each site. If you are using BGP, you must configure BGP on the routers in these locations to learn the routes advertised by the RHOSO control plane and data plane nodes.
2.3. IP address pool sizing for the internalapi network Copier lienLien copié sur presse-papiers!
Size your internal API network address pool based on the number of distributed compute node (DCN) sites in your deployment. Each site requires dedicated IP addresses for Image service (glance) endpoints and load balancers.
The Image service operator creates an endpoint for each Image service pod with its own DNS name, such as glance-az0-internal.openstack.svc:9292. Each Compute service and Block storage service in each availability zone uses the Image service API server in the same availability zone. For example, when you update the cinderVolumes field in the OpenStackControlPlane custom resource (CR), add a field called glance_api_servers under customServiceConfig:
cinderVolumes:
az0:
customServiceConfig: |
[DEFAULT]
enabled_backends = az0
glance_api_servers = https://glance-az0-internal.openstack.svc:9292
The Image service endpoint DNS name maps to a load balancer IP address in the internalapi address pool as indicated by the internal metadata annotations:
[glance_store]
default_backend = ceph
[ceph]
rbd_store_ceph_conf = /etc/ceph/ceph.conf
store_description = "ceph RBD backend"
rbd_store_pool = images
rbd_store_user = openstack
rbd_thin_provisioning = True
networkAttachments:
- storage
override:
service:
internal:
metadata:
annotations:
metallb.universe.tf/address-pool: internalapi
metallb.universe.tf/allow-shared-ip: internalapi
metallb.universe.tf/loadBalancerIPs: 172.17.0.80
The range of addresses in this address pool should be sized according to the number of DCN sites. For example, the following shows only 10 available addresses in the internalapi network.
$ oc get ipaddresspool -n metallb-system
NAME AUTO ASSIGN AVOID BUGGY IPS ADDRESSES
ctlplane true false ["192.168.122.80-192.168.122.90"]
internalapi true false ["172.17.0.80-172.17.0.90"]
storage true false ["172.18.0.80-172.18.0.90"]
tenant true false ["172.19.0.80-172.19.0.90"]
Use commands like the following after updating the glance section of the OpenStackControlPlane CR in order to confirm that the Glance Operator has created the service endpoint and route.
$ oc get svc | grep glance
glance-az0-internal LoadBalancer 172.30.217.178 172.17.0.80 9292:32134/TCP 24h
glance-az0-public ClusterIP 172.30.78.47 <none> 9292/TCP 24h
glance-az1-internal LoadBalancer 172.30.52.123 172.17.0.81 9292:31679/TCP 23h
glance-c1ca8-az0-external-api ClusterIP None <none> 9292/TCP 24h
glance-c1ca8-az0-internal-api ClusterIP None <none> 9292/TCP 24h
glance-c1ca8-az1-edge-api ClusterIP None <none> 9292/TCP 23h
$ oc get route | grep glance
glance-az0-public glance-az0-public-openstack.apps.ocp.openstack.lab glance-az0-public glance-az0-public reencrypt/Redirect None
glance-default-public glance-default-public-openstack.apps.ocp.openstack.lab glance-default-public glance-default-public reencrypt/Redirect None
Chapter 3. Installing and preparing the OpenStack Operator Copier lienLien copié sur presse-papiers!
You install the Red Hat OpenStack Services on OpenShift (RHOSO) OpenStack Operator (openstack-operator) and create the RHOSO control plane on an operational Red Hat OpenShift Container Platform (RHOCP) cluster. You install the OpenStack Operator by using the RHOCP OperatorHub. You perform the control plane installation tasks and all data plane creation tasks on a workstation that has access to the RHOCP cluster.
For information about mapping RHOSO versions to OpenStack Operators and OpenStackVersion Custom Resources (CRs), see the Red Hat Knowledgebase article How RHOSO versions map to OpenStack Operators and OpenStackVersion CRs.
3.1. Prerequisites Copier lienLien copié sur presse-papiers!
An operational RHOCP cluster, version 4.18. For the RHOCP system requirements, see Red Hat OpenShift Container Platform cluster requirements in Planning your deployment.
- For the minimum RHOCP hardware requirements for hosting your RHOSO control plane, see Minimum RHOCP hardware requirements.
- For the minimum RHOCP network requirements, see RHOCP network requirements.
-
For a list of the Operators that must be installed before you install the
openstack-operator, see RHOCP software requirements.
-
The
occommand line tool is installed on your workstation. -
You are logged in to the RHOCP cluster as a user with
cluster-adminprivileges.
3.2. Installing the OpenStack Operator by using the web console Copier lienLien copié sur presse-papiers!
You can use the Red Hat OpenShift Container Platform (RHOCP) web console to install the OpenStack Operator (openstack-operator) on your RHOCP cluster from the OperatorHub. After you install the Operator, you configure a single instance of the OpenStack Operator initialization resource, OpenStack, to start the OpenStack Operator on your cluster.
Procedure
-
Log in to the RHOCP web console as a user with
cluster-adminpermissions. - Select Operators → OperatorHub.
-
In the Filter by keyword field, type
OpenStack. -
Click the OpenStack Operator tile with the
Red Hatsource label. - Read the information about the Operator and click Install.
- On the Install Operator page, select "Operator recommended Namespace: openstack-operators" from the Installed Namespace list.
- On the Install Operator page, select "Manual" from the Update approval list. For information about how to manually approve a pending Operator update, see Manually approving a pending Operator update in the RHOCP Operators guide.
-
Click Install to make the Operator available to the
openstack-operatorsnamespace. The OpenStack Operator is installed when the Status isSucceeded. - Click Create OpenStack to open the Create OpenStack page.
-
On the Create OpenStack page, click Create to create an instance of the OpenStack Operator initialization resource. The OpenStack Operator is ready to use when the Status of the
openstackinstance isConditions: Ready.
3.3. Installing the OpenStack Operator by using the CLI Copier lienLien copié sur presse-papiers!
You can use the Red Hat OpenShift Container Platform (RHOCP) CLI (oc) to install the OpenStack Operator (openstack-operator) on your RHOCP cluster from the OperatorHub.
To install the OpenStack Operator by using the CLI, you create the openstack-operators namespace for the Red Hat OpenStack Platform (RHOSP) service Operators. You then create the OperatorGroup and Subscription custom resources (CRs) within the namespace. After you install the Operator, you configure a single instance of the OpenStack Operator initialization resource, OpenStack, to start the OpenStack Operator on your cluster.
Procedure
Create the
openstack-operatorsnamespace for the RHOSP operators:$ cat << EOF | oc apply -f - apiVersion: v1 kind: Namespace metadata: name: openstack-operators spec: finalizers: - kubernetes EOFCreate the
OperatorGroupCR in theopenstack-operatorsnamespace:$ cat << EOF | oc apply -f - apiVersion: operators.coreos.com/v1 kind: OperatorGroup metadata: name: openstack namespace: openstack-operators EOFCreate the
SubscriptionCR that subscribes toopenstack-operator:$ cat << EOF| oc apply -f - apiVersion: operators.coreos.com/v1alpha1 kind: Subscription metadata: name: openstack-operator namespace: openstack-operators spec: name: openstack-operator channel: stable-v1.0 source: redhat-operators sourceNamespace: openshift-marketplace installPlanApproval: Manual EOFWait for the install plan to be created:
$ oc get installplan -n openstack-operators -o json | jq -r '.items[] | select(.spec.approval=="Manual" and .spec.approved==false) | .metadata.name' | head -n1Approve the install plan:
$ oc patch installplan <install_plan_name> -n openstack-operators --type merge -p '{"spec":{"approved":true}}'Verify that the OpenStack Operator is installed:
$ oc wait csv -n openstack-operators \ -l operators.coreos.com/openstack-operator.openstack-operators="" \ --for jsonpath='{.status.phase}'=SucceededCreate an instance of the
openstack-operator:$ cat << EOF | oc apply -f - apiVersion: operator.openstack.org/v1beta1 kind: OpenStack metadata: name: openstack namespace: openstack-operators EOFConfirm that the Openstack Operator is deployed:
$ oc wait openstack/openstack -n openstack-operators --for condition=Ready --timeout=500s
Additional resources
Chapter 4. Deploying the DCN control plane Copier lienLien copié sur presse-papiers!
Deploy the control plane on a Red Hat OpenShift Container Platform (RHOCP) cluster and configure the required networks. DCN deployments require careful network configuration to manage multiple subnets across central and edge locations.
Configure RHOCP networks before installing the control plane. The subnets that you use are specific to your environment. This document uses the following configuration in each of its examples.
| Central location (AZ-0) | AZ-1 | AZ-2 | |
|---|---|---|---|
| Control plane | 192.168.122.0/24 | 192.168.133.0/24 | 192.168.144.0/24 |
| External | 10.0.0.0/24 | 10.0.10.0/24 | 10.0.20.0/24 |
| Internal | 172.17.0.0/24 | 172.17.10.0/24 | 172.17.20.0/24 |
| Storage | 172.18.0.0/24 | 172.18.10.0/24 | 172.18.20.0/24 |
| Tenant | 172.19.0.0/24 | 172.19.10.0/24 | 172.19.20.0/24 |
| Storage Management | 172.20.0.0/24 | 172.20.10.0/24 | 172.20.20.0/24 |
4.1. Spine-leaf network topology for DCN Copier lienLien copié sur presse-papiers!
Configure the routed spine-leaf network topology to interconnect geographically distributed nodes in your distributed compute node (DCN) deployment. This network topology is required for edge deployments.
You must configure the following CRs:
- NodeNetworkConfigurationPolicy
-
Use the
NodeNetworkConfigurationPolicyCR to configure the interfaces for each isolated network on each worker node in RHOCP cluster. - NetworkAttachmentDefinition
-
Use the
NetworkAttachmentDefinitionCR to attach service pods to the isolated networks, where needed. - L2Advertisement
-
Use the
L2Advertisementresource to define how the Virtual IPs (VIPs) are announced. - IPAddressPool
-
Use the
IPAddressPoolresource to configure which IPs can be used as VIPs. - NetConfig
-
Use the
NetConfigCR to specify the subnets for the data plane networks. - OpenStackControlPlane
-
Use the
OpenStackControlPlaneto define and configure OpenStack services on OpenShift.
4.2. Preparing DCN networking Copier lienLien copié sur presse-papiers!
Configure networking for your distributed compute node (DCN) deployment by setting up network interfaces, routes, and IP address pools. Proper network configuration ensures reliable communication between the central control plane and distributed edge locations.
Prerequisites
- The OpenStack Operator is installed
Procedure
-
Create a
NodeNetworkConfigurationPolicy(nncp) CR definition file on your workstation for each worker node in the RHOCP cluster that hosts OpenStack services. In each
nncpCR file, configure the interfaces for each isolated network. Each service interface must have its own unique address:apiVersion: nmstate.io/v1 kind: NodeNetworkConfigurationPolicy metadata: labels: osp/nncm-config-type: standard name: worker-0 namespace: openstack spec: desiredState: dns-resolver: config: search: [] server: - 192.168.122.1 interfaces: - description: internalapi vlan interface ipv4: address: - ip: 172.17.0.10 prefix-length: "24" dhcp: false enabled: true ipv6: enabled: false mtu: 1496 name: internalapi state: up type: vlan vlan: base-iface: enp7s0 id: "20" - description: storage vlan interface ipv4: address: - ip: 172.18.0.10 prefix-length: "24" dhcp: false enabled: true ipv6: enabled: false mtu: 1496 name: storage state: up type: vlan vlan: base-iface: enp7s0 id: "21" - description: tenant vlan interface ipv4: address: - ip: 172.19.0.10 prefix-length: "24" dhcp: false enabled: true ipv6: enabled: false mtu: 1496 name: tenant state: up type: vlan vlan: base-iface: enp7s0 id: "22" - description: ctlplane interface mtu: 1500 name: enp7s0 state: up type: ethernet - bridge: options: stp: enabled: false port: - name: enp7s0 vlan: {} description: linux-bridge over ctlplane interface ipv4: address: - ip: 192.168.122.10 prefix-length: "24" dhcp: false enabled: true ipv6: enabled: false mtu: 1500 name: ospbr state: up type: linux-bridgeAdd the
route-rulesattribute and the route configuration to networks in each remote location to eachnncpCR file:route-rules: config: [] routes: config: - destination: 192.168.133.0/24 next-hop-address: 192.168.122.1 next-hop-interface: ospbr table-id: 254 - destination: 192.168.144.0/24 next-hop-address: 192.168.122.1 next-hop-interface: ospbr table-id: 254 - destination: 172.17.10.0/24 next-hop-address: 172.17.0.1 next-hop-interface: internalapi table-id: 254 - destination: 172.18.10.0/24 next-hop-address: 172.18.0.1 next-hop-interface: storage table-id: 254 - destination: 172.19.10.0/24 next-hop-address: 172.19.0.1 next-hop-interface: tenant table-id: 254 - destination: 172.17.20.0/24 next-hop-address: 172.17.0.1 next-hop-interface: internalapi table-id: 254 - destination: 172.18.20.0/24 next-hop-address: 172.18.0.1 next-hop-interface: storage table-id: 254 - destination: 172.19.20.0/24 next-hop-address: 172.19.0.1 next-hop-interface: tenant table-id: 254 nodeSelector: kubernetes.io/hostname: worker-0 node-role.kubernetes.io/worker: ""NoteEach service network routes to the same network at each remote location. For example, the
internalapinetwork (172.17.0.0/24) has a route to theinternalapinetwork at each remote location (172.17.10.0/24 and 172.17.20.0/24) through a local router at 172.17.0.1.Create the
nncpCRs in the cluster:$ oc create -f worker0-nncp.yaml $ oc create -f worker1-nncp.yaml $ oc create -f worker2-nncp.yamlCreate a
NetworkAttachmentDefinitionCR definition file for each network. Include routes to each remote location to the networks of the same function. For example, theinternalapiNetworkAttachmentDefinition specifies its own subnet range as well as routes to theinternalapinetworks at remote sites.Create a
NetworkAttachmentDefinitionCR definition file for theinternalapinetwork:apiVersion: k8s.cni.cncf.io/v1 kind: NetworkAttachmentDefinition metadata: labels: osp/net: internalapi osp/net-attach-def-type: standard name: internalapi namespace: openstack spec: config: | { "cniVersion": "0.3.1", "name": "internalapi", "type": "macvlan", "master": "internalapi", "ipam": { "type": "whereabouts", "range": "172.17.0.0/24", "range_start": "172.17.0.30", "range_end": "172.17.0.70", "routes": [ { "dst": "172.17.10.0/24", "gw": "172.17.0.1" }, { "dst": "172.17.20.0/24", "gw": "172.17.0.1" } ] } }Create a
NetworkAttachmentDefinitionCR definition file for thecontrolnetwork:apiVersion: k8s.cni.cncf.io/v1 kind: NetworkAttachmentDefinition metadata: labels: osp/net: ctlplane osp/net-attach-def-type: standard name: ctlplane namespace: openstack spec: config: | { "cniVersion": "0.3.1", "name": "ctlplane", "type": "macvlan", "master": "ospbr", "ipam": { "type": "whereabouts", "range": "192.168.122.0/24", "range_start": "192.168.122.30", "range_end": "192.168.122.70", "routes": [ { "dst": "192.168.133.0/24", "gw": "192.168.122.1" }, { "dst": "192.168.144.0/24", "gw": "192.168.122.1" } ] } }Create a
NetworkAttachmentDefinitionCR definition file for thestoragenetwork:apiVersion: k8s.cni.cncf.io/v1 kind: NetworkAttachmentDefinition metadata: labels: osp/net: storage osp/net-attach-def-type: standard name: storage namespace: openstack spec: config: | { "cniVersion": "0.3.1", "name": "storage", "type": "macvlan", "master": "storage", "ipam": { "type": "whereabouts", "range": "172.18.0.0/24", "range_start": "172.18.0.30", "range_end": "172.18.0.70", "routes": [ { "dst": "172.18.10.0/24", "gw": "172.18.0.1" }, { "dst": "172.18.20.0/24", "gw": "172.18.0.1" } ] } }Create a
NetworkAttachmentDefinitionCR definition file for thetenantnetwork:apiVersion: k8s.cni.cncf.io/v1 kind: NetworkAttachmentDefinition metadata: labels: osp/net: tenant osp/net-attach-def-type: standard name: tenant namespace: openstack spec: config: | { "cniVersion": "0.3.1", "name": "tenant", "type": "macvlan", "master": "tenant", "ipam": { "type": "whereabouts", "range": "172.19.0.0/24", "range_start": "172.19.0.30", "range_end": "172.19.0.70", "routes": [ { "dst": "172.19.10.0/24", "gw": "172.19.0.1" }, { "dst": "172.19.20.0/24", "gw": "172.19.0.1" } ] } }
Create the
NetworkAttachmentDefinitionCRs:$ oc create -f internalapi-net-attach-def.yaml $ oc create -f control-net-attach-def.yaml $ oc create -f storage-net-attach-def.yaml $ oc create -f tenant-net-attach-def.yamlCreate a NetConfig CR definition file to define which IPs can be used as Virtual IPs (VIPs). Each network is defined under the
dnsDomainfield, withallocationRangesfor each geographic region. These ranges cannot overlap with thewhereaboutsIPAM range.Create the file with the added allocation ranges for the control plane networking similar to the following:
apiVersion: network.openstack.org/v1beta1 kind: NetConfig metadata: name: netconfig namespace: openstack spec: networks: - dnsDomain: ctlplane.example.com mtu: 1500 name: ctlplane subnets: - allocationRanges: - end: 192.168.122.120 start: 192.168.122.100 - end: 192.168.122.170 start: 192.168.122.150 cidr: 192.168.122.0/24 gateway: 192.168.122.1 name: subnet1 routes: - destination: 192.168.133.0/24 nexthop: 192.168.122.1 - destination: 192.168.144.0/24 nexthop: 192.168.122.1 - allocationRanges: - end: 192.168.133.120 start: 192.168.133.100 - end: 192.168.133.170 start: 192.168.133.150 cidr: 192.168.133.0/24 gateway: 192.168.133.1 name: subnet2 routes: - destination: 192.168.122.0/24 nexthop: 192.168.133.1 - destination: 192.168.144.0/24 nexthop: 192.168.133.1 - allocationRanges: - end: 192.168.144.120 start: 192.168.144.100 - end: 192.168.144.170 start: 192.168.144.150 cidr: 192.168.144.0/24 gateway: 192.168.144.1 name: subnet3 routes: - destination: 192.168.122.0/24 nexthop: 192.168.144.1 - destination: 192.168.133.0/24 nexthop: 192.168.144.1Add an allocation range for the
internalapinetwork:- dnsDomain: internalapi.example.com mtu: 1496 name: internalapi subnets: - allocationRanges: - end: 172.17.0.250 start: 172.17.0.100 cidr: 172.17.0.0/24 name: subnet1 routes: - destination: 172.17.10.0/24 nexthop: 172.17.0.1 - destination: 172.17.20.0/24 nexthop: 172.17.0.1 vlan: 20 - allocationRanges: - end: 172.17.10.250 start: 172.17.10.100 cidr: 172.17.0.0/24 name: subnet2 routes: - destination: 172.17.0.0/24 nexthop: 172.17.10.1 - destination: 172.17.20.0/24 nexthop: 172.17.10.1 vlan: 30 - allocationRanges: - end: 172.17.20.250 start: 172.17.20.100 cidr: 172.17.20.0/24 name: subnet3 routes: - destination: 172.17.0.0/24 nexthop: 172.17.20.1 - destination: 172.17.10.0/24 nexthop: 172.17.20.1 vlan: 40Add an allocation range for the
externalnetwork:- dnsDomain: external.example.com mtu: 1500 name: external subnets: - allocationRanges: - end: 10.0.0.250 start: 10.0.0.100 cidr: 10.0.0.0/24 name: subnet1 vlan: 22 - dnsDomain: external.example.com mtu: 1500 name: external subnets: - allocationRanges: - end: 10.0.10.250 start: 10.0.10.100 cidr: 10.0.10.0/24 name: subnet2 vlan: 22 - dnsDomain: external.example.com mtu: 1500 name: external subnets: - allocationRanges: - end: 10.0.20.250 start: 10.0.20.100 cidr: 10.0.20.0/24 name: subnet3 vlan: 22 - dnsDomain: storage.example.com mtu: 1496 name: storage subnets: - allocationRanges: - end: 172.18.0.250 start: 172.18.0.100 cidr: 172.18.0.0/24 name: subnet1 routes: - destination: 172.18.10.0/24 nexthop: 172.18.0.1 - destination: 172.18.20.0/24 nexthop: 172.18.0.1 vlan: 21 - allocationRanges: - end: 172.18.10.250 start: 172.18.10.100 cidr: 172.18.10.0/24 name: subnet2 routes: - destination: 172.18.0.0/24 nexthop: 172.18.10.1 - destination: 172.18.20.0/24 nexthop: 172.18.10.1 vlan: 31 - allocationRanges: - end: 172.18.20.250 start: 172.18.20.100 cidr: 172.18.20.0/24 name: subnet3 routes: - destination: 172.18.0.0/24 nexthop: 172.18.20.1 - destination: 172.18.10.0/24 nexthop: 172.18.20.1 vlan: 41Add an allocation range for the
tenantnetwork:- dnsDomain: tenant.example.com mtu: 1496 name: tenant subnets: - allocationRanges: - end: 172.19.0.250 start: 172.19.0.100 cidr: 172.19.0.0/24 name: subnet1 routes: - destination: 172.19.10.0/24 nexthop: 172.19.0.1 - destination: 172.19.20.0/24 nexthop: 172.19.0.1 vlan: 22 - allocationRanges: - end: 172.19.10.250 start: 172.19.10.100 cidr: 172.19.10.0/24 name: subnet2 routes: - destination: 172.19.0.0/24 nexthop: 172.19.10.1 - destination: 172.19.20.0/24 nexthop: 172.19.10.1 vlan: 32 - allocationRanges: - end: 172.19.20.250 start: 172.19.20.100 cidr: 172.19.20.0/24 name: subnet3 routes: - destination: 172.19.0.0/24 nexthop: 172.19.20.1 - destination: 172.19.10.0/24 nexthop: 172.19.20.1 vlan: 42Add an allocation range for the
storagemgmtnetwork:- dnsDomain: storagemgmt.example.com mtu: 1500 name: storagemgmt subnets: - allocationRanges: - end: 172.20.0.250 start: 172.20.0.100 cidr: 172.20.0.0/24 name: subnet1 routes: - destination: 172.20.10.0/24 nexthop: 172.20.0.1 - destination: 172.20.20.0/24 nexthop: 172.20.0.1 vlan: 23 - allocationRanges: - end: 172.20.10.250 start: 172.20.10.100 cidr: 172.20.10.0/24 name: subnet2 routes: - destination: 172.20.0.0/24 nexthop: 172.20.10.1 - destination: 172.20.20.0/24 nexthop: 172.20.10.1 vlan: 33 - allocationRanges: - end: 172.20.20.250 start: 172.20.20.100 cidr: 172.20.20.0/24 name: subnet3 routes: - destination: 172.20.0.0/24 nexthop: 172.20.20.1 - destination: 172.20.10.0/24 nexthop: 172.20.20.1 vlan: 43
Create the NetConfig CR:
oc create -f netconfig
4.3. Creating the DCN control plane Copier lienLien copié sur presse-papiers!
Create the control plane that manages your distributed cloud infrastructure. The control plane centrally orchestrates workloads across central and edge node sets.
Prerequisites
-
The OpenStack Operator (
openstack-operator) is installed. - The RHOCP cluster is prepared for RHOSO networks.
The RHOCP cluster is not configured with any network policies that prevent communication between the
openstack-operatorsnamespace and the control plane namespace (defaultopenstack). Use the following command to check the existing network policies on the cluster:$ oc get networkpolicy -n openstack-
You are logged on to a workstation that has access to the RHOCP cluster, as a user with
cluster-adminprivileges.
Procedure
Create a file on your workstation named
openstack_control_plane.yamlto define theOpenStackControlPlaneCR:apiVersion: core.openstack.org/v1beta1 kind: OpenStackControlPlane metadata: name: openstack-control-plane namespace: openstackUse the
specfield to specify theSecretCR you create to provide secure access to your pod, and thestorageClassyou create for your Red Hat OpenShift Container Platform (RHOCP) cluster storage back end:apiVersion: core.openstack.org/v1beta1 kind: OpenStackControlPlane metadata: name: openstack-control-plane namespace: openstack spec: secret: osp-secret storageClass: <RHOCP_storage_class>- Replace <RHOCP_storage_class> with the storage class you created for your RHOCP cluster storage back end.
Add service configurations. Include service configurations for all required services:
Block Storage service (cinder):
cinder: uniquePodNames: false apiOverride: route: {} template: customServiceConfig: | [DEFAULT] storage_availability_zone = az0 databaseInstance: openstack secret: osp-secret cinderAPI: replicas: 3 override: service: internal: metadata: annotations: metallb.universe.tf/address-pool: internalapi metallb.universe.tf/allow-shared-ip: internalapi metallb.universe.tf/loadBalancerIPs: 172.17.0.80 spec: type: LoadBalancer cinderScheduler: replicas: 1 cinderVolumes: az0: networkAttachments: - storage replicas: 0NoteIn RHOSO 18.0.3, You must set the
uniquePodNamesfield to a value offalseto allow for the propagation of Secrets. For more information see OSPRH-11240.Note-
Set the
replicasfield to a value of0. The replica count is changed and additionalcinderVolumeservices are added after storage is configured. -
Set the
storage_availability_zonefield in the template section toaz0. All Block storage service (cinder) pods inherit this value, such ascinderBackup,cinderVolume, and so on. You can override this AZ for thecinderVolumeservice by specifying thebackend_availability_zone.
-
Set the
Compute service (nova):
nova: apiOverride: route: {} template: apiServiceTemplate: replicas: 3 override: service: internal: metadata: annotations: metallb.universe.tf/address-pool: internalapi metallb.universe.tf/allow-shared-ip: internalapi metallb.universe.tf/loadBalancerIPs: 172.17.0.80 spec: type: LoadBalancer metadataServiceTemplate: replicas: 3 override: service: metadata: annotations: metallb.universe.tf/address-pool: internalapi metallb.universe.tf/allow-shared-ip: internalapi metallb.universe.tf/loadBalancerIPs: 172.17.0.80 spec: type: LoadBalancer schedulerServiceTemplate: replicas: 3 override: service: metadata: annotations: metallb.universe.tf/address-pool: internalapi metallb.universe.tf/allow-shared-ip: internalapi metallb.universe.tf/loadBalancerIPs: 172.17.0.80 spec: type: LoadBalancer cellTemplates: cell0: cellDatabaseAccount: nova-cell0 cellDatabaseInstance: openstack cellMessageBusInstance: rabbitmq hasAPIAccess: true cell1: cellDatabaseAccount: nova-cell1 cellDatabaseInstance: openstack-cell1 cellMessageBusInstance: rabbitmq-cell1 noVNCProxyServiceTemplate: enabled: true networkAttachments: - ctlplane hasAPIAccess: true secret: osp-secretDNS service for the data plane:
dns: template: options: - key: server values: - <IP address for DNS server reachable from dnsmasq pod> override: service: metadata: annotations: metallb.universe.tf/address-pool: ctlplane metallb.universe.tf/allow-shared-ip: ctlplane metallb.universe.tf/loadBalancerIPs: 192.168.122.80 spec: type: LoadBalancer replicas: 2-
options: Defines thednsmasqinstances required for each DNS server by using key-value pairs. In this example, there is one key-value pair defined because there is only one DNS server configured to forward requests to. key: Specifies thednsmasqparameter to customize for the deployeddnsmasqinstance. Set to one of the following valid values:-
server -
rev-server -
srv-host -
txt-record -
ptr-record -
rebind-domain-ok -
naptr-record -
cname -
host-record -
caa-record -
dns-rr -
auth-zone -
synth-domain -
no-negcache -
local
-
values: Specifies the value for the DNS server reachable from thednsmasqpod on the RHOCP cluster network. You can specify a generic DNS server as the value, for example,1.1.1.1, or a DNS server for a specific domain, for example,/google.com/8.8.8.8.NoteThis DNS service,
dnsmasq, provides DNS services for nodes on the RHOSO data plane.dnsmasqis different from the RHOSO DNS service (designate) that provides DNS as a service for cloud tenants.
-
Galera
galera: templates: openstack: storageRequest: 5000M secret: osp-secret replicas: 3 openstack-cell1: storageRequest: 5000M secret: osp-secret replicas: 3Identity service (keystone)
keystone: apiOverride: route: {} template: override: service: internal: metadata: annotations: metallb.universe.tf/address-pool: internalapi metallb.universe.tf/allow-shared-ip: internalapi metallb.universe.tf/loadBalancerIPs: 172.17.0.80 spec: type: LoadBalancer databaseInstance: openstack secret: osp-secret replicas: 3Image service (glance):
glance: apiOverrides: default: route: {} template: databaseInstance: openstack storage: storageRequest: 10G secret: osp-secret keystoneEndpoint: default glanceAPIs: default: replicas: 0 override: service: internal: metadata: annotations: metallb.universe.tf/address-pool: internalapi metallb.universe.tf/allow-shared-ip: internalapi metallb.universe.tf/loadBalancerIPs: 172.17.0.80 spec: type: LoadBalancer networkAttachments: - storageNoteYou must initially set the
replicasfield to a value of0. The replica count is changed and additionalglanceAPIservices are added after storage is configured.Key Management service (barbican):
barbican: apiOverride: route: {} template: databaseInstance: openstack secret: osp-secret barbicanAPI: replicas: 3 override: service: internal: metadata: annotations: metallb.universe.tf/address-pool: internalapi metallb.universe.tf/allow-shared-ip: internalapi metallb.universe.tf/loadBalancerIPs: 172.17.0.80 spec: type: LoadBalancer barbicanWorker: replicas: 3 barbicanKeystoneListener: replicas: 1Memcached
memcached: templates: memcached: replicas: 3Networking service (neutron):
neutron: apiOverride: route: {} template: customServiceConfig: | [DEFAULT] network_scheduler_driver = neutron.scheduler.dhcp_agent_scheduler.AZAwareWeightScheduler default_availability_zones = az0 [ml2_type_vlan] network_vlan_ranges = datacentre:1:1000 [neutron] physnets = datacentre replicas: 3 override: service: internal: metadata: annotations: metallb.universe.tf/address-pool: internalapi metallb.universe.tf/allow-shared-ip: internalapi metallb.universe.tf/loadBalancerIPs: 172.17.0.80 spec: type: LoadBalancer databaseInstance: openstack secret: osp-secret networkAttachments: - internalapi-
Set the
network_scheduler_driverto a value ofneutron.scheduler.dhcp_agent_scheduler.AZAwareWeightSchedulerif a DHCP agent is deployed. OVN
ovn: template: ovnController: external-ids: availability-zones: - az0 enable-chassis-as-gateway: true ovn-bridge: br-int ovn-encap-type: geneve system-id: random networkAttachment: tenant nicMappings: datacentre: ospbr ovnDBCluster: ovndbcluster-nb: replicas: 3 dbType: NB storageRequest: 10G networkAttachment: internalapi ovndbcluster-sb: replicas: 3 dbType: SB storageRequest: 10G networkAttachment: internalapi ovnNorthd: networkAttachment: internalapiPlacement service (placement)
placement: apiOverride: route: {} template: override: service: internal: metadata: annotations: metallb.universe.tf/address-pool: internalapi metallb.universe.tf/allow-shared-ip: internalapi metallb.universe.tf/loadBalancerIPs: 172.17.0.80 spec: type: LoadBalancer databaseInstance: openstack replicas: 3 secret: osp-secretRabbitMQ
rabbitmq: templates: rabbitmq: replicas: 3 override: service: metadata: annotations: metallb.universe.tf/address-pool: internalapi metallb.universe.tf/loadBalancerIPs: 172.17.0.85 spec: type: LoadBalancer rabbitmq-cell1: replicas: 3 override: service: metadata: annotations: metallb.universe.tf/address-pool: internalapi metallb.universe.tf/loadBalancerIPs: 172.17.0.86 spec: type: LoadBalancer
Create the control plane:
oc create -f openstack_control_plane.yaml -n openstack
4.4. Distributing Ceph secret keys across sites Copier lienLien copié sur presse-papiers!
When you deploy a distributed compute node (DCN) environment with multiple Red Hat Ceph Storage backends, you create a Ceph authentication key for each backend. Limiting key distribution to only necessary credentials minimizes security exposure at each location.
- Add the key for each Ceph backend to the secret for the default location.
- Add the key for the default Ceph backend, as well as the local ceph backend for each additional location.
For three locations, az0, az1, and az2, you must have three secrets. Locations az1 and az2 each have keys for the local backend as well as the keys for az0. Location az0 contains all Ceph back end keys.
You create the required secrets after Ceph has been deployed at each edge location, and the keyring and configuration file for each has been collected. Alternatively, you can deploy each Ceph backend as needed, and update secrets with each edge deployment.
Procedure
Create a secret for location
az0.If you have already deployed Red Hat Ceph Storage (RHCS) at all edge sites which require storage, create a secret for
az0which contains all keyrings andconffiles:oc create secret generic ceph-conf-az-0 \ --from-file=az0.client.openstack.keyring \ --from-file=az0.conf \ --from-file=az1.client.openstack.keyring \ --from-file=az1.conf \ --from-file=az2.client.openstack.keyring \ --from-file=az2.conf -n openstackIf you have not deployed RHCS at all edge sites, create a secret for
az0which contains the keyring andconffile foraz0:oc create secret generic ceph-conf-az-0 \ --from-file=az0.client.openstack.keyring \ --from-file=az0.conf -n openstack
When you deploy RHCS at the edge location at availability zone 1 (
az1), create a secret for locationaz1which contains keyrings andconffiles for the local backend, and the default backend:oc create secret generic ceph-conf-az-1 \ --from-file=az0.client.openstack.keyring \ --from-file=az0.conf \ --from-file=az1.client.openstack.keyring \ --from-file=az1.conf -n openstackIf needed, update the secret for the central location:
oc delete secret ceph-conf-az-0 -n openstack oc create secret generic ceph-conf-az-0 \ --from-file=az0.client.openstack.keyring \ --from-file=az0.conf \ --from-file=az1.client.openstack.keyring \ --from-file=az1.conf -n openstackWhen you deploy RHCS at the edge location at availability zone 2 (
az2) create a secret for locationaz2which contains keyrings andconffiles for the local backend, and the default backend:oc create secret generic ceph-conf-az-2 \ --from-file=az0.client.openstack.keyring \ --from-file=az0.conf \ --from-file=az2.client.openstack.keyring \ --from-file=az2.conf -n openstackIf needed, update the secret for the central location:
oc delete secret ceph-conf-az-0 -n openstack oc create secret generic ceph-conf-az-0 \ --from-file=az0.client.openstack.keyring \ --from-file=az0.conf \ --from-file=az1.client.openstack.keyring \ --from-file=az1.conf \ --from-file=az1.client.openstack.keyring \ --from-file=az1.conf \ --from-file=az2.client.openstack.keyring \ --from-file=az2.conf[Optional] When you have finished creating the necessary keys, you can verify that they show up in the
openstacknamespace:oc get secret -n openstack -o name | grep ceph-confExample output:
secret/ceph-conf-az-0 secret/ceph-conf-az-1 secret/ceph-conf-az-2When you create an
OpenStackDataPlaneNodeSet, use the appropriate key under theextraMountsfield:apiVersion: dataplane.openstack.org/v1beta1 kind: OpenStackDataPlaneNodeSet metadata: name: openstack-edpm-dcn-0 namespace: openstack spec: ... nodeTemplate: extraMounts: - extraVolType: Ceph volumes: - name: ceph secret: secretName: ceph-conf-az-0 mounts: - name: ceph mountPath: "/etc/ceph" readOnly: trueWhen you create a data plane NodeSet, you must also update the
OpenStackControlPlanecustom resource (CR) with the secret name:apiVersion: core.openstack.org/v1beta1 kind: OpenStackControlPlane spec: extraMounts: - name: v1 region: r1 extraVol: - propagation: - az0 - CinderBackup extraVolType: Ceph volumes: - name: ceph secret: name: ceph-conf-az-0 mounts: - name: ceph mountPath: "/etc/ceph" readOnly: true - propagation: - az1 extraVolType: Ceph volumes: - name: ceph secret: name: ceph-conf-az-1 mounts: - name: ceph mountPath: "/etc/ceph" readOnly: true ...NoteIf the
CinderBackupservice is a part of the deployment, then you must include it in the propagation list because it does not have the availability zone in its pod name.When you update the
glanceAPIsfield in theOpenStackControlPlaneCR, the Image service (glance) pod name matches theextraMounts propagationinstances:glanceAPIs: az0: customServiceConfig: | ... az1: customServiceConfig: | ...When you update the
cinderVolumesfield in theOpenStackControlPlaneCR, the Block Storage service (cinder) pod names must also match theextraMounts propagationinstances:kind: OpenStackControlPlane spec: <...> cinder <...> cinderVolumes: az0: <...> az1: <...>
Chapter 5. Deploying a DCN node set Copier lienLien copié sur presse-papiers!
Deploy node sets at central and remote edge locations using the same procedures, and use a single control plane to manage your geographically distributed workloads.
Each edge location requires a separate availability zone to ensure proper isolation and resource scheduling. For example, deploy the central location node set at az0, deploy the first edge site at az1, and so on.
5.1. Configuring the data plane node networks Copier lienLien copié sur presse-papiers!
Configure data plane node networks to meet Red Hat Ceph Storage networking requirements. Proper network configuration ensures optimal storage performance and reliable communication between compute and storage services.
Prerequisites
- Control plane deployment is complete but has not yet been modified to use Ceph Storage.
- The data plane nodes have been pre-provisioned with an operating system.
- The data plane nodes are accessible through an SSH key that Ansible can use.
- If you are using HCI, then the data plane nodes have disks available to be used as Ceph OSDs.
- There are a minimum of three available data plane nodes. Ceph Storage clusters must have a minimum of three nodes to ensure redundancy.
Procedure
Create a file on your workstation named
dcn-data-plane-networks.yamlto define theOpenStackDataPlaneNodeSetCR that configures the data plane node networks:apiVersion: dataplane.openstack.org/v1beta1 kind: OpenStackDataPlaneNodeSet metadata: name: dcn-data-plane-networks namespace: openstack spec: env: - name: ANSIBLE_FORCE_COLOR value: "True"Specify the services to apply to the nodes:
spec: ... services: - bootstrap - configure-network - validate-network - install-os - ceph-hci-pre - configure-os - ssh-known-hosts - run-os - reboot-osSet the edpm_enable_chassis_gw and edpm_ovn_availability_zones fields on the data plane:
spec: env: - name: ANSIBLE_FORCE_COLOR value: "True" networkAttachments: - ctlplane nodeTemplate: ansible: ansiblePort: 22 ansibleUser: cloud-admin ansibleVars: edpm_enable_chassis_gw: true edpm_ovn_availability_zones: - az0Optional: The
ceph-hci-preservice prepares data plane nodes to host Red Hat Ceph Storage services after network configuration using theedpm_ceph_hci_pre edpm-ansiblerole. By default, theedpm_ceph_hci_pre_enabled_servicesparameter of this role only containsRBD,RGW, andNFSservices. DCN only supportsRBDservices at DCN sites. If you are deploying HCI, disable the RGW and NFS services by adding theedpm_ceph_hci_pre_enabled_servicesparameter, and adding only ceph RBD services.apiVersion: dataplane.openstack.org/v1beta1 kind: OpenStackDataPlaneNodeSet metadata: name: openstack-edpm namespace: openstack spec: env: - name: ANSIBLE_FORCE_COLOR value: "True" networkAttachments: - ctlplane nodeTemplate: ansible: ansiblePort: 22 ansibleUser: cloud-admin ansibleVars: edpm_ceph_hci_pre_enabled_services: - ceph_mon - ceph_mgr - ceph_osd ...NoteIf other services, such as the Dashboard, are deployed with HCI nodes, they must be added to the
edpm_ceph_hci_pre_enabled_servicesparameter list. For more information about this role, see edpm_ceph_hci_pre role.Configure the Red Hat Ceph Storage cluster network for storage management.
The following example has 3 nodes. It assumes the storage management is on
VLAN23:apiVersion: dataplane.openstack.org/v1beta1 kind: OpenStackDataPlaneNodeSet metadata: name: openstack-edpm namespace: openstack spec: env: - name: ANSIBLE_FORCE_COLOR value: "True" networkAttachments: - ctlplane nodeTemplate: ansible: ansiblePort: 22 ansibleUser: cloud-admin ansibleVars: edpm_ceph_hci_pre_enabled_services: - ceph_mon - ceph_mgr - ceph_osd edpm_fips_mode: check edpm_iscsid_image: {{ registry_url }}/openstack-iscsid:{{ image_tag }} edpm_logrotate_crond_image: {{ registry_url }}/openstack-cron:{{ image_tag }} edpm_network_config_hide_sensitive_logs: false edpm_network_config_os_net_config_mappings: edpm-compute-0: nic1: 52:54:00:1e:af:6b nic2: 52:54:00:d9:cb:f4 edpm-compute-1: nic1: 52:54:00:f2:bc:af nic2: 52:54:00:f1:c7:dd edpm-compute-2: nic1: 52:54:00:dd:33:14 nic2: 52:54:00:50:fb:c3 edpm_network_config_template: | --- {% set mtu_list = [ctlplane_mtu] %} {% for network in nodeset_networks %} {{ mtu_list.append(lookup(vars, networks_lower[network] ~ _mtu)) }} {%- endfor %} {% set min_viable_mtu = mtu_list | max %} network_config: - type: ovs_bridge name: {{ neutron_physical_bridge_name }} mtu: {{ min_viable_mtu }} use_dhcp: false dns_servers: {{ ctlplane_dns_nameservers }} domain: {{ dns_search_domains }} addresses: - ip_netmask: {{ ctlplane_ip }}/{{ ctlplane_cidr }} routes: {{ ctlplane_host_routes }} members: - type: interface name: nic2 mtu: {{ min_viable_mtu }} # force the MAC address of the bridge to this interface primary: true {% for network in nodeset_networks %} - type: vlan mtu: {{ lookup(vars, networks_lower[network] ~ _mtu) }} vlan_id: {{ lookup(vars, networks_lower[network] ~ _vlan_id) }} addresses: - ip_netmask: {{ lookup(vars, networks_lower[network] ~ _ip) }}/{{ lookup(vars, networks_lower[network] ~ _cidr) }} routes: {{ lookup(vars, networks_lower[network] ~ _host_routes) }} {% endfor %} edpm_neutron_metadata_agent_image: {{ registry_url }}/openstack-neutron-metadata-agent-ovn:{{ image_tag }} edpm_nodes_validation_validate_controllers_icmp: false edpm_nodes_validation_validate_gateway_icmp: false edpm_selinux_mode: enforcing edpm_sshd_allowed_ranges: - 192.168.111.0/24 - 192.168.122.0/24 - 192.168.133.0/24 - 192.168.144.0/24 edpm_sshd_configure_firewall: true enable_debug: false gather_facts: false image_tag: current-podified neutron_physical_bridge_name: br-ex neutron_public_interface_name: eth0 service_net_map: nova_api_network: internalapi nova_libvirt_network: internalapi storage_mgmt_cidr: "24" storage_mgmt_host_routes: [] storage_mgmt_mtu: 9000 storage_mgmt_vlan_id: 23 storage_mtu: 9000 timesync_ntp_servers: - hostname: pool.ntp.org ansibleSSHPrivateKeySecret: dataplane-ansible-ssh-private-key-secret managementNetwork: ctlplane networks: - defaultRoute: true name: ctlplane subnetName: subnet1 - name: internalapi subnetName: subnet1 - name: storage subnetName: subnet1 - name: tenant subnetName: subnet1 nodes: edpm-compute-0: ansible: host: 192.168.122.100 hostName: compute-0 networks: - defaultRoute: true fixedIP: 192.168.122.100 name: ctlplane subnetName: subnet1 - name: internalapi subnetName: subnet1 - name: storage subnetName: subnet1 - name: storagemgmt subnetName: subnet1 - name: tenant subnetName: subnet1 edpm-compute-1: ansible: host: 192.168.122.101 hostName: compute-1 networks: - defaultRoute: true fixedIP: 192.168.122.101 name: ctlplane subnetName: subnet1 - name: internalapi subnetName: subnet1 - name: storage subnetName: subnet1 - name: storagemgmt subnetName: subnet1 - name: tenant subnetName: subnet1 edpm-compute-2: ansible: host: 192.168.122.102 hostName: compute-2 networks: - defaultRoute: true fixedIP: 192.168.122.102 name: ctlplane subnetName: subnet1 - name: internalapi subnetName: subnet1 - name: storage subnetName: subnet1 - name: storagemgmt subnetName: subnet1 - name: tenant subnetName: subnet1 preProvisioned: true services: - bootstrap - configure-network - validate-network - install-os - ceph-hci-pre - configure-os - ssh-known-hosts - run-os - reboot-osApply the CR:
$ oc apply -f <dataplane_cr_file>Replace
<dataplane_cr_file>with the name of your file.NoteAnsible does not configure or validate the networks until the
OpenStackDataPlaneDeploymentCRD is created.
-
Create an
OpenStackDataPlaneDeploymentCRD, as described in Creating the data plane in the Deploying Red Hat OpenStack Services on OpenShift guide, which has theOpenStackDataPlaneNodeSetCRD file defined above to have Ansible configure the services on the data plane nodes. To confirm the network is configured, complete the following steps:
- SSH into a data plane node.
-
Use the
ip acommand to display configured networks. - Confirm the storage networks are in the list of configured networks.
5.2. Configuring hyperconverged Ceph Storage Copier lienLien copié sur presse-papiers!
Configure and deploy hyperconverged Red Hat Ceph Storage on your data plane nodes. This configuration enables compute and storage services to run on the same hardware for optimal resource utilization at edge sites.
The following steps are specifically for a hyperconverged configuration of Red Hat Ceph Storage (RHCS), and are not required if you have deployed an external RHCS cluster.
Procedure
- Edit the Red Hat Ceph Storage configuration file.
Add the
StorageandStorage Managementnetwork ranges. Red Hat Ceph Storage uses theStoragenetwork as the Red Hat Ceph Storagepublic_networkand theStorage Managementnetwork as thecluster_network.The following example is for a configuration file entry where the
Storagenetwork range is172.18.0.0/24and theStorage Managementnetwork range is172.20.0.0/24:[global] public_network = 172.18.0.0/24 cluster_network = 172.20.0.0/24Add collocation boundaries between the Compute service and Ceph OSD services. Boundaries should be set between collocated Compute service and Ceph OSD services to reduce CPU and memory contention.
The following is an example for a Ceph configuration file entry with these boundaries set:
[osd] osd_memory_target_autotune = true osd_numa_auto_affinity = true [mgr] mgr/cephadm/autotune_memory_target_ratio = 0.2In this example, the
osd_memory_target_autotuneparameter is set totrueso that the OSD daemons adjust memory consumption based on theosd_memory_targetoption. Theautotune_memory_target_ratiodefaults to 0.7. This means 70 percent of the total RAM in the system is the starting point from which any memory consumed by non-autotuned Ceph daemons is subtracted. The remaining memory is divided between the OSDs; assuming all OSDs haveosd_memory_target_autotuneset to true. For HCI deployments, you can setmgr/cephadm/autotune_memory_target_ratioto 0.2 so that more memory is available for the Compute service.For additional information about service collocation, see Collocating services in a HCI environment for NUMA nodes.
NoteIf these values need to be adjusted after the deployment, use the
ceph config set osd <key> <value>command.Deploy Ceph Storage with the edited configuration file on a data plane node:
$ cephadm bootstrap --config <config_file> --mon-ip <data_plane_node_ip> --skip-monitoring-stack-
Replace
<config_file>with the name of your Ceph configuration file. Replace
<data_plane_node_ip>with theStoragenetwork IP address of the data plane node on which Red Hat Ceph Storage will be installed.NoteThe
--skip-monitoring-stackoption is used in thecephadm bootstrapcommand to skip the deployment of monitoring services. This ensures the Red Hat Ceph Storage deployment completes successfully if monitoring services have been previously deployed as part of any other preceding process.If monitoring services have not been deployed, see the Red Hat Ceph Storage documentation for information and procedures on enabling monitoring services:
-
Replace
After the Red Hat Ceph Storage cluster is bootstrapped on the first EDPM node, see "Red Hat Ceph Storage installation" in the Red Hat Ceph Storage Installation Guide to add the other EDPM nodes to the Ceph cluster:
5.3. Configuring the DCN data plane Copier lienLien copié sur presse-papiers!
Configure the data plane to integrate with your Red Hat Ceph Storage backend. This configuration enables data plane nodes to access Ceph for persistent storage operations.
Prerequisites
- Complete the procedures in Integrating Red Hat Ceph Storage.
Procedure
-
Edit the
OpenStackDataPlaneNodeSetCR. To make the
cephxkey and configuration file available for the Compute service (nova), use theextraMountsparameter.The following is an example of using the
extraMountsparameter for this purpose:apiVersion: dataplane.openstack.org/v1beta1 kind: OpenStackDataPlaneNodeSet spec: ... nodeTemplate: extraMounts: - extraVolType: Ceph volumes: - name: ceph secret: secretName: ceph-conf-files mounts: - name: ceph mountPath: "/etc/ceph" readOnly: trueCreate a
ConfigMapto add required configuration details to the Compute service (nova). Create a file calledceph-nova-az0.yamland add contents similar to the following. You must add the Image service (glance) endpoint for the local availability zone, as well as set thecross_az_attachparameter to false:apiVersion: v1 kind: ConfigMap metadata: name: ceph-nova-az0 namespace: openstack data: 03-ceph-nova.conf: [libvirt] images_type = rbd images_rbd_pool = vms images_rbd_ceph_conf = /etc/ceph/az0.conf images_rbd_glance_store_name = az0 images_rbd_glance_copy_poll_interval = 15 images_rbd_glance_copy_timeout = 600 rbd_user = openstack rbd_secret_uuid = 9cfb3a03-3f91-516a-881e-a675f67c30ea hw_disk_discard = unmap volume_use_multipath = False [glance] endpoint_override = http://glance-az0-internal.openstack.svc:9292 valid_interfaces = internal [cinder] cross_az_attach = False catalog_info = volumev3:cinderv3:internalURLCreate the
ConfigMap:oc create -f ceph-nova-az0.yamlCreate a custom Compute (nova) service to use the ConfigMap. Create a file called
nova-custom-az0.yamland add contents similar to the following. You must add the name of theConfigMapthat you just created under thedataSourcesfield:apiVersion: dataplane.openstack.org/v1beta1 kind: OpenStackDataPlaneService metadata: name: nova-custom-ceph-az0 spec: addCertMounts: false caCerts: combined-ca-bundle dataSources: - configMapRef: name: ceph-nova-az0 - secretRef: name: nova-cell1-compute-config - secretRef: name: nova-migration-ssh-key edpmServiceType: nova playbook: osp.edpm.nova tlsCerts: default: contents: - dnsnames - ips edpmRoleServiceName: nova issuer: osp-rootca-issuer-internal networks: - ctlplaneCreate the custom service:
oc create -f nova-custom-ceph-az0.yamlNoteYou must create a unique
ConfigMapand custom Compute service for each availability zone. Append the availability zone to the end of these file names as shown in the previous steps.-
Locate the
serviceslist in the CR. Edit the
serviceslist to restore all of the services removed in Configuring the data plane node networks. Restoring the fullserviceslist allows the remaining jobs to be run that complete the configuration of the HCI environment.The following is an example of a full
serviceslist with the additional services in bold:apiVersion: dataplane.openstack.org/v1beta1 kind: OpenStackDataPlaneNodeSet spec: ... services: - bootstrap - configure-network - validate-network - install-os - ceph-hci-pre - configure-os - ssh-known-hosts - run-os - reboot-os - install-certs - ceph-client - ovn - neutron-metadata - libvirt - nova-custom-ceph-az0NoteIn addition to restoring the default service list, the
ceph-clientservice is added after therun-osservice. Theceph-clientservice configures EDPM nodes as clients of a Red Hat Ceph Storage server. This service distributes the files necessary for the clients to connect to the Red Hat Ceph Storage server. Theceph-hci-preservice is only needed when you deploy HCI.Optional: You can assign compute nodes to Compute service (nova) cells the same as you can in any other environment. Replace the
novaservice in yourOpenStackDataPlaneNodeSetCR with your customnovaservice:apiVersion: dataplane.openstack.org/v1beta1 kind: OpenStackDataPlaneNodeSet metadata: name: openstack-cell2 spec: services: - download-cache - bootstrap - configure-network - validate-network - install-os - configure-os - ssh-known-hosts - run-os - ovn - libvirt - *nova-cell-custom*For more information, see Connecting an OpenStackDataPlaneNodeSetSR to a Compute cell.
NoteIf you are using cells, then the
neutron-metadataservice is unique per cell and defined separately. For exampleneutron-metadata-cell1:apiVersion: dataplane.openstack.org/v1beta1 kind: OpenStackDataPlaneService metadata: labels: app.kubernetes.io/instance: neutron-metadata-cell1 app.kubernetes.io/name: openstackdataplaneservice app.kubernetes.io/part-of: openstack-operator name: neutron-metadata-cell1 ...The
nova-custom-cephservice is unique for each availability zone and defined separately. For example,nova-custom-ceph-az0:apiVersion: dataplane.openstack.org/v1beta1 kind: OpenStackDataPlaneService metadata: labels: app.kubernetes.io/instance: nova-custom-ceph-az0 app.kubernetes.io/name: openstackdataplaneservice app.kubernetes.io/part-of: openstack-operator name: nova-custom-ceph-az0 namespace: openstackOptional: If you are deploying Red Hat Ceph Storage (RHCS) as a hyperconverged solution, complete the following steps:
Create a
ConfigMapto set thereserved_host_memory_mbparameter to a value appropriate for your configuration:The following is an example of a ConfigMap used for this purpose:
apiVersion: v1 kind: ConfigMap metadata: name: reserved-memory-nova data: 04-reserved-memory-nova.conf: | [DEFAULT] reserved_host_memory_mb=75000NoteThe value for the
reserved_host_memory_mbparameter may be set so that the Compute service scheduler does not give memory to a virtual machine that a Ceph OSD on the same server needs. The example reserves 5 GB per OSD for 10 OSDs per host in addition to the default reserved memory for the hypervisor. In an IOPS-optimized cluster, you can improve performance by reserving more memory for each OSD. The 5 GB number is provided as a starting point which can be further tuned if necessary.Edit the
OpenStackDataPlaneService/nova-custom-ceph-azfile. Addreserved-memory-novato theconfigMapslist in theOpenStackDataPlaneServiceCR calledceph-nova-az0that you created earlier:kind: OpenStackDataPlaneService <...> spec: configMaps: - ceph-nova - reserved-memory-nova
Apply the CR changes.
$ oc apply -f <dataplane_cr_file>Replace
<dataplane_cr_file>with the name of your file.NoteAnsible does not configure or validate the networks until the
OpenStackDataPlaneDeploymentCRD is created.
-
Create an
OpenStackDataPlaneDeploymentCRD, as described in Creating the data plane in the Deploying Red Hat OpenStack Services on OpenShift guide, which has theOpenStackDataPlaneNodeSetCRD file defined above to have Ansible configure the services on the data plane nodes.
5.4. Example node set resource Copier lienLien copié sur presse-papiers!
Review a configuration example for a node set resource to understand the structure and required fields. This example demonstrates a three-node deployment with storage management networking configured.
apiVersion: dataplane.openstack.org/v1beta1
kind: OpenStackDataPlaneNodeSet
metadata:
name: openstack-edpm
namespace: openstack
spec:
env:
- name: ANSIBLE_FORCE_COLOR
value: "True"
networkAttachments:
- ctlplane
nodeTemplate:
ansible:
ansiblePort: 22
ansibleUser: cloud-admin
ansibleVars:
edpm_ceph_hci_pre_enabled_services:
- ceph_mon
- ceph_mgr
- ceph_osd
edpm_fips_mode: check
edpm_iscsid_image: {{ registry_url }}/openstack-iscsid:{{ image_tag }}
edpm_logrotate_crond_image: {{ registry_url }}/openstack-cron:{{ image_tag }}
edpm_network_config_hide_sensitive_logs: false
edpm_network_config_os_net_config_mappings:
edpm-compute-0:
nic1: 52:54:00:1e:af:6b
nic2: 52:54:00:d9:cb:f4
edpm-compute-1:
nic1: 52:54:00:f2:bc:af
nic2: 52:54:00:f1:c7:dd
edpm-compute-2:
nic1: 52:54:00:dd:33:14
nic2: 52:54:00:50:fb:c3
edpm_network_config_template: |
---
{% set mtu_list = [ctlplane_mtu] %}
{% for network in nodeset_networks %}
{{ mtu_list.append(lookup(vars, networks_lower[network] ~ _mtu)) }}
{%- endfor %}
{% set min_viable_mtu = mtu_list | max %}
network_config:
- type: ovs_bridge
name: {{ neutron_physical_bridge_name }}
mtu: {{ min_viable_mtu }}
use_dhcp: false
dns_servers: {{ ctlplane_dns_nameservers }}
domain: {{ dns_search_domains }}
addresses:
- ip_netmask: {{ ctlplane_ip }}/{{ ctlplane_cidr }}
routes: {{ ctlplane_host_routes }}
members:
- type: interface
name: nic2
mtu: {{ min_viable_mtu }}
# force the MAC address of the bridge to this interface
primary: true
{% for network in nodeset_networks %}
- type: vlan
mtu: {{ lookup(vars, networks_lower[network] ~ _mtu) }}
vlan_id: {{ lookup(vars, networks_lower[network] ~ _vlan_id) }}
addresses:
- ip_netmask:
{{ lookup(vars, networks_lower[network] ~ _ip) }}/{{ lookup(vars, networks_lower[network] ~ _cidr) }}
routes: {{ lookup(vars, networks_lower[network] ~ _host_routes) }}
{% endfor %}
edpm_neutron_metadata_agent_image: {{ registry_url }}/openstack-neutron-metadata-agent-ovn:{{ image_tag }}
edpm_nodes_validation_validate_controllers_icmp: false
edpm_nodes_validation_validate_gateway_icmp: false
edpm_selinux_mode: enforcing
edpm_sshd_allowed_ranges:
- 192.168.111.0/24
- 192.168.122.0/24
- 192.168.133.0/24
- 192.168.144.0/24
edpm_sshd_configure_firewall: true
enable_debug: false
gather_facts: false
image_tag: current-podified
neutron_physical_bridge_name: br-ex
neutron_public_interface_name: eth0
service_net_map:
nova_api_network: internalapi
nova_libvirt_network: internalapi
storage_mgmt_cidr: "24"
storage_mgmt_host_routes: []
storage_mgmt_mtu: 9000
storage_mgmt_vlan_id: 23
storage_mtu: 9000
timesync_ntp_servers:
- hostname: pool.ntp.org
ansibleSSHPrivateKeySecret: dataplane-ansible-ssh-private-key-secret
managementNetwork: ctlplane
networks:
- defaultRoute: true
name: ctlplane
subnetName: subnet1
- name: internalapi
subnetName: subnet1
- name: storage
subnetName: subnet1
- name: tenant
subnetName: subnet1
- name: external
subnetName: external1
nodes:
edpm-compute-0:
ansible:
host: 192.168.122.100
hostName: compute-0
networks:
- defaultRoute: true
fixedIP: 192.168.122.100
name: ctlplane
subnetName: subnet1
- name: internalapi
subnetName: subnet1
- name: storage
subnetName: subnet1
- name: storagemgmt
subnetName: subnet1
- name: tenant
subnetName: subnet1
- name: external
subnetName: external1
edpm-compute-1:
ansible:
host: 192.168.122.101
hostName: compute-1
networks:
- defaultRoute: true
fixedIP: 192.168.122.101
name: ctlplane
subnetName: subnet1
- name: internalapi
subnetName: subnet1
- name: storage
subnetName: subnet1
- name: storagemgmt
subnetName: subnet1
- name: tenant
subnetName: subnet1
- name: external
subnetName: external1
edpm-compute-2:
ansible:
host: 192.168.122.102
hostName: compute-2
networks:
- defaultRoute: true
fixedIP: 192.168.122.102
name: ctlplane
subnetName: subnet1
- name: internalapi
subnetName: subnet1
- name: storage
subnetName: subnet1
- name: storagemgmt
subnetName: subnet1
- name: tenant
subnetName: subnet1
- name: external
subnetName: external1
preProvisioned: true
services:
- bootstrap
- configure-network
- validate-network
- install-os
- ceph-hci-pre
- configure-os
- ssh-known-hosts
- run-os
- reboot-os
It is not necessary to add the storage management network to the networkAttachments key.
5.5. Updating the control plane Copier lienLien copié sur presse-papiers!
After you have deployed the data plane at the central location, you must update the control plane to integrate the newly deployed data plane.
Prerequisites
- You have deployed a node set at the central location using Red Hat OpenStack services on OpenShift.
- You have deployed Red Hat Ceph Storage (RHCS).
Procedure
Optional: Configure the Block Storage backup service in your
openstack_control_plane.yamlfile:cinderBackup: customServiceConfig: | [DEFAULT] backup_driver = cinder.backup.drivers.ceph.CephBackupDriver backup_ceph_pool = backups backup_ceph_user = openstackFor more information about configuring the Block Storage backup service, see Configuring the Block Storage backup service.
Update the Block Storage cinder volume service in your
openstack_control_plane.yamlfile:cinderVolumes: az0: customServiceConfig: | [DEFAULT] enabled_backends = ceph glance_api_servers = https://glance-az0-internal.openstack.svc:9292 [ceph] volume_backend_name = ceph volume_driver = cinder.volume.drivers.rbd.RBDDriver rbd_ceph_conf = /etc/ceph/az0.conf rbd_user = openstack rbd_pool = volumes rbd_flatten_volume_from_snapshot = False rbd_secret_uuid = 795dcbca-e715-5ac3-9b7e-a3f5c64eb89f rbd_cluster_name = az0 backend_availability_zone = az0For more information about configuring the Block Storage volume service, see Configuring the Block Storage volume service component.
Add the extraMounts field to your
openstack_control_plane.yamlfile to define the services that require access to the Red Hat Ceph Storage secret:extraMounts: - extraVol: - extraVolType: Ceph mounts: - mountPath: /etc/ceph name: ceph readOnly: true propagation: - az0 - CinderBackup volumes: - name: ceph projected: sources: - secret: name: ceph-conf-az-0Update the Image service (glance) in your
openstack_control_plane.yamlfile to configure the Block Storage service as the backend:glanceAPIs: az0: customServiceConfig: | [DEFAULT] enabled_import_methods = [web-download,copy-image,glance-direct] enabled_backends = az0:rbd [glance_store] default_backend = az0 [az0] rbd_store_ceph_conf = /etc/ceph/az0.conf store_description = "az0 RBD backend" rbd_store_pool = images rbd_store_user = openstack rbd_thin_provisioning = TrueApply the changes made to the
OpenStackControlPlaneCR:oc apply -f openstack_control_plane.yamlAdd the AZ to a host aggregate. This allows OpenStack administrators to schedule workloads to a geographical location based on image metadata.
Open a terminal to the
openstackclientpod:# oc rsh openstackclientCreate a new OpenStack aggregate:
$ openstack aggregate create <aggregate_name>Label the OpenStack aggregate with the name of the AZ:
$ openstack aggregate set --zone <availability_zone> <aggregate_name>Add each host in the AZ to the aggregate:
$ openstack aggregate add host <aggregate_name> <compute_node_1> $ openstack aggregate add host <aggregate_name> <compute_node_2> ...
5.6. Updating the control plane for new edge sites Copier lienLien copié sur presse-papiers!
Update the control plane configuration to integrate newly deployed edge locations. This enables the control plane to manage storage and compute services at the new edge sites.
You can create additional node sets by using the OpenStackDataPlaneNodeSet custom resource (CR). Use a unique availability zone, and the VLANs, NIC mappings, and IP addresses specific to the site you are deploying. For more information about deploying an OpenStackDataPlaneNodeSet CR, see Creating the data plane.
When you deploy a DCN node set with storage, you must update two fields of the OpenStackControlPlane CR at the central location:
-
cinderVolumes -
glanceAPIs - Neutron
- OVN
If you are using cells, you must also configure cells for the new DCN site.
Prerequisites
- You have deployed the central location.
-
You have deployed an additional
OpenStackDataPlanenode set.
Procedure
In the neutron service configuration, update the customServiceConfig field to add the new availability zone and network leaf:
customServiceConfig: | [DEFAULT] router_scheduler_driver = neutron.scheduler.l3_agent_scheduler.AZLeastRoutersScheduler network_scheduler_driver = neutron.scheduler.dhcp_agent_scheduler.AZAwareWeightScheduler default_availability_zones = az0,az1 [ml2_type_vlan] network_vlan_ranges = datacentre:1:1000,leaf1:1:1000 [neutron] physnets = datacentre,leafIn the OVN service configuration, update the availability zones:
ovnController: external-ids: availability-zones: - az0 - az1 enable-chassis-as-gateway: true ovn-bridge: br-int ovn-encap-type: geneve system-id: random networkAttachment: tenant nicMappings: datacentre: ospbrUpdate the
cinderVolumesfield in theOpenStackControlPlaneCR to add the availability zones definitions of the remote location. Each cinder volume service in each availability zone uses the Glance API server for its availability zone. For example,glance_api_servers = https://glance-az1-internal.openstack.svc:9292:cinderVolumes: az0: customServiceConfig: | [DEFAULT] .... az1: customServiceConfig: | [DEFAULT] enabled_backends = ceph glance_api_servers = https://glance-az1-internal.openstack.svc:9292 [ceph] volume_backend_name = az1 volume_driver = cinder.volume.drivers.rbd.RBDDriver rbd_ceph_conf = /etc/ceph/az1.conf rbd_user = openstack rbd_pool = volumes rbd_flatten_volume_from_snapshot = False rbd_secret_uuid = 19ccdd60-79a0-5f0f-aece-ece700e514f8 rbd_cluster_name = az1 backend_availability_zone = az1Register an Image service (glance) pod to the Identity service (keystone) catalog:
In DCN, an Image service pod is deployed for each node set. A single Image service pod is registered to the Identity service catalog at any one time. For this reason, in the top-level Glance custom resource (CR), the
keystoneEndpointparameter is defined and exposed. Unless a single instance is deployed, the human operator can choose, before the mainOpenStackControlPlaneCR is applied, which instance should be registered. Because our default endpoint is theaz0Image service API, thekeystoneEndpointis set toaz0:spec: <...> glance: enabled: true keystoneEndpoint: az0 glanceAPIs: az0: apiTimeout: 60Update the
glanceAPIsfield:For the node sets at
az0, theglanceAPIsfield configures the Image service pods for the central location. When you add an additional node set in AZ1, theOpenStackControlPlaneCR is updated such that theglanceAPIsfield contains the Image service (glance) pod definition for AZ0 and AZ1. Additionally, the Image service pod for AZ1 defines the ceph the backend for the central location, and the AZ0 Image service pod for the central location is updated so that it has the ceph backend definition for AZ1.glanceAPIs: az1: customServiceConfig: | [DEFAULT] enabled_import_methods = [web-download,copy-image,glance-direct] enabled_backends = az0:rbd,az1:rbd [glance_store] default_backend = az1 [az1] rbd_store_ceph_conf = /etc/ceph/az1.conf store_description = "az1 RBD backend" rbd_store_pool = images rbd_store_user = openstack rbd_thin_provisioning = True [az0] rbd_store_ceph_conf = /etc/ceph/az0.conf store_description = "az0 RBD backend" rbd_store_pool = images rbd_store_user = openstack rbd_thin_provisioning = True networkAttachments: - storage override: service: internal: metadata: annotations: metallb.universe.tf/address-pool: internalapi metallb.universe.tf/allow-shared-ip: internalapi metallb.universe.tf/loadBalancerIPs: 172.17.0.81 spec: type: LoadBalancer replicas: 2 type: edge az0: customServiceConfig: | [DEFAULT] enabled_import_methods = [web-download,copy-image,glance-direct] enabled_backends = az0:rbd,az1:rbd [glance_store] default_backend = az0 [az0] rbd_store_ceph_conf = /etc/ceph/az0.conf store_description = "az0 RBD backend" rbd_store_pool = images rbd_store_user = openstack rbd_thin_provisioning = True [az1] rbd_store_ceph_conf = /etc/ceph/az1.conf store_description = "az1 RBD backend" rbd_store_pool = images rbd_store_user = openstack rbd_thin_provisioning = True networkAttachments: - storage override: service: internal: metadata: annotations: metallb.universe.tf/address-pool: internalapi metallb.universe.tf/allow-shared-ip: internalapi metallb.universe.tf/loadBalancerIPs: 172.17.0.80 spec: type: LoadBalancer replicas: 3 type: splitNoteAvailability zone
az0is of typesplit, and all other availability zones are of typeedge.The split type is for cloud users to use when uploading images. The edge type is created so that when Cinder or Nova interact with Glance they can be configured to whichever glance is local to them. Use at least 3 replicas for the default split glance pods and 2 replicas for the edge glance pods and increase replicas proportionally to the workload.
Optional: Update the Cell configuration.
By default, compute nodes across all availability zones (AZs) are placed in a common cell called
cell1. You can increase the performance of large deployments by partitioning compute nodes into separate cells. For DCN deployments, place each availability zone into its own cell. For more information, see Adding Compute cells to the control plane.Apply the changes made to the
OpenStackControlPlaneCR:oc apply -f openstack_control_plane.yamlContinue to update the control plane for each additional edge site that is added. Add Red Hat Storage Service (RHCS) to your OpenShift secrets as needed.
In the neutron service configuration, update the customServiceConfig field to add the new availability zone and network leaf:
customServiceConfig: | [DEFAULT] router_scheduler_driver = neutron.scheduler.l3_agent_scheduler.AZLeastRoutersScheduler network_scheduler_driver = neutron.scheduler.dhcp_agent_scheduler.AZAwareWeightScheduler default_availability_zones = az0,az1,az2 [ml2_type_vlan] network_vlan_ranges = datacentre:1:1000,leaf1:1:1000,leaf2:1:1000 [neutron] physnets = datacentre,leaf1,leaf2In the OVN service configuration, update the availability zones
ovnController: external-ids: availability-zones: - az0 - az1 - az2 enable-chassis-as-gateway: true ovn-bridge: br-int ovn-encap-type: geneve system-id: random networkAttachment: tenant nicMappings: datacentre: ospbrAdd an additional availability zone the
cinderVolumesservice field:cinderVolumes: az0: customServiceConfig: | [DEFAULT] ... az1: customServiceConfig: | [DEFAULT] ... az2: customServiceConfig: | [DEFAULT] enabled_backends = ceph glance_api_servers = https://glance-az2-internal.openstack.svc:9292 [ceph] volume_backend_name = ceph volume_driver = cinder.volume.drivers.rbd.RBDDriver rbd_ceph_conf = /etc/ceph/az2.conf rbd_user = openstack rbd_pool = volumes rbd_flatten_volume_from_snapshot = False rbd_secret_uuid = 5c0c7a8e-55b1-5fa8-bc5c-9756b7862d2f rbd_cluster_name = az2 backend_availability_zone = az2Add an additional availability zone to the
glanceAPIsfield:As you add additional AZs, you must ensure that each Image service pod definition contains the storage configuration of the central location (AZ0), and its own local ceph configuration. You must also ensure that the central location has the storage definition of all other sites. This creates a hub and spoke relationship between the central location Image service pod, and the Image service pods for geographically dispersed node sets:
glanceAPIs: az0: customServiceConfig: | [DEFAULT] enabled_import_methods = [web-download,copy-image,glance-direct] enabled_backends = az0:rbd,az1:rbd,az2:rbd [glance_store] default_backend = az0 [az0] rbd_store_ceph_conf = /etc/ceph/az0.conf store_description = "az0 RBD backend" rbd_store_pool = images rbd_store_user = openstack rbd_thin_provisioning = True [az1] rbd_store_ceph_conf = /etc/ceph/az1.conf store_description = "az1 RBD backend" rbd_store_pool = images rbd_store_user = openstack rbd_thin_provisioning = True [az2] rbd_store_ceph_conf = /etc/ceph/az2.conf store_description = "az2 RBD backend" rbd_store_pool = images rbd_store_user = openstack rbd_thin_provisioning = True networkAttachments: - storage override: service: internal: metadata: annotations: metallb.universe.tf/address-pool: internalapi metallb.universe.tf/allow-shared-ip: internalapi metallb.universe.tf/loadBalancerIPs: 172.17.0.80 spec: type: LoadBalancer replicas: 3 type: split az1: customServiceConfig: | [DEFAULT] enabled_import_methods = [web-download,copy-image,glance-direct] enabled_backends = az0:rbd,az1:rbd [glance_store] default_backend = az1 [az1] rbd_store_ceph_conf = /etc/ceph/az1.conf store_description = "az1 RBD backend" rbd_store_pool = images rbd_store_user = openstack rbd_thin_provisioning = True [az0] rbd_store_ceph_conf = /etc/ceph/az0.conf store_description = "az0 RBD backend" rbd_store_pool = images rbd_store_user = openstack rbd_thin_provisioning = True networkAttachments: - storage override: service: internal: metadata: annotations: metallb.universe.tf/address-pool: internalapi metallb.universe.tf/allow-shared-ip: internalapi metallb.universe.tf/loadBalancerIPs: 172.17.0.81 spec: type: LoadBalancer replicas: 2 type: edge az2: customServiceConfig: | [DEFAULT] enabled_import_methods = [web-download,copy-image,glance-direct] enabled_backends = az0:rbd,az2:rbd [glance_store] default_backend = az2 [az2] rbd_store_ceph_conf = /etc/ceph/az2.conf store_description = "az2 RBD backend" rbd_store_pool = images rbd_store_user = openstack rbd_thin_provisioning = True [az0] rbd_store_ceph_conf = /etc/ceph/az0.conf store_description = "az0 RBD backend" rbd_store_pool = images rbd_store_user = openstack rbd_thin_provisioning = True networkAttachments: - storage override: service: internal: metadata: annotations: metallb.universe.tf/address-pool: internalapi metallb.universe.tf/allow-shared-ip: internalapi metallb.universe.tf/loadBalancerIPs: 172.17.0.82 spec: type: LoadBalancer replicas: 2 type: edge
Apply the changes made to the
OpenStackControlPlaneCR:oc apply -f openstack_control_plane.yamlAdd the AZ to a host aggregate. This allows OpenStack administrators to schedule workloads to a geographical location by passing the
--availability-zoneargument:Open a terminal to the
openstackclientpod:# oc rsh openstackclientCreate a new OpenStack aggregate:
$ openstack aggregate create <aggregate_name>Label the OpenStack aggregate with the name of the AZ:
$ openstack aggregate set --zone <availability_zone> <aggregate_name>Add each host in the AZ to the aggregate:
$ openstack aggregate add host <aggregate_name> <compute_node_1> $ openstack aggregate add host <aggregate_name> <compute_node_2> ...
5.7. Adding nodes to a DCN site Copier lienLien copié sur presse-papiers!
Scale out your data plane capacity at a distributed compute node (DCN) site by adding new compute nodes. This increases the available resources for running workloads at the edge location.
Prerequisites
- The nodes that you are adding are preprovisioned. For more information on preprovisioning, see Creating an OpenStackDataPlaneNodeSet CR with pre-provisioned nodes.
Procedure
-
Open the
OpenStackDataPlaneNodeSetmanifest file for the node set you want to update, for example,openstack_data_plane.yaml. Add the new node to the node set:
apiVersion: dataplane.openstack.org/v1beta1 kind: OpenStackDataPlaneNodeSet metadata: name: openstack-node-set spec: preProvisioned: True nodes: ... edpm-compute-3: hostName: edpm-compute-3 ansible: ansibleHost: 192.168.122.103 networks: - name: ctlplane subnetName: subnet1 defaultRoute: true fixedIP: 192.168.122.103 - name: internalapi subnetName: subnet1 - name: storage subnetName: subnet1 - name: tenant subnetName: subnet1 ...In the
OpenStackDataPlaneNodeSetmanifest file, edit theedpm_network_config_os_net_config_mappingsto add the mac address of the new host:... edpm_network_config_os_net_config_mappings: edpm-compute-0: nic1: 52:54:00:1e:af:6b nic2: 52:54:00:d9:cb:f4 edpm-compute-1: nic1: 52:54:00:f2:bc:af nic2: 52:54:00:f1:c7:dd edpm-compute-2: nic1: 52:54:00:dd:33:14 nic2: 52:54:00:50:fb:c3 edpm-compute-3: nic1: 52:54:12:8b:be:9a nic2: 52:54:12:8b:be:9b(Optional) If you are adding a node to a data plane deployment with HCI, then you must complete the following:
The
extraMountsparameter must be present to define thecephxkey and configuration file for the Compute service (nova). Ensure that yourextraMountsconfiguration is present in yourOpenStackDataPlaneNodeSetmanifest file:apiVersion: dataplane.openstack.org/v1beta1 kind: OpenStackDataPlaneNodeSet spec: ... nodeTemplate: extraMounts: - extraVolType: Ceph volumes: - name: ceph secret: secretName: ceph-conf-files mounts: - name: ceph mountPath: "/etc/ceph" readOnly: trueSpecify the additional services to apply to the nodes:
apiVersion: dataplane.openstack.org/v1beta1 kind: OpenStackDataPlaneNodeSet spec: ... services: - bootstrap - configure-network - validate-network - install-os - ceph-hci-pre - configure-os - ssh-known-hosts - run-os - reboot-os - install-certs - ceph-client - ovn - neutron-metadata - libvirt - nova-custom-ceph-az0NoteThe
nova-custom-ceph-az0service is created when you configure the DCN data plane, and should be present during this step. For more information see Configuring the DCN data plane.
-
Save the
OpenStackDataPlaneNodeSetmanifest file. Apply the updated
OpenStackDataPlaneNodeSetCR configuration:$ oc apply -f <data-plane-custom-resource-file>-
Replace
<data-plane-custom-resource-file>with the name of the manifest file you have edited.
-
Replace
Verify that the data plane resource is updated by confirming that the status is
SetupReady:$ oc wait openstackdataplanenodeset <node-set-name> --for condition=SetupReady --timeout=10m-
Replace
<node-set-name>with the name of theOpenStackDataPlaneNodeSetCR that you are adding the node to.
When the status is
SetupReady, the command returns acondition metmessage, otherwise it returns a timeout error. For information about the data plane conditions and states, see Data plane conditions and states in Deploying Red Hat OpenStack Services on OpenShift.-
Replace
Create a file on your workstation to define the
OpenStackDataPlaneDeploymentCR:apiVersion: dataplane.openstack.org/v1beta1 kind: OpenStackDataPlaneDeployment metadata: name: <node_set_scaling_deployment_name>-
Replace
<node_set_scaling_deployment_name>with the name of theOpenStackDataPlaneDeploymentCR. The name must be unique, and cannot match the previously createdOpenStackDataPlaneDeployment. The name you choose must consist of lower case alphanumeric characters,-(hyphen) or.(period), and it must start and end with an alphanumeric character.
TipGive the definition file and the
OpenStackDataPlaneDeploymentCR unique and descriptive names that indicate the purpose of the modified node set.-
Replace
Add the
OpenStackDataPlaneNodeSetCR that you modified:spec: nodeSets: - <nodeSet_name>-
Save the
OpenStackDataPlaneDeploymentCR deployment file. Deploy the modified
OpenStackDataPlaneNodeSetCR:$ oc create -f openstack_data_plane_deploy.yaml -n openstackYou can view the Ansible logs while the deployment executes:
$ oc get pod -l app=openstackansibleee -w $ oc logs -l app=openstackansibleee -f --max-log-requests 10If the
oc logscommand returns an error similar to the following error, increase the--max-log-requestsvalue:error: you are attempting to follow 19 log streams, but maximum allowed concurrency is 10, use --max-log-requests to increase the limitVerify that the modified
OpenStackDataPlaneNodeSetCR is deployed:$ oc get openstackdataplanedeployment -n openstack NAME NODESETS STATUS MESSAGE openstack-data-plane ["openstack-data-plane"] True Setup Complete $ oc get openstackdataplanenodeset -n openstack NAME STATUS MESSAGE openstack-data-plane True NodeSet ReadyFor information about the meaning of the returned status, see Data plane conditions and states in Deploying Red Hat OpenStack Services on OpenShift.
If the status indicates that the data plane has not been deployed, then troubleshoot the deployment. For more information about troubleshooting, see Troubleshooting data plane creation and deployment in Deploying Red Hat OpenStack Services on OpenShift.
(Optional): If you are adding a node to a data plane deployment with HCI, then you must configure the node as a Ceph OSD node, and configure it to use the collocated Red Hat Ceph Storage cluster. For more information, see "Adding a Ceph OSD node" in the Red Hat Ceph Storage Operations Guide:
If the new nodes are Compute nodes, you must bring them online:
Map the Compute nodes to the Compute cell that they are connected to:
$ oc rsh nova-cell0-conductor-0 nova-manage cell_v2 discover_hosts --verboseIf you did not create additional cells, this command maps the Compute nodes to
cell1.Access the remote shell for the
openstackclientpod and verify that the deployed Compute nodes are visible on the control plane:$ oc rsh -n openstack openstackclient $ openstack hypervisor listNoteUse the Ceph Orchestrator with Cephadm in the back end to add hosts to an existing Red Hat Ceph Storage cluster. For more information, see [Adding hosts using the Ceph Orchestrator].
Add the new host to the availability zone (AZ) that corresponds to the node set or geographic location in which they reside:
Open a terminal to the
openstackclientpod:$ oc rsh openstackclientAdd the host to the AZ aggregate:
$ openstack aggregate add host <availability_zone> <compute_node_3>-
Replace
<availability_zone>with the name of the AZ that the new host is in. -
Replace
<compute_node_3>with the name of the host you are adding to the node set.
-
Replace
Chapter 6. Removing a DCN node set Copier lienLien copié sur presse-papiers!
Remove an unused edge location and its availability zone from the DCN configuration. This allows you to decommission sites that are no longer needed and reclaim resources.
6.1. Decommissioning a DCN edge site Copier lienLien copié sur presse-papiers!
Decommission edge sites that are no longer required to reclaim hardware and clean up resources. This process safely removes a site from your DCN deployment while preserving the integrity of remaining sites.
Prerequisites
All data and workloads on the DCN site are migrated off.
ImportantUnsaved data and workloads that are not migrated are lost after completing this procedure.
Procedure
Delete the host aggregate for the DCN site that you want to remove, for example, “az2”:
Access the remote shell for the OpenStackClient pod from your workstation:
$ oc rsh -n openstack openstackclientOptional: View the Compute nodes assigned to the host aggregate::
# openstack aggregate show <aggregate_name>To remove the assigned Compute nodes from the host aggregate, enter the following command for each Compute node:
# openstack aggregate remove host <aggregate_name> \ <host_name>-
Replace
<aggregate_name>with the aggregate that corresponds to the DCN site to be removed, for example "az2". -
Replace
<host_name>with each host in the aggregate being removed, in turn.
-
Replace
Remove the aggregate:
openstack aggregate delete <aggregate_name>-
Replace
<aggregate_name>with the aggregate that corresponds to the DCN site to be removed, for example "az2".
-
Replace
Exit the
openstackclientpod:$ exit
Optional: Remove the cell:
- Open your OpenStackControlPlane CR file, openstack_control_plane.yaml, on your workstation.
Remove the cell definition from the
cellTemplates:cellTemplates: cell0: hasAPIAccess: true cellDatabaseAccount: nova-cell0 cellDatabaseInstance: openstack cellMessageBusInstance: rabbitmq cell1: hasAPIAccess: true cellDatabaseAccount: nova-cell1 cellDatabaseInstance: openstack-cell1 cellMessageBusInstance: rabbitmq-cell1 cell2: hasAPIAccess: true cellDatabaseAccount: nova-cell2 cellDatabaseInstance: openstack-cell2 cellMessageBusInstance: rabbitmq-cell2 - cell3: - hasAPIAccess: true - cellDatabaseAccount: nova-cell3 - cellDatabaseInstance: openstack-cell3 - cellMessageBusInstance: rabbitmq-cell3Delete the cell-specific RabbitMQ definition from the OpenStackControlPlane CR:
spec: ... rabbitmq: templates: ... rabbitmq-<cellname>: ...Delete the cell-specific Galera definition from the `OpenStackControlPlane CR file:
spec: ... galera: templates: ... openstack-<cellname>: ...Update the control plane:
$ oc apply -f openstack_control_plane.yaml -n openstack
Remove the Block storage (cinder) pods for the DCN site:
Get a list of volumes:
$ openstack volume service listExample
+------------------+--------------------------+------+---------+-------+----------------------------+ | Binary | Host | Zone | Status | State | Updated At | +------------------+--------------------------+------+---------+-------+----------------------------+ | cinder-scheduler | cinder-e479e-scheduler-0 | nova | enabled | down | 2024-11-10T16:29:40.000000 | | cinder-scheduler | cinder-scheduler-0 | nova | enabled | up | 2024-11-12T19:11:08.000000 | | cinder-volume | cinder-volume-az0-0@ceph | az0 | enabled | up | 2024-11-12T19:11:09.000000 | | cinder-backup | cinder-backup-0 | nova | enabled | up | 2024-11-12T19:11:08.000000 | | cinder-backup | cinder-backup-1 | nova | enabled | up | 2024-11-12T19:11:13.000000 | | cinder-backup | cinder-backup-2 | nova | enabled | up | 2024-11-12T19:11:16.000000 | | cinder-volume | cinder-volume-az1-0@ceph | az1 | enabled | up | 2024-11-12T19:11:15.000000 | | cinder-volume | cinder-volume-az2-0@ceph | az2 | enabled | up | 2024-11-12T17:28:28.000000 | +------------------+--------------------------+------+---------+-------+----------------------------+Disable the Block storage volume service of the availability zone(AZ) being removed:
$ openstack volume service set \ --disable cinder-volume-az2-0@ceph cinder-volumeVerify that the Block storage volume is disabled:
openstack volume service listExample
+------------------+--------------------------+------+----------+-------+----------------------------+ | Binary | Host | Zone | Status | State | Updated At | +------------------+--------------------------+------+----------+-------+----------------------------+ | cinder-scheduler | cinder-e479e-scheduler-0 | nova | enabled | down | 2024-11-10T16:29:40.000000 | | cinder-scheduler | cinder-scheduler-0 | nova | enabled | up | 2024-11-12T19:23:38.000000 | | cinder-volume | cinder-volume-az0-0@ceph | az0 | enabled | up | 2024-11-12T19:23:29.000000 | | cinder-backup | cinder-backup-0 | nova | enabled | up | 2024-11-12T19:23:38.000000 | | cinder-backup | cinder-backup-1 | nova | enabled | up | 2024-11-12T19:23:33.000000 | | cinder-backup | cinder-backup-2 | nova | enabled | up | 2024-11-12T19:23:36.000000 | | cinder-volume | cinder-volume-az1-0@ceph | az1 | enabled | up | 2024-11-12T19:23:35.000000 | | cinder-volume | cinder-volume-az2-0@ceph | az2 | disabled | up | 2024-11-12T19:23:24.000000 | +------------------+--------------------------+------+----------+-------+----------------------------+Open the OpenStackControlPlane manifest file,
openstack_control_plane.yaml. Remove theCinderVolumepod for the site being removed, as shown in the following example:cinderVolumes: az2: customServiceConfig: | [DEFAULT] enabled_backends = ceph glance_api_servers = https://glance-az2-internal.openstack.svc:9292 [ceph] volume_backend_name = ceph volume_driver = cinder.volume.drivers.rbd.RBDDriver rbd_ceph_conf = /etc/ceph/az2.conf rbd_user = openstack rbd_pool = volumes rbd_flatten_volume_from_snapshot = False rbd_secret_uuid = 795dcbca-e715-5ac3-9b7e-a3f5c64eb89f rbd_cluster_name = az2 backend_availability_zone = az2
Remove the cinder volume service for the DCN site you are removing:
Open a shell to the cinder scheduler pod:
oc rsh cinder-scheduler-0Remove the cinder volume service:
cinder-manage service remove cinder-volume cinder-volume-az2-0@cephExit the shell
$ exit
Remove the
GlanceAPIpod for the site being removed:In the openstack-control-plane.yaml custom resource (CR) file, remove the az2 field and all fields under it:
glanceAPIs: az0: <...> az1: <...> az2: apiTimeout: 60 customServiceConfig: | [DEFAULT] enabled_import_methods = [web-download,copy-image,glance-direct] enabled_backends = az0:rbd,az2:rbd [glance_store] default_backend = az2 [az0] rbd_store_ceph_conf = /etc/ceph/az0.conf store_description = "az0 RBD backend" rbd_store_pool = images rbd_store_user = openstack rbd_thin_provisioning = True [az2] rbd_store_ceph_conf = /etc/ceph/az2.conf store_description = "az2 RBD backend" rbd_store_pool = images rbd_store_user = openstack rbd_thin_provisioning = True imageCache: cleanerScheduler: '*/30 * * * *' prunerScheduler: 1 0 * * * size: "" networkAttachments: - storage override: service: internal: metadata: annotations: metallb.universe.tf/address-pool: internalapi metallb.universe.tf/allow-shared-ip: internalapi metallb.universe.tf/loadBalancerIPs: 172.17.0.82 spec: type: LoadBalancer replicas: 1 resources: {} storage: {} tls: api: internal: {} public: {} type: edgeReapply the control plane:
oc apply -f openstack-control-plane.yamlEnsure that the Image service (glance) pods have been removed:
$ oc get pods | grep glance | grep -v purgeExample
glance-e479e-az0-external-api-0 3/3 Running 0 2d glance-e479e-az0-external-api-1 3/3 Running 0 2d glance-e479e-az0-external-api-2 3/3 Running 0 2d glance-e479e-az0-internal-api-0 3/3 Running 0 2d glance-e479e-az0-internal-api-1 3/3 Running 0 2d glance-e479e-az0-internal-api-2 3/3 Running 0 2d glance-e479e-az1-edge-api-0Noteglance-e4793-az2-edge-api-0does not appear in this list.Ensure that the az2 pods are removed:
$ oc get pods | grep cinder-volumeExample
cinder-volume-az0-0 2/2 Running 0 2d cinder-volume-az1-0 2/2 Running 0 2d
Remove the Ceph cluster from the DCN site:
Shut down the Ceph clusters, but do not power off the hosts. For more information, see "Powering down and rebooting the cluster using the Ceph Orchestrator" in the Red Hat Ceph Storage Administration Guide:
- Red Hat Ceph Storage 7 Administration Guide
- Red Hat Ceph Storage 8 Administration Guide
Red Hat Ceph Storage 9 Administration Guide
NoteThe hosts must remain powered on to complete the following steps.
Remove the secret that was used for accessing the removed Ceph cluster:
$ oc delete secret ceph-conf-az-2 -n openstackRe-create the secret for the central site so that it does not contain the secret for the Red Hat Ceph Storage cluster at az2. For example, if you have a three availability zones,
az0,az1, andaz2, and you are removing the edge location that corresponds toaz2, run the following:oc delete secret ceph-conf-az-0 -n openstack oc create secret generic ceph-conf-az-0 \ --from-file=az0.client.openstack.keyring \ --from-file=az0.conf \ --from-file=az1.client.openstack.keyring \ --from-file=az1.conf -n openstackEdit the
extraMountsin the OpenStackControlPlane to remove the reference to the AZ being removed and the removed secret. For example, for AZ2, remove the following list element:- propagation: - az2 extraVolType: Ceph volumes: - name: ceph secret: name: ceph-conf-az-2
-
Remove the node set. To remove the node set that corresponds to the
az2availability zone, complete the steps in Removing an OpenStackDataPlaneNodeSet resource.
Chapter 7. Removing a DCN node Copier lienLien copié sur presse-papiers!
Remove a node from a DCN site when it is no longer needed. This allows you to repurpose or decommission the hardware while maintaining site operations.
- Migrate all instances from the host. For more information see Cold migrating an instance.
- Remove the compute node from its host aggregate. For more information, see Remove a compute node from a host aggregate.
- If you are running HCI, you must remove the host from the Ceph cluster. For more information see Removing a host using the Ceph Orchestrator.
- Remove the Compute node from the data plane. For more information see Removing a Compute node from the data plane.
7.1. Cold migrating an instance Copier lienLien copié sur presse-papiers!
Cold migrating an instance involves stopping the instance and moving it to another Compute node. Cold migration facilitates migration scenarios that live migrating cannot facilitate, such as migrating instances that use PCI passthrough.
The scheduler automatically selects the destination Compute node. For more information, see Migration constraints.
Procedure
Access the remote shell for the
OpenStackClientpod from your workstation:$ oc rsh -n openstack openstackclientTo cold migrate an instance, enter the following command to power off and move the instance:
$ openstack server migrate <instance> --wait-
Replace
<instance>with the name or ID of the instance to migrate. -
Specify the
--block-migrationflag if migrating a locally stored volume. -
Specify the
--waitflag to indicate that you must wait for the migration to complete.
-
Replace
- While you wait for the instance migration to complete, you can open another terminal window and check the migration status. For more information, see Checking migration status.
Check the status of the instance:
$ openstack server list --all-projectsA status of "VERIFY_RESIZE" indicates you need to confirm or revert the migration:
If the migration worked as expected, confirm it:
$ openstack server resize --confirm <instance>Replace
<instance>with the name or ID of the instance to migrate. A status of "ACTIVE" indicates that the instance is ready to use.If the migration did not work as expected, revert it:
$ openstack server resize --revert <instance>Replace
<instance>with the name or ID of the instance.
Restart the instance:
$ openstack server start <instance>Replace
<instance>with the name or ID of the instance.Optional: If you disabled the source Compute node for maintenance, you must re-enable the node so that new instances can be assigned to it:
$ openstack compute service set <source> nova-compute --enableReplace
<source>with the hostname of the source Compute node.Exit the
OpenStackClientpod:$ exit
7.2. Removing a compute node from a host aggregate Copier lienLien copié sur presse-papiers!
Remove a compute node from its host aggregate before decommissioning or reassigning the node. This ensures that the scheduler stops directing workloads to the node.
Procedure
Access the remote shell for the OpenStackClient pod from your workstation:
$ oc rsh -n openstack openstackclientView a list of all the Compute nodes assigned to the host aggregate:
# openstack aggregate show <aggregate_name>To remove an assigned Compute node from the host aggregate, enter the following command:
# openstack aggregate remove host <aggregate_name> <host_name>-
Replace
<aggregate_name>with the name of the host aggregate to remove the Compute node from. -
Replace
<host_name>with the name of the Compute node to remove from the host aggregate.
-
Replace
7.3. Removing a host from the Ceph cluster Copier lienLien copié sur presse-papiers!
Remove a host from your hyperconverged Ceph cluster before decommissioning the node. This process safely removes the node from the Ceph cluster while maintaining cluster availability.
Prerequisites
- You are running HCI on a DCN cluster
- You have Root-level access to all the nodes.
- The host you are removing is added to the storage cluster.
- Cephadm is deployed on the node where the service is to be removed.
Procedure
Log in to the Cephadm shell:
[root@host01 ~]# cephadm shellFetch the host details:
[ceph: root@host01 /]# ceph orch host lsDrain all the daemons from the host:
[ceph: root@host01 /]# ceph orch host drain <hostname>Replace
<hostname>with the name of the host that you are removing.For more information about managing Red Hat Ceph Storage services, see the Administration Guide for your Red Hat Ceph Storage version:
Check the status of OSD removal:
[ceph: root@host01 /]# ceph orch osd rm statusWhen no placement groups (PG) are left on the OSD, the OSD is decommissioned and removed from the storage cluster.
Check if all the daemons are removed from the storage cluster:
[ceph: root@host01 /]# ceph orch ps <hostname>-
Replace
<hostname>with the name of the host that you are removing.
-
Replace
Remove the host:
[ceph: root@host01 /]# ceph orch host rm <hostname>-
Replace
<hostname>with the name of the host that you are removing.
-
Replace
7.4. Removing a Compute node from the data plane Copier lienLien copié sur presse-papiers!
You can remove a Compute node from a node set on the data plane. If you remove all the nodes from a node set, then you must also remove the node set from the data plane.
Prerequisites
-
You are logged in to the RHOCP cluster as a user with
cluster-adminprivileges. - The workloads on the Compute nodes have been migrated to other Compute nodes.
Procedure
Access the remote shell for the
openstackclientpod:$ oc rsh -n openstack openstackclientRetrieve the IP address of the Compute node that you want to remove:
$ openstack hypervisor listRetrieve a list of your Compute nodes to identify the name and UUID of the node that you want to remove:
$ openstack compute service listDisable the
nova-computeservice on the Compute node to be removed:$ openstack compute service set <hostname> nova-compute --disableTipUse the
--disable-reasonoption to add a short explanation on why the service is being disabled. This is useful if you intend to redeploy the Compute service.Exit the
OpenStackClientpod:$ exitSSH into the Compute node to be removed and stop the
ovnandnova-computecontainers:$ ssh -i <key_file_name> cloud-admin@<node_IP_address> [cloud-admin@<hostname> ~]$ sudo systemctl stop edpm_ovn_controller [cloud-admin@<hostname> ~]$ sudo systemctl stop edpm_ovn_metadata_agent [cloud-admin@<hostname> ~]$ sudo systemctl stop edpm_nova_compute-
Replace
<key_file_name>with the name and location of the SSH key pair file you created to enable Ansible to manage the RHEL nodes. -
Replace
<node_IP_address>with the IP address for the Compute node that you retrieved in step 2.
-
Replace
Remove the
systemdunit files that manage theovnandnova-computecontainers to prevent the agents from being automatically started and registered in the database if the removed node is rebooted:[cloud-admin@<hostname> ~]$ sudo rm -f /etc/systemd/system/edpm_ovn_controller [cloud-admin@<hostname> ~]$ sudo rm -f /etc/systemd/system/edpm_ovn_metadata_agent [cloud-admin@<hostname> ~]$ sudo rm -f /etc/systemd/system/edpm_nova_computeDisconnect from the Compute node:
$ exitAccess the remote shell for
openstackclient:$ oc rsh -n openstack openstackclientDelete the network agents for the Compute node to be removed:
$ openstack network agent list [--host <hostname>] $ openstack network agent delete <agent_id>Delete the
nova-computeservice for the Compute node to be removed:$ openstack compute service delete <node_uuid>-
Replace
<node_uuid>with the UUID of the node to be removed that you retrieved in step 3.
-
Replace
Exit the
OpenStackClientpod:$ exitRemove the node from the
OpenStackDataPlaneNodeSetCR:$ oc patch openstackdataplanenodeset/<node_set_name> --type json --patch '[{ "op": "remove", "path": "/spec/nodes/<node_name>" }]'-
Replace
<node_set_name>with the name of theOpenStackDataPlaneNodeSetCR that the node belongs to. -
Replace
<node_name>with the name of the node defined in thenodessection of theOpenStackDataPlaneNodeSetCR.
-
Replace
Create a file on your workstation to define the
OpenStackDataPlaneDeploymentCR to update the node set with the Compute node removed:apiVersion: dataplane.openstack.org/v1beta1 kind: OpenStackDataPlaneDeployment metadata: name: <node_set_deployment_name>-
Replace
<node_set_deployment_name>with the name of theOpenStackDataPlaneDeploymentCR. The name must be unique, must consist of lower case alphanumeric characters,-(hyphen) or.(period), and must start and end with an alphanumeric character.
TipGive the definition file and the
OpenStackDataPlaneDeploymentCR unique and descriptive names that indicate the purpose of the modified node set.-
Replace
Add the
OpenStackDataPlaneNodeSetCR that you removed the node from:spec: nodeSets: - <nodeSet_name>-
Save the
OpenStackDataPlaneDeploymentCR deployment file. Deploy the
OpenStackDataPlaneDeploymentCR to delete the removed nodes:$ oc create -f openstack_data_plane_deploy.yaml -n openstackYou can view the Ansible logs while the deployment executes:
$ oc get pod -l app=openstackansibleee -w $ oc logs -l app=openstackansibleee -f --max-log-requests 10If the
oc logscommand returns an error similar to the following error, increase the--max-log-requestsvalue:error: you are attempting to follow 19 log streams, but maximum allowed concurrency is 10, use --max-log-requests to increase the limitVerify that the modified
OpenStackDataPlaneNodeSetCR is deployed:$ oc get openstackdataplanedeployment -n openstack NAME NODESETS STATUS MESSAGE openstack-data-plane ["openstack-data-plane"] True Setup Complete $ oc get openstackdataplanenodeset -n openstack NAME STATUS MESSAGE openstack-data-plane True NodeSet ReadyFor information about the meaning of the returned status, see Data plane conditions and states in Deploying Red Hat OpenStack Services on OpenShift.
If the status indicates that the data plane has not been deployed, then troubleshoot the deployment. For information, see Troubleshooting the data plane creation and deployment in Deploying Red Hat OpenStack Services on OpenShift.
Chapter 8. Validating edge storage Copier lienLien copié sur presse-papiers!
To ensure that the deployment of central and edge sites are working, test glance multi-store and instance creation. You can use the openstackclient pod to test commands as an OpenStack administrator.
You can import images into glance that are available on the local filesystem or available on a web server.
Always store an image copy in the central site, even if there are no instances using the image at the central location.
8.1. Viewing Image service stores in DCN Copier lienLien copié sur presse-papiers!
Verify that all Image service (glance) stores are available to confirm that your multi-store configuration is working correctly.
Procedure
Check the stores that are available through the Image service by using the
glance stores-infocommand. In the following example, three stores are available: central, dcn1, and dcn2. These correspond to glance stores at the central location and edge sites, respectively:$ glance stores-info +----------+----------------------------------------------------------------------------------+ | Property | Value | +----------+----------------------------------------------------------------------------------+ | stores | [{"default": "true", "id": "az0", "description": "central rbd glance | | | store"}, {"id": "az1", "description": "z1 rbd glance store"}, | | | {"id": "az2", "description": "az2 rbd glance store"}] | +----------+----------------------------------------------------------------------------------+
8.2. Importing an image from a local file Copier lienLien copié sur presse-papiers!
Import an image from a local file to the central location and distribute it to edge sites. This workflow ensures that images are centrally managed and consistently available across all locations.
Procedure
Ensure that your image file is in raw format. If the image is not in raw format, you must convert the image before importing it into the Image service:
$ file cirros-0.5.1-x86_64-disk.img cirros-0.5.1-x86_64-disk.img: QEMU QCOW2 Image (v3), 117440512 bytes $ qemu-img convert -f qcow2 -O raw cirros-0.5.1-x86_64-disk.img cirros-0.5.1-x86_64-disk.rawImport the image into the default back end at the central site:
openstack image create \ --disk-format raw --container-format bare \ --name cirros --file cirros-0.5.1-x86_64-disk.raw \ --store central
8.3. Importing an image from a web server Copier lienLien copié sur presse-papiers!
Import images directly from a web server to multiple storage locations. This method streamlines image distribution across your DCN deployment by eliminating manual copying steps.
This procedure assumes that the default image conversion plugin is enabled in the Image service (glance). This feature automatically converts QCOW2 file formats into raw images, which are optimal for Ceph RBD. You can confirm that a glance image is in raw format by running the glance image-show ID | grep disk_format.
Procedure
Use the
image-create-via-importparameter of theglancecommand to import an image from a web server. Use the--storesparameter.# glance image-create-via-import \ --disk-format qcow2 \ --container-format bare \ --name cirros \ --uri http://download.cirros-cloud.net/0.4.0/cirros-0.4.0-x86_64-disk.img \ --import-method web-download \ --stores az0,az1In this example, the QCOW2 CirrOS image is downloaded from the official CirrOS site, converted to raw by glance, and imported into the central site and edge site 1 as specified by the
--storesparameter.NoteAlternatively you can replace
--storeswith--all-stores Trueto upload the image to all of the stores.
8.4. Copying an image to a new site Copier lienLien copié sur presse-papiers!
Copy images from the central location to edge sites to make them available for instance creation. This reduces instance launch time by eliminating the need to transfer images over the WAN during deployment.
Use the UUID of the glance image for the copy operation:
ID=$(openstack image show cirros -c id -f value) glance image-import $ID --stores az1,az2 --import-method copy-imageNoteIn this example, the
--storesoption specifies that thecirrosimage will be copied from the central site,az0, to edge sitesaz1andaz2. Alternatively, you can use the--all-stores Trueoption, which uploads the image to all the stores that don’t currently have the image.Confirm a copy of the image is in each store. Note that the
storeskey, which is the last item in the properties map, is set toaz0,az1,az2.:$ openstack image show $ID | grep properties | properties | os_glance_failed_import=', os_glance_importing_to_stores=', os_hash_algo=sha512, os_hash_value=6b813aa46bb90b4da216a4d19376593fa3f4fc7e617f03a92b7fe11e9a3981cbe8f0959dbebe36225e5f53dc4492341a4863cac4ed1ee0909f3fc78ef9c3e869, os_hidden=False, stores=az0,az1 |
Always store an image copy in the central site even if there is no VM using it on that site.
8.5. Creating image-based boot volumes at edge sites Copier lienLien copié sur presse-papiers!
Create persistent boot volumes from images stored at edge sites to verify image-based volume functionality.
Procedure
Identify the ID of the image to create as a volume, and pass that ID to the
openstack volume createcommand:IMG_ID=$(openstack image show cirros -c id -f value) openstack volume create --size 8 --availability-zone az1 pet-volume-az1 --image $IMG_IDIdentify the volume ID of the newly created volume and pass it to the
openstack server createcommand:VOL_ID=$(openstack volume show -f value -c id pet-volume-az1) openstack server create --flavor tiny --key-name az1-key --network az1-network --security-group basic --availability-zone az1 --volume $VOL_ID pet-server-az1You can verify that the volume is based on the image by running the
rbdcommand within a ceph-mon container at theaz1edge site to list the volumes pool.$ sudo cephadm shell -- rbd -p volumes ls -l NAME SIZE PARENT FMT PROT LOCK volume-28c6fc32-047b-4306-ad2d-de2be02716b7 8 GiB images/8083c7e7-32d8-4f7a-b1da-0ed7884f1076@snap 2 excl $Confirm that you can create a cinder snapshot of the root volume of the instance. Ensure that the server is stopped to quiesce data to create a clean snapshot. Use the --force option, because the volume status remains
in-usewhen the instance is off.openstack server stop pet-server-az1 openstack volume snapshot create pet-volume-az1-snap --volume $VOL_ID --force openstack server start pet-server-az1List the contents of the volumes pool on the
az1Ceph cluster to show the newly created snapshot.$ sudo cephadm shell -- rbd -p volumes ls -l NAME SIZE PARENT FMT PROT LOCK volume-28c6fc32-047b-4306-ad2d-de2be02716b7 8 GiB images/8083c7e7-32d8-4f7a-b1da-0ed7884f1076@snap 2 excl volume-28c6fc32-047b-4306-ad2d-de2be02716b7@snapshot-a1ca8602-6819-45b4-a228-b4cd3e5adf60 8 GiB images/8083c7e7-32d8-4f7a-b1da-0ed7884f1076@snap 2 yes
8.6. Creating and copying instance snapshots between sites Copier lienLien copié sur presse-papiers!
Create instance snapshots at edge sites and copy them to the central location for backup and image management. This validates your image copying workflow across distributed sites.
Verify that you can create a new image at the
az1location. Ensure that the server is stopped to quiesce data to create a clean snapshot:NOVA_ID=$(openstack server show pet-server-az1 -f value -c id) openstack server stop $NOVA_ID openstack server image create --name cirros-snapshot $NOVA_ID openstack server start $NOVA_IDCopy the image from the
az1edge site back to the central location, which is the default backend for glance:IMAGE_ID=$(openstack image show cirros-snapshot -f value -c id) glance image-import $IMAGE_ID --stores az0 --import-method copy-image
8.7. Backing up and restoring volumes across edge sites Copier lienLien copié sur presse-papiers!
Back up and restore Block Storage (cinder) volumes across edge sites and the central location for data protection and disaster recovery. All backups are centrally stored and managed to provide a consistent recovery point across your distributed architecture.
Back up and restore operations directly between edge sites are not supported.
Prerequisites
- The Block Storage backup service is deployed in the central AZ. For more information, see Updating the control plane.
- Block Storage (cinder) REST API microversion 3.51 or later.
-
All sites use a common
openstackcephx client name.
Procedure
Create a backup of a volume in the first DCN site:
$ openstack --os-volume-api-version 3.51 volume backup create \ --name <volume_backup> --availability-zone <az0> <edge_volume>-
Replace
<volume_backup>with a name for the volume backup. -
Replace
<az0>with the name of the central availability zone that hosts thecinder-backupservice. Replace
<edge_volume>with the name of the volume that you want to back up.NoteIf you experience issues with Ceph keyrings, you might need to restart the
cinder-backupcontainer so that the keyrings copy from the host to the container successfully.
-
Replace
Restore the backup to a new volume in the second DCN site:
$ openstack --os-volume-api-version 3.47 volume create \ --backup <volume_backup> --availability-zone <az_2> <new_volume>-
Replace
<az_2>with the name of the availability zone where you want to restore the backup. -
Replace
<new_volume>with a name for the new volume. -
Replace
<volume_backup>with the name of the volume backup that you created in the previous step.
-
Replace