Chapter 5. Adopting the data plane
Adopting the Red Hat OpenStack Services on OpenShift (RHOSO) data plane involves the following steps:
- Stop any remaining services on the Red Hat OpenStack Platform (RHOSP) 17.1 control plane.
- Deploy the required custom resources.
- Perform a fast-forward upgrade on Compute services from RHOSP 17.1 to RHOSO 18.0.
- Adopt Networker services to the RHOSO data plane.
After the RHOSO control plane manages the newly deployed data plane, you must not re-enable services on the RHOSP 17.1 control plane and data plane. If you re-enable services, workloads are managed by two control planes or two data planes, resulting in data corruption, loss of control of existing workloads, inability to start new workloads, or other issues.
5.1. Stopping infrastructure management and Compute services Copy linkLink copied to clipboard!
You must stop cloud database nodes and messaging nodes on the Red Hat OpenStack Platform 17.1 control plane. Do not stop nodes that are running the following roles:
- Compute
- Storage
- Networker
-
Controller if running
OVN Controller Gateway agentnetwork agent
The following procedure applies to a standalone director deployment. You must stop the Pacemaker services on your host so that you can install libvirt packages when the Compute roles are adopted as data plane nodes. Modular libvirt daemons no longer run in podman containers on data plane nodes.
Prerequisites
Define the shell variables. Replace the following example values with values that apply to your environment:
CONTROLLER1_SSH="ssh -i <path_to_SSH_key> root@<controller-1 IP>" # ... # ... EDPM_PRIVATEKEY_PATH="<path_to_SSH_key>"-
CONTROLLER<X>_SSHdefines the SSH connection details for all Controller nodes, including cell Controller nodes, of the source director cloud. -
<path_to_SSH_key>defines the path to your SSH key.
-
Procedure
Stop the Pacemaker services:
PacemakerResourcesToStop=( "galera-bundle" "haproxy-bundle" "rabbitmq-bundle") echo "Stopping pacemaker services" for i in {1..3}; do SSH_CMD=CONTROLLER${i}_SSH if [ ! -z "${!SSH_CMD}" ]; then echo "Using controller $i to run pacemaker commands" for resource in ${PacemakerResourcesToStop[*]}; do if ${!SSH_CMD} sudo pcs resource config $resource; then ${!SSH_CMD} sudo pcs resource disable $resource fi done break fi done
5.2. Adopting Compute services to the RHOSO data plane Copy linkLink copied to clipboard!
Adopt your Compute (nova) services to the Red Hat OpenStack Services on OpenShift (RHOSO) data plane.
Prerequisites
- You have stopped the remaining control plane nodes, repositories, and packages on the Compute service (nova) hosts. For more information, see Stopping infrastructure management and Compute services.
-
If you have a Red Hat Ceph Storage environment, you have configured the Ceph back end for the
NovaLibvirtservice. For more information, see Configuring a Ceph back end. You have configured IP Address Management (IPAM):
$ oc apply -f - <<EOF apiVersion: network.openstack.org/v1beta1 kind: NetConfig metadata: name: netconfig spec: networks: - name: ctlplane dnsDomain: ctlplane.example.com subnets: - name: subnet1 allocationRanges: - end: 192.168.122.120 start: 192.168.122.100 - end: 192.168.122.200 start: 192.168.122.150 cidr: 192.168.122.0/24 gateway: 192.168.122.1 - name: internalapi dnsDomain: internalapi.example.com subnets: - name: subnet1 allocationRanges: - end: 172.17.0.250 start: 172.17.0.100 cidr: 172.17.0.0/24 vlan: 20 - name: External dnsDomain: external.example.com subnets: - name: subnet1 allocationRanges: - end: 10.0.0.250 start: 10.0.0.100 cidr: 10.0.0.0/24 gateway: 10.0.0.1 - name: storage dnsDomain: storage.example.com subnets: - name: subnet1 allocationRanges: - end: 172.18.0.250 start: 172.18.0.100 cidr: 172.18.0.0/24 vlan: 21 - name: storagemgmt dnsDomain: storagemgmt.example.com subnets: - name: subnet1 allocationRanges: - end: 172.20.0.250 start: 172.20.0.100 cidr: 172.20.0.0/24 vlan: 23 - name: tenant dnsDomain: tenant.example.com subnets: - name: subnet1 allocationRanges: - end: 172.19.0.250 start: 172.19.0.100 cidr: 172.19.0.0/24 vlan: 22 EOF-
If
neutron-sriov-nic-agentis running on your Compute service nodes, ensure that the physical device mappings match the values that are defined in theOpenStackDataPlaneNodeSetcustom resource (CR). For more information, see Pulling the configuration from a director deployment. To prevent workload shutdown, you have created the
tripleo_nova_libvirt_guests_service_cleanup.yamlplaybook:- become: true hosts: all strategy: tripleo_free name: disable and clean tripleo_nova_libvirt_guests tasks: - name: tripleo_nova_libvirt_guests removal become: true shell: | set -o pipefail systemctl disable tripleo_nova_libvirt_guests.service rm -f /etc/systemd/system/tripleo_nova_libvirt_guests.service rm -f /etc/systemd/system/virt-guest-shutdown.target systemctl daemon-reloadYou have used the following command to run the playbook:
$ansible-playbook -i overcloud-deploy/overcloud/tripleo-ansible-inventory.yaml tripleo_nova_libvirt_guests_service_cleanup.yamlYou have defined the shell variables to run the script that runs the upgrade:
$ CEPH_FSID=$(oc get secret ceph-conf-files -o json | jq -r .data."ceph.conf" | base64 -d | grep fsid | sed -e s/fsid = //) $ alias openstack="oc exec -t openstackclient -- openstack" $ DEFAULT_CELL_NAME="cell3" $ RENAMED_CELLS="cell1 cell2 $DEFAULT_CELL_NAME" $ declare -A COMPUTES_CELL1 $ export COMPUTES_CELL1=( > ["standalone.localdomain"]="192.168.122.100" > # <compute1> > # <compute2> > # <compute3> >) $ declare -A COMPUTES_CELL2 $ export COMPUTES_CELL2=( > # <compute1> >) $ declare -A COMPUTES_CELL3 $export COMPUTES_CELL3=( > # <compute1> > # <compute2> >) $ declare -A COMPUTES_API_CELL1 $export COMPUTES_API_CELL1=( > ["standalone.localdomain"]="172.17.0.100" > ["standalone2.localdomain"]="172.17.0.101" >) $ NODESETS="" $ for CELL in $(echo $RENAMED_CELLS); do > ref="COMPUTES_$(echo ${CELL}|tr [:lower:] [:upper:])" > eval names=\${!${ref}[@]} > [ -z "$names" ] && continue > NODESETS="openstack-${CELL}, $NODESETS" >done $ NODESETS="[${NODESETS%,*}]"-
DEFAULT_CELL_NAME="cell3"defines the source clouddefaultcell that acquires a newDEFAULT_CELL_NAMEon the destination cloud after adoption. In a multi-cell adoption scenario, you can retain the original name,default, or create a new cell default name by providing the incremented index of the last cell in the source cloud. For example, if the incremented index of the last cell iscell5, the new cell default name iscell6. -
export COMPUTES_CELL1=For each cell, update the<["standalone.localdomain"]="x.x.x.x">value and theCOMPUTES_CELL<X>value with the names and IP addresses of the Compute service nodes that are connected to thectlplaneandinternalapinetworks. Do not specify a real FQDN defined for each network. Always use the same hostname for each connected network of a Compute node. Provide the IP addresses and the names of the hosts on the remaining networks of the source cloud as needed, or you can manually adjust the files that you generate in step 9 of this procedure. -
<compute1>,<compute2>, and<compute3>specifies the names of your Compute service nodes for each cell. Assign all Compute service nodes from the source cloudcell1cell intoCOMPUTES_CELL1, and so on. -
export COMPUTES_CELL<X>=(specifies all Compute service nodes that you assign from the source clouddefaultcell intoCOMPUTES_CELL<X>andCOMPUTES_API_CELL<X>, where<X>is theDEFAULT_CELL_NAMEenvironment variable value. In this example, theDEFAULT_CELL_NAMEenvironment variable value equalscell3. -
export COMPUTES_API_CELL1=(For each cell, update the<["standalone.localdomain"]="192.168.122.100">value and theCOMPUTES_API_CELL<X>value with the names and IP addresses of the Compute service nodes that are connected to thectlplaneandinternalapinetworks.["standalone.localdomain"]="192.168.122.100"defines the custom DNS domain in the FQDN value of the nodes. This value is used in the data plane node setspec.nodes.<NODE NAME>.hostName. Do not specify a real FQDN defined for each network. Use the same hostname for each of its connected networks. Provide the IP addresses and the names of the hosts on the remaining networks of the source cloud as needed, or you can manually adjust the files that you generate in step 9 of this procedure. NODESETS="'openstack-${CELL}', $NODESETS"specifies the cells that contain Compute nodes. Cells that do not contain Compute nodes are omitted from this template because no node sets are created for the cells.NoteIf you deployed the source cloud with a
defaultcell, and want to rename it during adoption, define the new name that you want to use, as shown in the following example:$ DEFAULT_CELL_NAME="cell1" $ RENAMED_CELLS="cell1"
-
Do not set a value for the CEPH_FSID parameter if the local storage back end is configured by the Compute service for libvirt. The storage back end must match the source cloud storage back end. You cannot change the storage back end during adoption.
Procedure
Create an SSH authentication secret for the data plane nodes:
$ oc apply -f - <<EOF apiVersion: v1 kind: Secret metadata: name: dataplane-adoption-secret data: ssh-privatekey: | $(cat <path_to_SSH_key> | base64 | sed 's/^/ /') EOFReplace
<path_to_SSH_key>with the path to your SSH key.For more information about creating data plane secrets, see Creating the data plane secrets in Deploying Red Hat OpenStack Services on OpenShift.
Generate an ssh key-pair
nova-migration-ssh-keysecret:$ cd "$(mktemp -d)" $ ssh-keygen -f ./id -t ecdsa-sha2-nistp521 -N '' $ oc get secret nova-migration-ssh-key || oc create secret generic nova-migration-ssh-key \ --from-file=ssh-privatekey=id \ --from-file=ssh-publickey=id.pub \ --type kubernetes.io/ssh-auth $ rm -f id* $ cd -If TLS Everywhere is enabled, set
LIBVIRT_PASSWORDto match the existing RHOSP deployment password:declare -A TRIPLEO_PASSWORDS TRIPLEO_PASSWORDS[default]="$HOME/overcloud-passwords.yaml" LIBVIRT_PASSWORD=$(cat ${TRIPLEO_PASSWORDS[default]} | grep ' LibvirtTLSPassword:' | awk -F ': ' '{ print $2; }') LIBVIRT_PASSWORD_BASE64=$(echo -n "$LIBVIRT_PASSWORD" | base64)Create libvirt-secret when TLS-e is enabled:
$ oc apply -f - <<EOF apiVersion: v1 kind: Secret metadata: name: libvirt-secret type: Opaque data: LibvirtPassword: ${LIBVIRT_PASSWORD_BASE64} EOF
Create a configuration map to use for all cells to configure a local storage back end for libvirt:
$ oc apply -f - <<EOF apiVersion: v1 kind: ConfigMap metadata: name: nova-cells-global-config data: 99-nova-compute-cells-workarounds.conf: | [workarounds] disable_compute_service_check_for_ffu=true EOF-
dataprovides the configuration files for all the cells. 99-nova-compute-cells-workarounds.conf: |specifies the index of the<*.conf>files. There is a requirement to index the<*.conf>files from 03 to 99, based on precedence. A<99-*.conf>file takes the highest precedence, while indexes below 03 are reserved for internal use.NoteIf you adopt a live cloud, you might be required to carry over additional configurations for the default
novadata plane services that are stored in the cell1 defaultnova-extra-configconfiguration map. Do not delete or overwrite the existing configuration in thecell1defaultnova-extra-configconfiguration map that is assigned tonova. Overwriting the configuration can break the data place services that rely on specific contents of thenova-extra-configconfiguration map.
-
Configure a Red Hat Ceph Storage back end for libvirt:
$ oc apply -f - <<EOF apiVersion: v1 kind: ConfigMap metadata: name: nova-cells-global-config data: 99-nova-compute-cells-workarounds.conf: | [workarounds] disable_compute_service_check_for_ffu=true 03-ceph-nova.conf: | [libvirt] images_type=rbd images_rbd_pool=vms images_rbd_ceph_conf=/etc/ceph/ceph.conf images_rbd_glance_store_name=default_backend images_rbd_glance_copy_poll_interval=15 images_rbd_glance_copy_timeout=600 rbd_user=openstack rbd_secret_uuid=$CEPH_FSID EOFNoteFor Red Hat Ceph Storage environments with multi-cell configurations, you must name configuration maps and Red Hat OpenStack Platform data plane services similar to the following examples:
nova-custom-ceph-cellXandnova-compute-extraconfig-cellX.NoteFor Distributed Compute Node (DCN) deployments, do not use the single
nova-cells-global-configConfigMap. Create a per-siteConfigMapand per-siteOpenStackDataPlaneServicefor each site in your DCN deployment. Each site’s Compute service nodes require a different Red Hat Ceph Storage configuration and a different Image service endpoint. For more information, see Adopting Compute services with multiple Ceph back ends (DCN).Create the data plane services for Compute service cells to enable pre-upgrade workarounds, and to configure the Compute services for your chosen storage back end:
for CELL in $(echo $RENAMED_CELLS); do oc apply -f - <<EOF apiVersion: dataplane.openstack.org/v1beta1 kind: OpenStackDataPlaneService metadata: name: nova-$CELL spec: dataSources: - secretRef: name: nova-$CELL-compute-config - secretRef: name: nova-migration-ssh-key - configMapRef: name: nova-cells-global-config playbook: osp.edpm.nova caCerts: combined-ca-bundle edpmServiceType: nova containerImageFields: - NovaComputeImage - EdpmIscsidImage EOF done-
spec.dataSources.secretRefspecifies an additional auto-generatednova-cell<X>-metadata-neutron-configsecret to enable a local metadata service for cell<X>. You should also setspec.nova.template.cellTemplates.cell<X>.metadataServiceTemplate.enablein theOpenStackControlPlane/openstackCR, as described in Adopting the Compute service. You can configure a single top-level metadata, or define the metadata per cell. -
nova-$CELL-compute-configspecifies the secret that auto-generates for eachcell<X>. You must append thenova-cell<X>-compute-configfor each customOpenStackDataPlaneServiceCR that is related to the Compute service. nova-migration-ssh-keyspecifies the secret that you must append for each customOpenStackDataPlaneServiceCR that is related to the Compute service.NoteWhen creating your data plane services for Compute service cells, review the following considerations:
-
In this example, the same
nova-migration-ssh-keykey is shared across cells. However, you should use different keys for different cells. -
For simple configuration overrides, you do not need a custom data plane service. However, to reconfigure the cell,
cell1, the safest option is to create a custom service and a dedicated configuration map for it. -
The cell,
cell1, is already managed with the defaultOpenStackDataPlaneServiceCR callednovaand itsnova-extra-configconfiguration map. Do not change the default data plane servicenovadefinition. The changes are lost when the RHOSO operator is updated with OLM. -
When a cell spans multiple node sets, give the custom
OpenStackDataPlaneServiceresources a name that relates to the node set, for example,nova-cell1-nfvandnova-cell1-enterprise. The auto-generated configuration maps are then namednova-cell1-nfv-extra-configandnova-cell1-enterprise-extra-config. - Different configurations for nodes in multiple node sets of the same cell are also supported, but are not covered in this guide.
-
In this example, the same
-
If TLS Everywhere is enabled, append the following content to the
OpenStackDataPlaneServiceCR:tlsCerts: nova: contents: - dnsnames - ips networks: - ctlplane issuer: osp-rootca-issuer-internal edpmRoleServiceName: nova caCerts: combined-ca-bundle edpmServiceType: novaCreate a secret for the subscription manager:
$ oc create secret generic subscription-manager \ --from-literal rhc_auth='{"login": {"username": "<subscription_manager_username>", "password": "<subscription_manager_password>"}}'-
Replace
<subscription_manager_username>with the applicable username. -
Replace
<subscription_manager_password>with the applicable password.
-
Replace
Create a secret for the Red Hat registry:
$ oc create secret generic redhat-registry \ --from-literal edpm_container_registry_logins='{"registry.redhat.io": {"<registry_username>": "<registry_password>"}}'-
Replace
<registry_username>with the applicable username. Replace
<registry_password>with the applicable password.NoteYou do not need to reference the
subscription-managersecret in thedataSourcesfield of theOpenStackDataPlaneServiceCR. The secret is already passed in with a node-specificOpenStackDataPlaneNodeSetCR in theansibleVarsFromproperty in thenodeTemplatefield.
-
Replace
Create the data plane node set definitions for each cell:
$ declare -A names $ for CELL in $(echo $RENAMED_CELLS); do ref="COMPUTES_$(echo ${CELL}|tr [:lower:] [:upper:])" eval names=\${!${ref}[@]} ref_api="COMPUTES_API_$(echo ${CELL}|tr [:lower:] [:upper:])" [ -z "$names" ] && continue ind=0 rm -f computes-$CELL for compute in $names; do ip="${ref}[$compute]" ip_api="${ref_api}[$compute]" cat >> computes-$CELL << EOF ${compute}: hostName: $compute ansible: ansibleHost: $compute networks: - defaultRoute: true fixedIP: ${!ip} name: ctlplane subnetName: subnet1 - name: internalapi subnetName: subnet1 fixedIP: ${!ip_api} - name: storage subnetName: subnet1 - name: tenant subnetName: subnet1 EOF ind=$(( ind + 1 )) done test -f computes-$CELL || continue cat > nodeset-${CELL}.yaml <<EOF apiVersion: dataplane.openstack.org/v1beta1 kind: OpenStackDataPlaneNodeSet metadata: name: openstack-$CELL spec: tlsEnabled: false networkAttachments: - ctlplane preProvisioned: true services: - redhat - bootstrap - download-cache - configure-network - validate-network - install-os - configure-os - ssh-known-hosts - run-os - reboot-os - install-certs - ovn - neutron-metadata - libvirt - nova-$CELL - telemetry env: - name: ANSIBLE_CALLBACKS_ENABLED value: "profile_tasks" - name: ANSIBLE_FORCE_COLOR value: "True" - name: ANSIBLE_VERBOSITY value: "3" nodeTemplate: ansibleSSHPrivateKeySecret: dataplane-adoption-secret ansible: ansibleUser: root ansibleVarsFrom: - secretRef: name: subscription-manager - secretRef: name: redhat-registry ansibleVars: rhc_release: 9.2 rhc_repositories: - {name: "*", state: disabled} - {name: "rhel-9-for-x86_64-baseos-eus-rpms", state: enabled} - {name: "rhel-9-for-x86_64-appstream-eus-rpms", state: enabled} - {name: "rhel-9-for-x86_64-highavailability-eus-rpms", state: enabled} - {name: "rhoso-18.0-for-rhel-9-x86_64-rpms", state: enabled} - {name: "fast-datapath-for-rhel-9-x86_64-rpms", state: enabled} - {name: "rhceph-7-tools-for-rhel-9-x86_64-rpms", state: enabled} edpm_bootstrap_release_version_package: [] # edpm_network_config # Default nic config template for a EDPM node # These vars are edpm_network_config role vars edpm_network_config_template: | --- {% set mtu_list = [ctlplane_mtu] %} {% for network in nodeset_networks %} {% set _ = mtu_list.append(lookup(vars, networks_lower[network] ~ _mtu)) %} {%- endfor %} {% set min_viable_mtu = mtu_list | max %} network_config: - type: ovs_bridge name: {{ neutron_physical_bridge_name }} mtu: {{ min_viable_mtu }} use_dhcp: false dns_servers: {{ ctlplane_dns_nameservers }} domain: {{ dns_search_domains }} addresses: - ip_netmask: {{ ctlplane_ip }}/{{ ctlplane_cidr }} routes: {{ ctlplane_host_routes }} members: - type: interface name: nic1 mtu: {{ min_viable_mtu }} # force the MAC address of the bridge to this interface primary: true {% for network in nodeset_networks %} - type: vlan mtu: {{ lookup(vars, networks_lower[network] ~ _mtu) }} vlan_id: {{ lookup(vars, networks_lower[network] ~ _vlan_id) }} addresses: - ip_netmask: {{ lookup(vars, networks_lower[network] ~ _ip) }}/{{ lookup(vars, networks_lower[network] ~ _cidr) }} routes: {{ lookup(vars, networks_lower[network] ~ _host_routes) }} {% endfor %} edpm_network_config_nmstate: false # Control resolv.conf management by NetworkManager # false = disable NetworkManager resolv.conf update (default) # true = enable NetworkManager resolv.conf update edpm_bootstrap_network_resolvconf_update: false edpm_network_config_hide_sensitive_logs: false # # These vars are for the network config templates themselves and are # considered EDPM network defaults. neutron_physical_bridge_name: br-ctlplane neutron_public_interface_name: eth0 # edpm_nodes_validation edpm_nodes_validation_validate_controllers_icmp: false edpm_nodes_validation_validate_gateway_icmp: false # edpm ovn-controller configuration edpm_ovn_bridge_mappings: [<"bridge_mappings">] edpm_ovn_bridge: br-int edpm_ovn_encap_type: geneve ovn_monitor_all: true edpm_ovn_remote_probe_interval: 60000 edpm_ovn_ofctrl_wait_before_clear: 8000 timesync_ntp_servers: - hostname: clock.redhat.com - hostname: clock2.redhat.com edpm_bootstrap_command: | set -euxo pipefail dnf -y upgrade openstack-selinux rm -f /run/virtlogd.pid gather_facts: false # edpm firewall, change the allowed CIDR if needed edpm_sshd_configure_firewall: true edpm_sshd_allowed_ranges: [192.168.122.0/24] # Do not attempt OVS major upgrades here edpm_ovs_packages: - openvswitch3.3 edpm_default_mounts: - path: /dev/hugepages<size> opts: pagesize=<size> fstype: hugetlbfs group: hugetlbfs nodes: EOF cat computes-$CELL >> nodeset-${CELL}.yaml done-
${compute}.hostNamespecifies the FQDN for the node if your deployment has a custom DNS Domain. -
${compute}.networksspecifies the network composition. The network composition must match the source cloud configuration to avoid data plane connectivity downtime. Thectlplanenetwork must come first. The commands only retain IP addresses for the hosts on thectlplaneandinternalapinetworks. Repeat this step for other isolated networks, or update the resulting files manually. -
metadata.name:specifies the node set names for each cell, for example,openstack-cell1,openstack-cell2. Only create node sets for cells that contain Compute nodes. -
spec.tlsEnabledspecifies whether TLS Everywhere is enabled. If it is enabled, changetlsEnabledtotrue. -
spec.servicesspecifies the services to be adopted. If you are not adopting telemetry services, omit it from the services list. -
neutron_physical_bridge_name: br-ctlplanespecifies the bridge name. The bridge name and other OVN and Networking service-specific values must match the source cloud configuration to avoid data plane connectivity downtime. -
edpm_ovn_bridge_mappings: Replace[<"bridge_mappings">]with the value of the bridge mappings in your configuration, for example,["datacentre:br-ctlplane"]. path: /dev/hugepages<size>andopts: pagesize=<size>configures huge pages. Replace<size>with the size of the page. To configure multi-sized huge pages, create more items in the list. Note that the mount points must match the source cloud configuration.NoteEnsure that you use the same
ovn-controllersettings in theOpenStackDataPlaneNodeSetCR that you used in the Compute service nodes before adoption. This configuration is stored in theexternal_idscolumn in theOpen_vSwitchtable in the Open vSwitch database:$ ovs-vsctl list Open . ... external_ids : {hostname=standalone.localdomain, ovn-bridge=br-int, ovn-bridge-mappings=<bridge_mappings>, ovn-chassis-mac-mappings="datacentre:1e:0a:bb:e6:7c:ad", ovn-encap-ip="172.19.0.100", ovn-encap-tos="0", ovn-encap-type=geneve, ovn-match-northd-version=False, ovn-monitor-all=True, ovn-ofctrl-wait-before-clear="8000", ovn-openflow-probe-interval="60", ovn-remote="tcp:ovsdbserver-sb.openstack.svc:6642", ovn-remote-probe-interval="60000", rundir="/var/run/openvswitch", system-id="2eec68e6-aa21-4c95-a868-31aeafc11736"} ...
-
Deploy the
OpenStackDataPlaneNodeSetCRs for each Compute cell:for CELL in $(echo $RENAMED_CELLS); do test -f nodeset-${CELL}.yaml || continue oc apply -f nodeset-${CELL}.yaml doneIf you use a Red Hat Ceph Storage back end for Block Storage service (cinder), prepare the adopted data plane workloads:
for CELL in $(echo $RENAMED_CELLS); do test -f nodeset-${CELL}.yaml || continue oc patch osdpns/openstack-$CELL --type=merge --patch " spec: services: - redhat - bootstrap - download-cache - configure-network - validate-network - install-os - configure-os - ssh-known-hosts - run-os - reboot-os - ceph-client - install-certs - ovn - neutron-metadata - libvirt - nova-$CELL - telemetry nodeTemplate: extraMounts: - extraVolType: Ceph volumes: - name: ceph secret: secretName: ceph-conf-files mounts: - name: ceph mountPath: "/etc/ceph" readOnly: true " doneOptional: Enable
neutron-sriov-nic-agentin theOpenStackDataPlaneNodeSetCR:for CELL in $(echo $RENAMED_CELLS); do test -f nodeset-${CELL}.yaml || continue oc patch openstackdataplanenodeset openstack-$CELL --type='json' --patch='[ { "op": "add", "path": "/spec/services/-", "value": "neutron-sriov" }, { "op": "add", "path": "/spec/nodeTemplate/ansible/ansibleVars/edpm_neutron_sriov_agent_SRIOV_NIC_physical_device_mappings", "value": "dummy_sriov_net:dummy-dev" }, { "op": "add", "path": "/spec/nodeTemplate/ansible/ansibleVars/edpm_neutron_sriov_agent_SRIOV_NIC_resource_provider_bandwidths", "value": "dummy-dev:40000000:40000000" }, { "op": "add", "path": "/spec/nodeTemplate/ansible/ansibleVars/edpm_neutron_sriov_agent_SRIOV_NIC_resource_provider_hypervisors", "value": "dummy-dev:standalone.localdomain" }]' doneOptional: Enable
neutron-dhcpin theOpenStackDataPlaneNodeSetCR:for CELL in $(echo $RENAMED_CELLS); do test -f nodeset-${CELL}.yaml || continue oc patch openstackdataplanenodeset openstack-$CELL --type='json' --patch='[ { "op": "add", "path": "/spec/services/-", "value": "neutron-dhcp" }]' doneNoteTo use
neutron-dhcpwith OVN for the Bare Metal Provisioning service (ironic), you must set thedisable_ovn_dhcp_for_baremetal_portsconfiguration option for the Networking service (neutron) totrue. You can set this configuration in theNeutronAPIspec:.. spec: serviceUser: neutron ... customServiceConfig: | [DEFAULT] dhcp_agent_notification = True [ovn] disable_ovn_dhcp_for_baremetal_ports = trueRun the pre-adoption validation:
Create the validation service:
$ oc apply -f - <<EOF apiVersion: dataplane.openstack.org/v1beta1 kind: OpenStackDataPlaneService metadata: name: pre-adoption-validation spec: playbook: osp.edpm.pre_adoption_validation EOFCreate a
OpenStackDataPlaneDeploymentCR that runs only the validation:$ oc apply -f - <<EOF apiVersion: dataplane.openstack.org/v1beta1 kind: OpenStackDataPlaneDeployment metadata: name: openstack-pre-adoption spec: nodeSets: $NODESETS servicesOverride: - pre-adoption-validation EOFNoteIf you created different migration SSH keys for different
OpenStackDataPlaneServiceCRs, you should also define a separateOpenStackDataPlaneDeploymentCR for each node set or node sets that represent a cell.When the validation is finished, confirm that the status of the Ansible EE pods is
Completed:$ watch oc get pod -l app=openstackansibleee$ oc logs -l app=openstackansibleee -f --max-log-requests 20Wait for the deployment to reach the
Readystatus:$ oc wait --for condition=Ready openstackdataplanedeployment/openstack-pre-adoption --timeout=10mImportantIf any openstack-pre-adoption validations fail, you must reference the Ansible logs to determine which ones were unsuccessful, and then try the following troubleshooting options:
-
If the hostname validation failed, check that the hostname of the data plane node is correctly listed in the
OpenStackDataPlaneNodeSetCR. -
If the kernel argument check failed, ensure that the kernel argument configuration in the
edpm_kernel_argsandedpm_kernel_hugepagesvariables in theOpenStackDataPlaneNodeSetCR is the same as the kernel argument configuration that you used in the Red Hat OpenStack Platform (RHOSP) 17.1 node. -
If the tuned profile check failed, ensure that the
edpm_tuned_profilevariable in theOpenStackDataPlaneNodeSetCR is configured to use the same profile as the one set on the RHOSP 17.1 node.
-
If the hostname validation failed, check that the hostname of the data plane node is correctly listed in the
Remove the remaining director services:
Create an
OpenStackDataPlaneServiceCR to clean up the data plane services you are adopting:$ oc apply -f - <<EOF apiVersion: dataplane.openstack.org/v1beta1 kind: OpenStackDataPlaneService metadata: name: tripleo-cleanup spec: playbook: osp.edpm.tripleo_cleanup EOFCreate the
OpenStackDataPlaneDeploymentCR to run the clean-up:$ oc apply -f - <<EOF apiVersion: dataplane.openstack.org/v1beta1 kind: OpenStackDataPlaneDeployment metadata: name: tripleo-cleanup spec: nodeSets: $NODESETS servicesOverride: - tripleo-cleanup EOF
When the clean-up is finished, deploy the
OpenStackDataPlaneDeploymentCR:$ oc apply -f - <<EOF apiVersion: dataplane.openstack.org/v1beta1 kind: OpenStackDataPlaneDeployment metadata: name: openstack spec: nodeSets: $NODESETS EOFNoteIf you have other node sets to deploy, such as Networker nodes, you can add them in the
nodeSetslist in this step, or create separateOpenStackDataPlaneDeploymentCRs later. You cannot add new node sets to anOpenStackDataPlaneDeploymentCR after deployment.
Verification
Confirm that all the Ansible EE pods reach a
Completedstatus:$ watch oc get pod -l app=openstackansibleee$ oc logs -l app=openstackansibleee -f --max-log-requests 20Wait for the data plane node sets to reach the
Readystatus:for CELL in $(echo $RENAMED_CELLS); do oc wait --for condition=Ready osdpns/openstack-$CELL --timeout=30m doneVerify that the Networking service (neutron) agents are running:
$ oc exec openstackclient -- openstack network agent list +--------------------------------------+------------------------------+------------------------+-------------------+-------+-------+----------------------------+ | ID | Agent Type | Host | Availability Zone | Alive | State | Binary | +--------------------------------------+------------------------------+------------------------+-------------------+-------+-------+----------------------------+ | 174fc099-5cc9-4348-b8fc-59ed44fcfb0e | DHCP agent | standalone.localdomain | nova | :-) | UP | neutron-dhcp-agent | | 10482583-2130-5b0d-958f-3430da21b929 | OVN Metadata agent | standalone.localdomain | | :-) | UP | neutron-ovn-metadata-agent | | a4f1b584-16f1-4937-b2b0-28102a3f6eaa | OVN Controller agent | standalone.localdomain | | :-) | UP | ovn-controller | +--------------------------------------+------------------------------+------------------------+-------------------+-------+-------+----------------------------+
After you remove all the services from the director cell controllers, you can decomission the cell controllers. To create new cell Compute nodes, you re-provision the decomissioned controllers as new data plane hosts and add them to the node sets of corresponding or new cells.
Next steps
- You must perform a fast-forward upgrade on your Compute services. For more information, see Performing a fast-forward upgrade on Compute services.
5.3. Configuring data plane node sets for DCN sites Copy linkLink copied to clipboard!
If you are adopting a Distributed Compute Node (DCN) deployment, you must create separate OpenStackDataPlaneNodeSet custom resources (CRs) for each site. Each node set requires site-specific configuration for network subnets, OVN bridge mappings, and inter-site routes.
Prerequisites
- You have adopted the Red Hat OpenStack Platform (RHOSP) control plane to Red Hat OpenStack Services on OpenShift (RHOSO).
-
You have configured control plane networking for your spine-leaf topology, including multi-subnet
NetConfigandNetworkAttachmentDefinitionCRs with routes to remote sites. For more information, see Configuring control plane networking for spine-leaf topologies. You have the network configuration information for each DCN site:
- IP addresses and hostnames for all Compute nodes
- VLAN IDs for each service network
- Gateway addresses for inter-site routing
- You have identified the OVN bridge mappings (physnets) for each site.
Procedure
Define the OVN bridge mappings for each site. Each site requires a unique physnet that maps to the local provider network bridge:
Expand Table 5.1. Example OVN bridge mappings Site OVN bridge mapping Central
leaf0:br-exDCN1
leaf1:br-exDCN2
leaf2:br-exConfigure OVN for DCN sites. The default OVN controller configuration uses the Kubernetes ClusterIP (
ovsdbserver-sb.openstack.svc), which is not routable from remote DCN sites. You must create a DCN-specific configuration that uses directinternalapiIP addresses.Get the OVN Southbound database
internalapiIP addresses:$ oc get pod -l service=ovsdbserver-sb -o jsonpath='{range .items[*]}{.metadata.annotations.k8s\.v1\.cni\.cncf\.io/network-status}{"\n"}{end}' | jq -r '.[] | select(.name=="openstack/internalapi") | .ips[0]'Example output:
172.17.0.34 172.17.0.35 172.17.0.36Create a ConfigMap with the OVN SB direct IPs for DCN sites:
$ oc apply -f - <<EOF apiVersion: v1 kind: ConfigMap metadata: name: ovncontroller-config-dcn namespace: openstack data: ovsdb-config: | ovn-remote: tcp:172.17.0.34:6642,tcp:172.17.0.35:6642,tcp:172.17.0.36:6642 EOF-
Replace the IP addresses with the actual
internalapiIPs from the previous step.
-
Replace the IP addresses with the actual
Create an
OpenStackDataPlaneServiceCR for DCN OVN configuration:$ oc apply -f - <<EOF apiVersion: dataplane.openstack.org/v1beta1 kind: OpenStackDataPlaneService metadata: name: ovn-dcn namespace: openstack spec: addCertMounts: false caCerts: combined-ca-bundle containerImageFields: - OvnControllerImage dataSources: - configMapRef: name: ovncontroller-config-dcn edpmServiceType: ovn playbook: osp.edpm.ovn tlsCerts: default: contents: - dnsnames - ips issuer: osp-rootca-issuer-ovn keyUsages: - digital signature - key encipherment - server auth - client auth networks: - ctlplane EOFNoteThe
ovn-dcnservice uses theovncontroller-config-dcnConfigMap (throughdataSources), which contains the directinternalapiIPs instead of theClusterIP. DCN node sets must use this service instead of the defaultovnservice.
Create an
OpenStackDataPlaneNodeSetCR for the central site Compute nodes:apiVersion: dataplane.openstack.org/v1beta1 kind: OpenStackDataPlaneNodeSet metadata: name: openstack-edpm spec: tlsEnabled: false networkAttachments: - ctlplane preProvisioned: true services: - redhat - bootstrap - download-cache - configure-network - validate-network - install-os - configure-os - ssh-known-hosts - run-os - reboot-os - install-certs - ovn - neutron-metadata - libvirt - nova-cell1 - telemetry nodeTemplate: ansibleSSHPrivateKeySecret: dataplane-adoption-secret ansible: ansibleUser: root ansibleVars: edpm_ovn_bridge_mappings: ["leaf0:br-ex"] edpm_ovn_bridge: br-int edpm_ovn_encap_type: geneve # Network configuration template for central site edpm_network_config_template: | --- network_config: - type: ovs_bridge name: {{ neutron_physical_bridge_name }} use_dhcp: false dns_servers: {{ ctlplane_dns_nameservers }} addresses: - ip_netmask: {{ ctlplane_ip }}/{{ ctlplane_cidr }} routes: {{ ctlplane_host_routes }} members: - type: interface name: nic1 primary: true {% for network in nodeset_networks %} - type: vlan vlan_id: {{ lookup(vars, networks_lower[network] ~ _vlan_id) }} addresses: - ip_netmask: {{ lookup(vars, networks_lower[network] ~ _ip) }}/{{ lookup(vars, networks_lower[network] ~ _cidr) }} routes: {{ lookup(vars, networks_lower[network] ~ _host_routes) }} {% endfor %} nodes: compute-0: hostName: compute-0.example.com ansible: ansibleHost: compute-0.example.com networks: - defaultRoute: true fixedIP: 192.168.122.100 name: ctlplane subnetName: subnet1 - name: internalapi subnetName: subnet1 - name: storage subnetName: subnet1 - name: tenant subnetName: subnet1-
The OVN bridge mapping uses the central site physnet
leaf0. -
Central site nodes reference
subnet1for all networks.
-
The OVN bridge mapping uses the central site physnet
Create an
OpenStackDataPlaneNodeSetCR for DCN1 edge site compute nodes. You must add inter-site routes to the network configuration template and use theovn-dcnservice:apiVersion: dataplane.openstack.org/v1beta1 kind: OpenStackDataPlaneNodeSet metadata: name: openstack-edpm-dcn1 spec: tlsEnabled: false networkAttachments: - ctlplane preProvisioned: true services: - redhat - bootstrap - download-cache - configure-network - validate-network - install-os - configure-os - ssh-known-hosts - run-os - reboot-os - install-certs - ovn-dcn - neutron-metadata - libvirt - nova-cell1 - telemetry nodeTemplate: ansibleSSHPrivateKeySecret: dataplane-adoption-secret ansible: ansibleUser: root ansibleVars: edpm_ovn_bridge_mappings: ["leaf1:br-ex"] edpm_ovn_bridge: br-int edpm_ovn_encap_type: geneve # Network configuration template for DCN1 site with inter-site routes edpm_network_config_template: | --- network_config: - type: ovs_bridge name: {{ neutron_physical_bridge_name }} use_dhcp: false dns_servers: {{ ctlplane_dns_nameservers }} addresses: - ip_netmask: {{ ctlplane_ip }}/{{ ctlplane_cidr }} routes:1 {{ ctlplane_host_routes }} - ip_netmask: 192.168.122.0/24 next_hop: 192.168.133.1 - ip_netmask: 192.168.144.0/24 next_hop: 192.168.133.1 members: - type: interface name: nic1 primary: true {% for network in nodeset_networks %} - type: vlan vlan_id: {{ lookup(vars, networks_lower[network] ~ _vlan_id) }} addresses: - ip_netmask: {{ lookup(vars, networks_lower[network] ~ _ip) }}/{{ lookup(vars, networks_lower[network] ~ _cidr) }} routes: {{ lookup(vars, networks_lower[network] ~ _host_routes) }} {% if network == internalapi %} - ip_netmask: 172.17.0.0/24 next_hop: 172.17.10.1 - ip_netmask: 172.17.20.0/24 next_hop: 172.17.10.1 {% endif %} {% if network == storage %} - ip_netmask: 172.18.0.0/24 next_hop: 172.18.10.1 - ip_netmask: 172.18.20.0/24 next_hop: 172.18.10.1 {% endif %} {% if network == tenant %} - ip_netmask: 172.19.0.0/24 next_hop: 172.19.10.1 - ip_netmask: 172.19.20.0/24 next_hop: 172.19.10.1 {% endif %} {% endfor %} nodes: dcn1-compute-0: hostName: dcn1-compute-0.example.com ansible: ansibleHost: dcn1-compute-0.example.com networks: - defaultRoute: true fixedIP: 192.168.133.100 name: ctlplane subnetName: ctlplanedcn1 - name: internalapi subnetName: internalapidcn1 - name: storage subnetName: storagedcn1 - name: tenant subnetName: tenantdcn1-
Replace
ovnwithovn-dcnunder spec:services. This ensures OVN controller connects to the OVN Southbound database using direct internalapi IPs instead of the unreachable ClusterIP. -
DCN1 uses the
leaf1physnet, for its OVN bridge mapping underspec:nodeTemplate:ansible:ansibleVars:edpm_ovn_bridge_mappings. - Inter-site routes must be added to the network configuration template. These routes enable DCN1 compute nodes to reach the central site (192.168.122.0/24) and other DCN sites (192.168.144.0/24 for DCN2). Similar routes are added for each service network (internalapi, storage, tenant).
-
DCN1 nodes reference site-specific subnet names like
ctlplanedcn1andinternalapidcn1. These subnet names must match those defined in theNetConfigCR.
-
Replace
Repeat step 3 for all other DCN sites. Adjust site specific parameters:
-
The nodeset name, for example:
openstack-edpm-dcn2 -
The OVN bridge mapping, for example:
leaf2:br-ex -
The subnet names, for example:
ctlplanedcn2, andinternalapidcn2 - The inter-site routes. The routes from DCN2 should point to the central site subnets and the DCN1 site subnets.
- The compute node definitions with site-appropriate IP addresses.
-
The nodeset name, for example:
Deploy all nodesets by creating an
OpenStackDataPlaneDeploymentCR:apiVersion: dataplane.openstack.org/v1beta1 kind: OpenStackDataPlaneDeployment metadata: name: openstack-edpm-deployment spec: nodeSets: - openstack-edpm - openstack-edpm-dcn1 - openstack-edpm-dcn2NoteAll nodesets can be deployed in parallel once the control plane adoption is complete.
Wait for the deployment to complete:
$ oc wait --for condition=Ready openstackdataplanedeployment/openstack-edpm-deployment --timeout=40m
Verification
Verify that all node sets reach the
Readystatus:$ oc get openstackdataplanenodeset NAME STATUS MESSAGE openstack-edpm True Ready openstack-edpm-dcn1 True Ready openstack-edpm-dcn2 True ReadyVerify that Compute services are running across all sites. Ensure that all
nova-computeservices showState=upfor nodes in all availability zones:$ oc exec openstackclient -- openstack compute service listVerify inter-site connectivity by checking routes on a DCN Compute node:
$ ssh dcn1-compute-0 ip route show | grep 172.17.0 172.17.0.0/24 via 172.17.10.1 dev internalapiTest that DCN Compute nodes can reach the control plane:
$ ssh dcn1-compute-0 ping -c 3 172.17.0.30Replace
172.17.0.30with an IP address of a control plane service on the internalapi network.
5.4. Adopting Compute services with multiple Red Hat Ceph Storage back ends (DCN) Copy linkLink copied to clipboard!
In a Distributed Compute Node (DCN) deployment where Image Service (glance) and Block Storage service (cinder) services run on edge Compute nodes, each site has its own Red Hat Ceph Storage cluster. The Compute service (nova) nodes at each site must be configured with the Red Hat Ceph Storage connection details and Image service endpoint for their local site. Because the Image service has a separate API endpoint at each site, each site’s OpenStackDataPlaneNodeSet custom resource (CR) must use a different OpenStackDataPlaneService CR that points to the correct Image service.
In a DCN deployment, all node sets belong to a single Compute service cell. The central site and each edge site are separate OpenStackDataPlaneNodeSet resources within that cell. The per-site OpenStackDataPlaneService resources deliver different Red Hat Ceph Storage and Image service configurations to each node set while sharing the same cell-level Compute service configuration.
Prerequisites
- You have adopted the Image service with multiple Red Hat Ceph Storage back ends. For more information, see Adopting the Image service with multiple Ceph back ends.
- You have adopted the Block Storage service with multiple Red Hat Ceph Storage back ends. For more information, see Adopting the Block Storage service with multiple Ceph back ends.
-
The per-site Red Hat Ceph Storage secrets (
ceph-conf-central,ceph-conf-dcn1,ceph-conf-dcn2) exist. For more information, see Configuring a Red Hat Ceph Storage back end. Retrieve the
fsidfor each Red Hat Ceph Storage cluster:$ oc get secret ceph-conf-central -o json | jq -r '.data | to_entries[] | select(.key | endswith(".conf")) | "\(.key): \(.value | @base64d)"' | grep fsid
Procedure
Set the cell name variable. In a DCN deployment, all node sets belong to a single cell:
$ DEFAULT_CELL_NAME="cell1"Retrieve the
fsidfor each Red Hat Ceph Storage cluster and store them in shell variables:$ CEPH_FSID_CENTRAL=$(oc get secret ceph-conf-central -o json | jq -r .data."<central.conf>" | base64 -d | awk /fsid/{print $3}) $ CEPH_FSID_DCN1=$(oc get secret ceph-conf-dcn1 -o json | jq -r .data."<dcn1.conf>" | base64 -d | awk /fsid/{print $3}) $ CEPH_FSID_DCN2=$(oc get secret ceph-conf-dcn2 -o json | jq -r .data."<dcn2.conf>" | base64 -d | awk /fsid/{print $3})where:
<central.conf>-
Specifies the name of the Red Hat Ceph Storage configuration file for the central site in the
ceph-conf-centralsecret. <dcn1.conf>-
Specifies the name of the Red Hat Ceph Storage configuration file for an edge site in the
ceph-conf-dcn1secret. <dcn2.conf>-
Specifies the name of the Red Hat Ceph Storage configuration file for an additional edge site in the
ceph-conf-dcn2secret.
Create a
ConfigMapfor each site. EachConfigMapcontains the Red Hat Ceph Storage and Image service configuration specific to that site.The following example creates
ConfigMapresources for a central site and two edge sites.Create the
ConfigMapfor the central site:$ oc apply -f - <<EOF apiVersion: v1 kind: ConfigMap metadata: name: nova-ceph-central data: 99-nova-compute-cells-workarounds.conf: | [workarounds] disable_compute_service_check_for_ffu=true 03-ceph-nova.conf: | [libvirt] images_type=rbd images_rbd_pool=vms images_rbd_ceph_conf=/etc/ceph/central.conf images_rbd_glance_store_name=central images_rbd_glance_copy_poll_interval=15 images_rbd_glance_copy_timeout=600 rbd_user=openstack rbd_secret_uuid=${CEPH_FSID_CENTRAL} [glance] endpoint_override = http://glance-central-internal.openstack.svc:9292 valid_interfaces = internal [cinder] cross_az_attach = False catalog_info = volumev3:cinderv3:internalURL EOFEach
ConfigMapcontains three configuration sections:-
[libvirt]points to the local Red Hat Ceph Storage cluster configuration and uses the localfsidas therbd_secret_uuid. -
[glance]usesendpoint_overrideto direct Image service requests to the local Image service API endpoint instead of the endpoint that is registered in the Identity service catalog. The examples usehttp://for the Image service endpoints. If your Red Hat OpenStack Platform deployment uses TLS for internal endpoints, usehttps://instead, and ensure that you have completed the TLS migration. For more information, see Migrating TLS-e to the RHOSO deployment. -
[cinder]setscross_az_attach = Falseto prevent volumes from being attached to instances in a different availability zone.
-
Create the
ConfigMapfor the first edge site:$ oc apply -f - <<EOF apiVersion: v1 kind: ConfigMap metadata: name: nova-ceph-dcn1 data: 99-nova-compute-cells-workarounds.conf: | [workarounds] disable_compute_service_check_for_ffu=true 03-ceph-nova.conf: | [libvirt] images_type=rbd images_rbd_pool=vms images_rbd_ceph_conf=/etc/ceph/dcn1.conf images_rbd_glance_store_name=dcn1 images_rbd_glance_copy_poll_interval=15 images_rbd_glance_copy_timeout=600 rbd_user=openstack rbd_secret_uuid=${CEPH_FSID_DCN1} [glance] endpoint_override = http://glance-dcn1-internal.openstack.svc:9292 valid_interfaces = internal [cinder] cross_az_attach = False catalog_info = volumev3:cinderv3:internalURL EOFCreate the
ConfigMapfor the second edge site:$ oc apply -f - <<EOF apiVersion: v1 kind: ConfigMap metadata: name: nova-ceph-dcn2 data: 99-nova-compute-cells-workarounds.conf: | [workarounds] disable_compute_service_check_for_ffu=true 03-ceph-nova.conf: | [libvirt] images_type=rbd images_rbd_pool=vms images_rbd_ceph_conf=/etc/ceph/dcn2.conf images_rbd_glance_store_name=dcn2 images_rbd_glance_copy_poll_interval=15 images_rbd_glance_copy_timeout=600 rbd_user=openstack rbd_secret_uuid=${CEPH_FSID_DCN2} [glance] endpoint_override = http://glance-dcn2-internal.openstack.svc:9292 valid_interfaces = internal [cinder] cross_az_attach = False catalog_info = volumev3:cinderv3:internalURL EOFImportantThe
endpoint_overridein the[glance]section is different for each site. This setting directs the Compute service to contact the local Image service API instead of the central endpoint registered in the Identity service catalog. Without this setting, all Compute nodes contact the central Image service, and image data is transferred across the WAN instead of read from the local Red Hat Ceph Storage cluster.-
Central Compute nodes use
glance-central-internal.openstack.svc -
DCN1 Compute nodes use
glance-dcn1-internal.openstack.svc -
DCN2 Compute nodes use
glance-dcn2-internal.openstack.svc
These endpoint names correspond to the
GlanceAPIinstances that are created when you adopt the Image service with DCN back ends.-
Central Compute nodes use
Create a per-site
OpenStackDataPlaneServiceCR for each site. Each service references the site-specificConfigMapthat you created in the previous step:$ oc apply -f - <<EOF --- apiVersion: dataplane.openstack.org/v1beta1 kind: OpenStackDataPlaneService metadata: name: nova-custom-ceph-central spec: dataSources: - configMapRef: name: nova-ceph-central - secretRef: name: nova-${DEFAULT_CELL_NAME}-compute-config - secretRef: name: nova-migration-ssh-key playbook: osp.edpm.nova caCerts: combined-ca-bundle edpmServiceType: nova containerImageFields: - NovaComputeImage - EdpmIscsidImage --- apiVersion: dataplane.openstack.org/v1beta1 kind: OpenStackDataPlaneService metadata: name: nova-custom-ceph-dcn1 spec: dataSources: - configMapRef: name: nova-ceph-dcn1 - secretRef: name: nova-${DEFAULT_CELL_NAME}-compute-config - secretRef: name: nova-${DEFAULT_CELL_NAME}-metadata-neutron-config - secretRef: name: nova-migration-ssh-key playbook: osp.edpm.nova caCerts: combined-ca-bundle edpmServiceType: nova containerImageFields: - NovaComputeImage - EdpmIscsidImage --- apiVersion: dataplane.openstack.org/v1beta1 kind: OpenStackDataPlaneService metadata: name: nova-custom-ceph-dcn2 spec: dataSources: - configMapRef: name: nova-ceph-dcn2 - secretRef: name: nova-${DEFAULT_CELL_NAME}-compute-config - secretRef: name: nova-${DEFAULT_CELL_NAME}-metadata-neutron-config - secretRef: name: nova-migration-ssh-key playbook: osp.edpm.nova caCerts: combined-ca-bundle edpmServiceType: nova containerImageFields: - NovaComputeImage - EdpmIscsidImage EOFNoteAll
OpenStackDataPlaneServiceCRs reference the same cell secret (nova-cell1-compute-config) because all node sets belong to a single cell. The per-siteConfigMapis what differentiates the Red Hat Ceph Storage and Image service configuration for each site.When you create the
OpenStackDataPlaneNodeSetCR for each site, reference the per-site service in theserviceslist instead ofnova-$CELL. For example:-
The central node set uses
nova-custom-ceph-centralin itsserviceslist. -
The DCN1 node set uses
nova-custom-ceph-dcn1in itsserviceslist. The DCN2 node set uses
nova-custom-ceph-dcn2in itsserviceslist.If you have already created the
OpenStackDataPlaneNodeSetCRs with the defaultnova-$CELLservice, patch each node set to use the per-site service. The following example patches the central node set:$ oc patch osdpns/openstack-${DEFAULT_CELL_NAME} --type=merge --patch " spec: services: - bootstrap - download-cache - configure-network - validate-network - install-os - configure-os - ssh-known-hosts - run-os - reboot-os - install-certs - ceph-client - ovn - neutron-metadata - libvirt - nova-custom-ceph-central nodeTemplate: extraMounts: - extraVolType: Ceph volumes: - name: ceph secret: secretName: ceph-conf-central mounts: - name: ceph mountPath: "/etc/ceph" readOnly: true "Patch each DCN edge node set with the same services list, replacing
ovnwithovn-dcnandnova-custom-ceph-centralwith the per-site service name. You must include theceph-clientservice so that the Red Hat Ceph Storage configuration files from the per-site secret are deployed into the Compute service containers on the edge nodes. Withoutceph-client, the/etc/ceph/directory inside the Compute service container is empty and instances fail to launch with aRADOS object not found (error calling conf_read_file)error.For example, for the DCN1 node set named
dcn1:$ oc patch osdpns/dcn1 --type=merge --patch " spec: services: - bootstrap - download-cache - configure-network - validate-network - install-os - configure-os - ssh-known-hosts - run-os - reboot-os - install-certs - ceph-client - ovn-dcn - neutron-metadata - libvirt - nova-custom-ceph-dcn1 nodeTemplate: extraMounts: - extraVolType: Ceph volumes: - name: ceph secret: secretName: ceph-conf-dcn1 mounts: - name: ceph mountPath: "/etc/ceph" readOnly: true "Repeat this step for each additional edge site, replacing
dcn1andnova-custom-ceph-dcn1with the appropriate site name, for example,dcn2andnova-custom-ceph-dcn2.
-
The central node set uses
5.5. Performing a fast-forward upgrade on Compute services Copy linkLink copied to clipboard!
You must upgrade the Compute services from Red Hat OpenStack Platform 17.1 to Red Hat OpenStack Services on OpenShift (RHOSO) 18.0 on the control plane and data plane by completing the following tasks:
- Update the cell1 Compute data plane services version.
- Remove pre-fast-forward upgrade workarounds from the Compute control plane services and Compute data plane services.
- Run Compute database online migrations to update live data.
Prerequisites
You have defined the shell variables necessary to apply the fast-forward upgrade commands for each Compute service cell.
DEFAULT_CELL_NAME="cell1" RENAMED_CELLS="$DEFAULT_CELL_NAME" declare -A PODIFIED_DB_ROOT_PASSWORD for CELL in $(echo "super $RENAMED_CELLS"); do PODIFIED_DB_ROOT_PASSWORD[$CELL]=$(oc get -o json secret/osp-secret | jq -r .data.DbRootPassword | base64 -d) done- You have completed the steps in Adopting Compute services to the RHOSO data plane.
Procedure
Wait for the Compute service data plane services version to update for all the cells:
for CELL in $(echo $RENAMED_CELLS); do oc exec openstack-$CELL-galera-0 -c galera -- mysql -rs -uroot -p"${PODIFIED_DB_ROOT_PASSWORD[$CELL]}" \ -e "select a.version from nova_${CELL}.services a join nova_${CELL}.services b where a.version!=b.version and a.binary='nova-compute' and a.deleted=0;" doneNoteThe query returns an empty result when the update is completed. No downtime is expected for virtual machine (VM) workloads.
Review any errors in the nova Compute agent logs on the data plane, and the
nova-conductorjournal records on the control plane.Patch the
OpenStackControlPlaneCR to remove the pre-fast-forward upgrade workarounds from the Compute control plane services:$ rm -f celltemplates $ for CELL in $(echo $RENAMED_CELLS); do $ cat >> celltemplates << EOF ${CELL}: metadataServiceTemplate: customServiceConfig: | [workarounds] disable_compute_service_check_for_ffu=false conductorServiceTemplate: customServiceConfig: | [workarounds] disable_compute_service_check_for_ffu=false EOF done $ cat > oscp-patch.yaml << EOF spec: nova: template: apiServiceTemplate: customServiceConfig: | [workarounds] disable_compute_service_check_for_ffu=false metadataServiceTemplate: customServiceConfig: | [workarounds] disable_compute_service_check_for_ffu=false schedulerServiceTemplate: customServiceConfig: | [workarounds] disable_compute_service_check_for_ffu=false cellTemplates: cell0: conductorServiceTemplate: customServiceConfig: | [workarounds] disable_compute_service_check_for_ffu=false EOF $ cat celltemplates >> oscp-patch.yamlIf you are adopting the Compute service with the Bare Metal Provisioning service (ironic), append the following
novaComputeTemplatesin thecell<X>section of the Compute service CR patch:cell<X>: novaComputeTemplates: <hostname>: customServiceConfig: | [DEFAULT] host = <hostname> [workarounds] disable_compute_service_check_for_ffu=true computeDriver: ironic.IronicDriver ...where:
- <hostname>
-
Specifies the hostname of the node that is running the
ironicCompute driver in the source cloud ofcell<X>.
Apply the patch file:
$ oc patch openstackcontrolplane openstack --type=merge --patch-file=oscp-patch.yamlWait until the Compute control plane services CRs are ready:
$ oc wait --for condition=Ready --timeout=300s Nova/novaRemove the pre-fast-forward upgrade workarounds from the Compute data plane services:
$ oc patch cm nova-cells-global-config --type=json -p='[{"op": "replace", "path": "/data/99-nova-compute-cells-workarounds.conf", "value": "[workarounds]\n"}]' $ for CELL in $(echo $RENAMED_CELLS); do $ oc get Openstackdataplanenodeset openstack-${CELL} || continue $ oc apply -f - <<EOF --- apiVersion: dataplane.openstack.org/v1beta1 kind: OpenStackDataPlaneDeployment metadata: name: openstack-nova-compute-ffu-$CELL spec: nodeSets: - openstack-${CELL} servicesOverride: - nova-${CELL} backoffLimit: 3 EOF doneWait for the Compute data plane services to be ready for all the cells:
$ oc wait --for condition=Ready openstackdataplanedeployments --all --timeout=5mRun Compute database online migrations to complete the upgrade:
$ oc exec -it nova-cell0-conductor-0 -- nova-manage db online_data_migrations $ for CELL in $(echo $RENAMED_CELLS); do $ oc exec -it nova-${CELL}-conductor-0 -- nova-manage db online_data_migrations doneDiscover the Compute hosts in the cells:
$ oc rsh nova-cell0-conductor-0 nova-manage cell_v2 discover_hosts --verboseIf you have a test VM that is not a production workload, complete the following verification steps:
Verify if the existing test VM instance is running:
${BASH_ALIASES[openstack]} server --os-compute-api-version 2.48 show --diagnostics test 2>&1 || echo FAILVerify if the Compute services can stop the existing test VM instance:
${BASH_ALIASES[openstack]} server list -c Name -c Status -f value | grep -qF "test ACTIVE" && ${BASH_ALIASES[openstack]} server stop test || echo PASS ${BASH_ALIASES[openstack]} server list -c Name -c Status -f value | grep -qF "test SHUTOFF" || echo FAIL ${BASH_ALIASES[openstack]} server --os-compute-api-version 2.48 show --diagnostics test 2>&1 || echo PASSVerify if the Compute services can start the existing test VM instance:
${BASH_ALIASES[openstack]} server list -c Name -c Status -f value | grep -qF "test SHUTOFF" && ${BASH_ALIASES[openstack]} server start test || echo PASS ${BASH_ALIASES[openstack]} server list -c Name -c Status -f value | grep -qF "test ACTIVE" && \ ${BASH_ALIASES[openstack]} server --os-compute-api-version 2.48 show --diagnostics test --fit-width -f json | jq -r '.state' | grep running || echo FAIL
Next steps
After the data plane adoption, the Compute hosts continue to run Red Hat Enterprise Linux (RHEL) 9.2. To take advantage of RHEL 9.4, perform a minor update procedure after finishing the adoption procedure.
5.6. Adopting Networker services to the RHOSO data plane Copy linkLink copied to clipboard!
Adopt the Networker services in your existing Red Hat OpenStack Platform deployment to the Red Hat OpenStack Services on OpenShift (RHOSO) data plane. The Networker services could be running on Controller nodes or dedicated Networker nodes. You decide which services you want to run on the Networker nodes, and create a separate OpenStackDataPlaneNodeSet custom resource (CR) for the Networker nodes.
By definition, any node that has set enable-chassis-as-gw is considered a Networker node. After the adoption process, these nodes continue to be Networker nodes.
You can implement the following options if they apply to your environment:
-
Depending on your topology, you might need to run the
neutron-metadataservice on the nodes, specifically when you want to serve metadata to SR-IOV ports that are hosted on Compute nodes. -
If you want to continue running OVN gateway services on Networker nodes, keep
ovnservice in the list to deploy. -
Optional: You can run the
neutron-dhcpservice on your Networker nodes instead of your Compute nodes. You might not need to useneutron-dhcpwith OVN, unless your deployment uses DHCP relays, or advanced DHCP options that are supported by dnsmasq but not by the OVN DHCP implementation.
Adopt each Controller or Networker node in your existing Red Hat OpenStack Platform deployment to the Red Hat OpenStack Services on OpenShift (RHOSO) when your node is set as an OVN chassis gateway. Any node with parameter set to enable-chassis-as-gw is considered OVN gateway chassis. In this case, such nodes will become edpm networker nodes after adoption.
Check for the nodes where
OVN Controller Gateway agentagents are running. The list of agents varies depending on the services you enabled:$ oc exec openstackclient -- openstack network agent list +--------------------------------------+------------------------------+--------------------------+-------------------+-------+-------+----------------------------+ | ID | Agent Type | Host | Availability Zone | Alive | State | Binary | +--------------------------------------+------------------------------+--------------------------+-------------------+-------+-------+----------------------------+ | e5075ee0-9dd9-4f0a-a42a-6bbdf1a6111c | OVN Controller Gateway agent | controller-0.localdomain | | XXX | UP | ovn-controller | | f3112349-054c-403a-b00a-e219238192b8 | OVN Controller agent | compute-0.localdomain | | XXX | UP | ovn-controller | | af9dae2d-1c1c-55a8-a743-f84719f6406d | OVN Metadata agent | compute-0.localdomain | | XXX | UP | neutron-ovn-metadata-agent | | 51a11df8-a66e-47a2-aec0-52eb8589626c | OVN Controller Gateway agent | controller-1.localdomain | | XXX | UP | ovn-controller | | bb817e5e-7832-410a-9e67-934dac8c602f | OVN Controller Gateway agent | controller-2.localdomain | | XXX | UP | ovn-controller | +--------------------------------------+------------------------------+--------------------------+-------------------+-------+-------+----------------------------+
Prerequisites
Define the shell variable. Based on above agent list output, controller-0, controller-1, controller-2 are our target hosts. If you have both
ControllerandNetworkernodes running networker services then add all those hosts below.declare -A networkers networkers+=( ["controller-0.localdomain"]="192.168.122.100" ["controller-1.localdomain"]="192.168.122.101" ["controller-2.localdomain"]="192.168.122.102" # ... )-
Replace
["<node-name>"]="192.168.122.100"with the name and IP address of the corresponding Networker or Controller node as per your environment.
-
Replace
Procedure
Deploy the
OpenStackDataPlaneNodeSetCR for your nodes:NoteYou can reuse most of the
nodeTemplatesection from theOpenStackDataPlaneNodeSetCR that is designated for your Compute nodes. You can omit some of the variables because of the limited set of services that are running on the Networker nodes.$ oc apply -f - <<EOF apiVersion: dataplane.openstack.org/v1beta1 kind: OpenStackDataPlaneNodeSet metadata: name: openstack-networker spec: tlsEnabled: false networkAttachments: - ctlplane preProvisioned: true services: - redhat - bootstrap - download-cache - configure-network - validate-network - install-os - configure-os - ssh-known-hosts - run-os - install-certs - ovn env: - name: ANSIBLE_CALLBACKS_ENABLED value: "profile_tasks" - name: ANSIBLE_FORCE_COLOR value: "True" nodes: controller-0: hostName: controller-0 ansible: ansibleHost: ${networkers[controller-0.localdomain]} networks: - defaultRoute: true fixedIP: ${networkers[controller-0.localdomain]} name: ctlplane subnetName: subnet1 - name: internalapi subnetName: subnet1 - name: storage subnetName: subnet1 - name: tenant subnetName: subnet1 controller-1: hostName: controller-1 ansible: ansibleHost: ${networkers[controller-1.localdomain]} networks: - defaultRoute: true fixedIP: ${networkers[controller-1.localdomain]} name: ctlplane subnetName: subnet1 - name: internalapi subnetName: subnet1 - name: storage subnetName: subnet1 - name: tenant subnetName: subnet1 controller-2: hostName: controller-2 ansible: ansibleHost: ${networkers[controller-2.localdomain]} networks: - defaultRoute: true fixedIP: ${networkers[controller-2.localdomain]} name: ctlplane subnetName: subnet1 - name: internalapi subnetName: subnet1 - name: storage subnetName: subnet1 - name: tenant subnetName: subnet1 nodeTemplate: ansibleSSHPrivateKeySecret: dataplane-adoption-secret ansible: ansibleUser: root ansibleVarsFrom: - secretRef: name: subscription-manager - secretRef: name: redhat-registry ansibleVars: rhc_release: 9.2 rhc_repositories: - {name: "*", state: disabled} - {name: "rhel-9-for-x86_64-baseos-eus-rpms", state: enabled} - {name: "rhel-9-for-x86_64-appstream-eus-rpms", state: enabled} - {name: "rhel-9-for-x86_64-highavailability-eus-rpms", state: enabled} - {name: "rhoso-18.0-for-rhel-9-x86_64-rpms", state: enabled} - {name: "fast-datapath-for-rhel-9-x86_64-rpms", state: enabled} - {name: "rhceph-7-tools-for-rhel-9-x86_64-rpms", state: enabled} edpm_bootstrap_release_version_package: [] # edpm_network_config # Default nic config template for a EDPM node # These vars are edpm_network_config role vars edpm_network_config_template: | --- {% set mtu_list = [ctlplane_mtu] %} {% for network in nodeset_networks %} {% set _ = mtu_list.append(lookup(vars, networks_lower[network] ~ _mtu)) %} {%- endfor %} {% set min_viable_mtu = mtu_list | max %} network_config: - type: ovs_bridge name: {{ neutron_physical_bridge_name }} mtu: {{ min_viable_mtu }} use_dhcp: false dns_servers: {{ ctlplane_dns_nameservers }} domain: {{ dns_search_domains }} addresses: - ip_netmask: {{ ctlplane_ip }}/{{ ctlplane_cidr }} routes: {{ ctlplane_host_routes }} members: - type: interface name: nic1 mtu: {{ min_viable_mtu }} # force the MAC address of the bridge to this interface primary: true {% for network in nodeset_networks %} - type: vlan mtu: {{ lookup(vars, networks_lower[network] ~ _mtu) }} vlan_id: {{ lookup(vars, networks_lower[network] ~ _vlan_id) }} addresses: - ip_netmask: {{ lookup(vars, networks_lower[network] ~ _ip) }}/{{ lookup(vars, networks_lower[network] ~ _cidr) }} routes: {{ lookup(vars, networks_lower[network] ~ _host_routes) }} {% endfor %} edpm_network_config_nmstate: false edpm_network_config_hide_sensitive_logs: false # # These vars are for the network config templates themselves and are # considered EDPM network defaults. neutron_physical_bridge_name: br-ctlplane neutron_public_interface_name: eth0 # edpm_nodes_validation edpm_nodes_validation_validate_controllers_icmp: false edpm_nodes_validation_validate_gateway_icmp: false # edpm ovn-controller configuration edpm_ovn_bridge_mappings: [<"bridge_mappings">] edpm_ovn_bridge: br-int edpm_ovn_encap_type: geneve ovn_monitor_all: true edpm_ovn_remote_probe_interval: 60000 edpm_ovn_ofctrl_wait_before_clear: 8000 # serve as a OVN gateway edpm_enable_chassis_gw: true timesync_ntp_servers: - hostname: clock.redhat.com - hostname: clock2.redhat.com gather_facts: false enable_debug: false # edpm firewall, change the allowed CIDR if needed edpm_sshd_configure_firewall: true edpm_sshd_allowed_ranges: [192.168.122.0/24] # SELinux module edpm_selinux_mode: enforcing # Do not attempt OVS major upgrades here edpm_ovs_packages: - openvswitch3.3 EOF-
spec.tlsEnabledspecifies whether TLS Everywhere is enabled. If TLS is enabled, changespec:tlsEnabledtotrue. -
edpm_ovn_bridge_mappings: Replace[<"bridge_mappings">]with the bridge mapping values that you used in your Red Hat OpenStack Platform 17.1 deployment, for example,["datacentre:br-ctlplane"]. -
edpm_enable_chassis_gwspecifies whether to runovn-controllerin gateway mode.
-
Ensure that you use the same
ovn-controllersettings in theOpenStackDataPlaneNodeSetCR that you used in the Networker nodes before adoption. This configuration is stored in theexternal_idscolumn in theOpen_vSwitchtable in the Open vSwitch database:ovs-vsctl list Open . ... external_ids : {hostname=controller-0.localdomain, ovn-bridge=br-int, ovn-bridge-mappings=<bridge_mappings>, ovn-chassis-mac-mappings="datacentre:1e:0a:bb:e6:7c:ad", ovn-cms-options=enable-chassis-as-gw, ovn-encap-ip="172.19.0.100", ovn-encap-tos="0", ovn-encap-type=geneve, ovn-match-northd-version=False, ovn-monitor-all=True, ovn-ofctrl-wait-before-clear="8000", ovn-openflow-probe-interval="60", ovn-remote="tcp:ovsdbserver-sb.openstack.svc:6642", ovn-remote-probe-interval="60000", rundir="/var/run/openvswitch", system-id="2eec68e6-aa21-4c95-a868-31aeafc11736"} ...-
Replace
<bridge_mappings>with the value of the bridge mappings in your configuration, for example,"datacentre:br-ctlplane".
-
Replace
Optional: Enable
neutron-metadatain theOpenStackDataPlaneNodeSetCR:$ oc patch openstackdataplanenodeset <networker_CR_name> --type='json' --patch='[ { "op": "add", "path": "/spec/services/-", "value": "neutron-metadata" }]'-
Replace
<networker_CR_name>with the name of the CR that you deployed for your Networker nodes, for example,openstack-networker.
-
Replace
Optional: Enable
neutron-dhcpin theOpenStackDataPlaneNodeSetCR:$ oc patch openstackdataplanenodeset <networker_CR_name> --type='json' --patch='[ { "op": "add", "path": "/spec/services/-", "value": "neutron-dhcp" }]'Run the
pre-adoption-validationservice for Networker nodes:Create a
OpenStackDataPlaneDeploymentCR that runs only the validation:$ oc apply -f - <<EOF apiVersion: dataplane.openstack.org/v1beta1 kind: OpenStackDataPlaneDeployment metadata: name: openstack-pre-adoption-networker spec: nodeSets: - openstack-networker servicesOverride: - pre-adoption-validation EOFWhen the validation is finished, confirm that the status of the Ansible EE pods is
Completed:$ watch oc get pod -l app=openstackansibleee$ oc logs -l app=openstackansibleee -f --max-log-requests 20Wait for the deployment to reach the
Readystatus:$ oc wait --for condition=Ready openstackdataplanedeployment/openstack-pre-adoption-networker --timeout=10m
Deploy the
OpenStackDataPlaneDeploymentCR for Networker nodes:$ oc apply -f - <<EOF apiVersion: dataplane.openstack.org/v1beta1 kind: OpenStackDataPlaneDeployment metadata: name: openstack-networker spec: nodeSets: - openstack-networker EOFNoteAlternatively, you can include the Networker node set in the
nodeSetslist before you deploy the mainOpenStackDataPlaneDeploymentCR. You cannot add new node sets to theOpenStackDataPlaneDeploymentCR after deployment.Clean up any Networking service (neutron) agents that are no longer running.
NoteIn some cases, agents from the old data plane that are replaced or retired remain in RHOSO. The function these agents provided might be provided by a new agent that is running in RHOSO, or the function might be replaced by other components. For example, DHCP agents might no longer be needed, since OVN DHCP in RHOSO can provide this function.
List the agents:
$ oc exec openstackclient -- openstack network agent list +--------------------------------------+------------------------------+--------------------------+-------------------+-------+-------+----------------------------+ | ID | Agent Type | Host | Availability Zone | Alive | State | Binary | +--------------------------------------+------------------------------+--------------------------+-------------------+-------+-------+----------------------------+ | e5075ee0-9dd9-4f0a-a42a-6bbdf1a6111c | OVN Controller Gateway agent | controller-0.localdomain | | :-) | UP | ovn-controller | | 856960f0-5530-46c7-a331-6eadcba362da | DHCP agent | controller-1.localdomain | nova | XXX | UP | neutron-dhcp-agent | | 8bd22720-789f-45b8-8d7d-006dee862bf9 | DHCP agent | controller-2.localdomain | nova | XXX | UP | neutron-dhcp-agent | | e584e00d-be4c-4e98-a11a-4ecd87d21be7 | DHCP agent | controller-0.localdomain | nova | XXX | UP | neutron-dhcp-agent | +--------------------------------------+------------------------------+--------------------------+-------------------+-------+-------+----------------------------+If any agent in the list shows
XXXin theAlivefield, verify the Host and Agent Type, if the functions of this agent is no longer required, and the agent has been permanently stopped on the Red Hat OpenStack Platform host. Then, delete the agent:$ oc exec openstackclient -- openstack network agent delete <agent_id>-
Replace
<agent_id>with the ID of the agent to delete, for example,856960f0-5530-46c7-a331-6eadcba362da.
-
Replace
Verification
Confirm that all the Ansible EE pods reach a
Completedstatus:$ watch oc get pod -l app=openstackansibleee$ oc logs -l app=openstackansibleee -f --max-log-requests 20Wait for the data plane node set to reach the
Readystatus:$ oc wait --for condition=Ready osdpns/<networker_CR_name> --timeout=30m-
Replace
<networker_CR_name>with the name of the CR that you deployed for your Networker nodes, for example,openstack-networker.
-
Replace
Verify that the Networking service (neutron) agents are running. The list of agents varies depending on the services you enabled:
$ oc exec openstackclient -- openstack network agent list +--------------------------------------+------------------------------+--------------------------+-------------------+-------+-------+----------------------------+ | ID | Agent Type | Host | Availability Zone | Alive | State | Binary | +--------------------------------------+------------------------------+--------------------------+-------------------+-------+-------+----------------------------+ | e5075ee0-9dd9-4f0a-a42a-6bbdf1a6111c | OVN Controller Gateway agent | controller-0.localdomain | | :-) | UP | ovn-controller | | f3112349-054c-403a-b00a-e219238192b8 | OVN Controller agent | compute-0.localdomain | | :-) | UP | ovn-controller | | af9dae2d-1c1c-55a8-a743-f84719f6406d | OVN Metadata agent | compute-0.localdomain | | :-) | UP | neutron-ovn-metadata-agent | | 51a11df8-a66e-47a2-aec0-52eb8589626c | OVN Controller Gateway agent | controller-1.localdomain | | :-) | UP | ovn-controller | | bb817e5e-7832-410a-9e67-934dac8c602f | OVN Controller Gateway agent | controller-2.localdomain | | :-) | UP | ovn-controller | +--------------------------------------+------------------------------+--------------------------+-------------------+-------+-------+----------------------------+
5.7. Enabling the high availability for Compute instances service Copy linkLink copied to clipboard!
To enable the high availability for Compute instances (Instance HA) service, you create the following resources:
- Fencing secret.
- Configuration map. You can create the configuration map manually, or the configuration map is created automatically when you deploy the Instance HA resource. However, you must create the configuration map manually if you want to disable the Instance HA service.
- Instance HA resource.
Prerequisites
-
You have created the
fencing-secret.yamlconfiguration file. For more information, see Maintaining the Instance HA functionality after adoption. - You have disabled Pacemaker on your Compute nodes. For more information, see Preventing Pacemaker from monitoring Compute nodes.
Procedure
Create the secret:
$ oc apply -f fencing-secret.yaml -n openstackOptional: Create the Instance HA configuration map and set the
DISABLEDparameter tofalse. For example:$ cat << EOF > iha-cm.yaml kind: ConfigMap metadata: name: instanceha-0-config namespace: openstack apiVersion: v1 data: config.yaml: | config: EVACUABLE_TAG: "evacuable" TAGGED_IMAGES: "true" TAGGED_FLAVORS: "true" TAGGED_AGGREGATES: "true" SMART_EVACUATION: "false" DELTA: "30" DELAY: "0" POLL: "45" THRESHOLD: "50" WORKERS: "4" RESERVED_HOSTS: "false" LEAVE_DISABLED: "false" CHECK_KDUMP: "false" LOGLEVEL: "info" DISABLED: "false" EOFApply the configuration:
$ oc apply -f iha-cm.yaml -n openstackNoteIf you want to restrict which Compute nodes are evacuated, create host aggregates and set them by using the
EVACUABLE_TAGparameter. Alternatively, you can set theTAGGED_AGGREGATESparameter tofalseto enable monitoring and evacuation of all your Compute nodes. For more information about Instance HA service parameters, see Editing the Instance HA service parameters in Configuring high availability for instances.
Create an Instance HA resource and reference the fencing secret and configuration map. For example:
$ cat << EOF > iha.yaml apiVersion: instanceha.openstack.org/v1beta1 kind: InstanceHa metadata: name: instanceha-0 namespace: openstack spec: caBundleSecretName: combined-ca-bundle instanceHaConfigMap: fencingSecret: fencing-secret EOF-
spec.instanceHaConfigMapdefines the name of the YAML file containing the Instance HA configuration map that you created. If you do not create this file, then a YAML file calledinstanceha-configis created automatically when the Instance HA service is installed, providing the default values of the Instance HA service parameters. You can then edit the values as needed.
-
Deploy the Instance HA resource:
$ oc apply -f iha.yaml -n openstack
Next steps
After you complete the Red Hat OpenStack Services on OpenShift adoption, remove the Pacemaker components from the Compute nodes. You must run the following commands on each Compute node:
$ sudo systemctl stop pacemaker_remote $ sudo systemctl stop pcsd $ sudo systemctl stop pcsd-ruby.service $ sudo systemctl disable pacemaker_remote $ sudo systemctl disable pcsd $ sudo systemctl disable pcsd-ruby.service $ sudo dnf remove pacemaker pacemaker-remote pcs pcsd -y
5.8. Post-adoption tasks for the Load-balancing service Copy linkLink copied to clipboard!
If you adopted the Load-balancing service (octavia), after you complete the data plane adoption, you must perform the following tasks:
- Upgrade the amphorae virtual machines to the new images.
- Remove obsolete resources from your existing load balancers.
Prerequisites
- You have adopted the Load-balancing service. For more information, see Adopting the Load-balancing service.
Procedure
Ensure that the connectivity between the new control plane and the adopted Compute nodes is functional by creating a new load balancer and checking that its
provisioning_statusbecomesACTIVE:$ alias openstack="oc exec -t openstackclient -- openstack" $ openstack loadbalancer create --vip-subnet-id public-subnet --name lb-post-adoption --waitTrigger a failover for all existing load balancers to upgrade the amphorae virtual machines to use the new image and to establish connectivity with the new control plane:
$ openstack loadbalancer list -f value -c id | \ xargs -r -n1 -P4 ${BASH_ALIASES[openstack]} loadbalancer failover --waitDelete old flavors that were migrated to the new control plane:
$ openstack flavor delete octavia_65 # The following flavors might not exist in OSP 17.1 deployments $ openstack flavor show octavia_amphora-mvcpu-ha && \ openstack flavor delete octavia_amphora-mvcpu-ha $ openstack loadbalancer flavor show octavia_amphora-mvcpu-ha && \ openstack loadbalancer flavor delete octavia_amphora-mvcpu-ha $ openstack loadbalancer flavorprofile show octavia_amphora-mvcpu-ha_profile && \ openstack loadbalancer flavorprofile delete octavia_amphora-mvcpu-ha_profileNoteSome flavors might still be used by load balancers and cannot be deleted.
Delete the old management network and its ports:
$ for net_id in $(openstack network list -f value -c ID --name lb-mgmt-net); do \ desc=$(openstack network show "$net_id" -f value -c description); \ [ -z "$desc" ] && WALLABY_LB_MGMT_NET_ID="$net_id" ; \ done $ for id in $(openstack port list --network "$WALLABY_LB_MGMT_NET_ID" -f value -c ID); do \ openstack port delete "$id" ; \ done $ openstack network delete "$WALLABY_LB_MGMT_NET_ID"Verify that only one
lb-mgmt-netand onelb-mgmt-subnetexists:$ openstack network list | grep lb-mgmt-net | fe470c29-0482-4809-9996-6d636e3feea3 | lb-mgmt-net | 6a881091-097d-441c-937b-5a23f4f243b7 | $ openstack subnet list | grep lb-mgmt-subnet | 6a881091-097d-441c-937b-5a23f4f243b7 | lb-mgmt-subnet | fe470c29-0482-4809-9996-6d636e3feea3 | 172.24.0.0/16 |