Adopting a Red Hat OpenStack Platform 17.1 deployment
Adopting a Red Hat OpenStack Platform 17.1 overcloud to a Red Hat OpenStack Services on OpenShift 18.0 data plane
Abstract
Providing feedback on Red Hat documentation Copy linkLink copied to clipboard!
We appreciate your feedback. Tell us how we can improve the documentation.
To provide documentation feedback for Red Hat OpenStack Services on OpenShift (RHOSO), create a Jira issue in the OSPRH Jira project.
Procedure
- Log in to the Red Hat Atlassian Jira.
- Click the following link to open a Create Issue page: Create issue
- Complete the Summary and Description fields. In the Description field, include the documentation URL, chapter or section number, and a detailed description of the issue.
- Click Create.
- Review the details of the bug you created.
Chapter 1. Red Hat OpenStack Services on OpenShift 18.0 adoption overview Copy linkLink copied to clipboard!
Adoption is the process of migrating a Red Hat OpenStack Platform (RHOSP) 17.1 control plane to Red Hat OpenStack Services on OpenShift 18.0, and then completing an in-place upgrade of the data plane. You can retain existing infrastructure investments and modernize your RHOSP deployment on a containerized Red Hat OpenShift Container Platform (RHOCP) foundation. To ensure that you understand the entire adoption process and how to sufficiently prepare your RHOSP environment, review the prerequisites, adoption process, and post-adoption tasks.
Read the whole adoption guide before you start the adoption to ensure that you understand the procedure. Prepare the necessary configuration snippets for each RHOSP service in advance, and test the migration in a representative test environment before you apply it to production.
1.1. Adoption limitations Copy linkLink copied to clipboard!
Before you proceed with the adoption, check which features are Technology Previews or unsupported.
- Technology Preview
The following features are Technology Previews and have not been tested within the context of the Red Hat OpenStack Services on OpenShift (RHOSO) adoption:
- Key Manager service (barbican) adoption with Proteccio hardware security module (HSM) integration
DNS-as-a-service (designate)
The following Compute service (nova) features are Technology Previews:
- NUMA-aware vswitches
- PCI passthrough by flavor
- SR-IOV trusted virtual functions
- vGPU
- Emulated virtual Trusted Platform Module (vTPM)
- UEFI
- AMD SEV
- Direct download from Rados Block Device (RBD)
- File-backed memory
-
Defining a custom inventory of resources in a YAML file,
provider.yaml
- Unsupported features
The adoption process does not support the following features:
- Adopting Border Gateway Protocol (BGP) environments to the RHOSO data plane
- Adopting a Federal Information Processing Standards (FIPS) environment
1.2. Adoption prerequisites Copy linkLink copied to clipboard!
Before you begin the adoption procedure, complete the following prerequisites:
- Planning information
- Review the Adoption limitations.
- Review the Red Hat OpenShift Container Platform (RHOCP) requirements, data plane node requirements, Compute node requirements, and so on. For more information, see Planning your deployment.
- Review the adoption-specific networking requirements. For more information, see Configuring the network for the RHOSO deployment.
- Review the adoption-specific storage requirements. For more information, see Storage requirements.
- Review how to customize your deployed control plane with the services that are required for your environment. For more information, see Customizing the Red Hat OpenStack Services on OpenShift deployment.
Familiarize yourself with the following RHOCP concepts that are used during adoption:
- Familiarize yourself with mapping RHOSO versions to OpenStack Operators and OpenStackVersion custom resources (CRs). For more information, see the Red Hat Knowledgebase article How RHOSO versions map to OpenStack Operators and OpenStackVersion CRs.
- Back-up information
Back up your Red Hat OpenStack Platform (RHOSP) 17.1 environment by using one of the following options:
- The Relax-and-Recover tool. For more information, see Backing up the undercloud and the control plane nodes by using the Relax-and-Recover tool in Backing up and restoring the undercloud and control plane nodes.
- The Snapshot and Revert tool. For more information, see Backing up your Red Hat OpenStack Platform cluster by using the Snapshot and Revert tool in Backing up and restoring the undercloud and control plane nodes.
- A third-party backup and recovery tool. For more information about certified backup and recovery tools, see the Red Hat Ecosystem Catalog.
- Back up the configuration files from the RHOSP services and director on your file system. For more information, see Pulling the configuration from a director deployment.
- Compute
- Upgrade your Compute nodes to Red Hat Enterprise Linux 9.2. For more information, see Upgrading all Compute nodes to RHEL 9.2 in Framework for upgrades (16.2 to 17.1).
-
On your Compute hosts, the
systemd-containerpackage must be installed and thesystemd-machinedservice must be running. For more information about how to verify that the package is installed and that the service is running, see Installing thesystemd-containerpackage on Compute hosts.
- ML2/OVS
- If you use the Modular Layer 2 plug-in with Open vSwitch mechanism driver (ML2/OVS), migrate it to the Modular Layer 2 plug-in with Open Virtual Networking (ML2/OVN) mechanism driver. For more information, see Migrating to the OVN mechanism driver.
- Tools
- The oc and podman command line tools are installed on your workstation.
Make sure to set the correct RHOSO project namespace in which to run commands.
$ oc project openstack
- RHOSP 17.1 release
- The RHOSP 17.1 cloud is updated to the 17.1.4 release or later. For more information, see Performing a minor update of Red Hat OpenStack Platform.
- RHOSP 17.1 hosts
- All control plane and data plane hosts of the RHOSP 17.1 cloud are up and running, and continue to run throughout the adoption procedure.
1.3. Guidelines for planning the adoption Copy linkLink copied to clipboard!
When planning to adopt a Red Hat OpenStack Services on OpenShift (RHOSO) 18.0 environment, consider the scope of the change. An adoption is similar in scope to a data center upgrade. Different firmware levels, hardware vendors, hardware profiles, networking interfaces, storage interfaces, and so on affect the adoption process and can cause changes in behavior during the adoption.
Review the following guidelines to adequately plan for the adoption and increase the chance that you complete the adoption successfully:
All commands in the adoption documentation are examples. Do not copy and paste the commands without understanding what the commands do.
- To minimize the risk of an adoption failure, reduce the number of environmental differences between the staging environment and the production sites.
- If the staging environment is not representative of the production sites or if a staging environment is not available, you must plan to include contingency time in case the adoption fails.
Review your custom Red Hat OpenStack Platform (RHOSP) service configuration at every major release.
- Every major release upgrades through multiple OpenStack releases.
- Each major release might deprecate configuration options or change the format of the configuration.
- Prepare a Method of Procedure (MOP) that is specific to your environment to reduce the risk of variance or omitted steps when running the adoption process.
You can use representative hardware in a staging environment to prepare a MOP and validate any content changes.
- Include a cross-section of firmware versions, additional interface or device hardware, and any additional software in the representative staging environment to ensure that it is broadly representative of the variety that is present in the production environments.
- Ensure that you validate any Red Hat Enterprise Linux update or upgrade in the representative staging environment.
- Use Satellite for localized and version-pinned RPM content where your data plane nodes are located.
- In the production environment, use the content that you tested in the staging environment.
1.4. Adoption process overview Copy linkLink copied to clipboard!
Familiarize yourself with the steps of the adoption process.
- Main adoption process
- Migrate TLS everywhere (TLS-e) to the Red Hat OpenStack Services on OpenShift (RHOSO) deployment.
- Migrate your existing databases to the new control plane.
- Adopt your Red Hat OpenStack Platform 17.1 control plane services to the new RHOSO 18.0 deployment.
- Adopt the RHOSO 18.0 data plane.
- Migrate the Object Storage service (swift) to the RHOSO nodes.
- Distributed Compute Node (DCN) architecture process
- Overview of Distributed Compute Node adoption
- Configuring spine-leaf networks for the Red Hat OpenStack Services on OpenShift deployment
- Configuring control plane networking for spine-leaf topologies
Configuring data plane node sets for DCN sites
If you use a DCN architecture with storage, the following additional steps apply, depending on the services that are included in your deployment:
- Adopting the Image service with multiple Red Hat Ceph Storage back ends (DCN)
- Adopt the Block Storage service service with multiple Red Hat Ceph Storage back ends (DCN)
- Adopting Compute services with multiple Red Hat Ceph Storage back ends (DCN)
- Red Hat Ceph Storage migration for Distributed Compute Node deployments
- Post-adoption tasks
- For more details on the tasks you must perform after completing the adoption, see Post-adoption tasks.
1.5. Adoption duration and impact Copy linkLink copied to clipboard!
The durations in the following table were recorded in a test environment that consisted of 228 Compute nodes and 3 Networker nodes. To accurately estimate the adoption duration for each task, perform these procedures in a test environment with hardware that is similar to your production environment. Ensure that you set up the Red Hat OpenShift Container Platform (RHOCP) (RHOCP) environment and install the Operators before testing.
Durations can vary significantly based on the content of your environment, for example, the size of your service databases or the number of services. The durations represent raw execution time. They do not include human operator activity.
| Adoption stage | Duration | Notes |
|---|---|---|
|
| |
|
| |
|
| |
|
| |
|
| |
|
|
| Scenario | Notes |
|---|---|
| Migrate a 17.1 OVN gateway on the control plane to a RHOCP-hosted OVN gateway | Possible L3 downtime due to the migration of the traffic path to new hosts |
| Migrate a 17.1 OVN gateway on the control plane to an 18.0 data plane Networker node | No L2/L3 data plane connectivity loss because the traffic path remains unchanged |
| Migrate a 17.1 OVN gateway on a Networker node to a 18.0 data plane Networker node | No L2/L3 data plane connectivity loss because the traffic path remains unchanged |
| L3 handled through provider networks | No L2/L3 data plane connectivity loss because the traffic path remains unchanged |
1.6. Overview of Distributed Compute Node adoption Copy linkLink copied to clipboard!
The process to adopt Distributed Compute Node (DCN) deployment from Red Hat OpenStack Platform (RHOSP) to Red Hat OpenStack Services on OpenShift (RHOSO) requires additional adoption tasks:
- You must map a multi-stack deployment to multiple node sets.
You must map additional networking configurations.
- Multi-stack to multi-node set mapping
In director deployments, DCN environments use multiple Heat stacks:
- The Central stack is templating for Controllers and central Compute nodes.
An edge stack is templating for Edge Compute nodes in a stack. There is one stack per DCN site.
When you perform an adoption, map director stacks to
OpenStackDataPlaneNodeSetcustom resources (CRs):Expand Table 1.3. Mapping director stacks to RHOSO nodesets director stack RHOSO nodeset Availability zone Central stack (Compute role)
openstack-edpmoropenstack-cell1az-central
DCN1 stack (ComputeDcn1 role)
openstack-edpm-dcn1oropenstack-cell1-dcn1az-dcn1
DCN2 stack (ComputeDcn2 role)
openstack-edpm-dcn2oropenstack-cell1-dcn2az-dcn2
NoteKeep all node sets in the same Nova cell to maintain unified scheduling through a shared cell. The default cell is
cell1.
- Key differences from standard adoption
The following table summarizes the differences between standard adoption and DCN adoption:
Expand Table 1.4. Comparison of standard and DCN adoption Aspect Standard adoption DCN adoption Director stacks
Single stack
Multiple stacks (central + edge sites)
Network topology
Flat L2 networks
Routed L3 networks with multiple subnets
Data plane node sets
Single node set
Multiple node sets (one per site minimum)
Network routes
Usually not required
Required for inter-site connectivity
Physnets
Single physnet (e.g.,
datacentre)Multiple physnets (e.g.,
leaf0,leaf1,leaf2)Availability zones
Often single AZ
Multiple AZs (one per site)
OVN bridge mappings
Single mapping
Site-specific mappings
Provider networks
Single segment
Multi-segment routed provider networks
- Requirements for DCN adoption
Before adopting a DCN deployment, ensure you have:
- Network topology information for all sites (IP ranges, VLANs, gateways)
- Inter-site routing configuration (routes between site subnets)
- Mapping of director roles to availability zones
- OVN bridge mapping configuration for each site
The adoption of the control plane must complete before adopting any data plane nodes. However, once the control plane is adopted, the edge site data plane adoptions can proceed in parallel with the central site data plane adoption.
- DCN Adoption workflow overview
The adoption of a Distributed Compute Node (DCN) deployment from Red Hat OpenStack Platform (RHOSP) to Red Hat OpenStack Services on OpenShift (RHOSO)
- Control plane adoption: Adopt all control plane services from the central director stack to the RHOSO control plane. This is identical to standard adoption.
-
Network configuration: Configure multi-subnet
NetConfigandNetworkAttachmentDefinitionCRs to support all site networks. Data plane node set creation: Create separate
OpenStackDataPlaneNodeSetCRs for each site, each with site-specific network configurations:- Network subnet references
- OVN bridge mappings (physnets)
- Inter-site routing configuration
- Data plane deployment: Deploy all node sets. The edge site node sets can be deployed in parallel after the central site control plane is adopted.
1.7. Installing the systemd-container package on Compute hosts Copy linkLink copied to clipboard!
Before you adopt the Red Hat OpenStack Services on OpenShift (RHOSO) data plane, you must verify that the systemd-container package is installed and that systemd-machined is running on all the Compute hosts. You must install the systemd-container package on each Compute host that does not have this package.
Procedure
- Log in to the Compute node host as a user with the appropriate permissions.
List the instances that are running on the host:
$ sudo machinectl list- Sample output
MACHINE CLASS SERVICE OS VERSION ADDRESSES qemu-1-instance-000000b9 vm libvirt-qemu - - - qemu-2-instance-000000c2 vm libvirt-qemu - - - 2 machines listed.
Verify that the
systemd-machinedservice is running:$ sudo systemctl status systemd-machined.service- Sample output
systemd-machined.service - Virtual Machine and Container Registration Service Loaded: loaded (/usr/lib/systemd/system/systemd-machined.service; static) Active: active (running) since Mon 2025-06-16 11:42:07 EDT; 2min 48s ago Docs: man:systemd-machined.service(8) man:org.freedesktop.machine1(5) Main PID: 136614 (systemd-machine) Status: "Processing requests..." Tasks: 1 (limit: 838860) Memory: 1.4M CPU: 33ms CGroup: /system.slice/systemd-machined.service └─136614 /usr/lib/systemd/systemd-machined Jun 16 11:42:07 computehost001 systemd[1]: Starting Virtual Machine and Container Registration Service... Jun 16 11:42:07 computehost001 systemd[1]: Started Virtual Machine and Container Registration Service. Jun 16 11:43:44 computehost001 systemd-machined[136614]: New machine qemu-1-instance-000000b9. Jun 16 11:43:51 computehost001 systemd-machined[136614]: New machine qemu-2-instance-000000c2.ImportantIf the
systemd-machinedservice is running, skip the rest of this procedure. Ensure that you verify that thesystemd-machinedservice is running each Compute node host in the cluster.
-
If the
systemd-machinedservice is not running, before you can install thesystemd-containerpackage, live migrate all virtual machines from the host. For more information about live migration, see Rebooting Compute nodes in Performing a minor update of Red Hat OpenStack Platform. Install the
systemd-containeron the host:-
If you upgraded your environment from an earlier version of Red Hat OpenStack Platform, reboot the Compute host to automatically install the
systemd-container. If you deployed a new RHOSO environment, install the
systemd-containermanually by using the following command. Rebooting the Compute host is not required:$ sudo dnf -y install systemd-containerNoteIf your Compute host is not running a virtual machine, you can install the
systemd-containerautomatically or manually.
-
If you upgraded your environment from an earlier version of Red Hat OpenStack Platform, reboot the Compute host to automatically install the
-
Repeat this procedure on each Compute host in the cluster where the
systemd-machinedservice is not running.
1.8. Identity service authentication Copy linkLink copied to clipboard!
If you have custom policies enabled, complete the following steps for adoption:
- Remove custom policies.
- Run the adoption.
- Re-add custom policies by using the new SRBAC syntax.
Red Hat does not support customized roles or policies. Syntax errors or misapplied authorization can negatively impact security or usability. If you need customized roles or policies in your production environment, contact Red Hat support for a support exception before you begin the adoption.
After you adopt a director-based OpenStack deployment to a Red Hat OpenStack Services on OpenShift deployment, the Identity service performs user authentication and authorization by using Secure RBAC (SRBAC). If SRBAC is already enabled, then there is no change to how you perform operations. If SRBAC is disabled, then adopting a director-based OpenStack deployment might change how you perform operations due to changes in API access policies.
1.9. Configuring the network for the Red Hat OpenStack Services on OpenShift deployment Copy linkLink copied to clipboard!
When you adopt a new Red Hat OpenStack Services on OpenShift (RHOSO) deployment, you must align the network configuration with the adopted cluster to maintain connectivity for existing workloads.
Perform the following tasks to incorporate the existing network configuration:
- Configure Red Hat OpenShift Container Platform (RHOCP) worker nodes to align VLAN tags and IP Address Management (IPAM) configuration with the existing deployment.
- Configure control plane services to use compatible IP ranges for service and load-balancing IP addresses.
- Configure data plane nodes to use corresponding compatible configuration for VLAN tags and IPAM.
When configuring nodes and services, the general approach is as follows:
- For IPAM, you can either reuse subnet ranges from the existing deployment or, if there is a shortage of free IP addresses in existing subnets, define new ranges for the new control plane services. If you define new ranges, you configure IP routing between the old and new ranges.
- For VLAN tags, always reuse the configuration from the existing deployment.
1.9.1. Retrieving the network configuration from your existing deployment Copy linkLink copied to clipboard!
You must determine which isolated networks are defined in your existing deployment. After you retrieve your network configuration, you have the following information:
- A list of isolated networks that are used in the existing deployment.
- For each of the isolated networks, the VLAN tag and IP ranges used for dynamic address allocation.
- A list of existing IP address allocations that are used in the environment. When reusing the existing subnet ranges to host the new control plane services, these addresses are excluded from the corresponding allocation pools.
Procedure
Find the network configuration in the
network_data.yamlfile. For example:- name: InternalApi mtu: 1500 vip: true vlan: 20 name_lower: internal_api dns_domain: internal.mydomain.tld. service_net_map_replace: internal subnets: internal_api_subnet: ip_subnet: '172.17.0.0/24' allocation_pools: [{'start': '172.17.0.4', 'end': '172.17.0.250'}]-
Retrieve the VLAN tag that is used in the
vlankey and the IP range in theip_subnetkey for each isolated network from thenetwork_data.yamlfile. When reusing subnet ranges from the existing deployment for the new control plane services, the ranges are split into separate pools for control plane services and load-balancer IP addresses. Use the
tripleo-ansible-inventory.yamlfile to determine the list of IP addresses that are already consumed in the adopted environment. For each listed host in the file, make a note of the IP and VIP addresses that are consumed by the node. For example:Standalone: hosts: standalone: ... internal_api_ip: 172.17.0.100 ... ... standalone: children: Standalone: {} vars: ... internal_api_vip: 172.17.0.2 ...NoteIn this example, the
172.17.0.2and172.17.0.100values are consumed and are not available for the new control plane services until the adoption is complete.- Repeat this procedure for each isolated network and each host in the configuration.
1.9.2. Planning your IPAM configuration Copy linkLink copied to clipboard!
In a Red Hat OpenStack Services on OpenShift (RHOSO) deployment, each service that is deployed on the Red Hat OpenShift Container Platform (RHOCP) worker nodes requires an IP address from the IP Address Management (IPAM) pool. In a Red Hat OpenStack Platform (RHOSP) deployment, all services that are hosted on a Controller node share the same IP address.
The RHOSO control plane has different requirements for the number of IP addresses that are made available for services. Depending on the size of the IP ranges that are used in the existing RHOSO deployment, you might reuse these ranges for the RHOSO control plane.
The total number of IP addresses that are required for the new control plane services in each isolated network is calculated as the sum of the following:
-
The number of RHOCP worker nodes. Each worker node requires 1 IP address in the
NodeNetworkConfigurationPolicycustom resource (CR). -
The number of IP addresses required for the data plane nodes. Each node requires an IP address from the
NetConfigCRs. -
The number of IP addresses required for control plane services. Each service requires an IP address from the
NetworkAttachmentDefinitionCRs. This number depends on the number of replicas for each service. -
The number of IP addresses required for load balancer IP addresses. Each service requires a Virtual IP address from the
IPAddressPoolCRs.
For example, a simple single worker node RHOCP deployment with Red Hat OpenShift Local has the following IP ranges defined for the internalapi network:
- 1 IP address for the single worker node
- 1 IP address for the data plane node
-
NetworkAttachmentDefinitionCRs for control plane services:X.X.X.30-X.X.X.70(41 addresses) -
IPAllocationPoolCRs for load balancer IPs:X.X.X.80-X.X.X.90(11 addresses)
This example shows a total of 54 IP addresses allocated to the internalapi allocation pools.
The requirements might differ depending on the list of RHOSP services to be deployed, their replica numbers, and the number of RHOCP worker nodes and data plane nodes.
Additional IP addresses might be required in future RHOSP releases, so you must plan for some extra capacity for each of the allocation pools that are used in the new environment.
After you determine the required IP pool size for the new deployment, you can choose to define new IP address ranges or reuse your existing IP address ranges. Regardless of the scenario, the VLAN tags in the existing deployment are reused in the new deployment. Ensure that the VLAN tags are properly retained in the new configuration.
1.9.2.1. Configuring new subnet ranges Copy linkLink copied to clipboard!
If you are using IPv6, you can reuse existing subnet ranges in most cases. For more information about existing subnet ranges, see Reusing existing subnet ranges.
You can define new IP ranges for control plane services that belong to a different subnet that is not used in the existing cluster. Then you configure link local IP routing between the existing and new subnets to enable existing and new service deployments to communicate. This involves using the director mechanism on a pre-adopted cluster to configure additional link local routes. This enables the data plane deployment to reach out to Red Hat OpenStack Platform (RHOSP) nodes by using the existing subnet addresses. You can use new subnet ranges with any existing subnet configuration, and when the existing cluster subnet ranges do not have enough free IP addresses for the new control plane services.
You must size the new subnet appropriately to accommodate the new control plane services. There are no specific requirements for the existing deployment allocation pools that are already consumed by the RHOSP environment.
Defining a new subnet for Storage and Storage management is not supported because Compute service (nova) and Red Hat Ceph Storage do not allow modifying those networks during adoption.
In the following procedure, you configure NetworkAttachmentDefinition custom resources (CRs) to use a different subnet from what is configured in the network_config section of the OpenStackDataPlaneNodeSet CR for the same networks. The new range in the NetworkAttachmentDefinition CR is used for control plane services, while the existing range in the OpenStackDataPlaneNodeSet CR is used to manage IP Address Management (IPAM) for data plane nodes.
The values that are used in the following procedure are examples. Use values that are specific to your configuration.
Procedure
Configure link local routes on the existing deployment nodes for the control plane subnets. This is done through director configuration:
network_config: - type: ovs_bridge name: br-ctlplane routes: - ip_netmask: 0.0.0.0/0 next_hop: 192.168.1.1 - ip_netmask: 172.31.0.0/24 next_hop: 192.168.1.100-
ip_netmaskdefines the new control plane subnet. next_hopdefines the control plane IP address of the existing data plane node.Repeat this configuration for other networks that need to use different subnets for the new and existing parts of the deployment.
-
Apply the new configuration to every RHOSP node:
(undercloud)$ openstack overcloud network provision \ --output <deployment_file> \ [--templates <templates_directory>]/home/stack/templates/<networks_definition_file>(undercloud)$ openstack overcloud node provision \ --stack <stack> \ --network-config \ --output <deployment_file> \ [--templates <templates_directory>]/home/stack/templates/<node_definition_file>-
Optional: Include the
--templatesoption to use your own templates instead of the default templates located in/usr/share/openstack-tripleo-heat-templates. Replace<templates_directory>with the path to the directory that contains your templates. -
Replace
<stack>with the name of the stack for which the bare-metal nodes are provisioned. If not specified, the default isovercloud. -
Include the
--network-configoptional argument to provide the network definitions to thecli-overcloud-node-network-config.yamlAnsible playbook. Thecli-overcloud-node-network-config.yamlplaybook uses theos-net-configtool to apply the network configuration on the deployed nodes. If you do not use--network-configto provide the network definitions, then you must configure the{{role.name}}NetworkConfigTemplateparameters in yournetwork-environment.yamlfile, otherwise the default network definitions are used. -
Replace
<deployment_file>with the name of the heat environment file to generate for inclusion in the deployment command, for example/home/stack/templates/overcloud-baremetal-deployed.yaml. Replace
<node_definition_file>with the name of your node definition file, for example,overcloud-baremetal-deploy.yaml. Ensure that thenetwork_config_updatevariable is set totruein the node definition file.NoteNetwork configuration changes are not applied by default to avoid the risk of network disruption. You must enforce the changes by setting the
StandaloneNetworkConfigUpdate: truein the director configuration files.
-
Optional: Include the
Confirm that there are new link local routes to the new subnet on each node. For example:
# ip route | grep 172 172.31.0.0/24 via 192.168.122.100 dev br-ctlplaneYou also must configure link local routes to existing deployment on Red Hat OpenStack Services on OpenShift (RHOSO) worker nodes. This is achieved by adding
routesentries to theNodeNetworkConfigurationPolicyCRs for each network. For example:- destination: 192.168.122.0/24 next-hop-interface: ospbr-
destinationdefines the original subnet of the isolated network on the data plane. next-hop-interfacedefines the Red Hat OpenShift Container Platform (RHOCP) worker network interface that corresponds to the isolated network on the data plane.As a result, the following route is added to your RHOCP nodes:
# ip route | grep 192 192.168.122.0/24 dev ospbr proto static scope link
-
Later, during the data plane adoption, in the
network_configsection of theOpenStackDataPlaneNodeSetCR, add the same link local routes for the new control plane subnet ranges. For example:nodeTemplate: ansible: ansibleUser: root ansibleVars: additional_ctlplane_host_routes: - ip_netmask: 172.31.0.0/24 next_hop: '{{ ctlplane_ip }}' edpm_network_config_template: | network_config: - type: ovs_bridge routes: {{ ctlplane_host_routes + additional_ctlplane_host_routes }} ...List the IP addresses that are used for the data plane nodes in the existing deployment as
ansibleHostandfixedIP. For example:nodes: standalone: ansible: ansibleHost: 192.168.122.100 ansibleUser: "" hostName: standalone networks: - defaultRoute: true fixedIP: 192.168.122.100 name: ctlplane subnetName: subnet1ImportantDo not change RHOSP node IP addresses during the adoption process. List previously used IP addresses in the
fixedIPfields for each node entry in thenodessection of theOpenStackDataPlaneNodeSetCR.Expand the SSH range for the firewall configuration to include both subnets to allow SSH access to data plane nodes from both subnets:
edpm_sshd_allowed_ranges: - 192.168.122.0/24 - 172.31.0.0/24This provides SSH access from the new subnet to the RHOSP nodes as well as the RHOSP subnets.
1.9.2.2. Reusing existing subnet ranges Copy linkLink copied to clipboard!
You can reuse existing subnet ranges if they have enough IP addresses to allocate to the new control plane services. You configure the new control plane services to use the same subnet as you used in the Red Hat OpenStack Platform (RHOSP) environment, and configure the allocation pools that are used by the new services to exclude IP addresses that are already allocated to existing cluster nodes. By reusing existing subnets, you avoid additional link local route configuration between the existing and new subnets.
If your existing subnets do not have enough IP addresses in the existing subnet ranges for the new control plane services, you must create new subnet ranges.
No special routing configuration is required to reuse subnet ranges. However, you must ensure that the IP addresses that are consumed by RHOSP services do not overlap with the new allocation pools configured for Red Hat OpenStack Services on OpenShift control plane services.
If you are especially constrained by the size of the existing subnet, you may have to apply elaborate exclusion rules when defining allocation pools for the new control plane services.
1.9.3. Configuring isolated networks Copy linkLink copied to clipboard!
Before you begin replicating your existing VLAN and IPAM configuration in the Red Hat OpenStack Services on OpenShift (RHOSO) environment, you must have the following IP address allocations for the new control plane services:
-
1 IP address for each isolated network on each Red Hat OpenShift Container Platform (RHOCP) worker node. You configure these IP addresses in the
NodeNetworkConfigurationPolicycustom resources (CRs) for the RHOCP worker nodes. -
1 IP range for each isolated network for the data plane nodes. You configure these ranges in the
NetConfigCRs for the data plane nodes. -
1 IP range for each isolated network for control plane services. These ranges enable pod connectivity for isolated networks in the
NetworkAttachmentDefinitionCRs. -
1 IP range for each isolated network for load balancer IP addresses. These IP ranges define load balancer IP addresses for MetalLB in the
IPAddressPoolCRs.
The exact list and configuration of isolated networks in the following procedures should reflect the actual Red Hat OpenStack Platform environment. The number of isolated networks might differ from the examples used in the procedures. The IPAM scheme might also differ. Only the parts of the configuration that are relevant to configuring networks are shown. The values that are used in the following procedures are examples. Use values that are specific to your configuration.
1.9.3.1. Configuring isolated networks on RHOCP worker nodes Copy linkLink copied to clipboard!
To connect service pods to isolated networks on Red Hat OpenShift Container Platform (RHOCP) worker nodes that run Red Hat OpenStack Platform services, physical network configuration on the hypervisor is required.
This configuration is managed by the NMState operator, which uses NodeNetworkConfigurationPolicy custom resources (CRs) to define the desired network configuration for the nodes.
Procedure
For each RHOCP worker node, define a
NodeNetworkConfigurationPolicyCR that describes the desired network configuration. For example:apiVersion: v1 items: - apiVersion: nmstate.io/v1 kind: NodeNetworkConfigurationPolicy spec: desiredState: interfaces: - description: internalapi vlan interface ipv4: address: - ip: 172.17.0.10 prefix-length: 24 dhcp: false enabled: true ipv6: enabled: false name: enp6s0.20 state: up type: vlan vlan: base-iface: enp6s0 id: 20 reorder-headers: true - description: storage vlan interface ipv4: address: - ip: 172.18.0.10 prefix-length: 24 dhcp: false enabled: true ipv6: enabled: false name: enp6s0.21 state: up type: vlan vlan: base-iface: enp6s0 id: 21 reorder-headers: true - description: tenant vlan interface ipv4: address: - ip: 172.19.0.10 prefix-length: 24 dhcp: false enabled: true ipv6: enabled: false name: enp6s0.22 state: up type: vlan vlan: base-iface: enp6s0 id: 22 reorder-headers: true nodeSelector: kubernetes.io/hostname: ocp-worker-0 node-role.kubernetes.io/worker: ""NoteFor environments that are enabled with border gateway protocol (BGP), you might need to add additional routes in the
NodeNetworkConfigurationPolicyCR so that RHOCP worker nodes can reach the Red Hat OpenStack Platform Controller nodes and Compute nodes over the control plane and internal API networks.When you configure the RHOCP worker nodes network in the
NodeNetworkConfigurationPolicyCR, add routes for each of the following networks:-
External network (for example,
172.31.0.0/24) -
Control plane network (for example,
192.168.188.0/24) -
BGP main network (for example,
99.99.0.0/16)
The following example shows the
routes.configsection from aNodeNetworkConfigurationPolicyCR for a worker node with BGP configured. In this example,100.64.0.17and100.65.0.17are the IP addresses of the leaf switches that are connected to the specific RHOCP node:routes: config: - destination: 99.99.0.0/16 next-hop-address: 100.64.0.17 next-hop-interface: enp7s0 weight: 200 - destination: 99.99.0.0/16 next-hop-address: 100.65.0.17 next-hop-interface: enp8s0 weight: 200 - destination: 172.31.0.0/24 next-hop-address: 100.64.0.17 next-hop-interface: enp7s0 weight: 200 - destination: 172.31.0.0/24 next-hop-address: 100.65.0.17 next-hop-interface: enp8s0 weight: 200 - destination: 192.168.188.0/24 next-hop-address: 100.64.0.17 next-hop-interface: enp7s0 weight: 200 - destination: 192.168.188.0/24 next-hop-address: 100.65.0.17 next-hop-interface: enp8s0 weight: 200-
External network (for example,
1.9.3.2. Configuring isolated networks on control plane services Copy linkLink copied to clipboard!
After the NMState operator creates the desired hypervisor network configuration for isolated networks, you must configure the Red Hat OpenStack Platform (RHOSP) services to use the configured interfaces. You define a NetworkAttachmentDefinition custom resource (CR) for each isolated network. In some clusters, these CRs are managed by the Cluster Network Operator, in which case you use Network CRs instead. For more information, see Cluster Network Operator in Networking.
Procedure
Define a
NetworkAttachmentDefinitionCR for each isolated network. For example:apiVersion: k8s.cni.cncf.io/v1 kind: NetworkAttachmentDefinition metadata: name: internalapi spec: config: | { "cniVersion": "0.3.1", "name": "internalapi", "type": "macvlan", "master": "enp6s0.20", "ipam": { "type": "whereabouts", "range": "172.17.0.0/24", "range_start": "172.17.0.20", "range_end": "172.17.0.50" } }ImportantEnsure that the interface name and IPAM range match the configuration that you used in the
NodeNetworkConfigurationPolicyCRs.Optional: When reusing existing IP ranges, you can exclude part of the range that is used in the existing deployment by using the
excludeparameter in theNetworkAttachmentDefinitionpool. For example:apiVersion: k8s.cni.cncf.io/v1 kind: NetworkAttachmentDefinition metadata: name: internalapi spec: config: | { "cniVersion": "0.3.1", "name": "internalapi", "type": "macvlan", "master": "enp6s0.20", "ipam": { "type": "whereabouts", "range": "172.17.0.0/24", "range_start": "172.17.0.20", "range_end": "172.17.0.50", "exclude": [ "172.17.0.24/32", "172.17.0.44/31" ] } }-
spec.config.ipam.range_startdefines the start of the IP range. -
spec.config.ipam.range_enddefines the end of the IP range. -
spec.config.ipam.excludeexcludes part of the IP range. This example excludes IP addresses172.17.0.24/32and172.17.0.44/31from the allocation pool.
-
If your RHOSP services require load balancer IP addresses, define the pools for these services in an
IPAddressPoolCR. For example:NoteThe load balancer IP addresses belong to the same IP range as the control plane services, and are managed by MetalLB. This pool should also be aligned with the RHOSP configuration.
- apiVersion: metallb.io/v1beta1 kind: IPAddressPool spec: addresses: - 172.17.0.60-172.17.0.70Define
IPAddressPoolCRs for each isolated network that requires load balancer IP addresses.Optional: When reusing existing IP ranges, you can exclude part of the range by listing multiple entries in the
addressessection of theIPAddressPool. For example:- apiVersion: metallb.io/v1beta1 kind: IPAddressPool spec: addresses: - 172.17.0.60-172.17.0.64 - 172.17.0.66-172.17.0.70The example above would exclude the
172.17.0.65address from the allocation pool.For environments that are enabled with border gateway protocol (BGP), add routes to the
NetworkAttachmentDefinitionCRs so that the pods can communicate with the Red Hat OpenStack Platform Controller nodes and Compute nodes over the isolated networks. This is similar to the routes that should be added to theNodeNetworkConfigurationPolicyCRs in BGP environments. For more information about isolated networks, see Configuring isolated networks on RHOCP worker nodes. The following example shows aNetworkAttachmentDefinitionCR for the storage network with routes:apiVersion: k8s.cni.cncf.io/v1 kind: NetworkAttachmentDefinition metadata: name: storage namespace: openstack spec: config: | { "cniVersion": "0.3.1", "name": "storage", "type": "bridge", "isDefaultGateway": false, "isGateway": true, "forceAddress": false, "hairpinMode": true, "ipMasq": false, "bridge": "storage", "ipam": { "type": "whereabouts", "range": "172.18.0.0/24", "range_start": "172.18.0.30", "range_end": "172.18.0.70", "routes": [ {"dst": "172.31.0.0/24", "gw": "172.18.0.1"}, {"dst": "192.168.188.0/24", "gw": "172.18.0.1"}, {"dst": "99.99.0.0/16", "gw": "172.18.0.1"} ] } }
1.9.3.3. Configuring isolated networks on data plane nodes Copy linkLink copied to clipboard!
Data plane nodes are configured by the OpenStack Operator and your OpenStackDataPlaneNodeSet custom resources (CRs). The OpenStackDataPlaneNodeSet CRs define your desired network configuration for the nodes.
Your Red Hat OpenStack Services on OpenShift (RHOSO) network configuration should reflect the existing Red Hat OpenStack Platform (RHOSP) network setup. You must pull the network_data.yaml files from each RHOSP node and reuse them when you define the OpenStackDataPlaneNodeSet CRs. The format of the configuration does not change, so you can put network templates under edpm_network_config_template variables, either for all nodes or for each node.
Procedure
Configure a
NetConfigCR with your desired VLAN tags and IPAM configuration. For example:apiVersion: network.openstack.org/v1beta1 kind: NetConfig metadata: name: netconfig spec: networks: - name: internalapi dnsDomain: internalapi.example.com subnets: - name: subnet1 allocationRanges: - end: 172.17.0.250 start: 172.17.0.100 cidr: 172.17.0.0/24 vlan: 20 - name: storage dnsDomain: storage.example.com subnets: - name: subnet1 allocationRanges: - end: 172.18.0.250 start: 172.18.0.100 cidr: 172.18.0.0/24 vlan: 21 - name: tenant dnsDomain: tenant.example.com subnets: - name: subnet1 allocationRanges: - end: 172.19.0.250 start: 172.19.0.100 cidr: 172.19.0.0/24 vlan: 22where:
- spec.networks
-
Specifies the
networkscomposition. Thenetworkscomposition must match the source cloud configuration to avoid data plane connectivity downtime.
Optional: In the
NetConfigCR, list multiple ranges for theallocationRangesfield to exclude some of the IP addresses, for example, to accommodate IP addresses that are already consumed by the adopted environment:apiVersion: network.openstack.org/v1beta1 kind: NetConfig metadata: name: netconfig spec: networks: - name: internalapi dnsDomain: internalapi.example.com subnets: - name: subnet1 allocationRanges: - end: 172.17.0.199 start: 172.17.0.100 - end: 172.17.0.250 start: 172.17.0.201 cidr: 172.17.0.0/24 vlan: 20This example excludes the
172.17.0.200address from the pool.
1.10. Configuring spine-leaf networks for the Red Hat OpenStack Services on OpenShift deployment Copy linkLink copied to clipboard!
When you adopt a Red Hat OpenStack Platform (RHOSP) deployment with spine-leaf networking, like a Distributed Compute Node (DCN) architecture, you must each L2 network segment with a separate IP subnet and create create routed provider networks. Traffic between sites is routed at L3 through spine routers or similar network infrastructure.
You must configure routing for Compute nodes at edge sites to connect with control plane services, such as RabbitMQ or the database at the central site. The cloud will not function correctly without routes configured.
DHCP relay is not supported in adopted Red Hat OpenStack Services on OpenShift (RHOSO) environments with spine-leaf topologies. This affects bare-metal provisioning scenarios that use PXE boot.
If you need to provision bare-metal nodes at edge sites, use Redfish virtual media or similar BMC virtual media features instead of PXE boot.
| Destination network | Next hop | Purpose |
|---|---|---|
| 172.17.0.0/24 | 172.17.10.1 | Route to central internalapi |
| 172.17.20.0/24 | 172.17.10.1 | Route to DCN2 internalapi |
| 172.18.0.0/24 | 172.18.10.1 | Route to central storage |
| 172.18.20.0/24 | 172.18.10.1 | Route to DCN2 storage |
You configure these routes in the edpm_network_config_template within the OpenStackDataPlaneNodeSet custom resource (CR) for each site.
| Network | Central site | DCN1 site | DCN2 site |
|---|---|---|---|
| Control plane | 192.168.122.0/24 | 192.168.133.0/24 | 192.168.144.0/24 |
| Internal API | 172.17.0.0/24 | 172.17.10.0/24 | 172.17.20.0/24 |
| Storage | 172.18.0.0/24 | 172.18.10.0/24 | 172.18.20.0/24 |
| Tenant | 172.19.0.0/24 | 172.19.10.0/24 | 172.19.20.0/24 |
When you adopt a spine-leaf deployment, you configure the NetConfig CR with multiple subnets for each service network. Each subnet represents a different site.
Example NetConfig with multiple subnets per network
apiVersion: network.openstack.org/v1beta1
kind: NetConfig
metadata:
name: netconfig
spec:
networks:
- name: ctlplane
dnsDomain: ctlplane.example.com
subnets:
- name: subnet1 # Central site
allocationRanges:
- end: 192.168.122.120
start: 192.168.122.100
cidr: 192.168.122.0/24
gateway: 192.168.122.1
- name: ctlplanedcn1 # DCN1 site
allocationRanges:
- end: 192.168.133.120
start: 192.168.133.100
cidr: 192.168.133.0/24
gateway: 192.168.133.1
- name: ctlplanedcn2 # DCN2 site
allocationRanges:
- end: 192.168.144.120
start: 192.168.144.100
cidr: 192.168.144.0/24
gateway: 192.168.144.1
- name: internalapi
dnsDomain: internalapi.example.com
subnets:
- name: subnet1 # Central site
allocationRanges:
- end: 172.17.0.250
start: 172.17.0.100
cidr: 172.17.0.0/24
vlan: 20
- name: internalapidcn1 # DCN1 site
allocationRanges:
- end: 172.17.10.250
start: 172.17.10.100
cidr: 172.17.10.0/24
vlan: 30
- name: internalapidcn2 # DCN2 site
allocationRanges:
- end: 172.17.20.250
start: 172.17.20.100
cidr: 172.17.20.0/24
vlan: 40
- Each network defines multiple subnets, one for each site.
- Each site uses unique VLAN IDs. In this example, central uses VLANs 20-23, DCN1 uses VLANs 30-33, and DCN2 uses VLANs 40-43.
-
The subnet naming convention typically uses
subnet1for the central site and site-specific names likeinternalapidcn1for edge sites.
Because the sites are geopgraphically distributed, each site requires its own provider network (physnet). The Networking service (neutron) must be configured to recognize all physnets.
Example Neutron ML2 configuration for multiple physnets
[ml2_type_vlan]
network_vlan_ranges = leaf0:1:1000,leaf1:1:1000,leaf2:1:1000
[neutron]
physnets = leaf0,leaf1,leaf2
-
leaf0corresponds to the central site. -
leaf1corresponds to the DCN1 site. -
leaf2corresponds to the DCN2 site.
When you create routed provider networks in RHOSO, you create network segments that map to these physnets:
-
Segment for central:
physnet=leaf0, subnet=192.168.122.0/24 -
Segment for DCN1:
physnet=leaf1, subnet=192.168.133.0/24 -
Segment for DCN2:
physnet=leaf2, subnet=192.168.144.0/24
1.11. Storage requirements Copy linkLink copied to clipboard!
Storage in a Red Hat OpenStack Platform (RHOSP) deployment refers to the following types:
- The storage that is needed for the service to run
- The storage that the service manages
Before you can deploy the services in Red Hat OpenStack Services on OpenShift (RHOSO), you must review the storage requirements, plan your Red Hat OpenShift Container Platform (RHOCP) node selection, prepare your RHOCP nodes, and so on.
1.11.1. Storage driver certification Copy linkLink copied to clipboard!
Before you adopt your Red Hat OpenStack Platform 17.1 deployment to a Red Hat OpenStack Services on OpenShift (RHOSO) 18.0 deployment, confirm that your deployed storage drivers are certified for use with RHOSO 18.0. For information on software certified for use with RHOSO 18.0, see the Red Hat Ecosystem Catalog.
1.11.2. Block Storage service guidelines Copy linkLink copied to clipboard!
Prepare to adopt your Block Storage service (cinder):
- Take note of the Block Storage service back ends that you use.
- Determine all the transport protocols that the Block Storage service back ends use, such as RBD, iSCSI, FC, NFS, NVMe-TCP, and so on. You must consider them when you place the Block Storage services and when the right storage transport-related binaries are running on the Red Hat OpenShift Container Platform (RHOCP) nodes. For more information about each storage transport protocol, see RHOCP preparation for Block Storage service adoption.
Use a Block Storage service volume service to deploy each Block Storage service volume back end.
For example, you have an LVM back end, a Ceph back end, and two entries in
cinderVolumes, and you cannot set global defaults for all volume services. You must define a service for each of them:apiVersion: core.openstack.org/v1beta1 kind: OpenStackControlPlane metadata: name: openstack spec: cinder: enabled: true template: cinderVolumes: lvm: customServiceConfig: | [DEFAULT] debug = True [lvm] < . . . > ceph: customServiceConfig: | [DEFAULT] debug = True [ceph] < . . . >WarningCheck that all configuration options are still valid for RHOSO 18.0 version. Configuration options might be deprecated, removed, or added. This applies to both back-end driver-specific configuration options and other generic options.
1.11.3. Limitations for adopting the Block Storage service Copy linkLink copied to clipboard!
Before you begin the Block Storage service (cinder) adoption, review the following limitations:
-
There is no global
nodeSelectoroption for all Block Storage service volumes. You must specify thenodeSelectorfor each back end. -
There are no global
customServiceConfigorcustomServiceConfigSecretsoptions for all Block Storage service volumes. You must specify these options for each back end. - Support for Block Storage service back ends that require kernel modules that are not included in Red Hat Enterprise Linux is not tested in Red Hat OpenStack Services on OpenShift (RHOSO).
1.11.4. RHOCP preparation for Block Storage service adoption Copy linkLink copied to clipboard!
Before you deploy Red Hat OpenStack Platform (RHOSP) in Red Hat OpenShift Container Platform (RHOCP) nodes, ensure that the networks are ready, that you decide which RHOCP nodes to restrict, and that you make any necessary changes to the RHOCP nodes.
- Node selection
You might need to restrict the RHOCP nodes where the Block Storage service volume and backup services run.
An example of when you need to restrict nodes for a specific Block Storage service is when you deploy the Block Storage service with the LVM driver. In that scenario, the LVM data where the volumes are stored only exists in a specific host, so you need to pin the Block Storage-volume service to that specific RHOCP node. Running the service on any other RHOCP node does not work. You cannot use the RHOCP host node name to restrict the LVM back end. You need to identify the LVM back end by using a unique label, an existing label, or a new label:
$ oc label nodes worker0 lvm=cinder-volumesapiVersion: core.openstack.org/v1beta1 kind: OpenStackControlPlane metadata: name: openstack spec: secret: osp-secret storageClass: local-storage cinder: enabled: true template: cinderVolumes: lvm-iscsi: nodeSelector: lvm: cinder-volumes < . . . >For more information about node selection, see About node selectors.
NoteIf your nodes do not have enough local disk space for temporary images, you can use a remote NFS location by setting the extra volumes feature,
extraMounts.- Transport protocols
Some changes to the storage transport protocols might be required for RHOCP:
-
If you use a
MachineConfigto make changes to RHOCP nodes, the nodes reboot. -
Check the back-end sections that are listed in the
enabled_backendsconfiguration option in yourcinder.conffile to determine the enabled storage back-end sections. -
Depending on the back end, you can find the transport protocol by viewing the
volume_driverortarget_protocolconfiguration options. The
iscsidservice,multipathdservice, andNVMe-TCPkernel modules start automatically on data plane nodes.- NFS
- RHOCP connects to NFS back ends without additional changes.
- Rados Block Device and Red Hat Ceph Storage
- RHOCP connects to Red Hat Ceph Storage back ends without additional changes. You must provide credentials and configuration files to the services.
- iSCSI
- To connect to iSCSI volumes, the iSCSI initiator must run on the RHOCP hosts where the volume and backup services run. The Linux Open iSCSI initiator does not support network namespaces, so you must only run one instance of the service for the normal RHOCP usage, as well as the RHOCP CSI plugins and the RHOSP services.
If you are not already running
iscsidon the RHOCP nodes, then you must apply aMachineConfig. For example:apiVersion: machineconfiguration.openshift.io/v1 kind: MachineConfig metadata: labels: machineconfiguration.openshift.io/role: worker service: cinder name: 99-master-cinder-enable-iscsid spec: config: ignition: version: 3.2.0 systemd: units: - enabled: true name: iscsid.service-
If you use labels to restrict the nodes where the Block Storage services run, you must use a
MachineConfigPoolto limit the effects of theMachineConfigto the nodes where your services might run. For more information, see About node selectors. -
If you are using a single node deployment to test the process, replace
workerwithmasterin theMachineConfig. - For production deployments that use iSCSI volumes, configure multipathing for better I/O.
- FC
- The Block Storage service volume and Block Storage service backup services must run in an RHOCP host that has host bus adapters (HBAs). If some nodes do not have HBAs, then use labels to restrict where these services run. For more information, see About node selectors.
- If the Image service is configured to use Block Storage service as a back end with FC, the Image service must also run on an RHOCP host that has HBAs and follow the same node selection requirements as the Block Storage service.
- If you have virtualized RHOCP clusters that use FC you need to expose the host HBAs inside the virtual machine.
- For production deployments that use FC volumes, configure multipathing for better I/O.
- NVMe-TCP
- To connect to NVMe-TCP volumes, load NVMe-TCP kernel modules on the RHOCP hosts.
If you do not already load the
nvme-fabricsmodule on the RHOCP nodes where the volume and backup services are going to run, then you must apply aMachineConfig. For example:apiVersion: machineconfiguration.openshift.io/v1 kind: MachineConfig metadata: labels: machineconfiguration.openshift.io/role: worker service: cinder name: 99-master-cinder-load-nvme-fabrics spec: config: ignition: version: 3.2.0 storage: files: - path: /etc/modules-load.d/nvme_fabrics.conf overwrite: false # Mode must be decimal, this is 0644 mode: 420 user: name: root group: name: root contents: # Source can be a http, https, tftp, s3, gs, or data as defined in rfc2397. # This is the rfc2397 text/plain string format source: data:,nvme-fabrics-
If you use labels to restrict the nodes where Block Storage services run, use a
MachineConfigPoolto limit the effects of theMachineConfigto the nodes where your services run. For more information, see About node selectors. -
If you use a single node deployment to test the process, replace
workerwithmasterin theMachineConfig. -
Only load the
nvme-fabricsmodule because it loads the transport-specific modules, such as TCP, RDMA, or FC, as needed. - For production deployments that use NVMe-TCP volumes, use multipathing for better I/O. For NVMe-TCP volumes, RHOCP uses native multipathing, called ANA.
After the RHOCP nodes reboot and load the
nvme-fabricsmodule, you can confirm that the operating system is configured and that it supports ANA by checking the host:$ cat /sys/module/nvme_core/parameters/multipathImportantANA does not use the Linux Multipathing Device Mapper, but RHOCP requires
multipathdto run on Compute nodes for the Compute service (nova) to be able to use multipathing. Multipathing is automatically configured on data plane nodes when they are provisioned.
- Multipathing
Use multipathing for iSCSI and FC protocols. To configure multipathing on these protocols, you perform the following tasks:
- Prepare the RHOCP hosts
- Configure the Block Storage services
- Prepare the Compute service nodes
- Configure the Compute service
To prepare the RHOCP hosts, ensure that the Linux Multipath Device Mapper is configured and running on the RHOCP hosts by using
MachineConfig. For example:# Includes the /etc/multipathd.conf contents and the systemd unit changes apiVersion: machineconfiguration.openshift.io/v1 kind: MachineConfig metadata: labels: machineconfiguration.openshift.io/role: worker service: cinder name: 99-master-cinder-enable-multipathd spec: config: ignition: version: 3.2.0 storage: files: - path: /etc/multipath.conf overwrite: false # Mode must be decimal, this is 0600 mode: 384 user: name: root group: name: root contents: # Source can be a http, https, tftp, s3, gs, or data as defined in rfc2397. # This is the rfc2397 text/plain string format source: data:,defaults%20%7B%0A%20%20user_friendly_names%20no%0A%20%20recheck_wwid%20yes%0A%20%20skip_kpartx%20yes%0A%20%20find_multipaths%20yes%0A%7D%0A%0Ablacklist%20%7B%0A%7D systemd: units: - enabled: true name: multipathd.service-
If you use labels to restrict the nodes where Block Storage services run, you need to use a
MachineConfigPoolto limit the effects of theMachineConfigto only the nodes where your services run. For more information, see About node selectors. -
If you are using a single node deployment to test the process, replace
workerwithmasterin theMachineConfig. - Cinder volume and backup are configured by default to use multipathing.
-
If you use a
1.11.5. Converting the Block Storage service configuration Copy linkLink copied to clipboard!
In your previous deployment, you use the same cinder.conf file for all the services. To prepare your Block Storage service (cinder) configuration for adoption, split this single-file configuration into individual configurations for each Block Storage service service. Review the following information to guide you in coverting your previous configuration:
-
Determine what part of the configuration is generic for all the Block Storage services and remove anything that would change when deployed in Red Hat OpenShift Container Platform (RHOCP), such as the
connectionin the[database]section, thetransport_urlandlog_dirin the[DEFAULT]sections, the whole[coordination]and[barbican]sections. The remaining generic configuration goes into thecustomServiceConfigoption, or aSecretcustom resource (CR) and is then used in thecustomServiceConfigSecretssection, at thecinder: template:level. -
Determine if there is a scheduler-specific configuration and add it to the
customServiceConfigoption incinder: template: cinderScheduler. -
Determine if there is an API-specific configuration and add it to the
customServiceConfigoption incinder: template: cinderAPI. -
If the Block Storage service backup is deployed, add the Block Storage service backup configuration options to
customServiceConfigoption, or to aSecretCR that you can add tocustomServiceConfigSecretssection at thecinder: template: cinderBackup:level. Remove thehostconfiguration in the[DEFAULT]section to support multiple replicas later. -
Determine the individual volume back-end configuration for each of the drivers. The configuration is in the specific driver section, and it includes the
[backend_defaults]section and FC zoning sections if you use them. The Block Storage service operator does not support a globalcustomServiceConfigoption for all volume services. Each back end has its own section undercinder: template: cinderVolumes, and the configuration goes in thecustomServiceConfigoption or in aSecretCR and is then used in thecustomServiceConfigSecretssection. If any of the Block Storage service volume drivers require a custom vendor image, find the location of the image in the Red Hat Ecosystem Catalog, and create or modify an
OpenStackVersionCR to specify the custom image by using the key from thecinderVolumessection.For example, if you have the following configuration:
spec: cinder: enabled: true template: cinderVolume: pure: customServiceConfigSecrets: - openstack-cinder-pure-cfg < . . . >Then the
OpenStackVersionCR that describes the container image for that back end looks like the following example:apiVersion: core.openstack.org/v1beta1 kind: OpenStackVersion metadata: name: openstack spec: customContainerImages: cinderVolumeImages: pure: registry.connect.redhat.com/purestorage/openstack-cinder-volume-pure-rhosp-18-0'NoteThe name of the
OpenStackVersionmust match the name of yourOpenStackControlPlaneCR.If your Block Storage services use external files, for example, for a custom policy, or to store credentials or SSL certificate authority bundles to connect to a storage array, make those files available to the right containers. Use
SecretsorConfigMapto store the information in RHOCP and then in theextraMountskey. For example, for Red Hat Ceph Storage credentials that are stored in aSecretcalledceph-conf-files, you patch the top-levelextraMountskey in theOpenstackControlPlaneCR:spec: extraMounts: - extraVol: - extraVolType: Ceph mounts: - mountPath: /etc/ceph name: ceph readOnly: true propagation: - CinderVolume - CinderBackup - Glance volumes: - name: ceph projected: sources: - secret: name: ceph-conf-filesFor a service-specific file, such as the API policy, you add the configuration on the service itself. In the following example, you include the
CinderAPIconfiguration that references the policy you are adding from aConfigMapcalledmy-cinder-confthat has apolicykey with the contents of the policy:spec: cinder: enabled: true template: cinderAPI: customServiceConfig: | [oslo_policy] policy_file=/etc/cinder/api/policy.yaml extraMounts: - extraVol: - extraVolType: Ceph mounts: - mountPath: /etc/cinder/api name: policy readOnly: true propagation: - CinderAPI volumes: - name: policy projected: sources: - configMap: name: my-cinder-conf items: - key: policy path: policy.yaml
1.11.6. Changes to CephFS through NFS Copy linkLink copied to clipboard!
Before you begin the adoption, review the following information to understand the changes to CephFS through NFS between Red Hat OpenStack Platform (RHOSP) 17.1 and Red Hat OpenStack Services on OpenShift (RHOSO) 18.0:
-
If the RHOSP 17.1 deployment uses CephFS through NFS as a back end for Shared File Systems service (manila), you cannot directly import the
ceph-nfsservice on the RHOSP Controller nodes into RHOSO 18.0. In RHOSO 18.0, the Shared File Systems service only supports using a clustered NFS service that is directly managed on the Red Hat Ceph Storage cluster. Adoption with theceph-nfsservice involves a data path disruption to existing NFS clients. -
On RHOSP 17.1, Pacemaker manages the high availability of the
ceph-nfsservice. This service is assigned a Virtual IP (VIP) address that is also managed by Pacemaker. The VIP is typically created on an isolatedStorageNFSnetwork. The Controller nodes have ordering and collocation constraints established between this VIP,ceph-nfs, and the Shared File Systems service (manila) share manager service. Prior to adopting Shared File Systems service, you must adjust the Pacemaker ordering and collocation constraints to separate the share manager service. This establishesceph-nfswith its VIP as an isolated, standalone NFS service that you can decommission after completing the RHOSO adoption. - In Red Hat Ceph Storage 7, a native clustered Ceph NFS service has to be deployed on the Red Hat Ceph Storage cluster by using the Ceph Orchestrator prior to adopting the Shared File Systems service. This NFS service eventually replaces the standalone NFS service from RHOSP 17.1 in your deployment. When the Shared File Systems service is adopted into the RHOSO 18.0 environment, it establishes all the existing exports and client restrictions on the new clustered Ceph NFS service. Clients can continue to read and write data on existing NFS shares, and are not affected until the old standalone NFS service is decommissioned. After the service is decommissioned, you can re-mount the same share from the new clustered Ceph NFS service during a scheduled downtime.
-
To ensure that NFS users are not required to make any networking changes to their existing workloads, assign an IP address from the same isolated
StorageNFSnetwork to the clustered Ceph NFS service. NFS users only need to discover and re-mount their shares by using new export paths. When the adoption is complete, RHOSO users can query the Shared File Systems service API to list the export locations on existing shares to identify the preferred paths to mount these shares. These preferred paths correspond to the new clustered Ceph NFS service in contrast to other non-preferred export paths that continue to be displayed until the old isolated, standalone NFS service is decommissioned. - When you migrate your workloads from the old NFS service, you must ensure that exports are not consumed from both the old NFS service and the new clustered Ceph NFS service at the same time. This simultaneous access to both services is considered dangerous and bypasses the protections for concurrent access that is ensured by the NFS protocol. When you migrate the workloads to use exports from the new NFS service, you must ensure that you migrate the use of each export entirely so that no part of the workload stays connected to the old NFS service.
-
You can no longer control the old Pacemaker-managed
ceph-nfsservice through the Red Hat OpenStack Platform director after the control plane adoption is complete. This means that there is no support for updating the NFS Ganesha software, or changing any configuration. While data is protected from server crashes or restarts, high availability and data recovery is still limited, and these maintenance issues are no longer visible to Shared File Systems service. - Cloud administrators must ensure a reasonably short window to switch over all end-user workloads to the new NFS service.
-
While the old
ceph-nfsservice only supported NFS version 4.1 and later, the new clustered NFS service supports NFS protocols 3 and 4.1 and later. Mixing protocol versions with an export results in unintended consequences. You should mount a given share across all clients by using a consistent NFS protocol version.
1.12. Red Hat Ceph Storage prerequisites Copy linkLink copied to clipboard!
Before you migrate your Red Hat Ceph Storage cluster daemons from your Controller nodes, you must complete the following tasks in your Red Hat OpenStack Platform 17.1 environment to prepare for the Red Hat OpenStack Services on OpenShift (RHOSO) adoption.
- Upgrade your Red Hat Ceph Storage cluster to release 7. For more information, see "Upgrading Red Hat Ceph Storage 6 to 7" in Framework for upgrades (16.2 to 17.1).
-
Your Red Hat Ceph Storage 7 deployment is managed by
cephadm. - The undercloud is still available, and the nodes and networks are managed by director.
-
If you use an externally deployed Red Hat Ceph Storage cluster, you must recreate a
ceph-nfscluster in the target nodes as well as propogate theStorageNFSnetwork. Complete the prerequisites for your specific Red Hat Ceph Storage environment:
- Red Hat Ceph Storage with monitoring stack components
- Red Hat Ceph Storage RGW
- Red Hat Ceph Storage RBD
- NFS Ganesha
1.12.1. Completing prerequisites for a Red Hat Ceph Storage cluster with monitoring stack components Copy linkLink copied to clipboard!
Before you migrate a Red Hat Ceph Storage cluster with monitoring stack components, you must gather monitoring stack information, review and update the container image registry, and remove the undercloud container images.
In addition to updating the container images related to the monitoring stack, you must update the configuration entry related to the container_image_base. This has an impact on all the Red Hat Ceph Storage daemons that rely on the undercloud images. New daemons are deployed by using the new image registry location that is configured in the Red Hat Ceph Storage cluster.
Procedure
Gather the current status of the monitoring stack. Verify that the hosts have no
monitoringlabel, orgrafana,prometheus, oralertmanager, in cases of a per daemons placement evaluation:NoteThe entire relocation process is driven by
cephadmand relies on labels to be assigned to the target nodes, where the daemons are scheduled. For more information about assigning labels to nodes, review the Red Hat Knowledgebase article Red Hat Ceph Storage: Supported configurations.[tripleo-admin@controller-0 ~]$ sudo cephadm shell -- ceph orch host ls HOST ADDR LABELS STATUS cephstorage-0.redhat.local 192.168.24.11 osd mds cephstorage-1.redhat.local 192.168.24.12 osd mds cephstorage-2.redhat.local 192.168.24.47 osd mds controller-0.redhat.local 192.168.24.35 _admin mon mgr controller-1.redhat.local 192.168.24.53 mon _admin mgr controller-2.redhat.local 192.168.24.10 mon _admin mgr 6 hosts in clusterConfirm that the cluster is healthy and that both
ceph orch lsandceph orch psreturn the expected number of deployed daemons.Review and update the container image registry:
NoteIf you run the Red Hat Ceph Storage externalization procedure after you migrate the Red Hat OpenStack Platform control plane, update the container images in the Red Hat Ceph Storage cluster configuration. The current container images point to the undercloud registry, which might not be available anymore. Because the undercloud is not available after adoption is complete, replace the undercloud-provided images with an alternative registry.
$ ceph config dump ... ... mgr advanced mgr/cephadm/container_image_alertmanager undercloud-0.ctlplane.redhat.local:8787/rh-osbs/openshift-ose-prometheus-alertmanager:v4.10 mgr advanced mgr/cephadm/container_image_base undercloud-0.ctlplane.redhat.local:8787/rh-osbs/rhceph mgr advanced mgr/cephadm/container_image_grafana undercloud-0.ctlplane.redhat.local:8787/rh-osbs/grafana:latest mgr advanced mgr/cephadm/container_image_node_exporter undercloud-0.ctlplane.redhat.local:8787/rh-osbs/openshift-ose-prometheus-node-exporter:v4.10 mgr advanced mgr/cephadm/container_image_prometheus undercloud-0.ctlplane.redhat.local:8787/rh-osbs/openshift-ose-prometheus:v4.10Remove the undercloud container images:
$ cephadm shell -- ceph config rm mgr mgr/cephadm/container_image_base \ for i in prometheus grafana alertmanager node_exporter; do \ cephadm shell -- ceph config rm mgr mgr/cephadm/container_image_$i \ done
1.12.2. Completing prerequisites for Red Hat Ceph Storage RGW migration Copy linkLink copied to clipboard!
Complete the following prerequisites before you begin the Ceph Object Gateway (RGW) migration.
Procedure
Check the current status of the Red Hat Ceph Storage nodes:
(undercloud) [stack@undercloud-0 ~]$ metalsmith list +------------------------+ +----------------+ | IP Addresses | | Hostname | +------------------------+ +----------------+ | ctlplane=192.168.24.25 | | cephstorage-0 | | ctlplane=192.168.24.10 | | cephstorage-1 | | ctlplane=192.168.24.32 | | cephstorage-2 | | ctlplane=192.168.24.28 | | compute-0 | | ctlplane=192.168.24.26 | | compute-1 | | ctlplane=192.168.24.43 | | controller-0 | | ctlplane=192.168.24.7 | | controller-1 | | ctlplane=192.168.24.41 | | controller-2 | +------------------------+ +----------------+Log in to
controller-0and check the Pacemaker status to identify important information for the RGW migration:Full List of Resources: * ip-192.168.24.46 (ocf:heartbeat:IPaddr2): Started controller-0 * ip-10.0.0.103 (ocf:heartbeat:IPaddr2): Started controller-1 * ip-172.17.1.129 (ocf:heartbeat:IPaddr2): Started controller-2 * ip-172.17.3.68 (ocf:heartbeat:IPaddr2): Started controller-0 * ip-172.17.4.37 (ocf:heartbeat:IPaddr2): Started controller-1 * Container bundle set: haproxy-bundle [undercloud-0.ctlplane.redhat.local:8787/rh-osbs/rhosp17-openstack-haproxy:pcmklatest]: * haproxy-bundle-podman-0 (ocf:heartbeat:podman): Started controller-2 * haproxy-bundle-podman-1 (ocf:heartbeat:podman): Started controller-0 * haproxy-bundle-podman-2 (ocf:heartbeat:podman): Started controller-1Identify the ranges of the storage networks. The following is an example and the values might differ in your environment:
[heat-admin@controller-0 ~]$ ip -o -4 a 1: lo inet 127.0.0.1/8 scope host lo\ valid_lft forever preferred_lft forever 2: enp1s0 inet 192.168.24.45/24 brd 192.168.24.255 scope global enp1s0\ valid_lft forever preferred_lft forever 2: enp1s0 inet 192.168.24.46/32 brd 192.168.24.255 scope global enp1s0\ valid_lft forever preferred_lft forever 7: br-ex inet 10.0.0.122/24 brd 10.0.0.255 scope global br-ex\ valid_lft forever preferred_lft forever 8: vlan70 inet 172.17.5.22/24 brd 172.17.5.255 scope global vlan70\ valid_lft forever preferred_lft forever 8: vlan70 inet 172.17.5.94/32 brd 172.17.5.255 scope global vlan70\ valid_lft forever preferred_lft forever 9: vlan50 inet 172.17.2.140/24 brd 172.17.2.255 scope global vlan50\ valid_lft forever preferred_lft forever 10: vlan30 inet 172.17.3.73/24 brd 172.17.3.255 scope global vlan30\ valid_lft forever preferred_lft forever 10: vlan30 inet 172.17.3.68/32 brd 172.17.3.255 scope global vlan30\ valid_lft forever preferred_lft forever 11: vlan20 inet 172.17.1.88/24 brd 172.17.1.255 scope global vlan20\ valid_lft forever preferred_lft forever 12: vlan40 inet 172.17.4.24/24 brd 172.17.4.255 scope global vlan40\ valid_lft forever preferred_lft forever-
br-exrepresents the External Network, where in the current environment, HAProxy has the front-end Virtual IP (VIP) assigned. -
vlan30represents the Storage Network, where the new RGW instances should be started on the Red Hat Ceph Storage nodes.
-
Identify the network that you previously had in HAProxy and propagate it through director to the Red Hat Ceph Storage nodes. Use this network to reserve a new VIP that is owned by Red Hat Ceph Storage as the entry point for the RGW service.
Log in to
controller-0and find theceph_rgwsection in the current HAProxy configuration:$ less /var/lib/config-data/puppet-generated/haproxy/etc/haproxy/haproxy.cfg ... ... listen ceph_rgw bind 10.0.0.103:8080 transparent bind 172.17.3.68:8080 transparent mode http balance leastconn http-request set-header X-Forwarded-Proto https if { ssl_fc } http-request set-header X-Forwarded-Proto http if !{ ssl_fc } http-request set-header X-Forwarded-Port %[dst_port] option httpchk GET /swift/healthcheck option httplog option forwardfor server controller-0.storage.redhat.local 172.17.3.73:8080 check fall 5 inter 2000 rise 2 server controller-1.storage.redhat.local 172.17.3.146:8080 check fall 5 inter 2000 rise 2 server controller-2.storage.redhat.local 172.17.3.156:8080 check fall 5 inter 2000 rise 2Confirm that the network is used as an HAProxy front end. The following example shows that
controller-0exposes the services by using the external network, which is absent from the Red Hat Ceph Storage nodes. You must propagate the external network through director:[controller-0]$ ip -o -4 a ... 7: br-ex inet 10.0.0.106/24 brd 10.0.0.255 scope global br-ex\ valid_lft forever preferred_lft forever ...NoteIf the target nodes are not managed by director, you cannot use this procedure to configure the network. An administrator must manually configure all the required networks.
Propagate the HAProxy front-end network to Red Hat Ceph Storage nodes.
In the NIC template that you use to define the
ceph-storagenetwork interfaces, add the new config section in the Red Hat Ceph Storage network configuration template file, for example,/home/stack/composable_roles/network/nic-configs/ceph-storage.j2:--- network_config: - type: interface name: nic1 use_dhcp: false dns_servers: {{ ctlplane_dns_nameservers }} addresses: - ip_netmask: {{ ctlplane_ip }}/{{ ctlplane_cidr }} routes: {{ ctlplane_host_routes }} - type: vlan vlan_id: {{ storage_mgmt_vlan_id }} device: nic1 addresses: - ip_netmask: {{ storage_mgmt_ip }}/{{ storage_mgmt_cidr }} routes: {{ storage_mgmt_host_routes }} - type: interface name: nic2 use_dhcp: false defroute: false - type: vlan vlan_id: {{ storage_vlan_id }} device: nic2 addresses: - ip_netmask: {{ storage_ip }}/{{ storage_cidr }} routes: {{ storage_host_routes }} - type: ovs_bridge name: {{ neutron_physical_bridge_name }} dns_servers: {{ ctlplane_dns_nameservers }} domain: {{ dns_search_domains }} use_dhcp: false addresses: - ip_netmask: {{ external_ip }}/{{ external_cidr }} routes: {{ external_host_routes }} members: [] - type: interface name: nic3 primary: trueAdd the External Network to the bare metal file, for example,
/home/stack/composable_roles/network/baremetal_deployment.yamlthat is used bymetalsmith:NoteEnsure that network_config_update is enabled for network propagation to the target nodes when
os-net-configis triggered.- name: CephStorage count: 3 hostname_format: cephstorage-%index% instances: - hostname: cephstorage-0 name: ceph-0 - hostname: cephstorage-1 name: ceph-1 - hostname: cephstorage-2 name: ceph-2 defaults: profile: ceph-storage network_config: template: /home/stack/composable_roles/network/nic-configs/ceph-storage.j2 network_config_update: true networks: - network: ctlplane vif: true - network: storage - network: storage_mgmt - network: externalConfigure the new network on the bare metal nodes:
(undercloud) [stack@undercloud-0]$ openstack overcloud node provision \ -o overcloud-baremetal-deployed-0.yaml \ --stack overcloud \ --network-config -y \ $PWD/composable_roles/network/baremetal_deployment.yamlVerify that the new network is configured on the Red Hat Ceph Storage nodes:
[root@cephstorage-0 ~]# ip -o -4 a 1: lo inet 127.0.0.1/8 scope host lo\ valid_lft forever preferred_lft forever 2: enp1s0 inet 192.168.24.54/24 brd 192.168.24.255 scope global enp1s0\ valid_lft forever preferred_lft forever 11: vlan40 inet 172.17.4.43/24 brd 172.17.4.255 scope global vlan40\ valid_lft forever preferred_lft forever 12: vlan30 inet 172.17.3.23/24 brd 172.17.3.255 scope global vlan30\ valid_lft forever preferred_lft forever 14: br-ex inet 10.0.0.133/24 brd 10.0.0.255 scope global br-ex\ valid_lft forever preferred_lft forever
1.12.3. Completing prerequisites for a Red Hat Ceph Storage RBD migration Copy linkLink copied to clipboard!
Complete the following prerequisites before you begin the Red Hat Ceph Storage Rados Block Device (RBD) migration.
-
The target CephStorage or ComputeHCI nodes are configured to have both
storageandstorage_mgmtnetworks. This ensures that you can use both Red Hat Ceph Storage public and cluster networks from the same node. From Red Hat OpenStack Platform 17.1 and later you do not have to run a stack update. -
NFS Ganesha is migrated from a director deployment to
cephadm. For more information, see "Creating an NFS Ganesha cluster". - Ceph Metadata Server, monitoring stack, Ceph Object Gateway, and any other daemon that is deployed on Controller nodes.
- The daemons distribution follows the cardinality constraints that are described in "Red Hat Ceph Storage: Supported configurations".
-
The Red Hat Ceph Storage cluster is healthy, and the
ceph -scommand returnsHEALTH_OK. Run
os-net-configon the bare metal node and configure additional networks:If target nodes are
CephStorage, ensure that the network is defined in the bare metal file for theCephStoragenodes, for example,/home/stack/composable_roles/network/baremetal_deployment.yaml:- name: CephStorage count: 2 instances: - hostname: oc0-ceph-0 name: oc0-ceph-0 - hostname: oc0-ceph-1 name: oc0-ceph-1 defaults: networks: - network: ctlplane vif: true - network: storage_cloud_0 subnet: storage_cloud_0_subnet - network: storage_mgmt_cloud_0 subnet: storage_mgmt_cloud_0_subnet network_config: template: templates/single_nic_vlans/single_nic_vlans_storage.j2Add the missing network:
$ openstack overcloud node provision \ -o overcloud-baremetal-deployed-0.yaml --stack overcloud-0 \ /--network-config -y --concurrency 2 /home/stack/metalsmith-0.yamlVerify that the storage network is configured on the target nodes:
(undercloud) [stack@undercloud ~]$ ssh heat-admin@192.168.24.14 ip -o -4 a 1: lo inet 127.0.0.1/8 scope host lo\ valid_lft forever preferred_lft forever 5: br-storage inet 192.168.24.14/24 brd 192.168.24.255 scope global br-storage\ valid_lft forever preferred_lft forever 6: vlan1 inet 192.168.24.14/24 brd 192.168.24.255 scope global vlan1\ valid_lft forever preferred_lft forever 7: vlan11 inet 172.16.11.172/24 brd 172.16.11.255 scope global vlan11\ valid_lft forever preferred_lft forever 8: vlan12 inet 172.16.12.46/24 brd 172.16.12.255 scope global vlan12\ valid_lft forever preferred_lft forever
1.12.4. Creating an NFS Ganesha cluster Copy linkLink copied to clipboard!
If you use CephFS through NFS with the Shared File Systems service (manila), you must create a new clustered NFS service on the Red Hat Ceph Storage cluster. This service replaces the standalone, Pacemaker-controlled ceph-nfs service that you use in Red Hat OpenStack Platform (RHOSP) 17.1.
Procedure
Identify the Red Hat Ceph Storage nodes to deploy the new clustered NFS service, for example,
cephstorage-0,cephstorage-1,cephstorage-2.NoteYou must deploy this service on the
StorageNFSisolated network so that you can mount your existing shares through the new NFS export locations. You can deploy the new clustered NFS service on your existing CephStorage nodes or HCI nodes, or on new hardware that you enrolled in the Red Hat Ceph Storage cluster.If you deployed your Red Hat Ceph Storage nodes with director, propagate the
StorageNFSnetwork to the target nodes where theceph-nfsservice is deployed.NoteIf the target nodes are not managed by director, you cannot use this procedure to configure the network. An administrator must manually configure all the required networks.
-
Identify the node definition file,
overcloud-baremetal-deploy.yaml, that is used in the RHOSP environment. For more information about identifying theovercloud-baremetal-deploy.yamlfile, see Customizing overcloud networks in Customizing the Red Hat OpenStack Services on OpenShift deployment. Edit the networks that are associated with the Red Hat Ceph Storage nodes to include the
StorageNFSnetwork:- name: CephStorage count: 3 hostname_format: cephstorage-%index% instances: - hostname: cephstorage-0 name: ceph-0 - hostname: cephstorage-1 name: ceph-1 - hostname: cephstorage-2 name: ceph-2 defaults: profile: ceph-storage network_config: template: /home/stack/network/nic-configs/ceph-storage.j2 network_config_update: true networks: - network: ctlplane vif: true - network: storage - network: storage_mgmt - network: storage_nfsEdit the network configuration template file, for example,
/home/stack/network/nic-configs/ceph-storage.j2, for the Red Hat Ceph Storage nodes to include an interface that connects to theStorageNFSnetwork:- type: vlan device: nic2 vlan_id: {{ storage_nfs_vlan_id }} addresses: - ip_netmask: {{ storage_nfs_ip }}/{{ storage_nfs_cidr }} routes: {{ storage_nfs_host_routes }}Update the Red Hat Ceph Storage nodes:
$ openstack overcloud node provision \ --stack overcloud \ --network-config -y \ -o overcloud-baremetal-deployed-storage_nfs.yaml \ --concurrency 2 \ /home/stack/network/baremetal_deployment.yamlWhen the update is complete, ensure that a new interface is created in theRed Hat Ceph Storage nodes and that they are tagged with the VLAN that is associated with
StorageNFS.
-
Identify the node definition file,
Identify the IP address from the
StorageNFSnetwork to use as the Virtual IP address (VIP) for the Ceph NFS service:$ openstack port list -c "Fixed IP Addresses" --network storage_nfsIn a running
cephadmshell, identify the hosts for the NFS service:$ ceph orch host lsLabel each host that you identified. Repeat this command for each host that you want to label:
$ ceph orch host label add <hostname> nfs-
Replace
<hostname>with the name of the host that you identified.
-
Replace
Create the NFS cluster:
$ ceph nfs cluster create cephfs \ "label:nfs" \ --ingress \ --virtual-ip=<VIP> \ --ingress-mode=haproxy-protocolReplace
<VIP>with the VIP for the Ceph NFS service.NoteYou must set the
ingress-modeargument tohaproxy-protocol. No other ingress-mode is supported. This ingress mode allows you to enforce client restrictions through the Shared File Systems service.For more information about deploying the clustered Ceph NFS service, see "Management of NFS-Ganesha gateway using the Ceph Orchestrator" in the Operations Guide for your Red Hat Ceph Storage version:
Check the status of the NFS cluster:
$ ceph nfs cluster ls $ ceph nfs cluster info cephfs
1.13. Preparing an Instance HA deployment for adoption Copy linkLink copied to clipboard!
To enable the high availability for Compute instances (Instance HA) service after you adopt the Red Hat OpenStack Services on OpenShift (RHOSO) 18.0 data plane, perform the following preparation tasks:
- Create a fencing configuration file to use after you adopt the RHOSO data plane.
- Prevent Pacemaker from monitoring or recovering the Compute nodes.
1.13.1. Maintaining the Instance HA functionality after adoption Copy linkLink copied to clipboard!
To maintain the high availability for Compute instances (Instance HA) functionality after you adopt Red Hat OpenStack Services on OpenShift 18.0, create a fencing configuration file to use in your adopted environment.
Procedure
-
Gather the fencing information from the
fencing.yamlfile in your Red Hat OpenStack Platform (RHOSP) 17.1 cluster. Retrieve the RHOSP 17.1 stonith configuration from any of your overcloud Controller nodes:
$ sudo pcs configStonith Devices: ... Resource: stonith-fence_ipmilan-525400dde4f7 (class=stonith type=fence_ipmilan) Attributes: stonith-fence_ipmilan-525400dde4f7-instance_attributes delay=20 ipaddr=172.16.0.1 ipport=6231 lanplus=true login=admin passwd=password pcmk_host_list=compute-1 Operations: monitor: stonith-fence_ipmilan-525400dde4f7-monitor-interval-60s interval=60s Resource: stonith-fence_ipmilan-525400819ad3 (class=stonith type=fence_ipmilan) Attributes: stonith-fence_ipmilan-525400819ad3-instance_attributes delay=20 ipaddr=172.16.0.1 ipport=6230 lanplus=true login=admin passwd=password pcmk_host_list=compute-0 Operations: monitor: stonith-fence_ipmilan-525400819ad3-monitor-interval-60s interval=60s ...Generate the fencing configuration file:
- To install the script that automatically generates this file, see How do I automatically generate fencing secret for RHOSO18 instanceha from a osp17.1 cluster that I want to adopt?.
- To create the fencing configuration file manually, see Configuring the fencing of Compute nodes in Configuring high availability for instances.
1.13.2. Preventing Pacemaker from monitoring Compute nodes Copy linkLink copied to clipboard!
You must disable Pacemaker so that it does not monitor your Compute nodes during the adoption. For example, if a network issue occurs during the adoption, Pacemaker attempts to reboot the Compute nodes to recover them, which breaks the adoption.
Procedure
Retrieve the names of the Compute remote resources:
$ sudo pcs stonith |grep -B1 stonith-fence_compute-fence-nova |grep Target |awk -F ': ' '{print $2}'Disable the stonith and
pacemaker_remoteresources on each Compute remote resource:$ sudo pcs property set stonith-enabled=false $ sudo pcs resource disable <compute_remote_resource>where:
<compute_remote_resource>- Specifies the name of the Compute remote resource in your environment.
Retrieve the name of the Compute stonith resources:
$ sudo pcs stonith |grep Level |grep fence_compute |awk '{print $4}' |awk -F ',' '{print $1}' |sort |uniqRemove the Compute node
pacemaker_remoteand fencing resources:$ sudo pcs stonith disable stonith-fence_compute-fence-nova $ sudo pcs stonith disable <compute_stonith_resource> $ sudo pcs stonith delete <compute_stonith_resource> $ sudo pcs resource delete <compute_remote_resource> $ sudo pcs resource disable compute-unfence-trigger-clone $ sudo pcs resource delete compute-unfence-trigger-clone $ sudo pcs resource disable nova-evacuate $ sudo pcs resource delete nova-evacuatewhere:
<compute_stonith_resource>- Specifies the name of the Compute stonith resource in your environment.
1.14. Comparing configuration files between deployments Copy linkLink copied to clipboard!
To help you manage the configuration for your director and Red Hat OpenStack Platform (RHOSP) services, you can compare the configuration files between your director deployment and the Red Hat OpenStack Services on OpenShift (RHOSO) cloud by using the os-diff tool.
Prerequisites
Golang is installed and configured on your environment:
dnf install -y golang-github-openstack-k8s-operators-os-diff
Procedure
Configure the
/etc/os-diff/os-diff.cfgfile and the/etc/os-diff/ssh.configfile according to your environment. To allow os-diff to connect to your clouds and pull files from the services that you describe in theconfig.yamlfile, you must set the following options in theos-diff.cfgfile:[Default] local_config_dir=/tmp/ service_config_file=config.yaml [Tripleo] ssh_cmd=ssh -F ssh.config director_host=standalone container_engine=podman connection=ssh remote_config_path=/tmp/tripleo local_config_path=/tmp/ [Openshift] ocp_local_config_path=/tmp/ocp connection=local ssh_cmd=""-
ssh_cmd=ssh -F ssh.configinstructs os-diff to access your director host through SSH. The default value isssh -F ssh.config. However, you can set the value without an ssh.config file, for example,ssh -i /home/user/.ssh/id_rsa stack@my.undercloud.local. -
director_host=standalonespecifies the host to use to access your cloud, and the podman/docker binary is installed and allowed to interact with the running containers. You can leave this key blank.
-
If you use a host file to connect to your cloud, configure the
ssh.configfile to allow os-diff to access your RHOSP environment, for example:Host * IdentitiesOnly yes Host virthost Hostname virthost IdentityFile ~/.ssh/id_rsa User root StrictHostKeyChecking no UserKnownHostsFile=/dev/null Host standalone Hostname standalone IdentityFile <path to SSH key> User root StrictHostKeyChecking no UserKnownHostsFile=/dev/null Host crc Hostname crc IdentityFile ~/.ssh/id_rsa User stack StrictHostKeyChecking no UserKnownHostsFile=/dev/null-
Replace
<path to SSH key>with the path to your SSH key. You must provide a value forIdentityFileto get full working access to your RHOSP environment.
-
Replace
If you use an inventory file to connect to your cloud, generate the
ssh.configfile from your Ansible inventory, for example,tripleo-ansible-inventory.yamlfile:$ os-diff configure -i tripleo-ansible-inventory.yaml -o ssh.config --yaml
Verification
Test your connection:
$ ssh -F ssh.config standalone
1.15. Preventing configuration loss when using the oc patch command Copy linkLink copied to clipboard!
When you use the oc patch command to modify a resource, the changes are applied directly to the live object in your OpenShift cluster. If you later edit the custom resource (CR) file for the resource and apply the updates by using oc apply -f <filename>, your previous patched changes are overwritten and lost from the resource.
To prevent loss of configuration, you can use the --patch-file option to configure the patch and retain patch files. Alternatively, you can export your openstackcontrolplane CR after the patch is applied:
$ oc get <resource_type> <resource_name> -o yaml > <filename>.yaml
For example:
$ oc get OpenStackControlPlane openstack-control-plane -o yaml > openstack_control_plane.yaml
Chapter 2. Migrating TLS-e to the RHOSO deployment Copy linkLink copied to clipboard!
If you enabled TLS everywhere (TLS-e) in your Red Hat OpenStack Platform (RHOSP) 17.1 deployment, you must migrate TLS-e to the Red Hat OpenStack Services on OpenShift (RHOSO) deployment.
The RHOSO deployment uses the cert-manager operator to issue, track, and renew the certificates. In the following procedure, you extract the CA signing certificate from the FreeIPA instance that you use to provide the certificates in the RHOSP environment, and then import them into cert-manager in the RHOSO environment. As a result, you minimize the disruption on the Compute nodes because you do not need to install a new chain of trust.
You then decommission the previous FreeIPA node and no longer use it to issue certificates. This might not be possible if you use the IPA server to issue certificates for non-RHOSP systems.
- The following procedure was reproduced on a FreeIPA 4.10.1 server. The location of the files and directories might change depending on the version.
- If the signing keys are stored in an hardware security module (HSM) instead of an NSS shared database (NSSDB), and the keys are retrievable, special HSM utilities might be required.
Prerequisites
- Your RHOSP deployment is using TLS-e.
- Ensure that the back-end services on the new deployment are not started yet.
Define the following shell variables. The values are examples and refer to a single-node standalone director deployment. Replace these example values with values that are correct for your environment:
IPA_SSH="ssh -i <path_to_ssh_key> <admin user>@<freeipa-server-ip-address> sudo"
Procedure
To locate the CA certificate and key, list all the certificates inside your NSSDB:
$IPA_SSH certutil -L -d /etc/pki/pki-tomcat/alias-
The
-Loption lists all certificates. The
-doption specifies where the certificates are stored.The command produces an output similar to the following example:
Certificate Nickname Trust Attributes SSL,S/MIME,JAR/XPI caSigningCert cert-pki-ca CTu,Cu,Cu ocspSigningCert cert-pki-ca u,u,u Server-Cert cert-pki-ca u,u,u subsystemCert cert-pki-ca u,u,u auditSigningCert cert-pki-ca u,u,Pu
-
The
Export the certificate and key from the
/etc/pki/pki-tomcat/aliasdirectory. The following example uses thecaSigningCert cert-pki-cacertificate:$IPA_SSH pk12util -o /tmp/freeipa.p12 -n 'caSigningCert\ cert-pki-ca' -d /etc/pki/pki-tomcat/alias -k /etc/pki/pki-tomcat/alias/pwdfile.txt -w /etc/pki/pki-tomcat/alias/pwdfile.txtNoteThe command generates a P12 file with both the certificate and the key. The
/etc/pki/pki-tomcat/alias/pwdfile.txtfile contains the password that protects the key. You can use the password to both extract the key and generate the new file,/tmp/freeipa.p12. You can also choose another password. If you choose a different password for the new file, replace the parameter of the-woption, or use the-Woption followed by the password, in clear text.With that file, you can also get the certificate and the key by using the
openssl pkcs12command.Create the secret that contains the root CA:
$ oc create secret generic rootca-internalImport the certificate and the key from FreeIPA:
$ oc patch secret rootca-internal -p="{\"data\":{\"ca.crt\": \"`$IPA_SSH openssl pkcs12 -in /tmp/freeipa.p12 -passin file:/etc/pki/pki-tomcat/alias/pwdfile.txt -nokeys | openssl x509 | base64 -w 0`\"}}" $ oc patch secret rootca-internal -p="{\"data\":{\"tls.crt\": \"`$IPA_SSH openssl pkcs12 -in /tmp/freeipa.p12 -passin file:/etc/pki/pki-tomcat/alias/pwdfile.txt -nokeys | openssl x509 | base64 -w 0`\"}}" $ oc patch secret rootca-internal -p="{\"data\":{\"tls.key\": \"`$IPA_SSH openssl pkcs12 -in /tmp/freeipa.p12 -passin file:/etc/pki/pki-tomcat/alias/pwdfile.txt -nocerts -noenc | openssl rsa | base64 -w 0`\"}}"Create the cert-manager issuer and reference the secret:
$ oc apply -f - <<EOF apiVersion: cert-manager.io/v1 kind: Issuer metadata: name: rootca-internal labels: osp-rootca-issuer-public: "" osp-rootca-issuer-internal: "" osp-rootca-issuer-libvirt: "" osp-rootca-issuer-ovn: "" spec: ca: secretName: rootca-internal EOFDelete the previously created p12 files:
$IPA_SSH rm /tmp/freeipa.p12
Verification
Verify that the necessary resources are created:
$ oc get issuers$ oc get secret rootca-internal -o yaml
After the adoption is complete, the cert-manager operator issues new certificates and updates the secrets with the new certificates. As a result, the pods on the control plane automatically restart in order to obtain the new certificates. On the data plane, you must manually initiate a new deployment and restart certain processes to use the new certificates. The old certificates remain active until both the control plane and data plane obtain the new certificates.
Chapter 3. Migrating databases to the control plane Copy linkLink copied to clipboard!
To begin creating the control plane, enable back-end services and import the databases from your original Red Hat OpenStack Platform 17.1 deployment.
3.1. Retrieving topology-specific service configuration Copy linkLink copied to clipboard!
Before you migrate your databases to the Red Hat OpenStack Services on OpenShift (RHOSO) control plane, retrieve the topology-specific service configuration from your Red Hat OpenStack Platform (RHOSP) environment. You need this configuration for the following reasons:
- To check your current database for inaccuracies
- To ensure that you have the data you need before the migration
- To compare your RHOSP database with the adopted RHOSO database
Prerequisites
Define the following shell variables. Replace the example values with values that are correct for your environment:
NoteIf you use IPv6, define the
SOURCE_MARIADB_IPvalue without brackets. For example,SOURCE_MARIADB_IP=fd00:bbbb::2.$ PASSWORD_FILE="$HOME/overcloud-passwords.yaml" $ MARIADB_IMAGE=registry.redhat.io/rhoso/openstack-mariadb-rhel9:18.0 $ declare -A TRIPLEO_PASSWORDS $ CELLS="default cell1 cell2" $ for CELL in $(echo $CELLS); do > TRIPLEO_PASSWORDS[$CELL]="$PASSWORD_FILE" > done $ declare -A SOURCE_DB_ROOT_PASSWORD $ for CELL in $(echo $CELLS); do > SOURCE_DB_ROOT_PASSWORD[$CELL]=$(cat ${TRIPLEO_PASSWORDS[$CELL]} | grep ' MysqlRootPassword:' | awk -F ': ' '{ print $2; }') > doneDefine the following shell variables. Replace the example values with values that are correct for your environment:
$ MARIADB_CLIENT_ANNOTATIONS='--annotations=k8s.v1.cni.cncf.io/networks=internalapi' $ MARIADB_RUN_OVERRIDES="$MARIADB_CLIENT_ANNOTATIONS"NoteFor environments that are enabled with border gateway protocol (BGP), the network annotation must include a default route to enable proper routing. Use the following instead:
$ MARIADB_CLIENT_ANNOTATIONS='--annotations=k8s.v1.cni.cncf.io/networks=[{"name":"internalapi","namespace":"openstack","default-route":["<172.17.0.1>"]}]' $ MARIADB_RUN_OVERRIDES="$MARIADB_CLIENT_ANNOTATIONS"where:
- <172.17.0.1>
-
Replace with the gateway IP address of your
internalapinetwork.
$ CONTROLLER1_SSH="ssh -i *<path to SSH key>* root@*<node IP>*" $ declare -A SOURCE_MARIADB_IP $ SOURCE_MARIADB_IP[default]=*<galera cluster VIP>* $ SOURCE_MARIADB_IP[cell1]=*<galera cell1 cluster VIP>* $ SOURCE_MARIADB_IP[cell2]=*<galera cell2 cluster VIP>* # ...-
Provide
CONTROLLER1_SSHsettings with SSH connection details for any non-cell Controller of the source director cloud. -
For each cell that is defined in
CELLS, replaceSOURCE_MARIADB_IP[*]= ..., with the records lists for the cell names and VIP addresses of MariaDB Galera clusters, including the cells, of the source director cloud. To get the values for
SOURCE_MARIADB_IP, query the puppet-generated configurations in a Controller and CellController node:$ sudo grep -rI 'listen mysql' -A10 /var/lib/config-data/puppet-generated/ | grep bindNoteThe source cloud always uses the same password for cells databases. For that reason, the same passwords file is used for all cells stacks. However, split-stack topology allows using different passwords files for each stack.
Procedure
If your source RHOSP environment uses border gateway protocol (BGP) for Layer 3 networking, create a
BGPConfigurationcustom resource to enable BGP routing:$ cat << EOF > bgp.yaml apiVersion: network.openstack.org/v1beta1 kind: BGPConfiguration metadata: name: bgpconfiguration namespace: openstack spec: {} EOF $ oc apply -f bgp.yamlThe
BGPConfigurationresource enables BGP route advertisement between the Red Hat OpenShift Container Platform (RHOCP) cluster and the source cloud, which is necessary for themariadb-clientpod to reach the source MariaDB cluster.Create a persistent
mariadb-clientpod for database operations:$ oc delete pod mariadb-client || true $ oc run mariadb-client ${MARIADB_RUN_OVERRIDES} -q --image ${MARIADB_IMAGE} --restart=Never -- /usr/bin/sleep infinityThis creates a long-running pod that is used for all subsequent database operations, avoiding the need to create temporary pods for each command.
Wait for the
mariadb-clientpod to be able to reach the source MariaDB:$ oc rsh mariadb-client mysql -rsh "${SOURCE_MARIADB_IP[default]}" -uroot -p"${SOURCE_DB_ROOT_PASSWORD[default]}" -e 'select 1;'NoteFor BGP-enabled environments, this command might take a few moments to succeed while BGP routes are advertised and propagated through the network. The
mariadb-clientpod needs to receive the route to the source MariaDB IP address through BGP before it can establish a connection. If the command fails, wait a few seconds and retry. The connection should succeed once the BGP route advertisement is complete.For IPv6 environments, this command might take a few moments to succeed while the network IPv6 stack completes its setup. If the command fails, wait a few seconds and retry.
For standard deployments, such as non-BGP deployments or IPv4 deployments, this command should succeed immediately.
Export the shell variables for the following outputs and test the connection to the RHOSP database:
$ unset PULL_OPENSTACK_CONFIGURATION_DATABASES $ declare -xA PULL_OPENSTACK_CONFIGURATION_DATABASES $ for CELL in $(echo $CELLS); do > PULL_OPENSTACK_CONFIGURATION_DATABASES[$CELL]=$(oc rsh mariadb-client \ > mysql -rsh "${SOURCE_MARIADB_IP[$CELL]}" -uroot -p"${SOURCE_DB_ROOT_PASSWORD[$CELL]}" -e 'SHOW databases;') > doneIf the connection is successful, the expected output is nothing.
NoteThe
nova,nova_api, andnova_cell0databases are included in the same database host for the main overcloud Orchestration service (heat) stack. Additional cells use thenovadatabase of their local Galera clusters.Run
mysqlcheckon the RHOSP database to check for inaccuracies:$ unset PULL_OPENSTACK_CONFIGURATION_MYSQLCHECK_NOK $ declare -xA PULL_OPENSTACK_CONFIGURATION_MYSQLCHECK_NOK $ run_mysqlcheck() { > PULL_OPENSTACK_CONFIGURATION_MYSQLCHECK_NOK=$(oc rsh mariadb-client \ > mysqlcheck --all-databases -h ${SOURCE_MARIADB_IP[$CELL]} -u root -p"${SOURCE_DB_ROOT_PASSWORD[$CELL]}" | grep -v OK) > } $ for CELL in $(echo $CELLS); do > run_mysqlcheck $CELL > done $ if [ "$PULL_OPENSTACK_CONFIGURATION_MYSQLCHECK_NOK" != "" ]; then > for CELL in $(echo $CELLS); do > MYSQL_UPGRADE=$(oc rsh mariadb-client \ > mysql_upgrade --skip-version-check -v -h ${SOURCE_MARIADB_IP[$CELL]} -u root -p"${SOURCE_DB_ROOT_PASSWORD[$CELL]}") > # rerun mysqlcheck to check if problem is resolved > run_mysqlcheck > done > fiGet the Compute service (nova) cell mappings:
export PULL_OPENSTACK_CONFIGURATION_NOVADB_MAPPED_CELLS=$(oc rsh mariadb-client \ mysql -rsh "${SOURCE_MARIADB_IP[default]}" -uroot -p"${SOURCE_DB_ROOT_PASSWORD[default]}" nova_api -e \ 'select uuid,name,transport_url,database_connection,disabled from cell_mappings;')Get the hostnames of the registered Compute services:
$ unset PULL_OPENSTACK_CONFIGURATION_NOVA_COMPUTE_HOSTNAMES $ declare -xA PULL_OPENSTACK_CONFIGURATION_NOVA_COMPUTE_HOSTNAMES $ for CELL in $(echo $CELLS); do > PULL_OPENSTACK_CONFIGURATION_NOVA_COMPUTE_HOSTNAMES[$CELL]=$(oc rsh mariadb-client \ > mysql -rsh "${SOURCE_MARIADB_IP[$CELL]}" -uroot -p"${SOURCE_DB_ROOT_PASSWORD[$CELL]}" -e \ > "select host from nova.services where services.binary='nova-compute' and deleted=0;") > doneGet the list of the mapped Compute service cells:
export PULL_OPENSTACK_CONFIGURATION_NOVAMANAGE_CELL_MAPPINGS=$($CONTROLLER1_SSH sudo podman exec -it nova_conductor nova-manage cell_v2 list_cells)Store the exported variables for future use:
$ unset SRIOV_AGENTS $ declare -xA SRIOV_AGENTS $ for CELL in $(echo $CELLS); do > RCELL=$CELL > [ "$CELL" = "$DEFAULT_CELL_NAME" ] && RCELL=default > cat > ~/.source_cloud_exported_variables_$CELL << EOF > unset PULL_OPENSTACK_CONFIGURATION_DATABASES > unset PULL_OPENSTACK_CONFIGURATION_MYSQLCHECK_NOK > unset PULL_OPENSTACK_CONFIGURATION_NOVA_COMPUTE_HOSTNAMES > declare -xA PULL_OPENSTACK_CONFIGURATION_DATABASES > declare -xA PULL_OPENSTACK_CONFIGURATION_MYSQLCHECK_NOK > declare -xA PULL_OPENSTACK_CONFIGURATION_NOVA_COMPUTE_HOSTNAMES > PULL_OPENSTACK_CONFIGURATION_DATABASES[$CELL]="$(oc rsh mariadb-client \ > mysql -rsh ${SOURCE_MARIADB_IP[$RCELL]} -uroot -p${SOURCE_DB_ROOT_PASSWORD[$RCELL]} -e 'SHOW databases;')" > PULL_OPENSTACK_CONFIGURATION_MYSQLCHECK_NOK[$CELL]="$(oc rsh mariadb-client \ > mysqlcheck --all-databases -h ${SOURCE_MARIADB_IP[$RCELL]} -u root -p${SOURCE_DB_ROOT_PASSWORD[$RCELL]} | grep -v OK)" > PULL_OPENSTACK_CONFIGURATION_NOVA_COMPUTE_HOSTNAMES[$CELL]="$(oc rsh mariadb-client \ > mysql -rsh ${SOURCE_MARIADB_IP[$RCELL]} -uroot -p${SOURCE_DB_ROOT_PASSWORD[$RCELL]} -e \ > "select host from nova.services where services.binary='nova-compute' and deleted=0;")" > if [ "$RCELL" = "default" ]; then > PULL_OPENSTACK_CONFIGURATION_NOVADB_MAPPED_CELLS="$(oc rsh mariadb-client \ > mysql -rsh ${SOURCE_MARIADB_IP[$RCELL]} -uroot -p${SOURCE_DB_ROOT_PASSWORD[$RCELL]} nova_api -e \ > 'select uuid,name,transport_url,database_connection,disabled from cell_mappings;')" > PULL_OPENSTACK_CONFIGURATION_NOVAMANAGE_CELL_MAPPINGS="$($CONTROLLER1_SSH sudo podman exec -it nova_conductor nova-manage cell_v2 list_cells)" > fi > EOF > done $ chmod 0600 ~/.source_cloud_exported_variables*-
declare -xA SRIOV_AGENTSgets theneutron-sriov-nic-agentconfiguration to use for the data plane adoption ifneutron-sriov-nic-agentagents are running in your RHOSP deployment.
-
Clean up the
mariadb-clientpod:$ oc delete pod mariadb-clientThe
mariadb-clientpod is no longer needed after all the data is exported and stored.
Next steps
This configuration and the exported values are required later, during the data plane adoption post-checks. After the RHOSP control plane services are shut down, if any of the exported values are lost, re-running the export command fails because the control plane services are no longer running on the source cloud, and the data cannot be retrieved. To avoid data loss, preserve the exported values in an environment file before shutting down the control plane services.
3.2. Deploying back-end services Copy linkLink copied to clipboard!
Create the OpenStackControlPlane custom resource (CR) with the basic back-end services deployed, and disable all the Red Hat OpenStack Platform (RHOSP) services. This CR is the foundation of the control plane.
Prerequisites
- The cloud that you want to adopt is running, and it is on RHOSP 17.1.4 or later.
- All control plane and data plane hosts of the source cloud are running, and continue to run throughout the adoption procedure.
-
The
openstack-operatoris deployed, butOpenStackControlPlaneis not deployed. - Install the OpenStack Operators. For more information, see Installing and preparing the OpenStack Operator in Deploying Red Hat OpenStack Services on OpenShift.
-
If you enabled TLS everywhere (TLS-e) on the RHOSP environment, you must copy the
tlsroot CA from the RHOSP environment to therootca-internalissuer. - There are free PVs available for Galera and RabbitMQ.
Set the desired admin password for the control plane deployment. This can be the admin password from your original deployment or a different password:
ADMIN_PASSWORD=SomePasswordTo use the existing RHOSP deployment password:
declare -A TRIPLEO_PASSWORDS TRIPLEO_PASSWORDS[default]="$HOME/overcloud-passwords.yaml" ADMIN_PASSWORD=$(cat ${TRIPLEO_PASSWORDS[default]} | grep ' AdminPassword:' | awk -F ': ' '{ print $2; }')Set the service password variables to match the original deployment. Database passwords can differ in the control plane environment, but you must synchronize the service account passwords.
For example, in developer environments with director Standalone, the passwords can be extracted:
AODH_PASSWORD=$(cat ${TRIPLEO_PASSWORDS[default]} | grep ' AodhPassword:' | awk -F ': ' '{ print $2; }') BARBICAN_PASSWORD=$(cat ${TRIPLEO_PASSWORDS[default]} | grep ' BarbicanPassword:' | awk -F ': ' '{ print $2; }') CEILOMETER_METERING_SECRET=$(cat ${TRIPLEO_PASSWORDS[default]} | grep ' CeilometerMeteringSecret:' | awk -F ': ' '{ print $2; }') CEILOMETER_PASSWORD=$(cat ${TRIPLEO_PASSWORDS[default]} | grep ' CeilometerPassword:' | awk -F ': ' '{ print $2; }') CINDER_PASSWORD=$(cat ${TRIPLEO_PASSWORDS[default]} | grep ' CinderPassword:' | awk -F ': ' '{ print $2; }') GLANCE_PASSWORD=$(cat ${TRIPLEO_PASSWORDS[default]} | grep ' GlancePassword:' | awk -F ': ' '{ print $2; }') HEAT_AUTH_ENCRYPTION_KEY=$(cat ${TRIPLEO_PASSWORDS[default]} | grep ' HeatAuthEncryptionKey:' | awk -F ': ' '{ print $2; }') HEAT_PASSWORD=$(cat ${TRIPLEO_PASSWORDS[default]} | grep ' HeatPassword:' | awk -F ': ' '{ print $2; }') HEAT_STACK_DOMAIN_ADMIN_PASSWORD=$(cat ${TRIPLEO_PASSWORDS[default]} | grep ' HeatStackDomainAdminPassword:' | awk -F ': ' '{ print $2; }') IRONIC_PASSWORD=$(cat ${TRIPLEO_PASSWORDS[default]} | grep ' IronicPassword:' | awk -F ': ' '{ print $2; }') MANILA_PASSWORD=$(cat ${TRIPLEO_PASSWORDS[default]} | grep ' ManilaPassword:' | awk -F ': ' '{ print $2; }') NEUTRON_PASSWORD=$(cat ${TRIPLEO_PASSWORDS[default]} | grep ' NeutronPassword:' | awk -F ': ' '{ print $2; }') NOVA_PASSWORD=$(cat ${TRIPLEO_PASSWORDS[default]} | grep ' NovaPassword:' | awk -F ': ' '{ print $2; }') OCTAVIA_PASSWORD=$(cat ${TRIPLEO_PASSWORDS[default]} | grep ' OctaviaPassword:' | awk -F ': ' '{ print $2; }') PLACEMENT_PASSWORD=$(cat ${TRIPLEO_PASSWORDS[default]} | grep ' PlacementPassword:' | awk -F ': ' '{ print $2; }') SWIFT_PASSWORD=$(cat ${TRIPLEO_PASSWORDS[default]} | grep ' SwiftPassword:' | awk -F ': ' '{ print $2; }')
Procedure
Ensure that you are using the Red Hat OpenShift Container Platform (RHOCP) namespace where you want the control plane to be deployed:
$ oc project openstack- Create the RHOSP secret. For more information, see Providing secure access to the Red Hat OpenStack Services on OpenShift services in Deploying Red Hat OpenStack Services on OpenShift.
If the
$ADMIN_PASSWORDis different than the password you set inosp-secret, amend theAdminPasswordkey in theosp-secret:$ oc set data secret/osp-secret "AdminPassword=$ADMIN_PASSWORD"Set service account passwords in
osp-secretto match the service account passwords from the original deployment:$ oc set data secret/osp-secret "AodhPassword=$AODH_PASSWORD" $ oc set data secret/osp-secret "BarbicanPassword=$BARBICAN_PASSWORD" $ oc set data secret/osp-secret "CeilometerPassword=$CEILOMETER_PASSWORD" $ oc set data secret/osp-secret "CinderPassword=$CINDER_PASSWORD" $ oc set data secret/osp-secret "GlancePassword=$GLANCE_PASSWORD" $ oc set data secret/osp-secret "HeatAuthEncryptionKey=$HEAT_AUTH_ENCRYPTION_KEY" $ oc set data secret/osp-secret "HeatPassword=$HEAT_PASSWORD" $ oc set data secret/osp-secret "HeatStackDomainAdminPassword=$HEAT_STACK_DOMAIN_ADMIN_PASSWORD" $ oc set data secret/osp-secret "IronicPassword=$IRONIC_PASSWORD" $ oc set data secret/osp-secret "IronicInspectorPassword=$IRONIC_PASSWORD" $ oc set data secret/osp-secret "ManilaPassword=$MANILA_PASSWORD" $ oc set data secret/osp-secret "MetadataSecret=$METADATA_SECRET" $ oc set data secret/osp-secret "NeutronPassword=$NEUTRON_PASSWORD" $ oc set data secret/osp-secret "NovaPassword=$NOVA_PASSWORD" $ oc set data secret/osp-secret "OctaviaPassword=$OCTAVIA_PASSWORD" $ oc set data secret/osp-secret "PlacementPassword=$PLACEMENT_PASSWORD" $ oc set data secret/osp-secret "SwiftPassword=$SWIFT_PASSWORD"Deploy the
OpenStackControlPlaneCR. Ensure that you only enable the DNS, Galera, Memcached, and RabbitMQ services. All other services must be disabled:$ oc apply -f - <<EOF apiVersion: core.openstack.org/v1beta1 kind: OpenStackControlPlane metadata: name: openstack spec: secret: osp-secret storageClass: <storage_class> barbican: enabled: false template: barbicanAPI: {} barbicanWorker: {} barbicanKeystoneListener: {} cinder: enabled: false template: cinderAPI: {} cinderScheduler: {} cinderBackup: {} cinderVolumes: {} dns: template: override: service: metadata: annotations: metallb.universe.tf/address-pool: ctlplane metallb.universe.tf/allow-shared-ip: ctlplane metallb.universe.tf/loadBalancerIPs: <loadBalancer_IP> spec: type: LoadBalancer options: - key: server values: - <DNS_server_IP> replicas: 1 glance: enabled: false template: glanceAPIs: {} heat: enabled: false template: {} horizon: enabled: false template: {} ironic: enabled: false template: ironicConductors: [] keystone: enabled: false template: {} manila: enabled: false template: manilaAPI: {} manilaScheduler: {} manilaShares: {} galera: enabled: true templates: openstack: secret: osp-secret replicas: 3 storageRequest: 5G openstack-cell1: secret: osp-secret replicas: 3 storageRequest: 5G openstack-cell2: secret: osp-secret replicas: 1 storageRequest: 5G openstack-cell3: secret: osp-secret replicas: 1 storageRequest: 5G memcached: enabled: true templates: memcached: replicas: 3 neutron: enabled: false template: {} nova: enabled: false template: {} ovn: enabled: false template: ovnController: networkAttachment: tenant ovnNorthd: replicas: 0 ovnDBCluster: ovndbcluster-nb: replicas: 3 dbType: NB networkAttachment: internalapi ovndbcluster-sb: replicas: 3 dbType: SB networkAttachment: internalapi placement: enabled: false template: {} rabbitmq: templates: rabbitmq: override: service: metadata: annotations: metallb.universe.tf/address-pool: internalapi metallb.universe.tf/loadBalancerIPs: <loadBalancer_IP> spec: type: LoadBalancer rabbitmq-cell1: persistence: storage: 10Gi override: service: metadata: annotations: metallb.universe.tf/address-pool: internalapi metallb.universe.tf/loadBalancerIPs: <loadBalancer_IP> spec: type: LoadBalancer rabbitmq-cell2: persistence: storage: 10Gi override: service: metadata: annotations: metallb.universe.tf/address-pool: internalapi metallb.universe.tf/loadBalancerIPs: <loadBalancer_IP> spec: type: LoadBalancer rabbitmq-cell3: persistence: storage: 10Gi override: service: metadata: annotations: metallb.universe.tf/address-pool: internalapi metallb.universe.tf/loadBalancerIPs: <loadBalancer_IP> spec: type: LoadBalancer telemetry: enabled: false tls: podLevel: enabled: false ingress: enabled: false swift: enabled: false template: swiftRing: ringReplicas: 3 swiftStorage: replicas: 0 swiftProxy: replicas: 2 EOFwhere:
- <DNS_server_IP>
Specifies the value for the DNS server reachable from the
dnsmasqpod on the RHOCP cluster network. You can specify a generic DNS server as the value, for example,1.1.1.1, or a DNS server for a specific domain, for example,/google.com/8.8.8.8.NoteThis DNS service,
dnsmasq, provides DNS services for nodes on the RHOSO data plane.dnsmasqis different from the RHOSO DNS service (designate) that provides DNS as a service for cloud tenants.- <storage_class>
- Specifies an existing storage class in your RHOCP cluster.
- <loadBalancer_IP>
Specifies the LoadBalancer IP address. If you use IPv6, change the load balancer IPs to the IPs in your environment, for example:
... metallb.universe.tf/allow-shared-ip: ctlplane metallb.universe.tf/loadBalancerIPs: fd00:aaaa::80 ... metallb.universe.tf/address-pool: internalapi metallb.universe.tf/loadBalancerIPs: fd00:bbbb::85 ... metallb.universe.tf/address-pool: internalapi metallb.universe.tf/loadBalancerIPs: fd00:bbbb::86-
galera.openstack-cell1provides the required infrastructure database and messaging services for the Compute cells, for example,cell1,cell2, andcell3. Adjust the values for fields such asreplicas,storage, orstorageRequest, for each Compute cell as needed. spec.tlsspecifies whether TLS-e is enabled. If you enabled TLS-e in your RHOSP environment, settlsto the following:spec: ... tls: podLevel: enabled: true internal: ca: customIssuer: rootca-internal libvirt: ca: customIssuer: rootca-internal ovn: ca: customIssuer: rootca-internal ingress: ca: customIssuer: rootca-internal enabled: true
-
Verification
Verify that the Galera and RabbitMQ status is
Runningfor all defined cells:$ RENAMED_CELLS="cell1 cell2 cell3" $ oc get pod openstack-galera-0 -o jsonpath='{.status.phase}{"\n"}' $ oc get pod rabbitmq-server-0 -o jsonpath='{.status.phase}{"\n"}' $ for CELL in $(echo $RENAMED_CELLS); do > oc get pod openstack-$CELL-galera-0 -o jsonpath='{.status.phase}{"\n"}' > oc get pod rabbitmq-$CELL-server-0 -o jsonpath='{.status.phase}{"\n"}' > doneThe given cells names are later referred to by using the environment variable
RENAMED_CELLS. During the database migration procedure, the Nova cells are renamed.RENAMED_CELLSvariable represents the new cell names used in the RHOSO deployment.Ensure that the statuses of all the Rabbitmq and Galera CRs are
Setup complete:$ oc get Rabbitmqs,Galera NAME STATUS MESSAGE rabbitmq.rabbitmq.openstack.org/rabbitmq True Setup complete rabbitmq.rabbitmq.openstack.org/rabbitmq-cell1 True Setup complete NAME READY MESSAGE galera.mariadb.openstack.org/openstack True Setup complete galera.mariadb.openstack.org/openstack-cell1 True Setup completeVerify that the
OpenStackControlPlaneCR is waiting for deployment of theopenstackclientpod:$ oc get OpenStackControlPlane openstack NAME STATUS MESSAGE openstack Unknown OpenStackControlPlane Client not started
3.3. Configuring a Red Hat Ceph Storage back end Copy linkLink copied to clipboard!
If your Red Hat OpenStack Platform (RHOSP) 17.1 deployment uses a Red Hat Ceph Storage back end for any service, such as Image Service (glance), Block Storage service (cinder), Compute service (nova), or Shared File Systems service (manila), you must configure the custom resources (CRs) to use the same back end in the Red Hat OpenStack Services on OpenShift (RHOSO) 18.0 deployment.
To run ceph commands, you must use SSH to connect to a Red Hat Ceph Storage node and run sudo cephadm shell. This generates a Ceph orchestrator container that enables you to run administrative commands against the Red Hat Ceph Storage cluster. If you deployed the Red Hat Ceph Storage cluster by using director, you can launch the cephadm shell from an RHOSP Controller node.
Prerequisites
-
The
OpenStackControlPlaneCR is created. If your RHOSP 17.1 deployment uses the Shared File Systems service, the openstack keyring is updated. Modify the
openstackuser so that you can use it across all RHOSP services:ceph auth caps client.openstack \ mgr 'allow *' \ mon 'allow r, profile rbd' \ osd 'profile rbd pool=vms, profile rbd pool=volumes, profile rbd pool=images, allow rw pool manila_data'Using the same user across all services makes it simpler to create a common Red Hat Ceph Storage secret that includes the keyring and
ceph.conffile and propagate the secret to all the services that need it.The following shell variables are defined. Replace the following example values with values that are correct for your environment:
CEPH_SSH="ssh -i <path to SSH key> root@<node IP>" CEPH_KEY=$($CEPH_SSH "cat /etc/ceph/ceph.client.openstack.keyring | base64 -w 0") CEPH_CONF=$($CEPH_SSH "cat /etc/ceph/ceph.conf | base64 -w 0")
Procedure
Create the
ceph-conf-filessecret that includes the Red Hat Ceph Storage configuration:$ oc apply -f - <<EOF apiVersion: v1 data: ceph.client.openstack.keyring: $CEPH_KEY ceph.conf: $CEPH_CONF kind: Secret metadata: name: ceph-conf-files type: Opaque EOFThe content of the file should be similar to the following example:
apiVersion: v1 kind: Secret metadata: name: ceph-conf-files stringData: ceph.client.openstack.keyring: | [client.openstack] key = <secret key> caps mgr = "allow *" caps mon = "allow r, profile rbd" caps osd = "pool=vms, profile rbd pool=volumes, profile rbd pool=images, allow rw pool manila_data' ceph.conf: | [global] fsid = 7a1719e8-9c59-49e2-ae2b-d7eb08c695d4 mon_host = 10.1.1.2,10.1.1.3,10.1.1.4where:
mon_hostspecifies the addresses of the cluster’s monitors. If you use IPv6, use brackets for the
mon_host. For example:mon_host = [v2:[fd00:cccc::100]:3300/0,v1:[fd00:cccc::100]:6789/0]NoteFor Distributed Compute Node (DCN) deployments with multiple Red Hat Ceph Storage clusters, create one secret per site. Each secret contains only the keys that the respective site requires. For more information on the rationale and key distribution pattern, see Red Hat Ceph Storage migration for Distributed Compute Node deployments.
The Red Hat Ceph Storage configuration files for all clusters are available on the RHOSP controller at either
/var/lib/tripleo-config/ceph/, or/etc/ceph. Copy them locally and create the per-site secrets:$ CEPH_SSH="ssh root@<controller>" $ CEPH_DIR="/var/lib/tripleo-config/ceph" $ TMPDIR=$(mktemp -d) $ $CEPH_SSH "cat ${CEPH_DIR}/central.conf" > ${TMPDIR}/central.conf $ $CEPH_SSH "sudo cat ${CEPH_DIR}/central.client.openstack.keyring" > ${TMPDIR}/central.client.openstack.keyring $ $CEPH_SSH "cat ${CEPH_DIR}/dcn1.conf" > ${TMPDIR}/dcn1.conf $ $CEPH_SSH "sudo cat ${CEPH_DIR}/dcn1.client.openstack.keyring" > ${TMPDIR}/dcn1.client.openstack.keyring $ $CEPH_SSH "cat ${CEPH_DIR}/dcn2.conf" > ${TMPDIR}/dcn2.conf $ $CEPH_SSH "sudo cat ${CEPH_DIR}/dcn2.client.openstack.keyring" > ${TMPDIR}/dcn2.client.openstack.keyring # Central site secret: contains all clusters $ oc create secret generic ceph-conf-central \ --from-file=${TMPDIR}/central.conf \ --from-file=${TMPDIR}/central.client.openstack.keyring \ --from-file=${TMPDIR}/dcn1.conf \ --from-file=${TMPDIR}/dcn1.client.openstack.keyring \ --from-file=${TMPDIR}/dcn2.conf \ --from-file=${TMPDIR}/dcn2.client.openstack.keyring \ -n openstack # DCN1 edge site secret: central + local only $ oc create secret generic ceph-conf-dcn1 \ --from-file=${TMPDIR}/central.conf \ --from-file=${TMPDIR}/central.client.openstack.keyring \ --from-file=${TMPDIR}/dcn1.conf \ --from-file=${TMPDIR}/dcn1.client.openstack.keyring \ -n openstack # DCN2 edge site secret: central + local only $ oc create secret generic ceph-conf-dcn2 \ --from-file=${TMPDIR}/central.conf \ --from-file=${TMPDIR}/central.client.openstack.keyring \ --from-file=${TMPDIR}/dcn2.conf \ --from-file=${TMPDIR}/dcn2.client.openstack.keyring \ -n openstack $ rm -rf ${TMPDIR}Repeat for each additional edge site. Each edge site secret must include the central cluster files and only the files for that edge site’s local cluster.
When configuring
extraMountson theOpenStackControlPlane, use propagation labels matching the service instance names (for example,central,dcn1,dcn2) so that each pod mounts only its site-specific secret.
In your
OpenStackControlPlaneCR, inject the Red Hat Ceph Storage configuration into the RHOSP service pods usingextraMounts. For a single-cluster deployment, propagate one secret to all services:$ oc patch openstackcontrolplane openstack --type=merge --patch ' spec: extraMounts: - name: v1 region: r1 extraVol: - propagation: - CinderVolume - CinderBackup - GlanceAPI - ManilaShare extraVolType: Ceph volumes: - name: ceph projected: sources: - secret: name: ceph-conf-files mounts: - name: ceph mountPath: "/etc/ceph" readOnly: true 'For a DCN deployment with per-site secrets, use propagation labels matching each service instance name so that each pod receives only the keys for its site:
$ oc patch openstackcontrolplane openstack --type=merge --patch ' spec: extraMounts: - name: v1 region: r1 extraVol: - extraVolType: Ceph propagation: - central - CinderBackup - ManilaShare volumes: - name: ceph-central projected: sources: - secret: name: ceph-conf-central mounts: - name: ceph-central mountPath: "/etc/ceph" readOnly: true - extraVolType: Ceph propagation: - dcn1 volumes: - name: ceph-dcn1 projected: sources: - secret: name: ceph-conf-dcn1 mounts: - name: ceph-dcn1 mountPath: "/etc/ceph" readOnly: true - extraVolType: Ceph propagation: - dcn2 volumes: - name: ceph-dcn2 projected: sources: - secret: name: ceph-conf-dcn2 mounts: - name: ceph-dcn2 mountPath: "/etc/ceph" readOnly: true 'The propagation label
centralmatches the Image service and Block Storage service pod instances namedcentral. TheCinderBackupandManilaSharelabels are service-type propagation and apply to all Block Storage service backup and Shared File Systems service pods, which run only at the central site. Replacecentral,dcn1, anddcn2with the instance names used in your deployment.
3.4. Stopping Red Hat OpenStack Platform services Copy linkLink copied to clipboard!
Before you start the Red Hat OpenStack Services on OpenShift (RHOSO) adoption, you must stop the Red Hat OpenStack Platform (RHOSP) services to avoid inconsistencies in the data that you migrate for the data plane adoption. Inconsistencies are caused by resource changes after the database is copied to the new deployment.
You should not stop the infrastructure management services yet, such as:
- Database
- RabbitMQ
- HAProxy Load Balancer
- Ceph-nfs
- Compute service
- Containerized modular libvirt daemons
- Object Storage service (swift) back-end services
Prerequisites
Ensure that there no long-running tasks that require the services that you plan to stop, such as instance live migrations, volume migrations, volume creation, backup and restore, attaching, detaching, and other similar operations:
$ openstack server list --all-projects -c ID -c Status |grep -E '\| .+ing \|' $ openstack volume list --all-projects -c ID -c Status |grep -E '\| .+ing \|'| grep -vi error $ openstack volume backup list --all-projects -c ID -c Status |grep -E '\| .+ing \|' | grep -vi error $ openstack share list --all-projects -c ID -c Status |grep -E '\| .+ing \|'| grep -vi error $ openstack image list -c ID -c Status |grep -E '\| .+ing \|'- Collect the services topology-specific configuration. For more information, see Retrieving topology-specific service configuration.
Define the following shell variables. The values are examples and refer to a single node standalone director deployment. Replace these example values with values that are correct for your environment:
CONTROLLER1_SSH="ssh -i <path to SSH key> root@<controller-1 IP>" CONTROLLER2_SSH="ssh -i <path to SSH key> root@<controller-2 IP>" CONTROLLER3_SSH="ssh -i <path to SSH key> root@<controller-3 IP>"Specify the IP addresses of all Controller nodes, for example:
CONTROLLER1_SSH="ssh -i <path to SSH key> root@<controller-1 IP>" CONTROLLER2_SSH="ssh -i <path to SSH key> root@<controller-2 IP>" CONTROLLER3_SSH="ssh -i <path to SSH key> root@<controller-3 IP>" # ...-
<path_to_SSH_key>defines the path to your SSH key. -
<controller-<X> IP>defines the IP addresses of all Controller nodes.
-
Procedure
If your deployment enables CephFS through NFS as a back end for Shared File Systems service (manila), remove the following Pacemaker ordering and co-location constraints that govern the Virtual IP address of the
ceph-nfsservice and themanila-shareservice:# check the co-location and ordering constraints concerning "manila-share" sudo pcs constraint list --full # remove these constraints sudo pcs constraint remove colocation-openstack-manila-share-ceph-nfs-INFINITY sudo pcs constraint remove order-ceph-nfs-openstack-manila-share-OptionalDisable RHOSP control plane services:
# Update the services list to be stopped ServicesToStop=("tripleo_aodh_api.service" "tripleo_aodh_api_cron.service" "tripleo_aodh_evaluator.service" "tripleo_aodh_listener.service" "tripleo_aodh_notifier.service" "tripleo_ceilometer_agent_central.service" "tripleo_ceilometer_agent_notification.service" "tripleo_designate_api.service" "tripleo_designate_backend_bind9.service" "tripleo_designate_central.service" "tripleo_designate_mdns.service" "tripleo_designate_producer.service" "tripleo_designate_worker.service" "tripleo_octavia_api.service" "tripleo_octavia_health_manager.service" "tripleo_octavia_rsyslog.service" "tripleo_octavia_driver_agent.service" "tripleo_octavia_housekeeping.service" "tripleo_octavia_worker.service" "tripleo_horizon.service" "tripleo_keystone.service" "tripleo_barbican_api.service" "tripleo_barbican_worker.service" "tripleo_barbican_keystone_listener.service" "tripleo_cinder_api.service" "tripleo_cinder_api_cron.service" "tripleo_cinder_scheduler.service" "tripleo_cinder_volume.service" "tripleo_cinder_backup.service" "tripleo_collectd.service" "tripleo_glance_api.service" "tripleo_gnocchi_api.service" "tripleo_gnocchi_metricd.service" "tripleo_gnocchi_statsd.service" "tripleo_manila_api.service" "tripleo_manila_api_cron.service" "tripleo_manila_scheduler.service" "tripleo_neutron_api.service" "tripleo_placement_api.service" "tripleo_nova_api_cron.service" "tripleo_nova_api.service" "tripleo_nova_conductor.service" "tripleo_nova_metadata.service" "tripleo_nova_scheduler.service" "tripleo_nova_vnc_proxy.service" "tripleo_aodh_api.service" "tripleo_aodh_api_cron.service" "tripleo_aodh_evaluator.service" "tripleo_aodh_listener.service" "tripleo_aodh_notifier.service" "tripleo_ceilometer_agent_central.service" "tripleo_ceilometer_agent_compute.service" "tripleo_ceilometer_agent_ipmi.service" "tripleo_ceilometer_agent_notification.service" "tripleo_ovn_cluster_northd.service" "tripleo_ironic_neutron_agent.service" "tripleo_ironic_api.service" "tripleo_ironic_inspector.service" "tripleo_ironic_conductor.service" "tripleo_ironic_inspector_dnsmasq.service" "tripleo_ironic_pxe_http.service" "tripleo_ironic_pxe_tftp.service" "tripleo_unbound.service") PacemakerResourcesToStop=("openstack-cinder-volume" "openstack-cinder-backup" "openstack-manila-share") echo "Stopping systemd OpenStack services" for service in ${ServicesToStop[*]}; do for i in {1..3}; do SSH_CMD=CONTROLLER${i}_SSH if [ ! -z "${!SSH_CMD}" ]; then echo "Stopping the $service in controller $i" if ${!SSH_CMD} sudo systemctl is-active $service; then ${!SSH_CMD} sudo systemctl stop $service fi fi done done echo "Checking systemd OpenStack services" for service in ${ServicesToStop[*]}; do for i in {1..3}; do SSH_CMD=CONTROLLER${i}_SSH if [ ! -z "${!SSH_CMD}" ]; then if ! ${!SSH_CMD} systemctl show $service | grep ActiveState=inactive >/dev/null; then echo "ERROR: Service $service still running on controller $i" else echo "OK: Service $service is not running on controller $i" fi fi done done echo "Stopping pacemaker OpenStack services" for i in {1..3}; do SSH_CMD=CONTROLLER${i}_SSH if [ ! -z "${!SSH_CMD}" ]; then echo "Using controller $i to run pacemaker commands" for resource in ${PacemakerResourcesToStop[*]}; do if ${!SSH_CMD} sudo pcs resource config $resource &>/dev/null; then echo "Stopping $resource" ${!SSH_CMD} sudo pcs resource disable $resource else echo "Service $resource not present" fi done break fi done echo "Checking pacemaker OpenStack services" for i in {1..3}; do SSH_CMD=CONTROLLER${i}_SSH if [ ! -z "${!SSH_CMD}" ]; then echo "Using controller $i to run pacemaker commands" for resource in ${PacemakerResourcesToStop[*]}; do if ${!SSH_CMD} sudo pcs resource config $resource &>/dev/null; then if ! ${!SSH_CMD} sudo pcs resource status $resource | grep Started; then echo "OK: Service $resource is stopped" else echo "ERROR: Service $resource is started" fi fi done break fi doneIf the status of each service is
OK, then the services stopped successfully.For Distributed Compute Node (DCN) deployments where Image service, Block Storage service, and Red Hat Ceph Storage services run on edge Compute nodes, stop the Image service, Block Storage service, and etcd services on all edge Compute nodes with the
DistributedComputeHCIrole:NoteThe
DistributedComputeHCIrole runsGlanceApiEdge,CinderVolumeEdge, andEtcdservices. A minimum of three nodes per site use this role. Skip this step if your DCN deployment does not run these services on edge Compute nodes. The examples in this procedure use hyper-converged (HCI) roles. If your deployment does not use HCI, the same services apply to theDistributedComputerole, which runs the sameGlanceApiEdge,CinderVolumeEdge, andEtcdservices but without Ceph OSD, Ceph Monitor, or Ceph Manager.Define shell variables for your
DistributedComputeHCIedge Compute nodes:# DCN1 edge site DistributedComputeHCI nodes DCN1_HCI0_SSH="ssh -i <path to SSH key> root@<dcn1-hci-0 IP>" DCN1_HCI1_SSH="ssh -i <path to SSH key> root@<dcn1-hci-1 IP>" DCN1_HCI2_SSH="ssh -i <path to SSH key> root@<dcn1-hci-2 IP>" # DCN2 edge site DistributedComputeHCI nodes DCN2_HCI0_SSH="ssh -i <path to SSH key> root@<dcn2-hci-0 IP>" DCN2_HCI1_SSH="ssh -i <path to SSH key> root@<dcn2-hci-1 IP>" DCN2_HCI2_SSH="ssh -i <path to SSH key> root@<dcn2-hci-2 IP>"where:
<path to SSH key>-
Specifies the path to your SSH key for each
DistributedComputeHCIedge Compute node on each DCN edge site. <dcn1-hci-0 IP>,<dcn1-hci-1 IP>,<dcn1-hci-2 IP>-
Specifies the IP address for each
DistributedComputeHCIedge Compute node within the DCN1 edge site. <dcn2-hci-0 IP>,<dcn2-hci-1 IP>,<dcn2-hci-2 IP>-
Specifies the IP address for each
DistributedComputeHCIedge Compute node within the DCN2 edge site.
Stop the storage services on all
DistributedComputeHCInodes:# Services to stop on DistributedComputeHCI edge compute nodes DCN_HCI_SERVICES=("tripleo_glance_api_internal.service" "tripleo_cinder_volume.service" "tripleo_etcd.service") # List of all DistributedComputeHCI node SSH commands DCN_HCI_NODES=("$DCN1_HCI0_SSH" "$DCN1_HCI1_SSH" "$DCN1_HCI2_SSH" "$DCN2_HCI0_SSH" "$DCN2_HCI1_SSH" "$DCN2_HCI2_SSH") echo "Stopping storage services on DistributedComputeHCI nodes" for node_ssh in "${DCN_HCI_NODES[@]}"; do [ -z "$node_ssh" ] && continue echo "Processing node: $node_ssh" for service in "${DCN_HCI_SERVICES[@]}"; do if $node_ssh sudo systemctl is-active $service 2>/dev/null; then echo "Stopping $service" $node_ssh sudo systemctl stop $service fi done done echo "Checking storage services on DistributedComputeHCI nodes" for node_ssh in "${DCN_HCI_NODES[@]}"; do [ -z "$node_ssh" ] && continue for service in "${DCN_HCI_SERVICES[@]}"; do if ! $node_ssh systemctl show $service 2>/dev/null | grep ActiveState=inactive >/dev/null; then echo "ERROR: Service $service still running on $node_ssh" else echo "OK: Service $service is not running on $node_ssh" fi done doneNote-
On edge sites, the Image service runs with the service name
tripleo_glance_api_internal.service, which is different from thetripleo_glance_api.serviceon the central controller. -
The Block Storage service volume service (
tripleo_cinder_volume.service) uses the same service name on both edge sites and the central controller. -
The etcd service (
tripleo_etcd.service) is used as a distributed lock manager (DLM) for the Block Storage service volume service running in active/active mode on edge sites.
-
On edge sites, the Image service runs with the service name
If your DCN deployment includes
DistributedComputeHCIScaleOutnodes, stop the HAProxy service on those nodes:NoteThe
DistributedComputeHCIScaleOutrole is used to scale compute and storage capacity beyond the initial threeDistributedComputeHCInodes at each site. These nodes runHAProxyEdge, which proxies Image service requests to theGlanceApiEdgeinstances onDistributedComputeHCInodes. Skip this step if your DCN deployment does not includeDistributedComputeHCIScaleOutnodes. For non-HCI deployments, the equivalent role isDistributedComputeScaleOut, which runs the sameHAProxyEdgeservice.Define shell variables for your
DistributedComputeHCIScaleOutedge Compute nodes:# DCN1 edge site DistributedComputeHCIScaleOut nodes DCN1_SCALEOUT0_SSH="ssh -i <path to SSH key> root@<dcn1-scaleout-0 IP>" DCN1_SCALEOUT1_SSH="ssh -i <path to SSH key> root@<dcn1-scaleout-1 IP>" # DCN2 edge site DistributedComputeHCIScaleOut nodes DCN2_SCALEOUT0_SSH="ssh -i <path to SSH key> root@<dcn2-scaleout-0 IP>" DCN2_SCALEOUT1_SSH="ssh -i <path to SSH key> root@<dcn2-scaleout-1 IP>"where:
<dcn1-scaleout-0 IP>,<dcn1-scaleout-1 IP>-
Specifies the IP address for each
DistributedComputeHCIScaleOutnode within the DCN1 edge site. <dcn2-scaleout-0 IP>,<dcn2-scaleout-1 IP>-
Specifies the IP address for each
DistributedComputeHCIScaleOutnode within the DCN2 edge site.
Stop the services on all
DistributedComputeHCIScaleOutnodes:# Services to stop on DistributedComputeHCIScaleOut edge compute nodes DCN_SCALEOUT_SERVICES=("tripleo_haproxy_edge.service") # List of all DistributedComputeHCIScaleOut node SSH commands DCN_SCALEOUT_NODES=("$DCN1_SCALEOUT0_SSH" "$DCN1_SCALEOUT1_SSH" "$DCN2_SCALEOUT0_SSH" "$DCN2_SCALEOUT1_SSH") echo "Stopping services on DistributedComputeHCIScaleOut nodes" for node_ssh in "${DCN_SCALEOUT_NODES[@]}"; do [ -z "$node_ssh" ] && continue echo "Processing node: $node_ssh" for service in "${DCN_SCALEOUT_SERVICES[@]}"; do if $node_ssh sudo systemctl is-active $service 2>/dev/null; then echo "Stopping $service" $node_ssh sudo systemctl stop $service fi done done echo "Checking services on DistributedComputeHCIScaleOut nodes" for node_ssh in "${DCN_SCALEOUT_NODES[@]}"; do [ -z "$node_ssh" ] && continue for service in "${DCN_SCALEOUT_SERVICES[@]}"; do if ! $node_ssh systemctl show $service 2>/dev/null | grep ActiveState=inactive >/dev/null; then echo "ERROR: Service $service still running on $node_ssh" else echo "OK: Service $service is not running on $node_ssh" fi done doneNote-
The HAProxy edge service (
tripleo_haproxy_edge.service) provided a local Image service endpoint onDistributedComputeHCIScaleOutnodes, proxying requests to theGlanceApiEdgeinstances onDistributedComputeHCInodes. During adoption, Red Hat OpenShift Container Platform (RHOCP) Kubernetes service endpoints backed by MetalLB replace HAProxy.
-
The HAProxy edge service (
3.5. Migrating databases to MariaDB instances Copy linkLink copied to clipboard!
Migrate your databases from the original Red Hat OpenStack Platform (RHOSP) deployment to the MariaDB instances in the Red Hat OpenShift Container Platform (RHOCP) cluster.
Prerequisites
- Ensure that the control plane MariaDB and RabbitMQ are running, and that no other control plane services are running.
- Retrieve the topology-specific service configuration. For more information, see Retrieving topology-specific service configuration.
- Stop the RHOSP services. For more information, see Stopping Red Hat OpenStack Platform services.
- Ensure that there is network routability between the original MariaDB and the MariaDB for the control plane.
Define the following shell variables. Replace the following example values with values that are correct for your environment:
$ STORAGE_CLASS=local-storage $ MARIADB_IMAGE=registry.redhat.io/rhoso/openstack-mariadb-rhel9:18.0 $ CELLS="default cell1 cell2" $ DEFAULT_CELL_NAME="cell3" $ RENAMED_CELLS="cell1 cell2 $DEFAULT_CELL_NAME" $ CHARACTER_SET=utf8 # $ COLLATION=utf8_general_ci $ declare -A PODIFIED_DB_ROOT_PASSWORD $ for CELL in $(echo "super $RENAMED_CELLS"); do > PODIFIED_DB_ROOT_PASSWORD[$CELL]=$(oc get -o json secret/osp-secret | jq -r .data.DbRootPassword | base64 -d) > done $ declare -A PODIFIED_MARIADB_IP $ for CELL in $(echo "super $RENAMED_CELLS"); do > if [ "$CELL" = "super" ]; then > PODIFIED_MARIADB_IP[$CELL]=$(oc get svc --selector "mariadb/name=openstack" -ojsonpath='{.items[0].spec.clusterIP}') > else > PODIFIED_MARIADB_IP[$CELL]=$(oc get svc --selector "mariadb/name=openstack-$CELL" -ojsonpath='{.items[0].spec.clusterIP}') > fi > done $ declare -A TRIPLEO_PASSWORDS $ for CELL in $(echo $CELLS); do > if [ "$CELL" = "default" ]; then > TRIPLEO_PASSWORDS[default]="$HOME/overcloud-passwords.yaml" > else > # in a split-stack source cloud, it should take a stack-specific passwords file instead > TRIPLEO_PASSWORDS[$CELL]="$HOME/overcloud-passwords.yaml" > fi > done $ declare -A SOURCE_DB_ROOT_PASSWORD $ for CELL in $(echo $CELLS); do > SOURCE_DB_ROOT_PASSWORD[$CELL]=$(cat ${TRIPLEO_PASSWORDS[$CELL]} | grep ' MysqlRootPassword:' | awk -F ': ' '{ print $2; }') > done $ declare -A SOURCE_MARIADB_IP $ SOURCE_MARIADB_IP[default]=*<galera cluster VIP>* $ SOURCE_MARIADB_IP[cell1]=*<galera cell1 cluster VIP>* $ SOURCE_MARIADB_IP[cell2]=*<galera cell2 cluster VIP>* # ... $ declare -A SOURCE_GALERA_MEMBERS_DEFAULT $ SOURCE_GALERA_MEMBERS_DEFAULT=( > ["standalone.localdomain"]=172.17.0.100 > # [...]=... > ) $ declare -A SOURCE_GALERA_MEMBERS_CELL1 $ SOURCE_GALERA_MEMBERS_CELL1=( > # ... > ) $ declare -A SOURCE_GALERA_MEMBERS_CELL2 $ SOURCE_GALERA_MEMBERS_CELL2=( > # ... > )-
CELLSandRENAMED_CELLSrepresent changes that are going to be made after you import the databases. Thedefaultcell takes a new name fromDEFAULT_CELL_NAME. In a multi-cell adoption scenario,defaultcell might retain its original default name as well. -
CHARACTER_SETandCOLLATIONshould match the source database. If they do not match, then foreign key relationships break for any tables that are created in the future as part of the database sync. -
SOURCE_MARIADB_IP[X]= ...includes the data for each cell that is defined inCELLS. Provide records for the cell names and VIP addresses of MariaDB Galera clusters. -
<galera_cell1_cluster_VIP>defines the VIP of your galera cell1 cluster. -
<galera_cell2_cluster_VIP>defines the VIP of your galera cell2 cluster, and so on. -
SOURCE_GALERA_MEMBERS_CELL<X>, defines the names of the MariaDB Galera cluster members and their IP address for each cell defined inCELLS. Replace["standalone.localdomain"]="172.17.0.100"with the real hosts data.
-
A standalone director environment only creates a default cell, which should be the only CELLS value in this case. The DEFAULT_CELL_NAME value should be cell1.
The super is the top-scope Nova API upcall database instance. A super conductor connects to that database. In subsequent examples, the upcall and cells databases use the same password that is defined in osp-secret. Old passwords are only needed to prepare the data exports.
To get the values for
SOURCE_MARIADB_IP, query the puppet-generated configurations in the Controller and CellController nodes:$ sudo grep -rI 'listen mysql' -A10 /var/lib/config-data/puppet-generated/ | grep bindTo get the values for
SOURCE_GALERA_MEMBERS_*, query the puppet-generated configurations in the Controller and CellController nodes:$ sudo grep -rI 'listen mysql' -A10 /var/lib/config-data/puppet-generated/ | grep serverThe source cloud always uses the same password for cells databases. For that reason, the same passwords file is used for all cells stacks. However, split-stack topology allows using different passwords files for each stack.
Prepare the MariaDB adoption helper pod:
Create a temporary volume claim and a pod for the database data copy. Edit the volume claim storage request if necessary, to give it enough space for the overcloud databases:
$ oc apply -f - <<EOF --- apiVersion: v1 kind: PersistentVolumeClaim metadata: name: mariadb-data spec: storageClassName: $STORAGE_CLASS accessModes: - ReadWriteOnce resources: requests: storage: 10Gi --- apiVersion: v1 kind: Pod metadata: name: mariadb-copy-data annotations: openshift.io/scc: anyuid k8s.v1.cni.cncf.io/networks: internalapi labels: app: adoption spec: containers: - image: $MARIADB_IMAGE command: [ "sh", "-c", "sleep infinity"] name: adoption volumeMounts: - mountPath: /backup name: mariadb-data securityContext: allowPrivilegeEscalation: false capabilities: drop: ALL runAsNonRoot: true seccompProfile: type: RuntimeDefault volumes: - name: mariadb-data persistentVolumeClaim: claimName: mariadb-data EOFWait for the pod to be ready:
$ oc wait --for condition=Ready pod/mariadb-copy-data --timeout=30s
Procedure
Check that the source Galera database clusters in each cell have its members online and synced:
for CELL in $(echo $CELLS); do MEMBERS=SOURCE_GALERA_MEMBERS_$(echo ${CELL}|tr '[:lower:]' '[:upper:]')[@] for i in "${!MEMBERS}"; do echo "Checking for the database node $i WSREP status Synced" oc rsh mariadb-copy-data mysql \ -h "$i" -uroot -p"${SOURCE_DB_ROOT_PASSWORD[$CELL]}" \ -e "show global status like 'wsrep_local_state_comment'" | \ grep -qE "\bSynced\b" done doneNoteEach additional Compute service (nova) v2 cell runs a dedicated Galera database cluster, so the command checks each cell.
Get the count of source databases with the
NOK(not-OK) status:for CELL in $(echo $CELLS); do oc rsh mariadb-copy-data mysql -h "${SOURCE_MARIADB_IP[$CELL]}" -uroot -p"${SOURCE_DB_ROOT_PASSWORD[$CELL]}" -e "SHOW databases;" doneCheck that
mysqlcheckhad no errors:for CELL in $(echo $CELLS); do set +u . ~/.source_cloud_exported_variables_$CELL set -u test -z "${PULL_OPENSTACK_CONFIGURATION_MYSQLCHECK_NOK[$CELL]}" || [ "${PULL_OPENSTACK_CONFIGURATION_MYSQLCHECK_NOK[$CELL]}" = " " ] && echo "OK" || echo "CHECK FAILED" doneTest the connection to the control plane upcall and cells databases:
for CELL in $(echo "super $RENAMED_CELLS"); do oc rsh mariadb-copy-data mysql -rsh "${PODIFIED_MARIADB_IP[$CELL]}" -uroot -p"${PODIFIED_DB_ROOT_PASSWORD[$CELL]}" -e 'SHOW databases;' doneNoteYou must transition Compute services that you import later into a superconductor architecture by deleting the old service records in the cell databases, starting with
cell1. New records are registered with different hostnames that are provided by the Compute service operator. All Compute services, except the Compute agent, have no internal state, and you can safely delete their service records. You also need to rename the formerdefaultcell toDEFAULT_CELL_NAME.Create a dump of the original databases:
for CELL in $(echo $CELLS); do oc rsh mariadb-copy-data << EOF mysql -h"${SOURCE_MARIADB_IP[$CELL]}" -uroot -p"${SOURCE_DB_ROOT_PASSWORD[$CELL]}" \ -N -e "show databases" | grep -E -v "schema|mysql|gnocchi|aodh" | \ while read dbname; do echo "Dumping $CELL cell \${dbname}"; mysqldump -h"${SOURCE_MARIADB_IP[$CELL]}" -uroot -p"${SOURCE_DB_ROOT_PASSWORD[$CELL]}" \ --single-transaction --complete-insert --skip-lock-tables --lock-tables=0 \ "\${dbname}" > /backup/"${CELL}.\${dbname}".sql; done EOF doneRestore the databases from
.sqlfiles into the control plane MariaDB:for CELL in $(echo $CELLS); do RCELL=$CELL [ "$CELL" = "default" ] && RCELL=$DEFAULT_CELL_NAME oc rsh mariadb-copy-data << EOF declare -A db_name_map db_name_map['nova']="nova_$RCELL" db_name_map['ovs_neutron']='neutron' db_name_map['ironic-inspector']='ironic_inspector' declare -A db_cell_map db_cell_map['nova']="nova_$DEFAULT_CELL_NAME" db_cell_map["nova_$RCELL"]="nova_$RCELL" declare -A db_server_map db_server_map['default']=${PODIFIED_MARIADB_IP['super']} db_server_map["nova"]=${PODIFIED_MARIADB_IP[$DEFAULT_CELL_NAME]} db_server_map["nova_$RCELL"]=${PODIFIED_MARIADB_IP[$RCELL]} declare -A db_server_password_map db_server_password_map['default']=${PODIFIED_DB_ROOT_PASSWORD['super']} db_server_password_map["nova"]=${PODIFIED_DB_ROOT_PASSWORD[$DEFAULT_CELL_NAME]} db_server_password_map["nova_$RCELL"]=${PODIFIED_DB_ROOT_PASSWORD[$RCELL]} cd /backup for db_file in \$(ls ${CELL}.*.sql); do db_name=\$(echo \${db_file} | awk -F'.' '{ print \$2; }') [[ "$CELL" != "default" && ! -v "db_cell_map[\${db_name}]" ]] && continue if [[ "$CELL" == "default" && -v "db_cell_map[\${db_name}]" ]] ; then target=$DEFAULT_CELL_NAME elif [[ "$CELL" == "default" && ! -v "db_cell_map[\${db_name}]" ]] ; then target=super else target=$RCELL fi renamed_db_file="\${target}_new.\${db_name}.sql" mv -f \${db_file} \${renamed_db_file} if [[ -v "db_name_map[\${db_name}]" ]]; then echo "renaming $CELL cell \${db_name} to \$target \${db_name_map[\${db_name}]}" db_name=\${db_name_map[\${db_name}]} fi db_server=\${db_server_map["default"]} if [[ -v "db_server_map[\${db_name}]" ]]; then db_server=\${db_server_map[\${db_name}]} fi db_password=\${db_server_password_map['default']} if [[ -v "db_server_password_map[\${db_name}]" ]]; then db_password=\${db_server_password_map[\${db_name}]} fi echo "creating $CELL cell \${db_name} in \$target \${db_server}" mysql -h"\${db_server}" -uroot "-p\${db_password}" -e \ "CREATE DATABASE IF NOT EXISTS \${db_name} DEFAULT \ CHARACTER SET ${CHARACTER_SET} DEFAULT COLLATE ${COLLATION};" echo "importing $CELL cell \${db_name} into \$target \${db_server} from \${renamed_db_file}" mysql -h "\${db_server}" -uroot "-p\${db_password}" "\${db_name}" < "\${renamed_db_file}" done if [ "$CELL" = "default" ] ; then mysql -h "\${db_server_map['default']}" -uroot -p"\${db_server_password_map['default']}" -e \ "update nova_api.cell_mappings set name='$DEFAULT_CELL_NAME' where name='default';" fi mysql -h "\${db_server_map["nova_$RCELL"]}" -uroot -p"\${db_server_password_map["nova_$RCELL"]}" -e \ "delete from nova_${RCELL}.services where host not like '%nova_${RCELL}-%' and services.binary != 'nova-compute';" EOF done-
db_name_mapdefines which common databases to rename when importing them. -
db_cell_mapdefines which cells databases to import, and how to rename them, if needed. -
db_cell_map["nova_$RCELL"]="nova_$RCELL"omits importing specialcell0databases of the cells, as its contents cannot be consolidated during adoption. -
db_server_mapdefines which databases to import into which servers, usually dedicated for cells. -
db_server_password_mapdefines the root passwords map for database servers. You can only use the same password for now. -
renamed_db_file="\${target}_new.\${db_name}.sql"assigns which databases to import into which hosts when extracting databases from thedefaultcell.
-
Verification
Compare the following outputs with the topology-specific service configuration. For more information, see Retrieving topology-specific service configuration.
Check that the databases are imported correctly:
$ set +u $ . ~/.source_cloud_exported_variables_default $ set -u $ dbs=$(oc exec openstack-galera-0 -c galera -- mysql -rs -uroot -p"${PODIFIED_DB_ROOT_PASSWORD['super']}" -e 'SHOW databases;') $ echo $dbs | grep -Eq '\bkeystone\b' && echo "OK" || echo "CHECK FAILED" $ echo $dbs | grep -Eq '\bneutron\b' && echo "OK" || echo "CHECK FAILED" $ echo "${PULL_OPENSTACK_CONFIGURATION_DATABASES[@]}" | grep -Eq '\bovs_neutron\b' && echo "OK" || echo "CHECK FAILED" $ novadb_mapped_cells=$(oc exec openstack-galera-0 -c galera -- mysql -rs -uroot -p"${PODIFIED_DB_ROOT_PASSWORD['super']}" \ > nova_api -e 'select uuid,name,transport_url,database_connection,disabled from cell_mappings;') > uuidf='\S{8,}-\S{4,}-\S{4,}-\S{4,}-\S{12,}' > default=$(printf "%s\n" "$PULL_OPENSTACK_CONFIGURATION_NOVADB_MAPPED_CELLS" | sed -rn "s/^($uuidf)\s+default\b.*$/\1/p") > difference=$(diff -ZNua \ > <(printf "%s\n" "$PULL_OPENSTACK_CONFIGURATION_NOVADB_MAPPED_CELLS") \ > <(printf "%s\n" "$novadb_mapped_cells")) || true > if [ "$DEFAULT_CELL_NAME" != "default" ]; then > printf "%s\n" "$difference" | grep -qE "^\-$default\s+default\b" && echo "OK" || echo "CHECK FAILED" > printf "%s\n" "$difference" | grep -qE "^\+$default\s+$DEFAULT_CELL_NAME\b" && echo "OK" || echo "CHECK FAILED" > [ $(grep -E "^[-\+]$uuidf" <<<"$difference" | wc -l) -eq 2 ] && echo "OK" || echo "CHECK FAILED" > else > [ "x$difference" = "x" ] && echo "OK" || echo "CHECK FAILED" > fi > for CELL in $(echo $RENAMED_CELLS); do > RCELL=$CELL > [ "$CELL" = "$DEFAULT_CELL_NAME" ] && RCELL=default > set +u > . ~/.source_cloud_exported_variables_$RCELL > set -u > c1dbs=$(oc exec openstack-$CELL-galera-0 -c galera -- mysql -rs -uroot -p${PODIFIED_DB_ROOT_PASSWORD[$CELL]} -e 'SHOW databases;') > echo $c1dbs | grep -Eq "\bnova_${CELL}\b" && echo "OK" || echo "CHECK FAILED" > novadb_svc_records=$(oc exec openstack-$CELL-galera-0 -c galera -- mysql -rs -uroot -p${PODIFIED_DB_ROOT_PASSWORD[$CELL]} \ > nova_$CELL -e "select host from services where services.binary='nova-compute' and deleted=0 order by host asc;") > diff -Z <(echo "x$novadb_svc_records") <(echo "x${PULL_OPENSTACK_CONFIGURATION_NOVA_COMPUTE_HOSTNAMES[@]}") && echo "OK" || echo "CHECK FAILED" > done-
echo "${PULL_OPENSTACK_CONFIGURATION_DATABASES[@]}" | grep -Eq '\bovs_neutron\b' && echo "OK" || echo "CHECK FAILED"ensures that the Networking service (neutron) database is renamed fromovs_neutron. -
nova_api -e 'select uuid,name,transport_url,database_connection,disabled from cell_mappings;')ensures that thedefaultcell is renamed to$DEFAULT_CELL_NAME, and the cell UUIDs are retained. -
for CELL in $(echo $RENAMED_CELLS); doensures that the registered Compute services names have not changed. -
c1dbs=$(oc exec openstack-$CELL-galera-0 -c galera -- mysql -rs -uroot -p${PODIFIED_DB_ROOT_PASSWORD[$CELL]} -e 'SHOW databases;')ensures Compute service cells databases are extracted to separate database servers, and renamed fromnovatonova_cell<X>. -
diff -Z <(echo "x$novadb_svc_records") <(echo "x${PULL_OPENSTACK_CONFIGURATION_NOVA_COMPUTE_HOSTNAMES[@]}") && echo "OK" || echo "CHECK FAILED"ensures that the registered Compute service name has not changed.
-
Delete the
mariadb-datapod and themariadb-copy-datapersistent volume claim that contains the database backup:NoteConsider taking a snapshot of them before deleting.
$ oc delete pod mariadb-copy-data $ oc delete pvc mariadb-data
During the pre-checks and post-checks, the mariadb-client pod might return a pod security warning related to the restricted:latest security context constraint. This warning is due to default security context constraints and does not prevent the admission controller from creating a pod. You see a warning for the short-lived pod, but it does not interfere with functionality. For more information, see About pod security standards and warnings.
3.6. Migrating OVN data Copy linkLink copied to clipboard!
Migrate the data in the OVN databases from the original Red Hat OpenStack Platform deployment to ovsdb-server instances that are running in the Red Hat OpenShift Container Platform (RHOCP) cluster.
Prerequisites
-
The
OpenStackControlPlaneresource is created. -
NetworkAttachmentDefinitioncustom resources (CRs) for the original cluster are defined. Specifically, theinternalapinetwork is defined. -
The original Networking service (neutron) and OVN
northdare not running. - There is network routability between the control plane services and the adopted cluster.
- The cloud is migrated to the Modular Layer 2 plug-in with Open Virtual Networking (ML2/OVN) mechanism driver.
Define the following shell variables. Replace the example values with values that are correct for your environment:
STORAGE_CLASS=local-storage OVSDB_IMAGE=registry.redhat.io/rhoso/openstack-ovn-base-rhel9:18.0 SOURCE_OVSDB_IP=172.17.0.100 # For IPv4 SOURCE_OVSDB_IP=[fd00:bbbb::100] # For IPv6To get the value to set
SOURCE_OVSDB_IP, query the puppet-generated configurations in a Controller node:$ grep -rI 'ovn_[ns]b_conn' /var/lib/config-data/puppet-generated/
Procedure
Prepare a temporary
PersistentVolumeclaim and the helper pod for the OVN backup. Adjust the storage requests for a large database, if needed:$ oc apply -f - <<EOF --- apiVersion: cert-manager.io/v1 kind: Certificate metadata: name: ovn-data-cert spec: commonName: ovn-data-cert secretName: ovn-data-cert issuerRef: name: rootca-internal --- apiVersion: v1 kind: PersistentVolumeClaim metadata: name: ovn-data spec: storageClassName: $STORAGE_CLASS accessModes: - ReadWriteOnce resources: requests: storage: 10Gi --- apiVersion: v1 kind: Pod metadata: name: ovn-copy-data annotations: openshift.io/scc: anyuid k8s.v1.cni.cncf.io/networks: internalapi labels: app: adoption spec: containers: - image: $OVSDB_IMAGE command: [ "sh", "-c", "sleep infinity"] name: adoption volumeMounts: - mountPath: /backup name: ovn-data - mountPath: /etc/pki/tls/misc name: ovn-data-cert readOnly: true securityContext: allowPrivilegeEscalation: false capabilities: drop: ALL runAsNonRoot: true seccompProfile: type: RuntimeDefault volumes: - name: ovn-data persistentVolumeClaim: claimName: ovn-data - name: ovn-data-cert secret: secretName: ovn-data-cert EOFWait for the pod to be ready:
$ oc wait --for=condition=Ready pod/ovn-copy-data --timeout=30sIf the podified internalapi cidr is different than the source internalapi cidr, add an iptables accept rule on the Controller nodes:
$ $CONTROLLER1_SSH sudo iptables -I INPUT -s {PODIFIED_INTERNALAPI_NETWORK} -p tcp -m tcp --dport 6641 -m conntrack --ctstate NEW -j ACCEPT $ $CONTROLLER1_SSH sudo iptables -I INPUT -s {PODIFIED_INTERNALAPI_NETWORK} -p tcp -m tcp --dport 6642 -m conntrack --ctstate NEW -j ACCEPTBack up your OVN databases:
If you did not enable TLS everywhere, run the following command:
$ oc exec ovn-copy-data -- bash -c "ovsdb-client backup tcp:$SOURCE_OVSDB_IP:6641 > /backup/ovs-nb.db" $ oc exec ovn-copy-data -- bash -c "ovsdb-client backup tcp:$SOURCE_OVSDB_IP:6642 > /backup/ovs-sb.db"If you enabled TLS everywhere, run the following command:
$ oc exec ovn-copy-data -- bash -c "ovsdb-client backup --ca-cert=/etc/pki/tls/misc/ca.crt --private-key=/etc/pki/tls/misc/tls.key --certificate=/etc/pki/tls/misc/tls.crt ssl:$SOURCE_OVSDB_IP:6641 > /backup/ovs-nb.db" $ oc exec ovn-copy-data -- bash -c "ovsdb-client backup --ca-cert=/etc/pki/tls/misc/ca.crt --private-key=/etc/pki/tls/misc/tls.key --certificate=/etc/pki/tls/misc/tls.crt ssl:$SOURCE_OVSDB_IP:6642 > /backup/ovs-sb.db"
Start the control plane OVN database services prior to import, with
northddisabled:$ oc patch openstackcontrolplane openstack --type=merge --patch ' spec: ovn: enabled: true template: ovnDBCluster: ovndbcluster-nb: replicas: 3 dbType: NB storageRequest: 10G networkAttachment: internalapi ovndbcluster-sb: replicas: 3 dbType: SB storageRequest: 10G networkAttachment: internalapi ovnNorthd: replicas: 0 'Wait for the OVN database services to reach the
Runningphase:$ oc wait --for=jsonpath='{.status.phase}'=Running pod --selector=service=ovsdbserver-nb $ oc wait --for=jsonpath='{.status.phase}'=Running pod --selector=service=ovsdbserver-sbFetch the OVN database IP addresses on the
clusterIPservice network:PODIFIED_OVSDB_NB_IP=$(oc get svc --selector "statefulset.kubernetes.io/pod-name=ovsdbserver-nb-0" -ojsonpath='{.items[0].spec.clusterIP}') PODIFIED_OVSDB_SB_IP=$(oc get svc --selector "statefulset.kubernetes.io/pod-name=ovsdbserver-sb-0" -ojsonpath='{.items[0].spec.clusterIP}')If you are using IPv6, adjust the address to the format expected by
ovsdb-*tools:PODIFIED_OVSDB_NB_IP=[$PODIFIED_OVSDB_NB_IP] PODIFIED_OVSDB_SB_IP=[$PODIFIED_OVSDB_SB_IP]Upgrade the database schema for the backup files:
If you did not enable TLS everywhere, use the following command:
$ oc exec ovn-copy-data -- bash -c "ovsdb-client get-schema tcp:$PODIFIED_OVSDB_NB_IP:6641 > /backup/ovs-nb.ovsschema && ovsdb-tool convert /backup/ovs-nb.db /backup/ovs-nb.ovsschema" $ oc exec ovn-copy-data -- bash -c "ovsdb-client get-schema tcp:$PODIFIED_OVSDB_SB_IP:6642 > /backup/ovs-sb.ovsschema && ovsdb-tool convert /backup/ovs-sb.db /backup/ovs-sb.ovsschema"If you enabled TLS everywhere, use the following command:
$ oc exec ovn-copy-data -- bash -c "ovsdb-client get-schema --ca-cert=/etc/pki/tls/misc/ca.crt --private-key=/etc/pki/tls/misc/tls.key --certificate=/etc/pki/tls/misc/tls.crt ssl:$PODIFIED_OVSDB_NB_IP:6641 > /backup/ovs-nb.ovsschema && ovsdb-tool convert /backup/ovs-nb.db /backup/ovs-nb.ovsschema" $ oc exec ovn-copy-data -- bash -c "ovsdb-client get-schema --ca-cert=/etc/pki/tls/misc/ca.crt --private-key=/etc/pki/tls/misc/tls.key --certificate=/etc/pki/tls/misc/tls.crt ssl:$PODIFIED_OVSDB_SB_IP:6642 > /backup/ovs-sb.ovsschema && ovsdb-tool convert /backup/ovs-sb.db /backup/ovs-sb.ovsschema"
Restore the database backup to the new OVN database servers:
If you did not enable TLS everywhere, use the following command:
$ oc exec ovn-copy-data -- bash -c "ovsdb-client restore tcp:$PODIFIED_OVSDB_NB_IP:6641 < /backup/ovs-nb.db" $ oc exec ovn-copy-data -- bash -c "ovsdb-client restore tcp:$PODIFIED_OVSDB_SB_IP:6642 < /backup/ovs-sb.db"If you enabled TLS everywhere, use the following command:
$ oc exec ovn-copy-data -- bash -c "ovsdb-client restore --ca-cert=/etc/pki/tls/misc/ca.crt --private-key=/etc/pki/tls/misc/tls.key --certificate=/etc/pki/tls/misc/tls.crt ssl:$PODIFIED_OVSDB_NB_IP:6641 < /backup/ovs-nb.db" $ oc exec ovn-copy-data -- bash -c "ovsdb-client restore --ca-cert=/etc/pki/tls/misc/ca.crt --private-key=/etc/pki/tls/misc/tls.key --certificate=/etc/pki/tls/misc/tls.crt ssl:$PODIFIED_OVSDB_SB_IP:6642 < /backup/ovs-sb.db"
Check that the data was successfully migrated by running the following commands against the new database servers, for example:
$ oc exec -it ovsdbserver-nb-0 -- ovn-nbctl show $ oc exec -it ovsdbserver-sb-0 -- ovn-sbctl list ChassisStart the control plane
ovn-northdservice to keep both OVN databases in sync:$ oc patch openstackcontrolplane openstack --type=merge --patch ' spec: ovn: enabled: true template: ovnNorthd: replicas: 1 'If you are running OVN gateway services on RHOCP nodes, enable the control plane
ovn-controllerservice:$ oc patch openstackcontrolplane openstack --type=merge --patch ' spec: ovn: enabled: true template: ovnController: nicMappings: physNet: NIC 'physNetdefines the name of your physical network.NICis the name of the physical interface that is connected to your physical network.NoteRunning OVN gateways on RHOCP nodes might be prone to data plane downtime during Open vSwitch upgrades. Consider running OVN gateways on dedicated
Networkerdata plane nodes for production deployments instead.
Delete the
ovn-datahelper pod and the temporaryPersistentVolumeClaimthat is used to store OVN database backup files:$ oc delete --ignore-not-found=true pod ovn-copy-data $ oc delete --ignore-not-found=true pvc ovn-dataNoteConsider taking a snapshot of the
ovn-datahelper pod and the temporaryPersistentVolumeClaimbefore deleting them. For more information, see About volume snapshots in OpenShift Container Platform storage overview.Stop the adopted OVN database servers:
ServicesToStop=("tripleo_ovn_cluster_north_db_server.service" "tripleo_ovn_cluster_south_db_server.service") echo "Stopping systemd OpenStack services" for service in ${ServicesToStop[*]}; do for i in {1..3}; do SSH_CMD=CONTROLLER${i}_SSH if [ ! -z "${!SSH_CMD}" ]; then echo "Stopping the $service in controller $i" if ${!SSH_CMD} sudo systemctl is-active $service; then ${!SSH_CMD} sudo systemctl stop $service fi fi done done echo "Checking systemd OpenStack services" for service in ${ServicesToStop[*]}; do for i in {1..3}; do SSH_CMD=CONTROLLER${i}_SSH if [ ! -z "${!SSH_CMD}" ]; then if ! ${!SSH_CMD} systemctl show $service | grep ActiveState=inactive >/dev/null; then echo "ERROR: Service $service still running on controller $i" else echo "OK: Service $service is not running on controller $i" fi fi done done
Chapter 4. Adopting Red Hat OpenStack Platform control plane services Copy linkLink copied to clipboard!
Adopt your Red Hat OpenStack Platform 17.1 control plane services to deploy them in the Red Hat OpenStack Services on OpenShift (RHOSO) 18.0 control plane.
4.1. Adopting the Identity service Copy linkLink copied to clipboard!
To adopt the Identity service (keystone), you patch an existing OpenStackControlPlane custom resource (CR) where the Identity service is disabled. The patch starts the service with the configuration parameters that are provided by the Red Hat OpenStack Platform (RHOSP) environment.
Prerequisites
Create the keystone secret that includes the Fernet keys that were copied from the RHOSP environment:
$ oc apply -f - <<EOF apiVersion: v1 data: CredentialKeys0: $($CONTROLLER1_SSH sudo cat /var/lib/config-data/puppet-generated/keystone/etc/keystone/credential-keys/0 | base64 -w 0) CredentialKeys1: $($CONTROLLER1_SSH sudo cat /var/lib/config-data/puppet-generated/keystone/etc/keystone/credential-keys/1 | base64 -w 0) FernetKeys0: $($CONTROLLER1_SSH sudo cat /var/lib/config-data/puppet-generated/keystone/etc/keystone/fernet-keys/0 | base64 -w 0) FernetKeys1: $($CONTROLLER1_SSH sudo cat /var/lib/config-data/puppet-generated/keystone/etc/keystone/fernet-keys/1 | base64 -w 0) kind: Secret metadata: name: keystone type: Opaque EOF
Procedure
Patch the
OpenStackControlPlaneCR to deploy the Identity service:$ oc patch openstackcontrolplane openstack --type=merge --patch ' spec: keystone: enabled: true apiOverride: route: {} template: override: service: internal: metadata: annotations: metallb.universe.tf/address-pool: internalapi metallb.universe.tf/allow-shared-ip: internalapi metallb.universe.tf/loadBalancerIPs: <172.17.0.80> spec: type: LoadBalancer databaseInstance: openstack secret: osp-secret 'where:
- <172.17.0.80>
-
Specifies the load balancer IP in your environment. If you use IPv6, change the load balancer IP to the load balancer IP in your environment, for example,
metallb.universe.tf/loadBalancerIPs: fd00:bbbb::80.
Create an alias to use the
openstackcommand in the Red Hat OpenStack Services on OpenShift (RHOSO) deployment:$ alias openstack="oc exec -t openstackclient -- openstack"Remove services and endpoints that still point to the RHOSP control plane, excluding the Identity service and its endpoints:
$ openstack endpoint list | grep keystone | awk '/admin/{ print $2; }' | xargs ${BASH_ALIASES[openstack]} endpoint delete || true > for service in aodh heat heat-cfn barbican cinderv3 glance gnocchi manila manilav2 neutron nova placement swift ironic-inspector ironic octavia; do > openstack service list | awk "/ $service /{ print \$2; }" | xargs -r ${BASH_ALIASES[openstack]} service delete || true > done
Verification
-
Verify that you can access the
OpenStackClientpod. For more information, see Accessing the OpenStackClient pod in Maintaining the Red Hat OpenStack Services on OpenShift deployment. Confirm that the Identity service endpoints are defined and are pointing to the control plane FQDNs:
$ openstack endpoint list | grep keystoneWait for the
OpenStackControlPlaneresource to becomeReady:$ oc wait --for=condition=Ready --timeout=1m OpenStackControlPlane openstack
4.2. Configuring LDAP with domain-specific drivers Copy linkLink copied to clipboard!
This content in this section is available in this release as a Technology Preview, and therefore is not fully supported by Red Hat. It should only be used for testing, and should not be deployed in a production environment. For more information, see Technology Preview.
If you need to integrate the Identity service (keystone) with one or more LDAP servers using domain-specific configurations, you can enable domain-specific drivers and provide the necessary LDAP settings.
This involves two main steps:
- Create the secret that holds the domain-specific LDAP configuration files that the Identity service uses. Each file within the secret corresponds to an LDAP domain.
-
Patch the
OpenStackControlPlanecustom resource (CR) to enable domain-specific drivers for the Identity service and mount a secret that contains the LDAP configurations.
Procedure
To create the
keystone-domainssecret that stores the actual LDAP configuration files that Identity service uses, create a local file that includes your LDAP configuration, for example,keystone.myldapdomain.conf:The following example file includes the configuration for a single LDAP domain. If you have multiple LDAP domains, create a configuration file for each, for example,
keystone.DOMAIN_ONE.conf,keystone.DOMAIN_TWO.conf.[identity] driver = ldap [ldap] url = ldap://<ldap_server_host>:<ldap_server_port> user = <bind_dn_user> password = <bind_dn_password> suffix = <user_tree_dn> query_scope = sub # User configuration user_tree_dn = <user_tree_dn> user_objectclass = <user_object_class> user_id_attribute = <user_id_attribute> user_name_attribute = <user_name_attribute> user_mail_attribute = <user_mail_attribute> user_enabled_attribute = <user_enabled_attribute> user_enabled_default = true # Group configuration group_tree_dn = <group_tree_dn> group_objectclass = <group_object_class> group_id_attribute = <group_id_attribute> group_name_attribute = <group_name_attribute> group_member_attribute = <group_member_attribute> group_members_are_ids = true-
Replace the values, such as
<ldap_server_host>,<bind_dn_user>,<user_tree_dn>, and so on, with your LDAP server details.
-
Replace the values, such as
Create the secret from this file:
$ oc create secret generic keystone-domains --from-file=<keystone.DOMAIN_NAME.conf>Replace
<keystone.DOMAIN_NAME.conf>with the name of your local configuration file. If applicable, include additional configuration files by using the--from-fileoption. After creating the secret, you can remove the local configuration file if it is no longer needed, or store it securely.ImportantThe name of the file that you provide to
--from-file, for examplekeystone.DOMAIN_NAME.conf, is critical. The Identity service uses this filename to map incoming authentication requests for a domain to the correct LDAP configuration. Ensure thatDOMAIN_NAMEmatches the name of the domain you are configuring in the Identity service.
Patch the
OpenStackControlPlaneCR:$ oc patch openstackcontrolplane <cr_name> --type=merge -p ' spec: keystone: template: customServiceConfig: | [identity] domain_specific_drivers_enabled = true extraMounts: - name: v1 region: r1 extraVol: - propagation: - Keystone extraVolType: Conf volumes: - name: keystone-domains secret: secretName: keystone-domains mounts: - name: keystone-domains mountPath: "/etc/keystone/domains" readOnly: true-
Replace
<cr_name>with the name of yourOpenStackControlPlaneCR (for example,openstack). This patch does the following:
-
Sets
spec.keystone.template.customServiceConfig. Ensure that you do not overwrite any previously defined value. Defines
spec.keystone.template.extraMountsto mount a secret namedkeystone-domainsinto the Identity service pods at/etc/keystone/domains. This secret contains your LDAP configuration files.NoteYou might need to wait a few minutes for the changes to propagate and for the Identity service pods to be updated.
-
Sets
-
Replace
Verification
Verify that users from the LDAP domain are accessible:
$ oc exec -t openstackclient -- openstack user list --domain <domain_name>Replace
<domain_name>with your LDAP domain name.This command returns a list of users from your LDAP server.
Verify that groups from the LDAP domain are accessible:
$ oc exec -t openstackclient -- openstack group list --domain <domain_name>This command returns a list of groups from your LDAP server.
Test authentication with an LDAP user:
$ oc exec -t openstackclient -- openstack --os-auth-url <keystone_auth_url> --os-identity-api-version 3 --os-user-domain-name <domain_name> --os-username <ldap_username> --os-password <ldap_password> token issue-
Replace
<keystone_auth_url>with the Identity service authentication URL. Replace
<ldap_username>and<ldap_password>with valid LDAP user credentials.If successful, this command returns a token, confirming that LDAP authentication is working correctly.
-
Replace
Verify group membership for an LDAP user:
$ oc exec -t openstackclient -- openstack group contains user --group-domain <domain_name> --user-domain <domain_name> <group_name> <username>Replace
<domain_name>,<group_name>, and<username>with the appropriate values from your LDAP server.This command verifies that the user is properly associated with the group through LDAP.
4.3. Adopting the Key Manager service Copy linkLink copied to clipboard!
To adopt the Key Manager service (barbican), you patch an existing OpenStackControlPlane custom resource (CR) where Key Manager service is disabled. The patch starts the service with the configuration parameters that are provided by the Red Hat OpenStack Platform (RHOSP) environment. You configure the Key Manager service to use the simple_crypto back end.
The Key Manager service adoption is complete if you see the following results:
-
The
BarbicanAPI,BarbicanWorker, andBarbicanKeystoneListenerservices are up and running. - Keystone endpoints are updated, and the same crypto plugin of the source cloud is available.
To configure hardware security module (HSM) integration with Proteccio HSM, see Adopting the Key Manager service with Proteccio HSM integration.
Procedure
Add the kek secret:
$ oc set data secret/osp-secret "BarbicanSimpleCryptoKEK=$($CONTROLLER1_SSH "python3 -c \"import configparser; c = configparser.ConfigParser(); c.read('/var/lib/config-data/puppet-generated/barbican/etc/barbican/barbican.conf'); print(c['simple_crypto_plugin']['kek'])\"")"Patch the
OpenStackControlPlaneCR to deploy the Key Manager service:$ oc patch openstackcontrolplane openstack --type=merge --patch ' spec: barbican: enabled: true apiOverride: route: {} template: databaseInstance: openstack databaseAccount: barbican messagingBus: cluster: rabbitmq secret: osp-secret simpleCryptoBackendSecret: osp-secret serviceAccount: barbican serviceUser: barbican passwordSelectors: service: BarbicanPassword simplecryptokek: BarbicanSimpleCryptoKEK barbicanAPI: replicas: 1 override: service: internal: metadata: annotations: metallb.universe.tf/address-pool: internalapi metallb.universe.tf/allow-shared-ip: internalapi metallb.universe.tf/loadBalancerIPs: <172.17.0.80> spec: type: LoadBalancer barbicanWorker: replicas: 1 barbicanKeystoneListener: replicas: 1 'where:
<172.17.0.80>-
Specifies the load balancer IP in your environment. If you use IPv6, change the load balancer IP to the load balancer IP in your environment, for example,
metallb.universe.tf/loadBalancerIPs: fd00:bbbb::80. messagingBus.Cluster- For more information about RHOSO RabbitMQ clusters, see RHOSO RabbitMQ clusters in Monitoring high availability services.
Verification
Ensure that the Identity service (keystone) endpoints are defined and are pointing to the control plane FQDNs:
$ openstack endpoint list | grep key-managerEnsure that Barbican API service is registered in the Identity service:
$ openstack service list | grep key-manager$ openstack endpoint list | grep key-managerList the secrets:
$ openstack secret list
4.4. Adopting the Key Manager service with HSM integration Copy linkLink copied to clipboard!
This content in this section is available in this release as a Technology Preview, and therefore is not fully supported by Red Hat. It should only be used for testing, and should not be deployed in a production environment.
Adopt the Key Manager service (barbican) from director to Red Hat OpenStack Services on OpenShift (RHOSO) when your source environment includes hardware security module (HSM) integration to preserve HSM functionality and maintain access to HSM-backed secrets. HSM provides enhanced security for cryptographic operations by storing encryption keys in dedicated hardware devices.
For additional information about the Key Manager service before you start the adoption, see the following resources:
- Key Manager service service configuration documentation
- Hardware security module vendor-specific documentation
- OpenStack Barbican PKCS#11 plugin documentation
4.4.1. Key Manager service HSM adoption approaches Copy linkLink copied to clipboard!
This content in this section is available in this release as a Technology Preview, and therefore is not fully supported by Red Hat. It should only be used for testing, and should not be deployed in a production environment. For more information, see Technology Preview.
The Key Manager service (barbican) adoption approach depends on your source director environment configuration.
-
Use the standard adoption approach if your environment includes only the
simple_cryptoplugin for secret storage and has no HSM integration. Use the HSM-enabled adoption approach if your source environment has HSM integration that uses Public Key Cryptography Standard (PKCS) #11, Key Management Interoperability Protocol (KMIP), or other HSM back ends alongside
simple_crypto.- Standard adoption approach
- Uses the existing Key Manager service adoption procedure
- Migrates a simple crypto back-end configuration
- Provides a single-step adoption process
Is suitable for development, testing, and standard production environments
- HSM-enabled adoption approach
-
Uses the enhanced
barbican_adoptionrole with HSM awareness -
Configures HSM integration through a simple boolean flag (
barbican_hsm_enabled: true) -
Automatically creates required Kubernetes secrets (
hsm-loginandproteccio-data) - Preserves HSM metadata during database migration
- Supports both simple crypto and HSM back ends in the target environment
-
Requires HSM-specific configuration variables and custom container images with HSM client libraries (built using the
rhoso_proteccio_hsmAnsible role) - Uses HSM client certificates and configuration files accessible via URLs
- Requires proper HSM partition and key configuration that matches your source environment
The HSM-enabled adoption approach currently supports:
- Proteccio (Eviden Trustway): Fully supported with PKCS#11 integration
- Luna (Thales): PKCS#11 support available
- nCipher (Entrust): PKCS#11 support available
HSM adoption requires additional configuration steps, including:
-
Custom Barbican container images with HSM client libraries that are built using the
rhoso_proteccio_hsmAnsible role - HSM client certificates and configuration files that are accessible by using URLs
- Proper HSM partition and key configuration that matches your source environment
These approaches are mutually exclusive. Choose an approach based on your source environment configuration.
| Source environment characteristic | Approach | Rationale |
|---|---|---|
|
Only | Standard adoption | No HSM complexity needed |
| HSM integration present (PKCS#11, KMIP, and so on) | HSM-enabled adoption | Preserves HSM functionality and secrets |
| Development or testing environment | Standard adoption | Simpler setup and maintenance |
| Production with compliance requirements | HSM-enabled adoption | Maintains security compliance |
| Unknown back-end configuration | Check source environment first | Determine appropriate approach |
4.4.2. Adopting the Key Manager service with Proteccio HSM integration Copy linkLink copied to clipboard!
This content in this section is available in this release as a Technology Preview, and therefore is not fully supported by Red Hat. It should only be used for testing, and should not be deployed in a production environment. For more information, see Technology Preview.
To adopt the Key Manager service (barbican) with Proteccio hardware security module (HSM) integration, you use the enhanced Barbican adoption role with HSM support enabled through a configuration flag. This approach preserves HSM integration while adopting all existing secrets from your source Red Hat OpenStack Platform (RHOSP) environment. When you run the data plane adoption tests with HSM support enabled, the adoption process performs the following actions:
- Extracts the simple crypto KEK from the source configuration.
- Creates the required HSM secrets (hsm-login and proteccio-data) in the target namespace.
- Deploys Barbican with HSM-enabled configuration by using the PKCS#11 plugin.
- Verifies the HSM functionality and secret migration. When you run the data plane adoption tests with HSM support enabled, the adoption process performs the following actions:
- Extracts the simple crypto KEK from the source configuration.
- Creates the required HSM secrets (hsm-login and proteccio-data) in the target namespace.
- Deploys Barbican with HSM-enabled configuration by using the PKCS#11 plugin.
- Verifies the HSM functionality and secret migration.
The Key Manager service Proteccio HSM adoption is complete if you see the following results:
-
The
BarbicanAPIandBarbicanWorkerservices are up and running with HSM-enabled configuration. - All secrets from the source RHOSP 17.1 environment are available in Red Hat OpenStack Services on OpenShift (RHOSO) 18.0.
-
The PKCS11 crypto plugin is available alongside
simple_cryptofor new secret storage. - HSM functionality is verified and operational.
If your environment does not include Proteccio HSM, to adopt the Key Manager service by using simple_crypto, see Adopting the Key Manager service.
The enhanced Key Manager service adoption role supports HSM configuration through a simple boolean flag. This approach integrates seamlessly with the standard data plane adoption framework while providing HSM support.
Prerequisites
- You have a running director environment with Proteccio HSM integration (the source cloud).
- You have a Single Node OpenShift or OpenShift Local running in the Red Hat OpenShift Container Platform (RHOCP) cluster.
- You have SSH access to the source director undercloud and Controller nodes.
- You have configured HSM variables in your adoption configuration files.
- Custom Key Manager service container images with the Proteccio client libraries are available in your registry.
The HSM adoption process requires proper configuration of HSM-related variables. The adoption role automatically creates the required Kubernetes secrets (hsm-login and proteccio-data) when barbican_hsm_enabled is set to true. Ensure that your environment includes the following:
- All HSM-related variables are properly set in your configuration files
- The Proteccio client ISO, certificates, and configuration files are accessible from the configured URLs
- Custom Key Manager service images with Proteccio client are built and available in your container registry
Without proper HSM configuration, your HSM-protected secrets become inaccessible after adoption.
Procedure
Configure HSM integration variables in your adoption configuration (Zuul job vars or CI framework configuration):
# Enable HSM integration for the Barbican adoption role barbican_hsm_enabled: true # HSM login credentials proteccio_login_password: "your_hsm_password" # Kubernetes secret names (defaults shown) proteccio_login_secret_name: "hsm-login" proteccio_client_data_secret_name: "proteccio-data" # HSM partition and key configuration cifmw_hsm_proteccio_partition: "VHSM1" cifmw_hsm_mkek_label: "adoption_mkek_1" cifmw_hsm_hmac_label: "adoption_hmac_1" cifmw_hsm_proteccio_library_path: "/usr/lib64/libnethsm.so" cifmw_hsm_key_wrap_mechanism: "CKM_AES_CBC_PAD" # HSM client sources (URLs to download Proteccio client files) cifmw_hsm_proteccio_client_src: "<URL_of_Proteccio_ISO_file>" cifmw_hsm_proteccio_conf_src: "<URL_of_proteccio.rc_config_file>" cifmw_hsm_proteccio_client_crt_src: "<URL_of_client_certificate_file>" cifmw_hsm_proteccio_client_key_src: "<URL_of_client_certificate_key>" cifmw_hsm_proteccio_server_crt_src: - "<URL_of_HSM_certificate_file>"where:
- <URL_of_Proteccio_ISO_file>
- Specifies the full URL (including "http://" or "https://") of the Proteccio client ISO image file.
- <URL_of_proteccio.rc_config_file>
-
Specifies the full URL (including "http://" or "https://") of the
proteccio.rcconfiguration in your RHOSO environment. - <URL_of_client_certificate_file>
- Specifies the full URL (including "http://" or "https://") of the HSM client certificate file.
- <URL_of_client_certificate_key>
- Specifies the full URL (including "http://" or "https://") of the client key file.
- <URL_of_HSM_certificate_file>
- Specifies the full URL (including "http://" or "https://") of the HSM certificate file.
- Run the data plane adoption tests with HSM support enabled:
Verification
Ensure that the Identity service (keystone) endpoints are defined and are pointing to the control plane FQDNs:
$ openstack endpoint list | grep key-managerEnsure that the Barbican API service is registered in the Identity service:
$ openstack service list | grep key-managerVerify that all secrets from the source environment are available:
$ openstack secret listConfirm that Barbican services are running:
$ oc get pods -n openstack -l service=barbican -o wideTest secret creation to verify HSM functionality:
$ openstack secret store --name adoption-verification --payload 'HSM adoption successful'Verify that the HSM back end is operational:
$ openstack secret get <secret_id> --payloadwhere:
- <secret_id>
- Specifies the ID of the HSM secret.
4.4.3. Adopting the Key Manager service with HSM integration Copy linkLink copied to clipboard!
This content in this section is available in this release as a Technology Preview, and therefore is not fully supported by Red Hat. It should only be used for testing, and should not be deployed in a production environment. For more information, see Technology Preview.
When your source director environment includes hardware security module (HSM) integration, you must use the HSM-enabled adoption approach to preserve HSM functionality and maintain access to HSM-backed secrets.
Prerequisites
- The source director environment with HSM integration is configured.
- HSM client software and certificates are available from accessible URLs.
- The target Red Hat OpenStack Services on OpenShift (RHOSO) environment with HSM infrastructure is accessible.
- HSM-enabled Key Manager service (barbican) container images are built and available in your registry.
If you use the automated adoption process by setting barbican_hsm_enabled: true, the required HSM secrets (hsm-login and proteccio-data) are created automatically. You only need to manually create the secret when you perform the manual adoption steps.
Procedure
Confirm that your source environment configuration includes HSM integration:
$ ssh tripleo-admin@controller-0.ctlplane \ "sudo cat /var/lib/config-data/puppet-generated/barbican/etc/barbican/barbican.conf | grep -A5 '\[.*plugin\]'"If you see
[p11_crypto_plugin]or other HSM-specific sections, continue with the HSM adoption.Extract the simple crypto key encryption keys (KEK) from your source environment:
$ SIMPLE_CRYPTO_KEK=$(ssh tripleo-admin@controller-0.ctlplane \ "sudo python3 -c \"import configparser; c = configparser.ConfigParser(); c.read('/var/lib/config-data/puppet-generated/barbican/etc/barbican/barbican.conf'); print(c['simple_crypto_plugin']['kek'])\"")Add the KEK to the target environment:
$ oc set data secret/osp-secret "BarbicanSimpleCryptoKEK=${SIMPLE_CRYPTO_KEK}"If you are not using the automated adoption, create HSM-specific secrets in the target environment:
# Create HSM login credentials secret $ oc create secret generic hsm-login \ --from-literal=PKCS11Pin=<your_hsm_password> \ -n openstack # Create HSM client configuration and certificates secret $ oc create secret generic proteccio-data \ --from-file=client.crt=<path_to_client_cert> \ --from-file=client.key=<path_to_client_key> \ --from-file=10_8_60_93.CRT=<path_to_server_cert> \ --from-file=proteccio.rc=<path_to_hsm_config> \ -n openstackwhere:
- <your_hsm_password>
- Specifies the HSM password for your RHOSO environment.
- <path_to_client_cert>
- Specifies the path to the HSM client certificate.
- <path_to_client_key>
- Specifies the path to the client key.
- <path_to_server_cert>
- Specifies the path to the server certificate.
- <path_to_hsm_config>
Specifies the path to your HSM configuration in your RHOSO environment.
NoteWhen you use the automated adoption by setting
barbican_hsm_enabled: true, thebarbican_adoptionrole creates these secrets automatically. The secret names default tohsm-loginandproteccio-data.
Patch the
OpenStackControlPlanecustom resource to deploy Key Manager service with HSM support:$ oc patch openstackcontrolplane openstack --type=merge --patch ' spec: barbican: enabled: true apiOverride: route: {} template: databaseInstance: openstack databaseAccount: barbican rabbitMqClusterName: rabbitmq secret: osp-secret simpleCryptoBackendSecret: osp-secret serviceAccount: barbican serviceUser: barbican passwordSelectors: database: BarbicanDatabasePassword service: BarbicanPassword simplecryptokek: BarbicanSimpleCryptoKEK customServiceConfig: | [p11_crypto_plugin] plugin_name = PKCS11 library_path = /usr/lib64/libnethsm.so token_labels = VHSM1 mkek_label = adoption_mkek_1 hmac_label = adoption_hmac_1 encryption_mechanism = CKM_AES_CBC hmac_key_type = CKK_GENERIC_SECRET hmac_keygen_mechanism = CKM_GENERIC_SECRET_KEY_GEN hmac_mechanism = CKM_SHA256_HMAC key_wrap_mechanism = CKM_AES_CBC_PAD key_wrap_generate_iv = true always_set_cka_sensitive = true os_locking_ok = false globalDefaultSecretStore: pkcs11 enabledSecretStores: ["simple_crypto", "pkcs11"] pkcs11: loginSecret: hsm-login clientDataSecret: proteccio-data clientDataPath: /etc/proteccio barbicanAPI: replicas: 1 override: service: internal: metadata: annotations: metallb.universe.tf/address-pool: internalapi metallb.universe.tf/allow-shared-ip: internalapi metallb.universe.tf/loadBalancerIPs: 172.17.0.80 spec: type: LoadBalancer barbicanWorker: replicas: 1 barbicanKeystoneListener: replicas: 1 '-
library_pathspecifies the path to the PKCS#11 library, for example,/usr/lib64/libnethsm.sofor Proteccio). -
token_labelsspecifies the HSM partition name, for example,VHSM1. -
mkek_labelandhmac_labelspecify key labels that are configured in the HSM. -
loginSecretspecifies the name of the Kubernetes secret that contains the HSM PIN. -
clientDataSecretspecifies the name of the Kubernetes secret that contains the HSM certificates and configuration.
-
Verification
Verify that both secret stores are available:
$ openstack secret store listTest the HSM back-end functionality:
$ openstack secret store --name "hsm-test-$(date +%s)" \ --payload "test-payload" \ --algorithm aes --mode cbc --bit-length 256Verify that the migrated secrets are accessible:
$ openstack secret listCheck that the Key Manager service services are operational:
$ oc get pods -l service=barbican NAME READY STATUS RESTARTS AGE barbican-api-5d65949b4-xhkd7 2/2 Running 7 (10m ago) 29d barbican-keystone-listener-687cbdc77d-4kjnk 2/2 Running 3 (11m ago) 29d barbican-worker-5c4b947d5c-l9jdh 2/2 Running 3 (11m ago) 29d
HSM adoption preserves both simple crypto and HSM-backed secrets. The migration process maintains HSM metadata and secret references, ensuring continued access to existing secrets while enabling new secrets to use either back-end.
4.4.4. Troubleshooting Key Manager HSM adoption Copy linkLink copied to clipboard!
This content in this section is available in this release as a Technology Preview, and therefore is not fully supported by Red Hat. It should only be used for testing, and should not be deployed in a production environment. For more information, see Technology Preview.
Review troubleshooting guidance for common issues that you might encounter while you perform the HSM-enabled Key Manager (Barbican) service adoption.
If issues persist after following the troubleshooting guide:
- Collect adoption logs and configuration for analysis.
- Check the HSM vendor documentation for vendor-specific troubleshooting.
- Verify HSM server status and connectivity independently.
- Review the adoption summary report for additional diagnostic information.
4.4.4.1. Resolving configuration validation failures Copy linkLink copied to clipboard!
This content in this section is available in this release as a Technology Preview, and therefore is not fully supported by Red Hat. It should only be used for testing, and should not be deployed in a production environment. For more information, see Technology Preview.
If the adoption fails with validation errors about placeholder values, replace the placeholder values with your environment’s configuration values.
Example error:
TASK [Validate all required variables are set] ****
fatal: [localhost]: FAILED! => {
"msg": "Required variable proteccio_certs_path contains placeholder value."
}
Procedure
- Edit your hardware security module configuration in the Zuul job vars or CI framework configuration file.
Check the following key variables and replace all placeholder values with actual configuration values for your environment:
cifmw_hsm_password: <your_actual_hsm_password> cifmw_barbican_proteccio_partition: <VHSM1> cifmw_barbican_proteccio_mkek_label: <your_mkek_label> cifmw_barbican_proteccio_hmac_label: <your_hmac_label> cifmw_hsm_proteccio_client_src: <https://your-server/path/to/Proteccio.iso> cifmw_hsm_proteccio_conf_src: <https://your-server/path/to/proteccio.rc>- Verify that no placeholder values remain in your configuration.
4.4.4.2. Resolving missing HSM file prerequisites Copy linkLink copied to clipboard!
This content in this section is available in this release as a Technology Preview, and therefore is not fully supported by Red Hat. It should only be used for testing, and should not be deployed in a production environment. For more information, see Technology Preview.
If the adoption fails because hardware security module (HSM) certificates or client software cannot be found, update your configuration to point to the files in their specific locations.
Example error:
TASK [Validate Proteccio prerequisites exist] ****
fatal: [localhost]: FAILED! => {
"msg": "Proteccio client ISO not found: /opt/proteccio/Proteccio3.06.05.iso"
}
Procedure
Verify that all required HSM files are accessible from the configured URLs. For example:
$ curl -I https://your-server/path/to/Proteccio3.06.05.iso $ curl -I https://your-server/path/to/proteccio.rc $ curl -I https://your-server/path/to/client.crt $ curl -I https://your-server/path/to/client.keyIf the files are in different locations, update the URL variables in your configuration. For example:
cifmw_hsm_proteccio_client_src: "https://correct-server/path/to/Proteccio3.06.05.iso" cifmw_hsm_proteccio_conf_src: "https://correct-server/path/to/proteccio.rc" cifmw_hsm_proteccio_client_crt_src: "https://correct-server/path/to/client.crt" cifmw_hsm_proteccio_client_key_src: "https://correct-server/path/to/client.key"- Check the network connectivity and authentication to ensure that the URLs are accessible from the CI environment.
4.4.4.3. Resolving source environment connectivity issues Copy linkLink copied to clipboard!
This content in this section is available in this release as a Technology Preview, and therefore is not fully supported by Red Hat. It should only be used for testing, and should not be deployed in a production environment. For more information, see Technology Preview.
If the adoption cannot connect to the source Red Hat OpenStack Platform environment to extract the configuration, check your SSH connectivity to the source Controller node and update the configuration if needed.
Example error:
TASK [detect source environment HSM configuration] ****
fatal: [localhost]: FAILED! => {
"msg": "SSH connection to source environment failed"
}
Procedure
Verify SSH connectivity to the source Controller node:
$ ssh -o StrictHostKeyChecking=no tripleo-admin@controller-0.ctlplaneUpdate the
controller1_sshvariable if needed:$ controller1_ssh: "ssh -o StrictHostKeyChecking=no tripleo-admin@<controller_ip>"where:
<controller_ip>- Specifies the IP address of your Controller node.
- Ensure that the SSH keys are properly configured for passwordless access.
4.4.4.4. Resolving HSM secret creation failures Copy linkLink copied to clipboard!
This content in this section is available in this release as a Technology Preview, and therefore is not fully supported by Red Hat. It should only be used for testing, and should not be deployed in a production environment. For more information, see Technology Preview.
If hardware security module (HSM) secrets cannot be created in the target environment, check whether you need to update the names of your secrets in your source configuration file.
Example error:
TASK [Create HSM secrets in target environment] ****
fatal: [localhost]: FAILED! => {
"msg": "Failed to create secret proteccio-data"
}
Procedure
Verify target environment access:
$ export KUBECONFIG=/path/to/.kube/config $ oc get secrets -n openstackCheck if secrets already exist:
$ oc get secret proteccio-data hsm-login -n openstackIf secrets exist with different names, update the configuration variables:
proteccio_login_secret_name: "your-hsm-login-secret" proteccio_client_data_secret_name: "your-proteccio-data-secret"
4.4.4.5. Resolving custom image registry issues Copy linkLink copied to clipboard!
This content in this section is available in this release as a Technology Preview, and therefore is not fully supported by Red Hat. It should only be used for testing, and should not be deployed in a production environment. For more information, see Technology Preview.
If custom Barbican images cannot be pushed to or pulled from the configured registry, you can verify the authentication, test image push permissions, and then update the configuration as needed.
Example error:
TASK [Create Proteccio-enabled Barbican images] ****
fatal: [localhost]: FAILED! => {
"msg": "Failed to push image to registry"
}
Procedure
Verify registry authentication:
$ podman login <registry_url>where:
<registry_url>- Specifies the URL of your configured registry.
Test image push permissions:
$ podman tag hello-world <registry>/<namespace>/test:latest $ podman push <registry>/<namespace>/test:latestwhere:
<registry>- Specifies the name of your registry server.
<namespace>- Specifies the namespace of your container image.
Update registry configuration variables if needed:
cifmw_update_containers_registry: "your-registry:5001" cifmw_update_containers_org: "your-namespace" cifmw_image_registry_verify_tls: false
4.4.4.6. Resolving HSM back-end detection failures Copy linkLink copied to clipboard!
If the adoption role cannot detect hardware security module (HSM) configuration in the source environment, you must force the HSM adoption.
Example error:
TASK [detect source environment HSM configuration] ****
ok: [localhost] => {
"msg": "No HSM configuration found - using standard adoption"
}
Procedure
Manually verify that the HSM configuration exists in the source environment:
$ ssh tripleo-admin@controller-0.ctlplane \ "sudo grep -A 10 '\[p11_crypto_plugin\]' \ /var/lib/config-data/puppet-generated/barbican/etc/barbican/barbican.conf"If HSM is configured but not detected, force HSM adoption by setting the
barbican_hsm_enabledvariable:# In your Zuul job vars or CI framework configuration barbican_hsm_enabled: trueThis configuration ensures that the
barbican_adoptionrole uses the HSM-enabled patch for Key Manager service (barbican) deployment.
4.4.4.7. Resolving database migration issues Copy linkLink copied to clipboard!
This content in this section is available in this release as a Technology Preview, and therefore is not fully supported by Red Hat. It should only be used for testing, and should not be deployed in a production environment. For more information, see Technology Preview.
If hardware security module (HSM) metadata is not preserved during database migration, check the database logs for any errors and verify that the source database includes the HSM secrets.
Example error:
TASK [Verify database migration preserves HSM references] ****
ok: [localhost] => {
"msg": "HSM secrets found in migrated database: 0"
}
Procedure
Verify that the source database contains the HSM secrets:
$ ssh tripleo-admin@controller-0.ctlplane \ "sudo mysql barbican -e 'SELECT COUNT(*) FROM secret_store_metadata WHERE key=\"plugin_name\" AND value=\"PKCS11\";'"Check the database migration logs for errors:
$ oc logs deployment/barbican-api | grep -i migration- If the migration failed, restore the database from backup and retry.
4.4.4.8. Resolving service startup failures Copy linkLink copied to clipboard!
This content in this section is available in this release as a Technology Preview, and therefore is not fully supported by Red Hat. It should only be used for testing, and should not be deployed in a production environment. For more information, see Technology Preview.
If the Key Manager service (barbican) services fail to start after the hardware security module (HSM) configuration is applied, check the configuration in the pod.
Example error:
$ oc get pods -l service=barbican
NAME READY STATUS RESTARTS AGE
barbican-api-xyz 0/1 Error 0 2m
Procedure
Check pod logs for HSM connectivity issues:
$ oc logs barbican-api-xyzVerify HSM library is accessible:
$ oc exec barbican-api-xyz -- ls -la /usr/lib64/libnethsm.soCheck HSM configuration in the pod:
$ oc exec barbican-api-xyz -- cat /etc/proteccio/proteccio.rc
4.4.4.9. Resolving performance and connectivity issues Copy linkLink copied to clipboard!
This content in this section is available in this release as a Technology Preview, and therefore is not fully supported by Red Hat. It should only be used for testing, and should not be deployed in a production environment. For more information, see Technology Preview.
If the hardware security module (HSM) operations are slow or fail intermittently, check the HSM connectivity and monitor the HSM server logs.
Procedure
Test HSM connectivity from Key Manager service (barbican) pods:
$ oc exec barbican-api-xyz -- pkcs11-tool --module /usr/lib64/libnethsm.so --list-slotsCheck HSM server connectivity:
$ oc exec barbican-api-xyz -- nc -zv <hsm_server_ip> <hsm_port>where:
<hsm_server_ip>- Specifies the IP address of the HSM server.
<hsm_port>- Specifies the port of your HSM server.
- Monitor HSM server logs for authentication or capacity issues.
4.4.5. Troubleshooting Key Manager service Proteccio HSM adoption Copy linkLink copied to clipboard!
This content in this section is available in this release as a Technology Preview, and therefore is not fully supported by Red Hat. It should only be used for testing, and should not be deployed in a production environment. For more information, see Technology Preview.
Use this reference to troubleshoot common issues that might occur during Key Manager service (barbican) adoption with Proteccio HSM integration. If Proteccio HSM issues persist, consult the Eviden Trustway documentation and ensure that HSM server configuration matches the client settings.
4.4.5.1. Resolving prerequisite validation failures Copy linkLink copied to clipboard!
This content in this section is available in this release as a Technology Preview, and therefore is not fully supported by Red Hat. It should only be used for testing, and should not be deployed in a production environment. For more information, see Technology Preview.
If the adoption script fails during the prerequisites check, verify that your configuration includes all the required Proteccio files and that the HSM Ansible role is available.
Example error:
ERROR: Required file proteccio_files/YOUR_CERT_FILE not found
ERROR: Cannot connect to OpenShift cluster
ERROR: Proteccio HSM Ansible role not found
Procedure
Verify that all required Proteccio files are present:
$ ls -la /path/to/your/proteccio_files/Ensure that your configured certificate files, private key, HSM certificate file, and configuration file exist as specified in your
proteccio_required_filesconfiguration.Test OpenShift cluster connectivity:
$ oc cluster-info $ oc get pods -n openstackVerify that the HSM Ansible role is available:
$ ls -la /path/to/your/roles/ansible-role-rhoso-proteccio-hsm/
4.4.5.2. Resolving SSH connection failures to the source environment Copy linkLink copied to clipboard!
This content in this section is available in this release as a Technology Preview, and therefore is not fully supported by Red Hat. It should only be used for testing, and should not be deployed in a production environment. For more information, see Technology Preview.
If you cannot connect to the source director environment, verify your SSH key access and test the SSH commands that the adoption uses.
Example error:
Warning: Permanently added 'YOUR_UNDERCLOUD_HOST' (ED25519) to the list of known hosts.
Permission denied (publickey).
Procedure
Verify SSH key access to the undercloud:
$ ssh YOUR_UNDERCLOUD_HOST echo "Connection test"Test the specific SSH commands used by the adoption:
$ sudo ssh -t YOUR_UNDERCLOUD_HOST 'sudo -u stack bash -lc "echo test"' $ sudo ssh -t YOUR_UNDERCLOUD_HOST 'sudo -u stack ssh -t tripleo-admin@YOUR_CONTROLLER_HOST.ctlplane "echo test"'- If the connection fails, verify the SSH configuration and ensure that the undercloud hostname resolves correctly.
4.4.5.3. Resolving database import failures Copy linkLink copied to clipboard!
This content in this section is available in this release as a Technology Preview, and therefore is not fully supported by Red Hat. It should only be used for testing, and should not be deployed in a production environment. For more information, see Technology Preview.
If the source database export or import fails, check the source Galera container, database connectivity, and the source Key Manager service (barbican) configuration.
Example error:
Error: no container with name or ID "galera-bundle-podman-0" found
mysqldump: Got error: 1045: "Access denied for user 'barbican'@'localhost'"
Procedure
Verify that the source Galera container is running:
$ sudo ssh -t YOUR_UNDERCLOUD_HOST 'sudo -u stack ssh -t tripleo-admin@YOUR_CONTROLLER_HOST.ctlplane "sudo podman ps | grep galera"'Test database connectivity with the extracted credentials:
$ sudo ssh -t YOUR_UNDERCLOUD_HOST 'sudo -u stack ssh -t tripleo-admin@YOUR_CONTROLLER_HOST.ctlplane "sudo podman exec galera-bundle-podman-0 mysql -u barbican -p<password> -e \"SELECT 1;\""'where:
<password>- Specifies your database password.
Check the source Key Manager service configuration for the correct database password:
$ sudo ssh -t YOUR_UNDERCLOUD_HOST 'sudo -u stack ssh -t tripleo-admin@YOUR_CONTROLLER_HOST.ctlplane "sudo grep connection /var/lib/config-data/puppet-generated/barbican/etc/barbican/barbican.conf"'
4.4.5.4. Resolving custom image pull failures Copy linkLink copied to clipboard!
This content in this section is available in this release as a Technology Preview, and therefore is not fully supported by Red Hat. It should only be used for testing, and should not be deployed in a production environment. For more information, see Technology Preview.
If Proteccio custom images fail to pull or start, verify image registry access, image pull secrets, and registry authentication.
Example error:
Failed to pull image "<Your Custom Pod Image and Tag>": rpc error
Pod has unbound immediate PersistentVolumeClaims
Procedure
Verify image registry access:
$ podman pull <custom_pod_image_and_tag>where:
<custom_pod_image_and_tag>- Specifies your custom pod image and the image tag.
Check image pull secrets and registry authentication:
$ oc get secrets -n openstack | grep pull $ oc describe pod <barbican_pod_name> -n openstackwhere:
<barbican_pod_name>- Specifies your Barbican pod name.
Verify that the
OpenStackVersionresource was applied correctly:$ oc get openstackversion openstack -n openstack -o yaml
4.4.5.5. Resolving HSM certificate mounting issues Copy linkLink copied to clipboard!
This content in this section is available in this release as a Technology Preview, and therefore is not fully supported by Red Hat. It should only be used for testing, and should not be deployed in a production environment. For more information, see Technology Preview.
If Proteccio client certificates are not properly mounted in pods, check the secret creation and ensure that the Key Manager service (barbican) configuration includes the correct volume mounts.
Example error:
$ oc exec <barbican-pod> -c barbican-api -- ls -la /etc/proteccio/
ls: cannot access '/etc/proteccio/': No such file or directory
Procedure
Verify that the
proteccio-datasecret was created correctly:$ oc describe secret proteccio-data -n openstackCheck that the secret contains the expected files:
$ oc get secret proteccio-data -n openstack -o yamlVerify that the Key Manager service configuration includes the correct volume mounts:
$ oc get barbican barbican -n openstack -o yaml | grep -A10 pkcs11
4.4.5.6. Resolving service startup failures Copy linkLink copied to clipboard!
This content in this section is available in this release as a Technology Preview, and therefore is not fully supported by Red Hat. It should only be used for testing, and should not be deployed in a production environment. For more information, see Technology Preview.
If the Key Manager service (barbican) services fail to start after the hardware security module (HSM) configuration is applied, check the configuration in the pod.
Example error:
$ oc get pods -l service=barbican
NAME READY STATUS RESTARTS AGE
barbican-api-xyz 0/1 Error 0 2m
Procedure
Check pod logs for HSM connectivity issues:
$ oc logs barbican-api-xyzVerify HSM library is accessible:
$ oc exec barbican-api-xyz -- ls -la /usr/lib64/libnethsm.soCheck HSM configuration in the pod:
$ oc exec barbican-api-xyz -- cat /etc/proteccio/proteccio.rc
4.4.5.7. Resolving adoption verification failures Copy linkLink copied to clipboard!
This content in this section is available in this release as a Technology Preview, and therefore is not fully supported by Red Hat. It should only be used for testing, and should not be deployed in a production environment. For more information, see Technology Preview.
If the secrets from the source environment are not accessible after adoption, verify that the database import completed successfully, test API connectivity, and check for schema adoption issues.
Example error:
$ openstack secret list
# Returns empty list or HTTP 500 errors
Procedure
Verify that the database import completed successfully:
$ oc exec openstack-galera-0 -n openstack -- mysql -u root -p<password> barbican -e "SELECT COUNT(*) FROM secrets;"where:
<password>- Specifies your database password.
Check for schema adoption issues:
$ oc logs job.batch/barbican-db-sync -n openstackTest API connectivity:
$ oc exec openstackclient -n openstack -- curl -s -k -H "X-Auth-Token: $(openstack token issue -f value -c id)" https://barbican-internal.openstack.svc:9311/v1/secrets- Verify that projects and users were adopted correctly, as secrets are project-scoped.
4.4.6. Rolling back the HSM adoption Copy linkLink copied to clipboard!
If the hardware security module (HSM) adoption fails, you can restore your environment to its original state and attempt the adoption again.
This content in this section is available in this release as a Technology Preview, and therefore is not fully supported by Red Hat. It should only be used for testing, and should not be deployed in a production environment. For more information, see Technology Preview.
Procedure
Restore the Red Hat OpenStack Services on OpenShift (RHOSO) 18.0 database backup:
$ oc exec -i openstack-galera-0 -n openstack -- mysql -u root -p<password> barbican < /path/to/your/backups/rhoso18_barbican_backup.sqlwhere:
- <password>
- Specifies your database password.
Reset to standard images:
$ oc delete openstackversion openstack -n openstackRestore the base control plane configuration:
$ oc apply -f /path/to/your/base_controlplane.yaml
Next steps
To avoid additional issues when attempting your adoption again, consider the following suggestions:
- Check the adoption logs that are stored in your configured working directory with timestamped summary reports.
- For HSM-specific issues, consult the Proteccio documentation and verify HSM connectivity from the target environment.
-
Run the adoption in dry-run mode (
./run_proteccio_adoption.shoption 3) to validate the environment before making changes.
4.5. Adopting the Networking service Copy linkLink copied to clipboard!
To adopt the Networking service (neutron), you patch an existing OpenStackControlPlane custom resource (CR) that has the Networking service disabled. The patch starts the service with the configuration parameters that are provided by the Red Hat OpenStack Platform (RHOSP) environment.
The Networking service adoption is complete if you see the following results:
-
The
NeutronAPIservice is running. - The Identity service (keystone) endpoints are updated, and the same back end of the source cloud is available.
Prerequisites
- Ensure that Single Node OpenShift or OpenShift Local is running in the Red Hat OpenShift Container Platform (RHOCP) cluster.
- Adopt the Identity service. For more information, see Adopting the Identity service.
-
Migrate your OVN databases to
ovsdb-serverinstances that run in the Red Hat OpenShift Container Platform (RHOCP) cluster. For more information, see Migrating OVN data.
Procedure
Patch the
OpenStackControlPlaneCR to deploy the Networking service:$ oc patch openstackcontrolplane openstack --type=merge --patch ' spec: neutron: enabled: true apiOverride: route: {} template: override: service: internal: metadata: annotations: metallb.universe.tf/address-pool: internalapi metallb.universe.tf/allow-shared-ip: internalapi metallb.universe.tf/loadBalancerIPs: <172.17.0.80> spec: type: LoadBalancer databaseInstance: openstack databaseAccount: neutron secret: osp-secret networkAttachments: - internalapi 'where:
- <172.17.0.80>
Specifies the load balancer IP in your environment. If you use IPv6, change the load balancer IP to the load balancer IP in your environment, for example,
metallb.universe.tf/loadBalancerIPs: fd00:bbbb::80.NoteIf you used the
neutron-dhcp-agentin your RHOSP 17.1 deployment and you still need to use it after adoption, you must enable thedhcp_agent_notificationfor theneutron-apiservice:$ oc patch openstackcontrolplane openstack --type=merge --patch ' spec: neutron: template: customServiceConfig: | [DEFAULT] dhcp_agent_notification = True 'NoteIf you are adopting the Bare Metal Provisioning service (ironic), you must configure the ML2 mechanism drivers to include both
ovnandbaremetal:$ oc patch openstackcontrolplane openstack --type=merge --patch ' spec: neutron: template: ml2MechanismDrivers: - ovn - baremetal '
Verification
Inspect the resulting Networking service pods:
$ oc get pods -l service=neutronEnsure that the
Neutron APIservice is registered in the Identity service:$ openstack service list | grep network$ openstack endpoint list | grep network | 6a805bd6c9f54658ad2f24e5a0ae0ab6 | regionOne | neutron | network | True | public | http://neutron-public-openstack.apps-crc.testing | | b943243e596847a9a317c8ce1800fa98 | regionOne | neutron | network | True | internal | http://neutron-internal.openstack.svc:9696 |Create sample resources so that you can test whether the user can create networks, subnets, ports, or routers:
$ openstack network create net $ openstack subnet create --network net --subnet-range 10.0.0.0/24 subnet $ openstack router create router
4.6. Configuring control plane networking for spine-leaf topologies Copy linkLink copied to clipboard!
If you are adopting a spine-leaf or Distributed Compute Node (DCN) deployment, update the control plane networking for communication across sites. Add subnets for remote sites to your existing NetConfig custom resource (CR) and update NetworkAttachmentDefinition CRs with routes to enable connectivity between the central control plane and remote sites.
Prerequisites
- You have deployed the Red Hat OpenStack Services on OpenShift (RHOSO) control plane.
-
You have configured a
NetConfigCR for the central site. For more information, see Configuring isolated networks. You have the network topology information for all remote sites, including:
- IP address ranges for each service network at each site
- VLAN IDs for each service network at each site
- Gateway addresses for inter-site routing
Procedure
Update your existing
NetConfigCR to add subnets for each remote site. Each service network must include a subnet for the central site and each remote site. Use unique VLAN IDs for each site. For example:- Central site: VLANs 20-23
- Edge site 1: VLANs 30-33
Edge site 2: VLANs 40-43
apiVersion: network.openstack.org/v1beta1 kind: NetConfig metadata: name: netconfig spec: networks: - name: ctlplane dnsDomain: ctlplane.example.com subnets: - name: <subnet1> allocationRanges: - end: 192.168.122.120 start: 192.168.122.100 cidr: 192.168.122.0/24 gateway: 192.168.122.1 - name: <ctlplanesite1> allocationRanges: - end: 192.168.133.120 start: 192.168.133.100 cidr: 192.168.133.0/24 gateway: 192.168.133.1 - name: <ctlplanesite2> allocationRanges: - end: 192.168.144.120 start: 192.168.144.100 cidr: 192.168.144.0/24 gateway: 192.168.144.1 - name: internalapi dnsDomain: internalapi.example.com subnets: - name: subnet1 allocationRanges: - end: 172.17.0.250 start: 172.17.0.100 cidr: 172.17.0.0/24 vlan: 20 - name: internalapisite1 allocationRanges: - end: 172.17.10.250 start: 172.17.10.100 cidr: 172.17.10.0/24 vlan: 30 - name: internalapisite2 allocationRanges: - end: 172.17.20.250 start: 172.17.20.100 cidr: 172.17.20.0/24 vlan: 40 - name: storage dnsDomain: storage.example.com subnets: - name: subnet1 allocationRanges: - end: 172.18.0.250 start: 172.18.0.100 cidr: 172.18.0.0/24 vlan: 21 - name: storagesite1 allocationRanges: - end: 172.18.10.250 start: 172.18.10.100 cidr: 172.18.10.0/24 vlan: 31 - name: storagesite2 allocationRanges: - end: 172.18.20.250 start: 172.18.20.100 cidr: 172.18.20.0/24 vlan: 41 - name: tenant dnsDomain: tenant.example.com subnets: - name: subnet1 allocationRanges: - end: 172.19.0.250 start: 172.19.0.100 cidr: 172.19.0.0/24 vlan: 22 - name: tenantsite1 allocationRanges: - end: 172.19.10.250 start: 172.19.10.100 cidr: 172.19.10.0/24 vlan: 32 - name: tenantsite2 allocationRanges: - end: 172.19.20.250 start: 172.19.20.100 cidr: 172.19.20.0/24 vlan: 42where:
<subnet1>- Specifies a user-defined subnet name for the central site subnet.
<ctlplanesite1>- Specifies a user-defined subnet for the first DCN edge site.
<ctlplanesite2>Specifies a user-defined subnet for the second DCN edge site.
NoteYou must have the
storagemgmtnetwork on OpenShift nodes when using DCN with Swift storage. It is not necessary when using Red Hat Ceph Storage.
Update the
NetworkAttachmentDefinitionCR for theinternalapinetwork to include routes to remote site subnets. Theseroutesfields enable control plane pods attached to theinternalapinetwork, such as OVN Southbound database, to communicate with Compute nodes at remote sites through the central site gateway, and are required for DCN:apiVersion: k8s.cni.cncf.io/v1 kind: NetworkAttachmentDefinition metadata: name: internalapi spec: config: | { "cniVersion": "0.3.1", "name": "internalapi", "type": "macvlan", "master": "internalapi", "ipam": { "type": "whereabouts", "range": "172.17.0.0/24", "range_start": "172.17.0.30", "range_end": "172.17.0.70", "routes": [ { "dst": "172.17.10.0/24", "gw": "172.17.0.1" }, { "dst": "172.17.20.0/24", "gw": "172.17.0.1" } ] } }Update the
NetworkAttachmentDefinitionCR for thectlplanenetwork to include routes to remote site subnets:apiVersion: k8s.cni.cncf.io/v1 kind: NetworkAttachmentDefinition metadata: name: ctlplane spec: config: | { "cniVersion": "0.3.1", "name": "ctlplane", "type": "macvlan", "master": "ospbr", "ipam": { "type": "whereabouts", "range": "192.168.122.0/24", "range_start": "192.168.122.30", "range_end": "192.168.122.70", "routes": [ { "dst": "192.168.133.0/24", "gw": "192.168.122.1" }, { "dst": "192.168.144.0/24", "gw": "192.168.122.1" } ] } }Update the
NetworkAttachmentDefinitionCR for thestoragenetwork to include routes to remote site subnets:apiVersion: k8s.cni.cncf.io/v1 kind: NetworkAttachmentDefinition metadata: name: storage spec: config: | { "cniVersion": "0.3.1", "name": "storage", "type": "macvlan", "master": "storage", "ipam": { "type": "whereabouts", "range": "172.18.0.0/24", "range_start": "172.18.0.30", "range_end": "172.18.0.70", "routes": [ { "dst": "172.18.10.0/24", "gw": "172.18.0.1" }, { "dst": "172.18.20.0/24", "gw": "172.18.0.1" } ] } }Update the
NetworkAttachmentDefinitionCR for thetenantnetwork to include routes to remote site subnets:apiVersion: k8s.cni.cncf.io/v1 kind: NetworkAttachmentDefinition metadata: name: tenant spec: config: | { "cniVersion": "0.3.1", "name": "tenant", "type": "macvlan", "master": "tenant", "ipam": { "type": "whereabouts", "range": "172.19.0.0/24", "range_start": "172.19.0.30", "range_end": "172.19.0.70", "routes": [ { "dst": "172.19.10.0/24", "gw": "172.19.0.1" }, { "dst": "172.19.20.0/24", "gw": "172.19.0.1" } ] } }NoteAdjust the IP ranges, subnets, and gateway addresses in all NAD configurations to match your network topology. The
masterinterface name must match the interface on the OpenShift nodes where the VLAN is configured.If you have already deployed OVN services, restart the OVN Southbound database pods to pick up the new routes:
$ oc delete pod -l service=ovsdbserver-sbThe pods are automatically recreated with the updated network configuration.
Configure the Networking service (neutron) to recognize all site physnets. In the
OpenStackControlPlaneCR, ensure the Networking service configuration includes all physnets:apiVersion: core.openstack.org/v1beta1 kind: OpenStackControlPlane metadata: name: openstack spec: neutron: template: customServiceConfig: | [ml2_type_vlan] network_vlan_ranges = leaf0:1:1000,leaf1:1:1000,leaf2:1:1000 [ovn] ovn_emit_need_to_frag = falsewhere:
- leaf0
- Represents the physnet for the central site.
- leaf1
- Represents the physnet for the first remote site.
- leaf2
Represents the physnet for the second remote site.
NoteAdjust the physnet names to match your Red Hat OpenStack Platform deployment. Common conventions include
leaf0/leaf1/leaf2ordatacentre/dcn1/dcn2.
Verification
Verify that the
NetConfigCR is created with all subnets:$ oc get netconfig netconfig -o yaml | grep -A2 "name: subnet1\|name: .*site"Verify that each
NetworkAttachmentDefinitionincludes routes to remote site subnets:for nad in ctlplane internalapi storage tenant; do echo "=== $nad ===" oc get net-attach-def $nad -o jsonpath='{.spec.config}' | jq '.ipam.routes doneAfter restarting OVN SB pods, verify they have routes to remote site subnets:
$ oc exec $(oc get pod -l service=ovsdbserver-sb -o name | head -1) -- ip route show | grep 172.17Sample output:
172.17.10.0/24 via 172.17.0.1 dev internalapi 172.17.20.0/24 via 172.17.0.1 dev internalapi
4.7. Adopting the Object Storage service Copy linkLink copied to clipboard!
If you are using Object Storage as a service, adopt the Object Storage service (swift) to the Red Hat OpenStack Services on OpenShift (RHOSO) environment. If you are using the Object Storage API of the Ceph Object Gateway (RGW), skip the following procedure.
Prerequisites
- The Object Storage service storage back-end services are running in the Red Hat OpenStack Platform (RHOSP) deployment.
- The storage network is properly configured on the Red Hat OpenShift Container Platform (RHOCP) cluster. For more information, see Preparing Red Hat OpenShift Container Platform for Red Hat OpenStack Services on OpenShift in Deploying Red Hat OpenStack Services on OpenShift.
Procedure
Create the
swift-confsecret that includes the Object Storage service hash path suffix and prefix:$ oc apply -f - <<EOF apiVersion: v1 kind: Secret metadata: name: swift-conf type: Opaque data: swift.conf: $($CONTROLLER1_SSH sudo cat /var/lib/config-data/puppet-generated/swift/etc/swift/swift.conf | base64 -w0) EOFCreate the
swift-ring-filesConfigMapthat includes the Object Storage service ring files:$ oc apply -f - <<EOF apiVersion: v1 kind: ConfigMap metadata: name: swift-ring-files binaryData: swiftrings.tar.gz: $($CONTROLLER1_SSH "cd /var/lib/config-data/puppet-generated/swift/etc/swift && tar cz *.builder *.ring.gz backups/ | base64 -w0") account.ring.gz: $($CONTROLLER1_SSH "base64 -w0 /var/lib/config-data/puppet-generated/swift/etc/swift/account.ring.gz") container.ring.gz: $($CONTROLLER1_SSH "base64 -w0 /var/lib/config-data/puppet-generated/swift/etc/swift/container.ring.gz") object.ring.gz: $($CONTROLLER1_SSH "base64 -w0 /var/lib/config-data/puppet-generated/swift/etc/swift/object.ring.gz") EOFPatch the
OpenStackControlPlanecustom resource to deploy the Object Storage service:$ oc patch openstackcontrolplane openstack --type=merge --patch ' spec: swift: enabled: true template: memcachedInstance: memcached swiftRing: ringReplicas: 3 swiftStorage: replicas: 0 networkAttachments: - storage storageClass: local-storage storageRequest: 10Gi swiftProxy: secret: osp-secret replicas: 2 encryptionEnabled: false passwordSelectors: service: SwiftPassword serviceUser: swift override: service: internal: metadata: annotations: metallb.universe.tf/address-pool: internalapi metallb.universe.tf/allow-shared-ip: internalapi metallb.universe.tf/loadBalancerIPs: <172.17.0.80> spec: type: LoadBalancer networkAttachments: - storage '-
spec.swift.swiftStorage.storageClassmust match the RHOSO deployment storage class. -
metallb.universe.tf/loadBalancerIPs: <172.17.0.80>specifies the load balancer IP in your environment. If you use IPv6, change the load balancer IP to the load balancer IP in your environment, for example,metallb.universe.tf/loadBalancerIPs: fd00:bbbb::80. spec.swift.swiftProxy.networkAttachmentsmust match the network attachment for the previous Object Storage service configuration from the RHOSP deployment.NoteIf
SwiftEncryptionEnabled: truewas set in Red Hat OpenStack Platform, ensure thatspec.swift.swiftProxy.encryptionEnabledis set totrueand that the Key Manager service (barbican) adoption is complete before proceeding.
-
Verification
Inspect the resulting Object Storage service pods:
$ oc get pods -l component=swift-proxyVerify that the Object Storage proxy service is registered in the Identity service (keystone):
$ openstack service list | grep swift | b5b9b1d3c79241aa867fa2d05f2bbd52 | swift | object-store |$ openstack endpoint list | grep swift | 32ee4bd555414ab48f2dc90a19e1bcd5 | regionOne | swift | object-store | True | public | https://swift-public-openstack.apps-crc.testing/v1/AUTH_%(tenant_id)s | | db4b8547d3ae4e7999154b203c6a5bed | regionOne | swift | object-store | True | internal | http://swift-internal.openstack.svc:8080/v1/AUTH_%(tenant_id)s |Verify that you are able to upload and download objects:
$ openstack container create test +---------------------------------------+-----------+------------------------------------+ | account | container | x-trans-id | +---------------------------------------+-----------+------------------------------------+ | AUTH_4d9be0a9193e4577820d187acdd2714a | test | txe5f9a10ce21e4cddad473-0065ce41b9 | +---------------------------------------+-----------+------------------------------------+ $ openstack object create test --name obj <(echo "Hello World!") +--------+-----------+----------------------------------+ | object | container | etag | +--------+-----------+----------------------------------+ | obj | test | d41d8cd98f00b204e9800998ecf8427e | +--------+-----------+----------------------------------+ $ openstack object save test obj --file - Hello World!
The Object Storage data is still stored on the existing RHOSP nodes. For more information about migrating the actual data from the RHOSP deployment to the RHOSO deployment, see Migrating the Object Storage service (swift) data from RHOSP to Red Hat OpenStack Services on OpenShift (RHOSO) nodes.
4.8. Adopting the Image service Copy linkLink copied to clipboard!
To adopt the Image Service (glance) you patch an existing OpenStackControlPlane custom resource (CR) that has the Image service disabled. The patch starts the service with the configuration parameters that are provided by the Red Hat OpenStack Platform (RHOSP) environment.
The Image service adoption is complete if you see the following results:
-
The
GlanceAPIservice up and running. - The Identity service endpoints are updated, and the same back end of the source cloud is available.
To complete the Image service adoption, ensure that your environment meets the following criteria:
- You have a running director environment (the source cloud).
- You have a Single Node OpenShift or OpenShift Local that is running in the Red Hat OpenShift Container Platform (RHOCP) cluster.
-
Optional: You can reach an internal/external
Cephcluster by bothcrcand director.
If you have image quotas in RHOSP 17.1, these quotas are transferred to Red Hat OpenStack Services on OpenShift (RHOSO) 18.0 because the image quota system in 18.0 is disabled by default. If you enable image quotas in RHOSO 18.0, the new quotas replace the legacy quotas from RHOSP 17.1.
4.8.1. Adopting the Image service that is deployed with a Object Storage service back end Copy linkLink copied to clipboard!
Adopt the Image Service (glance) that you deployed with an Object Storage service (swift) back end in the Red Hat OpenStack Platform (RHOSP) environment. The control plane glanceAPI instance is deployed with the following configuration. You use this configuration in the patch manifest that deploys the Image service with the object storage back end:
..
spec
glance:
...
customServiceConfig: |
[DEFAULT]
enabled_backends = default_backend:swift
[glance_store]
default_backend = default_backend
[default_backend]
swift_store_create_container_on_put = True
swift_store_auth_version = 3
swift_store_auth_address = {{ .KeystoneInternalURL }}
swift_store_endpoint_type = internalURL
swift_store_user = service:glance
swift_store_key = {{ .ServicePassword }}
Prerequisites
- You have completed the previous adoption steps.
Procedure
Create a new file, for example,
glance_swift.patch, and include the following content:spec: glance: enabled: true apiOverride: route: {} template: secret: osp-secret databaseInstance: openstack storage: storageRequest: 10G customServiceConfig: | [DEFAULT] enabled_backends = default_backend:swift [glance_store] default_backend = default_backend [default_backend] swift_store_create_container_on_put = True swift_store_auth_version = 3 swift_store_auth_address = {{ .KeystoneInternalURL }} swift_store_endpoint_type = internalURL swift_store_user = service:glance swift_store_key = {{ .ServicePassword }} glanceAPIs: default: replicas: 1 override: service: internal: metadata: annotations: metallb.universe.tf/address-pool: internalapi metallb.universe.tf/allow-shared-ip: internalapi metallb.universe.tf/loadBalancerIPs: <172.17.0.80> spec: type: LoadBalancer networkAttachments: - storagewhere:
- <172.17.0.80>
Specifies the load balancer IP in your environment. If you use IPv6, change the load balancer IP to the load balancer IP in your environment, for example,
metallb.universe.tf/loadBalancerIPs: fd00:bbbb::80.NoteThe Object Storage service as a back end establishes a dependency with the Image service. Any deployed
GlanceAPIinstances do not work if the Image service is configured with the Object Storage service that is not available in theOpenStackControlPlanecustom resource. After the Object Storage service, and in particularSwiftProxy, is adopted, you can proceed with theGlanceAPIadoption. For more information, see Adopting the Object Storage service.
Verify that
SwiftProxyis available:$ oc get pod -l component=swift-proxy | grep Running swift-proxy-75cb47f65-92rxq 3/3 Running 0Patch the
GlanceAPIservice that is deployed in the control plane:$ oc patch openstackcontrolplane openstack --type=merge --patch-file=glance_swift.patch
4.8.2. Adopting the Image service that is deployed with a Block Storage service back end Copy linkLink copied to clipboard!
Adopt the Image Service (glance) that you deployed with a Block Storage service (cinder) back end in the Red Hat OpenStack Platform (RHOSP) environment. The control plane glanceAPI instance is deployed with the following configuration. You use this configuration in the patch manifest that deploys the Image service with the block storage back end:
..
spec
glance:
...
customServiceConfig: |
[DEFAULT]
enabled_backends = default_backend:cinder
[glance_store]
default_backend = default_backend
[default_backend]
description = Default cinder backend
cinder_store_auth_address = {{ .KeystoneInternalURL }}
cinder_store_user_name = {{ .ServiceUser }}
cinder_store_password = {{ .ServicePassword }}
cinder_store_project_name = service
cinder_catalog_info = volumev3::internalURL
cinder_use_multipath = true
[oslo_concurrency]
lock_path = /var/lib/glance/tmp
Prerequisites
- You have completed the previous adoption steps.
Procedure
Create a new file, for example
glance_cinder.patch, and include the following content:spec: glance: enabled: true apiOverride: route: {} template: secret: osp-secret databaseInstance: openstack storage: storageRequest: 10G customServiceConfig: | [DEFAULT] enabled_backends = default_backend:cinder [glance_store] default_backend = default_backend [default_backend] description = Default cinder backend cinder_store_auth_address = {{ .KeystoneInternalURL }} cinder_store_user_name = {{ .ServiceUser }} cinder_store_password = {{ .ServicePassword }} cinder_store_project_name = service cinder_catalog_info = volumev3::internalURL cinder_use_multipath = true [oslo_concurrency] lock_path = /var/lib/glance/tmp glanceAPIs: default: replicas: 1 override: service: internal: metadata: annotations: metallb.universe.tf/address-pool: internalapi metallb.universe.tf/allow-shared-ip: internalapi metallb.universe.tf/loadBalancerIPs: <172.17.0.80> spec: type: LoadBalancer networkAttachments: - storagewhere:
- <172.17.0.80>
Specifies the load balancer IP. If you use IPv6, change the load balancer IP to the load balancer IP in your environment, for example,
metallb.universe.tf/loadBalancerIPs: fd00:bbbb::80.NoteThe Block Storage service as a back end establishes a dependency with the Image service. Any deployed
GlanceAPIinstances do not work if the Image service is configured with the Block Storage service that is not available in theOpenStackControlPlanecustom resource. After the Block Storage service, and in particularCinderVolume, is adopted, you can proceed with theGlanceAPIadoption. For more information, see Adopting the Block Storage service.
Verify that
CinderVolumeis available:$ oc get pod -l component=cinder-volume | grep Running cinder-volume-75cb47f65-92rxq 3/3 Running 0Patch the
GlanceAPIservice that is deployed in the control plane:$ oc patch openstackcontrolplane openstack --type=merge --patch-file=glance_cinder.patch
4.8.3. Adopting the Image service that is deployed with an NFS back end Copy linkLink copied to clipboard!
Adopt the Image Service (glance) that you deployed with an NFS back end. To complete the following procedure, ensure that your environment meets the following criteria:
- The Storage network is propagated to the Red Hat OpenStack Platform (RHOSP) control plane.
-
The Image service can reach the Storage network and connect to the nfs-server through the port
2049.
Prerequisites
- You have completed the previous adoption steps.
In the source cloud, verify the NFS parameters that the overcloud uses to configure the Image service back end. Specifically, in yourdirector heat templates, find the following variables that override the default content that is provided by the
glance-nfs.yamlfile in the/usr/share/openstack-tripleo-heat-templates/environments/storagedirectory:GlanceBackend: file GlanceNfsEnabled: true GlanceNfsShare: 192.168.24.1:/var/nfsNoteIn this example, the
GlanceBackendvariable shows that the Image service has no notion of an NFS back end. The variable is using theFiledriver and, in the background, thefilesystem_store_datadir. Thefilesystem_store_datadiris mapped to the export value provided by theGlanceNfsSharevariable instead of/var/lib/glance/images/. If you do not export theGlanceNfsSharethrough a network that is propagated to the adopted Red Hat OpenStack Services on OpenShift (RHOSO) control plane, you must stop thenfs-serverand remap the export to thestoragenetwork. Before doing so, ensure that the Image service is stopped in the source Controller nodes.In the control plane, the Image service is attached to the Storage network, then propagated through the associated
NetworkAttachmentsDefinitioncustom resource (CR), and the resulting pods already have the right permissions to handle the Image service traffic through this network. In a deployed RHOSP control plane, you can verify that the network mapping matches with what has been deployed in the director-based environment by checking both theNodeNetworkConfigPolicy(nncp) and theNetworkAttachmentDefinition(net-attach-def). The following is an example of the output that you should check in the Red Hat OpenShift Container Platform (RHOCP) environment to make sure that there are no issues with the propagated networks:$ oc get nncp NAME STATUS REASON enp6s0-crc-8cf2w-master-0 Available SuccessfullyConfigured $ oc get net-attach-def NAME ctlplane internalapi storage tenant $ oc get ipaddresspool -n metallb-system NAME AUTO ASSIGN AVOID BUGGY IPS ADDRESSES ctlplane true false ["192.168.122.80-192.168.122.90"] internalapi true false ["172.17.0.80-172.17.0.90"] storage true false ["172.18.0.80-172.18.0.90"] tenant true false ["172.19.0.80-172.19.0.90"]
Procedure
Adopt the Image service and create a new
defaultGlanceAPIinstance that is connected with the existing NFS share:$ cat << EOF > glance_nfs_patch.yaml spec: extraMounts: - extraVol: - extraVolType: Nfs mounts: - mountPath: /var/lib/glance/images name: nfs propagation: - Glance volumes: - name: nfs nfs: path: <exported_path> server: <ip_address> name: r1 region: r1 glance: enabled: true template: databaseInstance: openstack customServiceConfig: | [DEFAULT] enabled_backends = default_backend:file [glance_store] default_backend = default_backend [default_backend] filesystem_store_datadir = /var/lib/glance/images/ storage: storageRequest: 10G keystoneEndpoint: nfs glanceAPIs: nfs: replicas: 3 type: single override: service: internal: metadata: annotations: metallb.universe.tf/address-pool: internalapi metallb.universe.tf/allow-shared-ip: internalapi metallb.universe.tf/loadBalancerIPs: <172.17.0.80> spec: type: LoadBalancer networkAttachments: - storage EOFwhere:
- <exported_path>
-
Specifies the exported path in the
nfs-server. - <ip_address>
-
Specifies the IP address that you use to communicate with the
nfs-server. - <172.17.0.80>
-
Specifies the load balancer IP in your environment. If you use IPv6, change the load balancer IP to the load balancer IP in your environment, for example,
metallb.universe.tf/loadBalancerIPs: fd00:bbbb::80.
Patch the
OpenStackControlPlaneCR to deploy the Image service with an NFS back end:$ oc patch openstackcontrolplane openstack --type=merge --patch-file glance_nfs_patch.yamlPatch the
OpenStackControlPlaneCR to remove the default Image service:$ oc patch openstackcontrolplane openstack --type=json -p="[{'op': 'remove', 'path': '/spec/glance/template/glanceAPIs/default'}]"
Verification
When
GlanceAPIis active, confirm that you can see a single API instance:$ oc get pods -l service=glance NAME READY STATUS RESTARTS glance-nfs-single-0 2/2 Running 0 glance-nfs-single-1 2/2 Running 0 glance-nfs-single-2 2/2 Running 0Ensure that the description of the pod reports the following output:
Mounts: ... nfs: Type: NFS (an NFS mount that lasts the lifetime of a pod) Server: {{ server ip address }} Path: {{ nfs export path }} ReadOnly: false ...Check that the mountpoint that points to
/var/lib/glance/imagesis mapped to the expectednfs server ipandnfs paththat you defined in the new defaultGlanceAPIinstance:$ oc rsh -c glance-api glance-default-single-0 sh-5.1# mount ... ... {{ ip address }}:/var/nfs on /var/lib/glance/images type nfs4 (rw,relatime,vers=4.2,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,clientaddr=172.18.0.5,local_lock=none,addr=172.18.0.5) ... ...Confirm that the UUID is created in the exported directory on the NFS node. For example:
$ oc rsh openstackclient $ openstack image list sh-5.1$ curl -L -o /tmp/cirros-0.6.3-x86_64-disk.img http://download.cirros-cloud.net/0.6.3/cirros-0.6.3-x86_64-disk.img ... ... sh-5.1$ openstack image create --container-format bare --disk-format raw --file /tmp/cirros-0.6.3-x86_64-disk.img cirros ... ... sh-5.1$ openstack image list +--------------------------------------+--------+--------+ | ID | Name | Status | +--------------------------------------+--------+--------+ | 634482ca-4002-4a6d-b1d5-64502ad02630 | cirros | active | +--------------------------------------+--------+--------+On the
nfs-servernode, the sameuuidis in the exported/var/nfs:$ ls /var/nfs/ 634482ca-4002-4a6d-b1d5-64502ad02630
4.8.4. Adopting the Image service that is deployed with a Red Hat Ceph Storage back end Copy linkLink copied to clipboard!
Adopt the Image Service (glance) that you deployed with a Red Hat Ceph Storage back end. Use the customServiceConfig parameter to inject the right configuration to the GlanceAPI instance.
Prerequisites
- You have completed the previous adoption steps.
Ensure that the Ceph-related secret (
ceph-conf-files) is created in theopenstacknamespace and that theextraMountsproperty of theOpenStackControlPlanecustom resource (CR) is configured properly. For more information, see Configuring a Ceph back end.$ cat << EOF > glance_patch.yaml spec: glance: enabled: true template: databaseInstance: openstack customServiceConfig: | [DEFAULT] enabled_backends=default_backend:rbd [glance_store] default_backend=default_backend [default_backend] rbd_store_ceph_conf=/etc/ceph/ceph.conf rbd_store_user=openstack rbd_store_pool=images store_description=Ceph glance store backend. storage: storageRequest: 10G glanceAPIs: default: replicas: 3 override: service: internal: metadata: annotations: metallb.universe.tf/address-pool: internalapi metallb.universe.tf/allow-shared-ip: internalapi metallb.universe.tf/loadBalancerIPs: <172.17.0.80> spec: type: LoadBalancer networkAttachments: - storage EOFwhere:
- <172.17.0.80>
-
Specifies the load balancer IP. If you use IPv6, change the load balancer IP to the load balancer IP in your environment, for example,
metallb.universe.tf/loadBalancerIPs: fd00:bbbb::80.
If you backed up your Red Hat OpenStack Platform (RHOSP) services configuration file from the original environment, you can compare it with the confgiuration file that you adopted and ensure that the configuration is correct. For more information, see Pulling the configuration from a director deployment.
os-diff diff /tmp/collect_tripleo_configs/glance/etc/glance/glance-api.conf glance_patch.yaml --crd
This command produces the difference between both ini configuration files.
Procedure
Patch the
OpenStackControlPlaneCR to deploy the Image service with a Red Hat Ceph Storage back end:$ oc patch openstackcontrolplane openstack --type=merge --patch-file glance_patch.yaml
4.8.5. Adopting the Image service with multiple Red Hat Ceph Storage back ends (DCN) Copy linkLink copied to clipboard!
Adopt the Image Service (glance) in a Distributed Compute Node (DCN) deployment where multiple Red Hat Ceph Storage clusters provide storage at different sites. This configuration deploys multiple GlanceAPI instances: a central API with access to all Red Hat Ceph Storage clusters, and edge APIs at each DCN site with access to their local cluster and the central cluster.
During adoption, the Image service instances that ran on edge site Compute nodes are migrated to run on Red Hat OpenShift Container Platform (RHOCP) at the central site. Although the control path for API requests now traverses the WAN to reach the Image service running on Red Hat OpenShift Container Platform (RHOCP), the data path remains local. Image data continues to be stored in the Red Hat Ceph Storage cluster at each edge site. When you create a virtual machine or volume from an image, the operation occurs at the local Red Hat Ceph Storage cluster. This architecture uses Red Hat Ceph Storage shallow copies (copy-on-write clones) to enable fast boot times without transferring image data across the WAN.
The virtual IP addresses (VIPs) used by Compute service nodes to reach the Image service change during adoption. Before adoption, edge site nodes contact a local Image service VIP on the internalapi subnet. After adoption, they contact a Red Hat OpenShift Container Platform (RHOCP) service endpoint on a different internalapi subnet. The following table shows an example of this change:
| Site | Before adoption | After adoption |
|---|---|---|
| Central | Identity service catalog VIP |
Identity service catalog updated to |
| DCN1 |
|
|
| DCN2 |
|
|
In Red Hat OpenStack Platform, the internal Image service endpoint at edge sites used TCP port 9293, after adoption, all Image service endpoints use port 9292. The new endpoints are backed by MetalLB load balancer IPs that you assign using the metallb.universe.tf/loadBalancerIPs annotation on each GlanceAPI. When you patch the OpenStackControlPlane custom resource (CR), Red Hat OpenShift Container Platform (RHOCP) creates internal Kubernetes services (for example, glance-dcn1-internal.openstack.svc) that resolve to those MetalLB IPs. The Compute service nodes are configured to use these endpoints when you adopt the data plane. For more information, see Adopting Compute services with multiple Ceph back ends (DCN). The examples in this procedure use http:// for the Image service endpoints. If your Red Hat OpenStack Platform deployment uses TLS for internal endpoints, use https:// and ensure that you have completed the TLS migration. For more information, see Migrating TLS-e to the RHOSO deployment.
Prerequisites
- You have completed the previous adoption steps.
-
The per-site Red Hat Ceph Storage secrets (
ceph-conf-central,ceph-conf-dcn1,ceph-conf-dcn2) exist and contain the configuration and keyrings for each site’s Red Hat Ceph Storage cluster. For more information, see Configuring a Red Hat Ceph Storage back end. -
The
extraMountsproperty of theOpenStackControlPlaneCR is configured to mount the Red Hat Ceph Storage configuration to all Glance instances. -
You have stopped the Image service on all DCN nodes. If your deployment includes
DistributedComputeHCIScaleOutorDistributedComputeScaleOutnodes, you have also stopped HAProxy on those nodes. For more information, see Stopping Red Hat OpenStack Platform services.
Procedure
Create a patch file for the Image service with multiple Red Hat Ceph Storage back ends. Use MetalLB loadbalancer IPs for the Image service endpoints:
Example DCN deployment with a central site and two edge sites:
$ cat << EOF > glance_dcn_patch.yaml spec: glance: enabled: true template: databaseInstance: openstack databaseAccount: glance keystoneEndpoint: central storage: storageRequest: <10G> glanceAPIs: central: type: split replicas: 3 override: service: internal: metadata: annotations: metallb.universe.tf/address-pool: internalapi metallb.universe.tf/allow-shared-ip: internalapi metallb.universe.tf/loadBalancerIPs: <172.17.0.80> spec: type: LoadBalancer networkAttachments: - storage customServiceConfig: | [DEFAULT] enabled_import_methods = [web-download,copy-image,glance-direct] enabled_backends = central:rbd,dcn1:rbd,dcn2:rbd [glance_store] default_backend = central [central] rbd_store_ceph_conf = /etc/ceph/central.conf store_description = "Central RBD backend" rbd_store_pool = images rbd_store_user = openstack rbd_thin_provisioning = True [dcn1] rbd_store_ceph_conf = /etc/ceph/dcn1.conf store_description = "DCN1 RBD backend" rbd_store_pool = images rbd_store_user = openstack rbd_thin_provisioning = True [dcn2] rbd_store_ceph_conf = /etc/ceph/dcn2.conf store_description = "DCN2 RBD backend" rbd_store_pool = images rbd_store_user = openstack rbd_thin_provisioning = True dcn1: type: edge replicas: 2 override: service: internal: metadata: annotations: metallb.universe.tf/address-pool: internalapi metallb.universe.tf/allow-shared-ip: internalapi metallb.universe.tf/loadBalancerIPs: <172.17.0.81> spec: type: LoadBalancer networkAttachments: - storage customServiceConfig: | [DEFAULT] enabled_import_methods = [web-download,copy-image,glance-direct] enabled_backends = central:rbd,dcn1:rbd [glance_store] default_backend = dcn1 [central] rbd_store_ceph_conf = /etc/ceph/central.conf store_description = "Central RBD backend" rbd_store_pool = images rbd_store_user = openstack rbd_thin_provisioning = True [dcn1] rbd_store_ceph_conf = /etc/ceph/dcn1.conf store_description = "DCN1 RBD backend" rbd_store_pool = images rbd_store_user = openstack rbd_thin_provisioning = True dcn2: type: edge replicas: 2 override: service: internal: metadata: annotations: metallb.universe.tf/address-pool: internalapi metallb.universe.tf/allow-shared-ip: internalapi metallb.universe.tf/loadBalancerIPs: <172.17.0.82> spec: type: LoadBalancer networkAttachments: - storage customServiceConfig: | [DEFAULT] enabled_import_methods = [web-download,copy-image,glance-direct] enabled_backends = central:rbd,dcn2:rbd [glance_store] default_backend = dcn2 [central] rbd_store_ceph_conf = /etc/ceph/central.conf store_description = "Central RBD backend" rbd_store_pool = images rbd_store_user = openstack rbd_thin_provisioning = True [dcn2] rbd_store_ceph_conf = /etc/ceph/dcn2.conf store_description = "DCN2 RBD backend" rbd_store_pool = images rbd_store_user = openstack rbd_thin_provisioning = True EOFwhere:
<172.17.0.80>- Specifies the load balancer IP for the central Image service API.
<172.17.0.81>- Specifies the load balancer IP for the DCN1 edge Image service API.
<172.17.0.82>Specifies the load balancer IP for the DCN2 edge Image service API.
You must configure the Compute nodes at each site to use their local Image service endpoints. For example, Compute nodes at central use 172.17.0.80, Compute nodes at dcn1 use 172.17.0.81, and Compute nodes at dcn2 use 172.17.0.82. This configuration is applied when you adopt the data plane by adding a per-site ConfigMap with the
glance_api_serverssetting to eachOpenStackDataPlaneNodeSet. For more information, see Adopting Compute services to the data plane.
Note-
The central
GlanceAPIusestype: splitand has access to all Red Hat Ceph Storage clusters. ThekeystoneEndpoint: centralsetting registers this API as the public endpoint in the Identity service. -
Each edge
GlanceAPIusestype: edgeand has access to its local Red Hat Ceph Storage cluster plus the central cluster. This enables image copying between sites. -
Set the
storageRequestPVC size based on the storage requirements of each edge site. - Adjust the number of edge sites and their names to match your DCN deployment.
Patch the
OpenStackControlPlaneCR to deploy the Image service with multiple Red Hat Ceph Storage back ends:$ oc patch openstackcontrolplane openstack --type=merge --patch-file glance_dcn_patch.yamlVerify that the Image service stores are available for each site:
$ glance stores-info +----------+----------------------------------------------------------------------------------+ | Property | Value | +----------+----------------------------------------------------------------------------------+ | stores | [{"id": "central", "description": "Central RBD backend", "default": "true"}, | | | {"id": "dcn1", "description": "dcn1 RBD backend"}, {"id": "dcn2", "description": | | | "dcn2 RBD backend"}] | +----------+----------------------------------------------------------------------------------+The output should list one store for each Red Hat Ceph Storage back end configured in the central
GlanceAPI, and the central store should be marked as the default. If any stores are missing, check thecustomServiceConfigin theglanceAPIssection of the patch and verify that the Red Hat Ceph Storage configuration files are present in theceph-conf-centralsecret.Verify that image import methods include
copy-image, which is required for copying images between stores:$ glance import-info +----------------+----------------------------------------------------------------------------------+ | Property | Value | +----------------+----------------------------------------------------------------------------------+ | import-methods | {"description": "Import methods available.", "type": "array", "value": ["web- | | | download", "copy-image", "glance-direct"]} | +----------------+----------------------------------------------------------------------------------+Upload a test image to the central store. Note the image ID:
$ glance image-create --disk-format raw --container-format bare --name test-image \ --file <image-file> --store centralVerify that the image ID from the previous command is shown in the central Red Hat Ceph Storage cluster’s
imagespool:$ sudo cephadm shell --config /etc/ceph/central.conf --keyring /etc/ceph/central.client.openstack.keyring \ -- rbd -p images --cluster central ls -l NAME SIZE PARENT FMT PROT LOCK <image-id> 20 MiB 2Copy the image to an edge site using the
copy-imageimport method:$ glance image-import <image-id> --stores dcn1 --import-method copy-imageAfter the import completes, verify that the
storesfield on the image now includes bothcentralanddcn1:$ glance image-show <image-id> | grep stores | stores | central,dcn1 |Verify the image was copied to the DCN1 Red Hat Ceph Storage cluster:
$ sudo cephadm shell --config /etc/ceph/dcn1.conf --keyring /etc/ceph/dcn1.client.openstack.keyring \ -- rbd -p images --cluster dcn1 ls -l NAME SIZE PARENT FMT PROT LOCK <image-id> 20 MiB 2The image is now present on the DCN1 Red Hat Ceph Storage cluster, confirming that Image service can copy images between sites. Repeat the
glance image-importcommand for each additional edge site to distribute the image to all DCN locations.
4.8.6. Verifying the Image service adoption Copy linkLink copied to clipboard!
Verify that you adopted the Image Service (glance) to the Red Hat OpenStack Services on OpenShift (RHOSO) 18.0 deployment.
Procedure
Test the Image service from the Red Hat OpenStack Platform CLI. You can compare and ensure that the configuration is applied to the Image service pods:
$ os-diff diff /etc/glance/glance.conf.d/02-config.conf glance_patch.yaml --frompod -p glance-apiIf no line appears, then the configuration is correct.
Inspect the resulting Image service pods:
GLANCE_POD=`oc get pod |grep glance-default | cut -f 1 -d' ' | head -n 1` oc exec -t $GLANCE_POD -c glance-api -- cat /etc/glance/glance.conf.d/02-config.conf [DEFAULT] enabled_backends=default_backend:rbd [glance_store] default_backend=default_backend [default_backend] rbd_store_ceph_conf=/etc/ceph/ceph.conf rbd_store_user=openstack rbd_store_pool=images store_description=Ceph glance store backend.If you use a Red Hat Ceph Storage back end, ensure that the Red Hat Ceph Storage secrets are mounted:
$ oc exec -t $GLANCE_POD -c glance-api -- ls /etc/ceph ceph.client.openstack.keyring ceph.confCheck that the service is active, and that the endpoints are updated in the RHOSP CLI:
$ oc rsh openstackclient $ openstack service list | grep image | fc52dbffef36434d906eeb99adfc6186 | glance | image | $ openstack endpoint list | grep image | 569ed81064f84d4a91e0d2d807e4c1f1 | regionOne | glance | image | True | internal | http://glance-internal-openstack.apps-crc.testing | | 5843fae70cba4e73b29d4aff3e8b616c | regionOne | glance | image | True | public | http://glance-public-openstack.apps-crc.testing |Check that the images that you previously listed in the source cloud are available in the adopted service:
$ openstack image list +--------------------------------------+--------+--------+ | ID | Name | Status | +--------------------------------------+--------+--------+ | c3158cad-d50b-452f-bec1-f250562f5c1f | cirros | active | +--------------------------------------+--------+--------+
4.9. Adopting the Placement service Copy linkLink copied to clipboard!
To adopt the Placement service, you patch an existing OpenStackControlPlane custom resource (CR) that has the Placement service disabled. The patch starts the service with the configuration parameters that are provided by the Red Hat OpenStack Platform (RHOSP) environment.
Prerequisites
- You import your databases to MariaDB instances on the control plane. For more information, see Migrating databases to MariaDB instances.
- You adopt the Identity service (keystone). For more information, see Adopting the Identity service.
Procedure
Patch the
OpenStackControlPlaneCR to deploy the Placement service:$ oc patch openstackcontrolplane openstack --type=merge --patch ' spec: placement: enabled: true apiOverride: route: {} template: databaseInstance: openstack databaseAccount: placement secret: osp-secret override: service: internal: metadata: annotations: metallb.universe.tf/address-pool: internalapi metallb.universe.tf/allow-shared-ip: internalapi metallb.universe.tf/loadBalancerIPs: <172.17.0.80> spec: type: LoadBalancer 'where:
- <172.17.0.80>
-
Specifies the load balancer IP in your environment. If you use IPv6, change the load balancer IP to the load balancer IP in your environment, for example,
metallb.universe.tf/loadBalancerIPs: fd00:bbbb::80.
Verification
Check that the Placement service endpoints are defined and pointing to the control plane FQDNs, and that the Placement API responds:
$ alias openstack="oc exec -t openstackclient -- openstack" $ openstack endpoint list | grep placement # Without OpenStack CLI placement plugin installed: $ PLACEMENT_PUBLIC_URL=$(openstack endpoint list -c 'Service Name' -c 'Service Type' -c URL | grep placement | grep public | awk '{ print $6; }') $ oc exec -t openstackclient -- curl "$PLACEMENT_PUBLIC_URL" # With OpenStack CLI placement plugin installed: $ openstack resource class list
4.10. Adopting the Bare Metal Provisioning service Copy linkLink copied to clipboard!
Review information about your Bare Metal Provisioning service (ironic) configuration and then adopt the Bare Metal Provisioning service to the Red Hat OpenStack Services on OpenShift control plane.
4.10.1. Bare Metal Provisioning service configurations Copy linkLink copied to clipboard!
You configure the Bare Metal Provisioning service (ironic) by using configuration snippets.
Some Bare Metal Provisioning service configuration is overridden in director, for example, PXE Loader file names are often overridden at intermediate layers. You must pay attention to the settings you apply in your Red Hat OpenStack Services on OpenShift (RHOSO) deployment. The ironic-operator applies a reasonable working default configuration, but if you override them with your prior configuration, your experience might not be ideal or your new Bare Metal Provisioning service fails to operate. Similarly, additional configuration might be necessary, for example, if you enable and use additional hardware types in your ironic.conf file.
The model of reasonable defaults includes commonly used hardware-types and driver interfaces. For example, the redfish-virtual-media boot interface and the ramdisk deploy interface are enabled by default. If you add new bare metal nodes after the adoption is complete, the driver interface selection occurs based on the order of precedence in the configuration if you do not explicitly set it on the node creation request or as an established default in the ironic.conf file.
Some configuration parameters do not need to be set on an individual node level, for example, network UUID values, or they are centrally configured in the ironic.conf file, as the setting controls security behavior.
It is critical that you maintain the following parameters that you configured and formatted as [section] and parameter name from the prior deployment to the new deployment. These parameters that govern the underlying behavior and values in the previous configuration would have used specific values if set.
- [neutron]cleaning_network
- [neutron]provisioning_network
- [neutron]rescuing_network
- [neutron]inspection_network
- [conductor]automated_clean
- [deploy]erase_devices_priority
- [deploy]erase_devices_metadata_priority
- [conductor]force_power_state_during_sync
You can set the following parameters individually on a node. However, you might choose to use embedded configuration options to avoid the need to set the parameters individually when creating or managing bare metal nodes. Check your prior ironic.conf file for these parameters, and if set, apply a specific override configuration.
- [conductor]bootloader
- [conductor]rescue_ramdisk
- [conductor]rescue_kernel
- [conductor]deploy_kernel
- [conductor]deploy_ramdisk
The instances of kernel_append_params, formerly pxe_append_params in the [pxe] and [redfish] configuration sections, are used to apply boot time options like "console" for the deployment ramdisk and as such often must be changed.
You cannot migrate hardware types that are set with the ironic.conf file enabled_hardware_types parameter, and hardware type driver interfaces starting with staging- into the adopted configuration.
4.10.2. Deploying the Bare Metal Provisioning service Copy linkLink copied to clipboard!
To deploy the Bare Metal Provisioning service (ironic), you patch an existing OpenStackControlPlane custom resource (CR) that has the Bare Metal Provisioning service disabled. The ironic-operator applies the configuration and starts the Bare Metal Provisioning services. After the services are running, the Bare Metal Provisioning service automatically begins polling the power state of the bare-metal nodes that it manages.
By default, RHOSO versions 18.0 and later of the Bare Metal Provisioning service include a new multi-tenant aware role-based access control (RBAC) model. As a result, bare-metal nodes might be missing when you run the openstack baremetal node list command after you adopt the Bare Metal Provisioning service. Your nodes are not deleted. Due to the increased access restrictions in the RBAC model, you must identify which project owns the missing bare-metal nodes and set the owner field on each missing bare-metal node.
Prerequisites
- You have imported the service databases into the control plane database.
The Bare Metal Provisioning service is disabled in the RHOSO 18.0. The following command should return a string of
false:$ oc get openstackcontrolplanes.core.openstack.org <name> -o jsonpath='{.spec.ironic.enabled}'-
Replace
<name>with the name of your existingOpenStackControlPlaneCR, for example,openstack-control-plane.
-
Replace
- The Identity service (keystone), Networking service (neutron), and Image Service (glance) are operational.
The Networking service is configured with the ML2 baremetal mechanism driver. For more information, see Adopting the Networking service.
NoteIf you use the Bare Metal Provisioning service in a Bare Metal as a Service configuration, do not adopt the Compute service (nova) before you adopt the Bare Metal Provisioning service.
- For the Bare Metal Provisioning service conductor services, the services must be able to reach Baseboard Management Controllers of hardware that is configured to be managed by the Bare Metal Provisioning service. If this hardware is unreachable, the nodes might enter "maintenance" state and be unavailable until connectivity is restored later.
You have downloaded the
ironic.conffile locally:$CONTROLLER1_SSH cat /var/lib/config-data/puppet-generated/ironic/etc/ironic/ironic.conf > ironic.confNoteThis configuration file must come from one of the Controller nodes and not a director undercloud node. The director undercloud node operates with different configuration that does not apply when you adopt the Overcloud Ironic deployment.
-
If you are adopting the Ironic Inspector service, you need the value of the
IronicInspectorSubnetsdirector parameter. Use the same values to populate thedhcpRangesparameter in the RHOSO environment. You have defined the following shell variables. Replace the following example values with values that apply to your environment:
$ alias openstack="oc exec -t openstackclient -- openstack"
Procedure
Patch the
OpenStackControlPlanecustom resource (CR) to deploy the Bare Metal Provisioning service:$ oc patch openstackcontrolplane openstack --type=merge --patch ' spec: ironic: enabled: true template: rpcTransport: oslo databaseInstance: openstack ironicAPI: replicas: 1 override: service: internal: metadata: annotations: metallb.universe.tf/address-pool: internalapi metallb.universe.tf/allow-shared-ip: internalapi metallb.universe.tf/loadBalancerIPs: <loadBalancer_IP> spec: type: LoadBalancer ironicConductors: - replicas: 1 networkAttachments: - baremetal provisionNetwork: baremetal storageRequest: 10G customServiceConfig: | [neutron] cleaning_network=<cleaning network uuid> provisioning_network=<provisioning network uuid> rescuing_network=<rescuing network uuid> inspection_network=<introspection network uuid> [conductor] automated_clean=true ironicInspector: replicas: 1 inspectionNetwork: baremetal networkAttachments: - baremetal dhcpRanges: - name: inspector-0 cidr: 172.20.1.0/24 start: 172.20.1.190 end: 172.20.1.199 gateway: 172.20.1.1 serviceUser: ironic-inspector databaseAccount: ironic-inspector passwordSelectors: database: IronicInspectorDatabasePassword service: IronicInspectorPassword ironicNeutronAgent: replicas: 1 messagingBus: cluster: rabbitmq secret: osp-secret 'where:
<loadBalancer_IP>-
Specifies the load balancer IP in your environment. If you use IPv6, change the load balancer IP to the load balancer IP in your environment, for example,
metallb.universe.tf/loadBalancerIPs: fd00:bbbb::80. messagingBus.Cluster- For more information about RHOSO RabbitMQ clusters, see RHOSO RabbitMQ clusters in Monitoring high availability services.
Wait for the Bare Metal Provisioning service control plane services CRs to become ready:
$ oc wait --for condition=Ready --timeout=300s ironics.ironic.openstack.org ironicVerify that the individual services are ready:
$ oc wait --for condition=Ready --timeout=300s ironicapis.ironic.openstack.org ironic-api $ oc wait --for condition=Ready --timeout=300s ironicconductors.ironic.openstack.org ironic-conductor $ oc wait --for condition=Ready --timeout=300s ironicinspectors.ironic.openstack.org ironic-inspector $ oc wait --for condition=Ready --timeout=300s ironicneutronagents.ironic.openstack.org ironic-ironic-neutron-agentUpdate the DNS Nameservers on the provisioning, cleaning, and rescue networks:
NoteFor name resolution to work for Bare Metal Provisioning service operations, you must set the DNS nameserver to use the internal DNS servers in the RHOSO control plane:
$ openstack subnet set --dns-nameserver 192.168.122.80 provisioning-subnetVerify that no Bare Metal Provisioning service nodes are missing from the node list:
$ openstack baremetal node listImportantIf the
openstack baremetal node listcommand output reports an incorrect power status, wait a few minutes and re-run the command to see if the output syncs with the actual state of the hardware being managed. The time required for the Bare Metal Provisioning service to review and reconcile the power state of bare-metal nodes depends on the number of operating conductors through thereplicasparameter and which are present in the Bare Metal Provisioning service deployment being adopted.If any Bare Metal Provisioning service nodes are missing from the
openstack baremetal node listcommand, temporarily disable the new RBAC policy to see the nodes again:$ oc patch openstackcontrolplane openstack --type=merge --patch ' spec: ironic: enabled: true template: databaseInstance: openstack ironicAPI: replicas: 1 customServiceConfig: | [oslo_policy] enforce_scope=false enforce_new_defaults=false 'After this configuration is applied, the operator restarts the Ironic API service and disables the new RBAC policy that is enabled by default.
View the bare-metal nodes that do not have an owner assigned:
$ openstack baremetal node list --long -c UUID -c Owner -c 'Provisioning State'Assign all bare-metal nodes with no owner to a new project, for example, the admin project:
ADMIN_PROJECT_ID=$(openstack project show -c id -f value --domain default admin) for node in $(openstack baremetal node list -f json -c UUID -c Owner | jq -r '.[] | select(.Owner == null) | .UUID'); do openstack baremetal node set --owner $ADMIN_PROJECT_ID $node; doneRe-apply the default RBAC by removing the
customServiceConfigsection or by setting the following values in thecustomServiceConfigsection totrue. For example:$ oc patch openstackcontrolplane openstack --type=merge --patch ' spec: ironic: enabled: true template: databaseInstance: openstack ironicAPI: replicas: 1 customServiceConfig: | [oslo_policy] enforce_scope=true enforce_new_defaults=true '
Verification
Verify the list of endpoints:
$ openstack endpoint list |grep ironicVerify the list of bare-metal nodes:
$ openstack baremetal node listReset the deploy images on all bare-metal nodes to use the new centrally configured images:
NoteAfter adoption, bare-metal nodes might still reference the old deployment’s kernel and ramdisk images in their
driver_infofields. Resetting these values causes the Bare Metal Provisioning service to use the new centrally configureddeploy_kernelanddeploy_ramdiskvalues from theironic.conffile.for node in $(openstack baremetal node list -c UUID -f value); do openstack baremetal node set $node \ --driver-info deploy_ramdisk= \ --driver-info deploy_kernel= done
4.11. Adopting the Compute service Copy linkLink copied to clipboard!
To adopt the Compute service (nova), you patch an existing OpenStackControlPlane custom resource (CR) where the Compute service is disabled. The patch starts the service with the configuration parameters that are provided by the Red Hat OpenStack Platform (RHOSP) environment. The following procedure describes a single-cell setup.
Prerequisites
- You have completed the previous adoption steps.
You have defined the following shell variables. Replace the following example values with the values that are correct for your environment:
alias openstack="oc exec -t openstackclient -- openstack" DEFAULT_CELL_NAME="cell3" RENAMED_CELLS="cell1 cell2 $DEFAULT_CELL_NAME"-
The source cloud
defaultcell takes a new$DEFAULT_CELL_NAME. In a multi-cell adoption scenario, the default cell might retain its original name,DEFAULT_CELL_NAME=default, or become renamed as a cell that is free for use. Do not use other existing cell names forDEFAULT_CELL_NAME, except fordefault. If you deployed the source cloud with a
defaultcell, and want to rename it during adoption, define the new name that you want to use, as shown in the following example:DEFAULT_CELL_NAME="cell1" RENAMED_CELLS="cell1"
-
The source cloud
Procedure
Patch the
OpenStackControlPlaneCR to deploy the Compute service:NoteThis procedure assumes that Compute service metadata is deployed on the top level and not on each cell level. If the RHOSP deployment has a per-cell metadata deployment, adjust the following patch as needed. You cannot run the metadata service in
cell0. To enable the metadata services of a local cell, set theenabledproperty in themetadataServiceTemplatefield of the local cell totruein theOpenStackControlPlaneCR.$ rm -f celltemplates $ for CELL in $(echo $RENAMED_CELLS); do > cat >> celltemplates << EOF > ${CELL}: > hasAPIAccess: true > cellDatabaseAccount: nova-$CELL > cellDatabaseInstance: openstack-$CELL > cellMessageBusInstance: rabbitmq-$CELL > metadataServiceTemplate: > enabled: false > override: > service: > metadata: > annotations: > metallb.universe.tf/address-pool: internalapi > metallb.universe.tf/allow-shared-ip: internalapi > metallb.universe.tf/loadBalancerIPs: 172.17.0.$(( 79 + ${CELL##*cell} )) > spec: > type: LoadBalancer > customServiceConfig: | > [workarounds] > disable_compute_service_check_for_ffu=true > conductorServiceTemplate: > customServiceConfig: | > [workarounds] > disable_compute_service_check_for_ffu=true >EOF >done $ cat > oscp-patch.yaml << EOF >spec: > nova: > enabled: true > apiOverride: > route: {} > template: > secret: osp-secret > apiDatabaseAccount: nova-api > apiServiceTemplate: > override: > service: > internal: > metadata: > annotations: > metallb.universe.tf/address-pool: internalapi > metallb.universe.tf/allow-shared-ip: internalapi > metallb.universe.tf/loadBalancerIPs: <172.17.0.80> > spec: > type: LoadBalancer > customServiceConfig: | > [workarounds] > disable_compute_service_check_for_ffu=true > metadataServiceTemplate: > enabled: true > override: > service: > metadata: > annotations: > metallb.universe.tf/address-pool: internalapi > metallb.universe.tf/allow-shared-ip: internalapi > metallb.universe.tf/loadBalancerIPs: <172.17.0.80> > spec: > type: LoadBalancer > customServiceConfig: | > [workarounds] > disable_compute_service_check_for_ffu=true > schedulerServiceTemplate: > customServiceConfig: | > [workarounds] > disable_compute_service_check_for_ffu=true > cellTemplates: > cell0: > hasAPIAccess: true > cellDatabaseAccount: nova-cell0 > cellDatabaseInstance: openstack > cellMessageBusInstance: rabbitmq > conductorServiceTemplate: > customServiceConfig: | > [workarounds] > disable_compute_service_check_for_ffu=true >EOF $ cat celltemplates >> oscp-patch.yaml $ oc patch openstackcontrolplane openstack --type=merge --patch-file=oscp-patch.yaml-
${CELL}.hasAPIAccessspecifies upcall access to the API. In the source cloud, cells are always configured with the main Nova API database upcall access. You can disable upcall access to the API by settinghasAPIAccesstofalse. However, do not make changes to the API during adoption. -
${CELL}.cellDatabaseInstancespecifies the database instance that is used by the cell. The database instance names must match the names that are defined in theOpenStackControlPlaneCR that you created in when you deployed the back-end services as described in Deploying back-end services. -
${CELL}.cellMessageBusInstancespecifies the message bus instance that is used by the cell. The message bus instance names must match the names that are defined in theOpenStackControlPlaneCR. -
metallb.universe.tf/loadBalancerIPs: <172.17.0.80>specifies the load balancer IP in your environment. If you use IPv6, change the load balancer IP to the load balancer IP in your environment, for example,metallb.universe.tf/loadBalancerIPs: fd00:bbbb::80.
-
If you are adopting the Compute service with the Bare Metal Provisioning service (ironic), append the
novaComputeTemplatesfield with the following content in each cell in the Compute service CR patch. For example:cell1: novaComputeTemplates: standalone: customServiceConfig: | [DEFAULT] host = <hostname> [workarounds] disable_compute_service_check_for_ffu=true computeDriver: ironic.IronicDriver ...-
Replace
<hostname>with the hostname of the node that is running theironicCompute driver in the source cloud.
-
Replace
Wait for the CRs for the Compute control plane services to be ready:
$ oc wait --for condition=Ready --timeout=300s Nova/novaNoteThe local Conductor services are started for each cell, while the superconductor runs in
cell0. Note thatdisable_compute_service_check_for_ffuis mandatory for all imported Compute services until the external data plane is imported, and until the Compute services are fast-forward upgraded. For more information, see Adopting Compute services to the RHOSO data plane and Performing a fast-forward upgrade on Compute services.
Verification
Check that Compute service endpoints are defined and pointing to the control plane FQDNs, and that the Nova API responds:
$ openstack endpoint list | grep nova $ openstack server list- Compare the outputs with the topology-specific configuration in Retrieving topology-specific service configuration.
Query the superconductor to check that the expected cells exist, and compare it to its pre-adoption values:
for CELL in $(echo $CELLS); do set +u . ~/.source_cloud_exported_variables_$CELL set -u RCELL=$CELL [ "$CELL" = "default" ] && RCELL=$DEFAULT_CELL_NAME echo "comparing $CELL to $RCELL" echo $PULL_OPENSTACK_CONFIGURATION_NOVAMANAGE_CELL_MAPPINGS | grep -F "| $CELL |" oc rsh nova-cell0-conductor-0 nova-manage cell_v2 list_cells | grep -F "| $RCELL |" doneThe following changes are expected for each cell:
-
The
cellXnovadatabase and username becomenova_cellX. -
The
defaultcell is renamed toDEFAULT_CELL_NAME. Thedefaultcell might retain the original name if there are multiple cells. -
The RabbitMQ transport URL no longer uses
guest.
-
The
At this point, the Compute service control plane services do not control the existing Compute service workloads. The control plane manages the data plane only after the data adoption process is completed. For more information, see Adopting Compute services to the RHOSO data plane.
To import external Compute services to the RHOSO data plane, you must upgrade them first. For more information, see Adopting Compute services to the RHOSO data plane, and Performing a fast-forward upgrade on Compute services.
4.12. Adopting the Block Storage service Copy linkLink copied to clipboard!
To adopt a director-deployed Block Storage service (cinder), create the manifest based on the existing cinder.conf file, deploy the Block Storage service, and validate the new deployment.
Prerequisites
- You have reviewed the Block Storage service limitations. For more information, see Limitations for adopting the Block Storage service.
- You have planned the placement of the Block Storage services.
- You have prepared the Red Hat OpenShift Container Platform (RHOCP) nodes where the volume and backup services run. For more information, see RHOCP preparation for Block Storage service adoption.
- The Block Storage service (cinder) is stopped.
- The service databases are imported into the control plane MariaDB.
- The Identity service (keystone) is adopted.
- If your Red Hat OpenStack Platform 17.1 deployment included the Key Manager service (barbican), the Key Manager service is adopted.
- The Storage network is correctly configured on the RHOCP cluster.
The contents of
cinder.conffile. Download the file so that you can access it locally:$CONTROLLER1_SSH sudo cat /var/lib/config-data/puppet-generated/cinder/etc/cinder/cinder.conf > cinder.conf
Procedure
Create a new file, for example,
cinder_api.patch, and apply the configuration:$ oc patch openstackcontrolplane openstack --type=merge --patch-file=<patch_name>Replace
<patch_name>with the name of your patch file.The following example shows a
cinder_api.patchfile:spec: extraMounts: - extraVol: - extraVolType: Ceph mounts: - mountPath: /etc/ceph name: ceph readOnly: true propagation: - CinderVolume - CinderBackup - Glance volumes: - name: ceph projected: sources: - secret: name: ceph-conf-files cinder: enabled: true apiOverride: route: {} template: databaseInstance: openstack databaseAccount: cinder secret: osp-secret cinderAPI: override: service: internal: metadata: annotations: metallb.universe.tf/address-pool: internalapi metallb.universe.tf/allow-shared-ip: internalapi metallb.universe.tf/loadBalancerIPs: <172.17.0.80> spec: type: LoadBalancer replicas: 1 customServiceConfig: | [DEFAULT] default_volume_type=tripleo cinderScheduler: replicas: 0 cinderBackup: networkAttachments: - storage replicas: 0 cinderVolumes: ceph: networkAttachments: - storage replicas: 0where:
- <172.17.0.80>
-
Specifies the load balancer IP in your environment. If you use IPv6, change the load balancer IP to the load balancer IP in your environment, for example,
metallb.universe.tf/loadBalancerIPs: fd00:bbbb::80.
Retrieve the list of the previous scheduler and backup services:
$ openstack volume service list +------------------+------------------------+------+---------+-------+----------------------------+ | Binary | Host | Zone | Status | State | Updated At | +------------------+------------------------+------+---------+-------+----------------------------+ | cinder-scheduler | standalone.localdomain | nova | enabled | down | 2024-11-04T17:47:14.000000 | | cinder-backup | standalone.localdomain | nova | enabled | down | 2024-11-04T17:47:14.000000 | | cinder-volume | hostgroup@tripleo_ceph | nova | enabled | down | 2024-11-04T17:47:14.000000 | +------------------+------------------------+------+---------+-------+----------------------------+Remove services for hosts that are in the
downstate:$ oc exec -t cinder-api-0 -c cinder-api -- cinder-manage service remove <service_binary> <service_host>-
Replace
<service_binary>with the name of the binary, for example,cinder-backup. -
Replace
<service_host>with the host name, for example,cinder-backup-0.
-
Replace
Deploy the scheduler, backup, and volume services:
Create another file, for example,
cinder_services.patch, and apply the configuration:$ oc patch openstackcontrolplane openstack --type=merge --patch-file=<patch_name>-
Replace
<patch_name>with the name of your patch file. The following example shows a
cinder_services.patchfile for a Ceph RBD deployment:spec: cinder: enabled: true template: cinderScheduler: replicas: 1 cinderBackup: networkAttachments: - storage replicas: 1 customServiceConfig: | [DEFAULT] backup_driver=cinder.backup.drivers.ceph.CephBackupDriver backup_ceph_conf=/etc/ceph/ceph.conf backup_ceph_user=openstack backup_ceph_pool=backups cinderVolumes: ceph: networkAttachments: - storage replicas: 1 customServiceConfig: | [tripleo_ceph] backend_host=hostgroup volume_backend_name=tripleo_ceph volume_driver=cinder.volume.drivers.rbd.RBDDriver rbd_ceph_conf=/etc/ceph/ceph.conf rbd_user=openstack rbd_pool=volumes rbd_flatten_volume_from_snapshot=False report_discard_supported=TrueNoteEnsure that you use the same configuration group name for the driver that you used in the source cluster. In this example, the driver configuration group in
customServiceConfigis calledtripleo_cephbecause it reflects the value of the configuration group name in thecinder.conffile of the source OpenStack cluster.
Configure the NetApp NFS Block Storage volume service:
Create a secret that includes sensitive information such as hostnames, passwords, and usernames to access the third-party NetApp NFS storage. You can find the credentials in the
cinder.conffile that was generated from the director deployment:$ oc apply -f - <<EOF apiVersion: v1 kind: Secret metadata: labels: service: cinder component: cinder-volume name: cinder-volume-ontap-secrets type: Opaque stringData: ontap-cinder-secrets: | [tripleo_netapp] netapp_login= netapp_username netapp_password= netapp_password netapp_vserver= netapp_vserver nas_host= netapp_nfsip nas_share_path=/netapp_nfspath netapp_pool_name_search_pattern=(netapp_poolpattern) EOFPatch the
OpenStackControlPlaneCR to deploy NetApp NFS Block Storage volume back end:$ oc patch openstackcontrolplane openstack --type=merge --patch-file=<cinder_netappNFS.patch>Replace
<cinder_netappNFS.patch>with the name of the patch file for your NetApp NFS Block Storage volume back end.The following example shows a
cinder_netappNFS.patchfile that configures a NetApp NFS Block Storage volume service:spec: cinder: enabled: true template: cinderVolumes: ontap-nfs: networkAttachments: - storage customServiceConfig: | [tripleo_netapp] volume_backend_name=ontap-nfs volume_driver=cinder.volume.drivers.netapp.common.NetAppDriver nfs_snapshot_support=true nas_secure_file_operations=false nas_secure_file_permissions=false netapp_server_hostname= netapp_backendip netapp_server_port=80 netapp_storage_protocol=nfs netapp_storage_family=ontap_cluster customServiceConfigSecrets: - cinder-volume-ontap-secrets
Configure the NetApp iSCSI Block Storage volume service:
Create a secret that includes sensitive information such as hostnames, passwords, and usernames to access the third-party NetApp iSCSI storage. You can find the credentials in the
cinder.conffile that was generated from the director deployment:$ oc apply -f - <<EOF apiVersion: v1 kind: Secret metadata: labels: service: cinder component: cinder-volume name: cinder-volume-ontap-secrets type: Opaque stringData: ontap-cinder-secrets: | [tripleo_netapp] netapp_server_hostname = netapp_host netapp_login = netapp_username netapp_password = netapp_password netapp_vserver = netapp_vserver netapp_pool_name_search_pattern=(netapp_poolpattern) EOF
Patch the
OpenStackControlPlanecustom resource (CR) to deploy the NetApp iSCSI Block Storage volume back end:$ oc patch openstackcontrolplane openstack --type=merge --patch-file=<cinder_netappISCSI.patch>Replace
<cinder_netappISCSI.patch>with the name of the patch file for your NetApp iSCSI Block Storage volume back end.The following example shows a
cinder_netappISCSI.patchfile that configures a NetApp iSCSI Block Storage volume service:spec: cinder: enabled: true template: cinderVolumes: ontap-iscsi: networkAttachments: - storage customServiceConfig: | [tripleo_netapp] volume_backend_name=ontap-iscsi volume_driver=cinder.volume.drivers.netapp.common.NetAppDriver netapp_storage_protocol=iscsi netapp_storage_family=ontap_cluster consistencygroup_support=True customServiceConfigSecrets: - cinder-volume-ontap-secrets
Check if all the services are up and running:
$ openstack volume service list +------------------+--------------------------+------+---------+-------+----------------------------+ | Binary | Host | Zone | Status | State | Updated At | +------------------+--------------------------+------+---------+-------+----------------------------+ | cinder-volume | hostgroup@tripleo_netapp | nova | enabled | up | 2023-06-28T17:00:03.000000 | | cinder-scheduler | cinder-scheduler-0 | nova | enabled | up | 2023-06-28T17:00:02.000000 | | cinder-backup | cinder-backup-0 | nova | enabled | up | 2023-06-28T17:00:01.000000 | +------------------+--------------------------+------+---------+-------+----------------------------+Apply the DB data migrations:
NoteYou are not required to run the data migrations at this step, but you must run them before the next upgrade. However, for adoption, you can run the migrations now to ensure that there are no issues before you run production workloads on the deployment.
$ oc exec -it cinder-scheduler-0 -- cinder-manage db online_data_migrations
Verification
Ensure that the
openstackalias is defined:$ alias openstack="oc exec -t openstackclient -- openstack"Confirm that Block Storage service endpoints are defined and pointing to the control plane FQDNs:
$ openstack endpoint list --service <endpoint>-
Replace
<endpoint>with the name of the endpoint that you want to confirm.
-
Replace
Confirm that the Block Storage services are running:
$ openstack volume service listNoteCinder API services do not appear in the list. However, if you get a response from the
openstack volume service listcommand, that means at least one of the cinder API services is running.Confirm that you have your previous volume types, volumes, snapshots, and backups:
$ openstack volume type list $ openstack volume list $ openstack volume snapshot list $ openstack volume backup listTo confirm that the configuration is working, perform the following steps:
Create a volume from an image to check that the connection to Image Service (glance) is working:
$ openstack volume create --image cirros --bootable --size 1 disk_newBack up the previous attached volume:
$ openstack --os-volume-api-version 3.47 volume create --backup <backup_name>Replace
<backup_name>with the name of your new backup location.NoteYou do not boot a Compute service (nova) instance by using the new
volume fromimage or try to detach the previous volume because the Compute service and the Block Storage service are still not connected.
4.13. Adopt the Block Storage service with multiple Red Hat Ceph Storage back ends (DCN) Copy linkLink copied to clipboard!
Adopt the Block Storage service (cinder) in a Distributed Compute Node (DCN) deployment where multiple Red Hat Ceph Storage clusters provide storage at different sites. You can deploy multiple CinderVolume instances, one for each availability zone, with each volume service configured to use its local Red Hat Ceph Storage cluster.
During adoption, the Block Storage service volume services that ran on edge site Compute nodes are migrated to run on Red Hat OpenShift Container Platform (RHOCP) at the central site. Although the control path for API requests now traverses the WAN to reach the Block Storage service running on Red Hat OpenShift Container Platform (RHOCP), the data path remains local. Volume data continues to be stored in the Red Hat Ceph Storage cluster at each edge site. When you create a volume or clone a volume from a snapshot, the operation occurs entirely within the local Red Hat Ceph Storage cluster. This preserves data locality.
Prerequisites
- You have completed the previous adoption steps.
-
The per-site Red Hat Ceph Storage secrets (
ceph-conf-central,ceph-conf-dcn1,ceph-conf-dcn2) exist and contain the configuration and keyrings for each site’s Red Hat Ceph Storage cluster. For more information, see Configuring a Red Hat Ceph Storage back end. -
The
extraMountsproperty of theOpenStackControlPlanecustom resource (CR) is configured to mount the Red Hat Ceph Storage configuration to all Block Storage service instances. -
You have stopped the Block Storage service on all DCN nodes. For more information, see Stopping Red Hat OpenStack Platform services. On edge sites, the Block Storage service volume service runs on Compute nodes with the service name
tripleo_cinder_volume.service.
Procedure
Retrieve the
fsidfor each Red Hat Ceph Storage cluster in your DCN deployment. Thefsidis used as therbd_secret_uuidfor libvirt integration:$ oc get secret ceph-conf-central -o json | jq -r '.data | to_entries[] | select(.key | endswith(".conf")) | "\(.key): \(.value | @base64d)"' | grep fsidCreate a patch file for the Block Storage service with multiple Red Hat Ceph Storage back ends. The following example shows a DCN deployment with a central site and two edge sites:
$ cat << EOF > cinder_dcn_patch.yaml spec: cinder: enabled: true template: cinderAPI: customServiceConfig: | [DEFAULT] default_availability_zone = az-central cinderScheduler: replicas: 1 cinderVolumes: central: networkAttachments: - storage replicas: 1 customServiceConfig: | [DEFAULT] enabled_backends = central glance_api_servers = http://glance-central-internal.openstack.svc:9292 [central] backend_host = hostgroup volume_backend_name = central volume_driver = cinder.volume.drivers.rbd.RBDDriver rbd_ceph_conf = /etc/ceph/central.conf rbd_user = openstack rbd_pool = volumes rbd_flatten_volume_from_snapshot = False report_discard_supported = True rbd_secret_uuid = <central_fsid> rbd_cluster_name = central backend_availability_zone = az-central dcn1: networkAttachments: - storage replicas: 1 customServiceConfig: | [DEFAULT] enabled_backends = dcn1 glance_api_servers = http://glance-dcn1-internal.openstack.svc:9292 [dcn1] backend_host = hostgroup volume_backend_name = dcn1 volume_driver = cinder.volume.drivers.rbd.RBDDriver rbd_ceph_conf = /etc/ceph/dcn1.conf rbd_user = openstack rbd_pool = volumes rbd_flatten_volume_from_snapshot = False report_discard_supported = True rbd_secret_uuid = <dcn1_fsid> rbd_cluster_name = dcn1 backend_availability_zone = az-dcn1 dcn2: networkAttachments: - storage replicas: 1 customServiceConfig: | [DEFAULT] enabled_backends = dcn2 glance_api_servers = http://glance-dcn2-internal.openstack.svc:9292 [dcn2] backend_host = hostgroup volume_backend_name = dcn2 volume_driver = cinder.volume.drivers.rbd.RBDDriver rbd_ceph_conf = /etc/ceph/dcn2.conf rbd_user = openstack rbd_pool = volumes rbd_flatten_volume_from_snapshot = False report_discard_supported = True rbd_secret_uuid = <dcn2_fsid> rbd_cluster_name = dcn2 backend_availability_zone = az-dcn2 EOFwhere:
<central_fsid>-
Specifies the
fsidof the central Red Hat Ceph Storage cluster, used as the libvirt secret UUID. <dcn1_fsid>-
Specifies the
fsidof the DCN1 edge Red Hat Ceph Storage cluster. <dcn2_fsid>-
Specifies the
fsidof the DCN2 edge Red Hat Ceph Storage cluster.
Note-
You must configure each
CinderVolumewith thebackend_availability_zonevalue that matches your Compute service availability zone for that site, becausecross_az_attach = Falseis set in the Compute service configuration. If the names do not match, instances cannot attach volumes. Replace the examples (az-central,az-dcn1,az-dcn2) with the names used in your Red Hat OpenStack Platform deployment. -
Each
CinderVolumepoints to its local Image service API endpoint throughglance_api_servers. This ensures that volume creation from images uses the local Image service and Red Hat Ceph Storage cluster. The examples usehttp://for the Image service endpoints. If your Red Hat OpenStack Platform deployment uses TLS for internal endpoints, usehttps://instead, and ensure that you have completed the TLS migration. For more information, see Migrating TLS-e to the RHOSO deployment. -
The
rbd_cluster_namesetting identifies which Red Hat Ceph Storage cluster configuration to use from the mounted secrets. - Adjust the number of edge sites and their names to match your DCN deployment.
Patch the
OpenStackControlPlaneCR to deploy the Block Storage service with multiple Red Hat Ceph Storage back ends:$ oc patch openstackcontrolplane openstack --type=merge --patch-file cinder_dcn_patch.yamlConfigure the Block Storage service backup service. In this example, the backup service runs at the central site and uses the central Red Hat Ceph Storage cluster. Add the
cinderBackupssection to your patch file and re-apply it:$ cat << EOF >> cinder_dcn_patch.yaml cinderBackups: central: networkAttachments: - storage replicas: 1 customServiceConfig: | [DEFAULT] backup_driver=cinder.backup.drivers.ceph.CephBackupDriver backup_ceph_conf=/etc/ceph/central.conf backup_ceph_user=openstack backup_ceph_pool=backups storage_availability_zone=az-central EOF $ oc patch openstackcontrolplane openstack --type=merge --patch-file cinder_dcn_patch.yamlNoteUnlike a single-site Red Hat Ceph Storage deployment where the backup config references
/etc/ceph/ceph.conf, in a DCN deployment the Red Hat Ceph Storage configuration files in theceph-conf-filessecret are named by cluster. Setbackup_ceph_confto the path of the Red Hat Ceph Storage configuration file for whichever cluster hosts yourbackupspool. In this example the file is namedcentral.conf, so the path is/etc/ceph/central.conf. Using a path that does not match a file in the secret will cause the backup service to fail with aconf_read_fileerror.Set
storage_availability_zoneto match the availability zone of the volumes you want to back up. The backup scheduler uses this to route backup requests to a service in the correct zone. If the backup service zone does not match the volume zone, backup creation fails withService not found for creating backup.Verify that the Block Storage service volume services are running for each availability zone:
$ openstack volume service list --service cinder-volume +------------------+---------------------+------------+---------+-------+----------------------------+ | Binary | Host | Zone | Status | State | Updated At | +------------------+---------------------+------------+---------+-------+----------------------------+ | cinder-volume | hostgroup@central | az-central | enabled | up | 2024-01-01T00:00:00.000000 | | cinder-volume | hostgroup@dcn1 | az-dcn1 | enabled | up | 2024-01-01T00:00:00.000000 | | cinder-volume | hostgroup@dcn2 | az-dcn2 | enabled | up | 2024-01-01T00:00:00.000000 | +------------------+---------------------+------------+---------+-------+----------------------------+Verify that the Block Storage service backup service is running and in the correct availability zone:
$ openstack volume service list --service cinder-backup +---------------+-------------------------+------------+---------+-------+----------------------------+ | Binary | Host | Zone | Status | State | Updated At | +---------------+-------------------------+------------+---------+-------+----------------------------+ | cinder-backup | cinder-backup-central-0 | az-central | enabled | up | 2024-01-01T00:00:00.000000 | +---------------+-------------------------+------------+---------+-------+----------------------------+Test the backup service by creating a volume, backing it up, and restoring the backup:
$ openstack volume create --size 1 backup-test-vol $ openstack volume backup create --name backup-test-backup backup-test-vol $ openstack volume backup show backup-test-backup +-----------------------+--------------------------------------+ | Field | Value | +-----------------------+--------------------------------------+ | container | backups | | fail_reason | None | | name | backup-test-backup | | size | 1 | | status | available | +-----------------------+--------------------------------------+ $ openstack volume backup restore backup-test-backup backup-test-restoreNoteSome versions of the Red Hat OpenShift Container Platform (RHOCP) client display a
cannot unpack non-iterable VolumeBackupsRestore objecterror after the restore command. This is a known issue in the client, the restore operation might not have failed. Verify by checking the restored volume status directly.$ openstack volume show backup-test-restore -c status -c availability_zone -c os-vol-host-attr:host -f value available az-central hostgroup@central#central
4.14. Adopting the Dashboard service Copy linkLink copied to clipboard!
To adopt the Dashboard service (horizon), you patch an existing OpenStackControlPlane custom resource (CR) that has the Dashboard service disabled. The patch starts the service with the configuration parameters that are provided by the Red Hat OpenStack Platform environment.
Prerequisites
- You adopted Memcached. For more information, see Deploying back-end services.
- You adopted the Identity service (keystone). For more information, see Adopting the Identity service.
Procedure
Patch the
OpenStackControlPlaneCR to deploy the Dashboard service:$ oc patch openstackcontrolplane openstack --type=merge --patch ' spec: horizon: enabled: true apiOverride: route: {} template: memcachedInstance: memcached secret: osp-secret '
Verification
Verify that the Dashboard service instance is successfully deployed and ready:
$ oc get horizonConfirm that the Dashboard service is reachable and returns a
200status code:PUBLIC_URL=$(oc get horizon horizon -o jsonpath='{.status.endpoint}') curl --silent --output /dev/stderr --head --write-out "%{http_code}" "$PUBLIC_URL/dashboard/auth/login/?next=/dashboard/" -k | grep 200
4.16. Adopting the Orchestration service Copy linkLink copied to clipboard!
To adopt the Orchestration service (heat), you patch an existing OpenStackControlPlane custom resource (CR), where the Orchestration service is disabled. The patch starts the service with the configuration parameters that are provided by the Red Hat OpenStack Platform (RHOSP) environment.
After you complete the adoption process, you have CRs for Heat, HeatAPI, HeatEngine, and HeatCFNAPI, and endpoints within the Identity service (keystone) to facilitate these services.
Prerequisites
- The source director environment is running.
- The target Red Hat OpenShift Container Platform (RHOCP) environment is running.
- You adopted MariaDB and the Identity service.
- If your existing Orchestration service stacks contain resources from other services such as Networking service (neutron), Compute service (nova), Object Storage service (swift), and so on, adopt those sevices before adopting the Orchestration service.
Procedure
Retrieve the existing
auth_encryption_keyandservicepasswords. You use these passwords to patch theosp-secret. In the following example, theauth_encryption_keyis used asHeatAuthEncryptionKeyand theservicepassword is used asHeatPassword:[stack@rhosp17 ~]$ grep -E 'HeatPassword|HeatAuth|HeatStackDomainAdmin' ~/overcloud-deploy/overcloud/overcloud-passwords.yaml HeatAuthEncryptionKey: Q60Hj8PqbrDNu2dDCbyIQE2dibpQUPg2 HeatPassword: dU2N0Vr2bdelYH7eQonAwPfI3 HeatStackDomainAdminPassword: dU2N0Vr2bdelYH7eQonAwPfI3Log in to a Controller node and verify the
auth_encryption_keyvalue in use:[stack@rhosp17 ~]$ ansible -i overcloud-deploy/overcloud/config-download/overcloud/tripleo-ansible-inventory.yaml overcloud-controller-0 -m shell -a "grep auth_encryption_key /var/lib/config-data/puppet-generated/heat/etc/heat/heat.conf | grep -Ev '^#|^$'" -b overcloud-controller-0 | CHANGED | rc=0 >> auth_encryption_key=Q60Hj8PqbrDNu2dDCbyIQE2dibpQUPg2Encode the password to Base64 format:
$ echo Q60Hj8PqbrDNu2dDCbyIQE2dibpQUPg2 | base64 UTYwSGo4UHFickROdTJkRENieUlRRTJkaWJwUVVQZzIKPatch the
osp-secretto update theHeatAuthEncryptionKeyandHeatPasswordparameters. These values must match the values in the director Orchestration service configuration:$ oc patch secret osp-secret --type='json' -p='[{"op" : "replace" ,"path" : "/data/HeatAuthEncryptionKey" ,"value" : "UTYwSGo4UHFickROdTJkRENieUlRRTJkaWJwUVVQZzIK"}]' secret/osp-secret patchedPatch the
OpenStackControlPlaneCR to deploy the Orchestration service:$ oc patch openstackcontrolplane openstack --type=merge --patch ' spec: heat: enabled: true apiOverride: route: {} template: databaseInstance: openstack databaseAccount: heat secret: osp-secret memcachedInstance: memcached passwordSelectors: authEncryptionKey: HeatAuthEncryptionKey service: HeatPassword stackDomainAdminPassword: HeatStackDomainAdminPassword '
Verification
Ensure that the statuses of all the CRs are
Setup complete:$ oc get Heat,HeatAPI,HeatEngine,HeatCFNAPI NAME STATUS MESSAGE heat.heat.openstack.org/heat True Setup complete NAME STATUS MESSAGE heatapi.heat.openstack.org/heat-api True Setup complete NAME STATUS MESSAGE heatengine.heat.openstack.org/heat-engine True Setup complete NAME STATUS MESSAGE heatcfnapi.heat.openstack.org/heat-cfnapi True Setup completeCheck that the Orchestration service is registered in the Identity service:
$ oc exec -it openstackclient -- openstack service list -c Name -c Type +------------+----------------+ | Name | Type | +------------+----------------+ | heat | orchestration | | glance | image | | heat-cfn | cloudformation | | ceilometer | Ceilometer | | keystone | identity | | placement | placement | | cinderv3 | volumev3 | | nova | compute | | neutron | network | +------------+----------------+$ oc exec -it openstackclient -- openstack endpoint list --service=heat -f yaml - Enabled: true ID: 1da7df5b25b94d1cae85e3ad736b25a5 Interface: public Region: regionOne Service Name: heat Service Type: orchestration URL: http://heat-api-public-openstack-operators.apps.okd.bne-shift.net/v1/%(tenant_id)s - Enabled: true ID: 414dd03d8e9d462988113ea0e3a330b0 Interface: internal Region: regionOne Service Name: heat Service Type: orchestration URL: http://heat-api-internal.openstack-operators.svc:8004/v1/%(tenant_id)sCheck that the Orchestration service engine services are running:
$ oc exec -it openstackclient -- openstack orchestration service list -f yaml - Binary: heat-engine Engine ID: b16ad899-815a-4b0c-9f2e-e6d9c74aa200 Host: heat-engine-6d47856868-p7pzz Hostname: heat-engine-6d47856868-p7pzz Status: up Topic: engine Updated At: '2023-10-11T21:48:01.000000' - Binary: heat-engine Engine ID: 887ed392-0799-4310-b95c-ac2d3e6f965f Host: heat-engine-6d47856868-p7pzz Hostname: heat-engine-6d47856868-p7pzz Status: up Topic: engine Updated At: '2023-10-11T21:48:00.000000' - Binary: heat-engine Engine ID: 26ed9668-b3f2-48aa-92e8-2862252485ea Host: heat-engine-6d47856868-p7pzz Hostname: heat-engine-6d47856868-p7pzz Status: up Topic: engine Updated At: '2023-10-11T21:48:00.000000' - Binary: heat-engine Engine ID: 1011943b-9fea-4f53-b543-d841297245fd Host: heat-engine-6d47856868-p7pzz Hostname: heat-engine-6d47856868-p7pzz Status: up Topic: engine Updated At: '2023-10-11T21:48:01.000000'Verify that you can see your Orchestration service stacks:
$ openstack stack list -f yaml - Creation Time: '2023-10-11T22:03:20Z' ID: 20f95925-7443-49cb-9561-a1ab736749ba Project: 4eacd0d1cab04427bc315805c28e66c9 Stack Name: test-networks Stack Status: CREATE_COMPLETE Updated Time: null
4.17. Adopting the Load-balancing service Copy linkLink copied to clipboard!
To adopt the Load-balancing service (octavia), you patch an existing OpenStackControlPlane custom resource (CR) where the Load-balancing service is disabled. The patch starts the service with the configuration parameters that are provided by the Red Hat OpenStack Platform (RHOSP) environment. After completing the data plane adoption, you must trigger a failover of existing load balancers to upgrade their amphora virtual machines to use the new image and to establish connectivity with the new control plane.
Procedure
Migrate the server certificate authority (CA) passphrase from the previous deployment:
SERVER_CA_PASSPHRASE=$($CONTROLLER1_SSH grep ^ca_private_key_passphrase /var/lib/config-data/puppet-generated/octavia/etc/octavia/octavia.conf) oc apply -f - <<EOF apiVersion: v1 kind: Secret metadata: name: octavia-ca-passphrase type: Opaque data: server-ca-passphrase: $(echo -n $SERVER_CA_PASSPHRASE | base64 -w0) EOFTo isolate the management network, add the network interface for the VLAN base interface:
$ oc get --no-headers nncp | cut -f 1 -d ' ' | grep -v nncp-dns | while read; do interfaces=$(oc get nncp $REPLY -o jsonpath="{.spec.desiredState.interfaces[*].name}") (echo $interfaces | grep -w -q "octbr\|enp6s0.24") || \ oc patch nncp $REPLY --type json --patch ' [{ "op": "add", "path": "/spec/desiredState/interfaces/-", "value": { "description": "Octavia VLAN host interface", "name": "enp6s0.24", "state": "up", "type": "vlan", "vlan": { "base-iface": "<enp6s0>", "id": 24 } } }, { "op": "add", "path": "/spec/desiredState/interfaces/-", "value": { "description": "Octavia Bridge", "mtu": <mtu>, "state": "up", "type": "linux-bridge", "name": "octbr", "bridge": { "options": { "stp": { "enabled": "false" } }, "port": [ { "name": "enp6s0.24" } ] } } }]' donewhere:
- <enp6s0>
- Specifies the name of the network interface in your RHOCP setup.
- <mtu>
-
Specifies the
mtuvalue in your environment.
To connect pods that manage load balancer virtual machines (amphorae) and the OpenvSwitch pods the OVN operator manages, configure the Load-balancing service network attachment definition:
$ cat octavia-nad.yaml << EOF_CAT apiVersion: k8s.cni.cncf.io/v1 kind: NetworkAttachmentDefinition metadata: labels: osp/net: octavia name: octavia spec: config: | { "cniVersion": "0.3.1", "name": "octavia", "type": "bridge", "bridge": "octbr", "ipam": { "type": "whereabouts", "range": "172.23.0.0/24", "range_start": "172.23.0.30", "range_end": "172.23.0.70", "routes": [ { "dst": "172.24.0.0/16", "gw" : "172.23.0.150" } ] } } EOF_CATCreate the
NetworkAttachmentDefinitionCR:$ oc apply -f octavia-nad.yamlEnable the Load-balancing service in RHOCP:
$ oc patch openstackcontrolplane openstack --type=merge --patch ' spec: ovn: template: ovnController: networkAttachment: tenant nicMappings: octavia: octbr octavia: enabled: true template: amphoraImageContainerImage: quay.io/gthiemonge/octavia-amphora-image octaviaHousekeeping: networkAttachments: - octavia octaviaHealthManager: networkAttachments: - octavia octaviaWorker: networkAttachments: - octavia 'Wait for the Load-balancing service control plane services CRs to be ready:
$ oc wait --for condition=Ready --timeout=600s octavia.octavia.openstack.org/octaviaEnsure that the Load-balancing service is registered in the Identity service:
$ alias openstack="oc exec -t openstackclient -- openstack" $ openstack service list | grep load-balancer | bd078ca6f90c4b86a48801f45eb6f0d7 | octavia | load-balancer | $ openstack endpoint list --service load-balancer +----------------------------------+-----------+--------------+---------------+---------+-----------+---------------------------------------------------+ | ID | Region | Service Name | Service Type | Enabled | Interface | URL | +----------------------------------+-----------+--------------+---------------+---------+-----------+---------------------------------------------------+ | f1ae7756b6164baf9cb82a1a670067a2 | regionOne | octavia | load-balancer | True | public | https://octavia-public-openstack.apps-crc.testing | | ff3222b4621843669e89843395213049 | regionOne | octavia | load-balancer | True | internal | http://octavia-internal.openstack.svc:9876 | +----------------------------------+-----------+--------------+---------------+---------+-----------+---------------------------------------------------+
Next steps
After you complete the data plane adoption, you must upgrade existing load balancers and remove old resources. For more information, see Post-adoption tasks for the Load-balancing service.
4.18. Adopting Telemetry services Copy linkLink copied to clipboard!
To adopt Telemetry services, you patch an existing OpenStackControlPlane custom resource (CR) that has Telemetry services disabled to start the service with the configuration parameters that are provided by the Red Hat OpenStack Platform (RHOSP) 17.1 environment.
If you adopt Telemetry services, the observability solution that is used in the RHOSP 17.1 environment, Service Telemetry Framework, is removed from the cluster. The new solution is deployed in the Red Hat OpenStack Services on OpenShift (RHOSO) environment, allowing for metrics, and optionally logs, to be retrieved and stored in the new back ends.
You cannot automatically migrate old data because different back ends are used. Metrics and logs are considered short-lived data and are not intended to be migrated to the RHOSO environment. For information about adopting legacy autoscaling stack templates to the RHOSO environment, see Adopting Autoscaling services.
Prerequisites
- The director environment is running (the source cloud).
- The Single Node OpenShift or OpenShift Local is running in the Red Hat OpenShift Container Platform (RHOCP) cluster.
- Previous adoption steps are completed.
Procedure
Patch the
OpenStackControlPlaneCR to deploycluster-observability-operator:$ oc create -f - <<EOF apiVersion: operators.coreos.com/v1alpha1 kind: Subscription metadata: name: cluster-observability-operator namespace: openshift-operators spec: channel: stable installPlanApproval: Automatic name: cluster-observability-operator source: redhat-operators sourceNamespace: openshift-marketplace EOFWait for the installation to succeed:
$ oc wait --for jsonpath="{.status.phase}"=Succeeded csv --namespace=openshift-operators -l operators.coreos.com/cluster-observability-operator.openshift-operatorsPatch the
OpenStackControlPlaneCR to deploy Ceilometer services:$ oc patch openstackcontrolplane openstack --type=merge --patch ' spec: telemetry: enabled: true template: ceilometer: passwordSelector: ceilometerService: CeilometerPassword enabled: true secret: osp-secret serviceUser: ceilometer 'Enable the metrics storage back end:
$ oc patch openstackcontrolplane openstack --type=merge --patch ' spec: telemetry: template: metricStorage: enabled: true monitoringStack: alertingEnabled: true scrapeInterval: 30s storage: strategy: persistent retention: 24h persistent: pvcStorageRequest: 20G '
Verification
Verify that the
alertmanagerandprometheuspods are available:$ oc get pods -l alertmanager=metric-storage NAME READY STATUS RESTARTS AGE alertmanager-metric-storage-0 2/2 Running 0 46s alertmanager-metric-storage-1 2/2 Running 0 46s $ oc get pods -l prometheus=metric-storage NAME READY STATUS RESTARTS AGE prometheus-metric-storage-0 3/3 Running 0 46sInspect the resulting Ceilometer pods:
CEILOMETETR_POD=`oc get pods -l service=ceilometer | tail -n 1 | cut -f 1 -d' '` oc exec -t $CEILOMETETR_POD -c ceilometer-central-agent -- cat /etc/ceilometer/ceilometer.confInspect enabled pollsters:
$ oc get secret ceilometer-config-data -o jsonpath="{.data['polling\.yaml\.j2']}" | base64 -dOptional: Override default pollsters according to the requirements of your environment:
$ oc patch openstackcontrolplane controlplane --type=merge --patch ' spec: telemetry: template: ceilometer: defaultConfigOverwrite: polling.yaml.j2: | --- sources: - name: pollsters interval: 100 meters: - volume.* - image.size enabled: true secret: osp-secret '
Next steps
Optional: Patch the
OpenStackControlPlaneCR to includelogging:$ oc patch openstackcontrolplane openstack --type=merge --patch ' spec: telemetry: template: logging: enabled: false ipaddr: 172.17.0.80 port: 10514 cloNamespace: openshift-logging '
4.19. Adopting the DNS service Copy linkLink copied to clipboard!
This content in this section is available in this release as a Technology Preview, and therefore is not fully supported by Red Hat. It should only be used for testing, and should not be deployed in a production environment. For more information, see Technology Preview.
To adopt the DNS service (designate), you patch an existing OpenStackControlPlane custom resource (CR) where the DNS service is disabled. The patch starts the service with the configuration parameters that are provided by the Red Hat OpenStack Platform (RHOSP) environment.
Procedure
Create an alias for the
openstackcommand:$ alias openstack="oc exec -t openstackclient -- openstack"To isolate the DNS service networks, add the network interfaces for the VLAN base interfaces:
$ oc get --no-headers nncp --output=custom-columns=NAME:.metadata.name | while read; do interfaces=$(oc get nncp $REPLY -o jsonpath="{.spec.desiredState.interfaces[*].name}") (echo $interfaces | grep -w -q "enp6s0.25\|enp6s0.26") || \ oc patch nncp $REPLY --type json --patch ' [{ "op": "add", "path": "/spec/desiredState/interfaces/-", "value": { "description": "Designate vlan interface", "name": "enp6s0.25", "state": "up", "type": "vlan", "vlan": { "base-iface": "<enp6s0>", "id": 25, "reorder-headers": true }, "ipv4": { "address": [{"ip": "172.28.0.5", "prefix-length": 24}], "enabled": true, "dhcp": false }, "ipv6": { "enabled": false } } }, { "op": "add", "path": "/spec/desiredState/interfaces/-", "value": { "description": "Designate external vlan interface", "name": "enp6s0.26", "state": "up", "type": "vlan", "vlan": { "base-iface": "<enp6s0>", "id": 26, "reorder-headers": true }, "ipv4": { "address": [{"ip": "172.50.0.5", "prefix-length": 24}], "enabled": true, "dhcp": false }, "ipv6": { "enabled": false } } }]' donewhere:
<enp6s0>- Specifies the name of the network interface in your Red Hat OpenShift Container Platform (RHOCP) setup.
Configure the DNS service internal network attachment definition:
$ cat >> designate-nad.yaml << EOF_CAT apiVersion: k8s.cni.cncf.io/v1 kind: NetworkAttachmentDefinition metadata: labels: osp/net: designate name: designate spec: config: | { "cniVersion": "0.3.1", "name": "designate", "type": "macvlan", "master": "enp6s0.25", "ipam": { "type": "whereabouts", "range": "172.28.0.0/24", "range_start": "172.28.0.30", "range_end": "172.28.0.70" } } EOF_CATApply the configuration:
$ oc apply -f designate-nad.yamlConfigure the DNS service external network attachment definition:
$ cat >> designateext-nad.yaml << EOF_CAT apiVersion: k8s.cni.cncf.io/v1 kind: NetworkAttachmentDefinition metadata: labels: osp/net: designateext name: designateext spec: config: | { "cniVersion": "0.3.1", "name": "designateext", "type": "macvlan", "master": "enp6s0.26", "ipam": { "type": "whereabouts", "range": "172.50.0.0/24", "range_start": "172.50.0.30", "range_end": "172.50.0.70" } } EOF_CATApply the configuration:
$ oc apply -f designateext-nad.yamlCreate a MetalLB IPAddressPool for the DNS service external network:
$ oc apply -f - <<EOF apiVersion: metallb.io/v1beta1 kind: IPAddressPool metadata: name: designateext namespace: metallb-system spec: autoAssign: false addresses: - 172.50.0.80-172.50.0.90 EOFCreate an L2Advertisement for the DNS service external network:
$ oc apply -f - <<EOF apiVersion: metallb.io/v1beta1 kind: L2Advertisement metadata: name: designateext namespace: metallb-system spec: ipAddressPools: - designateext interfaces: - enp6s0.26 EOFRetrieve the nameserver records from the DNS service database to preserve DNS delegation:
$ DB_ROOT_PASSWORD=$(oc get -o json secret/osp-secret | jq -r .data.DbRootPassword | base64 -d) $ oc exec openstack-galera-0 -c galera -- mysql -rs -uroot -p"$DB_ROOT_PASSWORD" \ -e "select a.hostname, a.priority from designate.pool_ns_records a;" > /tmp/designate_ns_records_raw.txtParse the nameserver records into YAML format for the DNS service CR:
$ raw=/tmp/designate_ns_records_raw.txt $ out=/tmp/designate_ns_records.yaml $ if [ ! -s "$raw" ]; then echo "[]" > "$out" else awk '{ gsub(/\.$/, "", $1); if (NF >= 2) printf "- hostname: %s.\n priority: %s\n", $1, $2 }' "$raw" > "$out" fiEnable the DNS service Redis instance in {rhocp_short}:
$ oc patch openstackcontrolplane openstack --type=merge --patch ' spec: redis: enabled: true templates: designate-redis: replicas: 1 'Wait for the DNS service Redis instance to become ready:
$ oc wait --for condition=Ready --timeout=60s redises.redis.openstack.org/designate-redisCreate the DNS service CR patch file:
$ cat > /tmp/designate_osp_patch.yaml << MAINCR spec: designate: enabled: true template: designateAPI: networkAttachments: - internalapi designateWorker: networkAttachments: - designate replicas: 3 designateCentral: replicas: 3 designateProducer: replicas: 3 designateBackendbind9: networkAttachments: - designate override: services: - metadata: annotations: metallb.universe.tf/address-pool: designateext metallb.universe.tf/allow-shared-ip: designateext metallb.universe.tf/loadBalancerIPs: 172.50.0.80 - metadata: annotations: metallb.universe.tf/address-pool: designateext metallb.universe.tf/allow-shared-ip: designateext metallb.universe.tf/loadBalancerIPs: 172.50.0.81 - metadata: annotations: metallb.universe.tf/address-pool: designateext metallb.universe.tf/allow-shared-ip: designateext metallb.universe.tf/loadBalancerIPs: 172.50.0.82 replicas: 3 storageClass: <storage_class> storageRequest: 10G designateMdns: networkAttachments: - designate replicas: 3 designateUnbound: networkAttachments: - designate replicas: 2 override: services: - metadata: annotations: metallb.universe.tf/address-pool: designateext metallb.universe.tf/allow-shared-ip: designateext metallb.universe.tf/loadBalancerIPs: 172.50.0.85 - metadata: annotations: metallb.universe.tf/address-pool: designateext metallb.universe.tf/allow-shared-ip: designateext metallb.universe.tf/loadBalancerIPs: 172.50.0.86 MAINCRwhere:
<storage_class>-
Specifies the storage class name for persistent volumes (for example,
local-storage).
Append the nameserver records to the patch file:
$ ns_yaml=/tmp/designate_ns_records.yaml $ patch_file=/tmp/designate_osp_patch.yaml $ echo ' nsRecords:' >> "$patch_file" $ if [ -s "$ns_yaml" ] && [ "$(cat "$ns_yaml")" != "[]" ]; then sed 's/^/ /' "$ns_yaml" >> "$patch_file" else echo ' []' >> "$patch_file" fiEnable the DNS service in {rhocp_short}:
$ oc patch openstackcontrolplane openstack --type=merge --patch-file /tmp/designate_osp_patch.yamlWait for the DNS service to become ready:
$ oc wait --for condition=Ready --timeout=600s designate.designate.openstack.org/designate
4.20. Adopting autoscaling services Copy linkLink copied to clipboard!
To adopt services that enable autoscaling, you patch an existing OpenStackControlPlane custom resource (CR) where the Alarming services (aodh) are disabled. The patch starts the service with the configuration parameters that are provided by the Red Hat OpenStack Platform environment.
Prerequisites
- The source director environment is running.
- A Single Node OpenShift or OpenShift Local is running in the Red Hat OpenShift Container Platform (RHOCP) cluster.
You have adopted the following services:
- MariaDB
- Identity service (keystone)
- Orchestration service (heat)
- Telemetry service
Procedure
Patch the
OpenStackControlPlaneCR to deploy the autoscaling services:$ oc patch openstackcontrolplane openstack --type=merge --patch ' spec: telemetry: enabled: true template: autoscaling: enabled: true aodh: passwordSelector: aodhService: AodhPassword databaseAccount: aodh databaseInstance: openstack secret: osp-secret serviceUser: aodh heatInstance: heat 'Inspect the aodh pods:
$ AODH_POD=`oc get pods -l service=aodh | tail -n 1 | cut -f 1 -d' '` $ oc exec -t $AODH_POD -c aodh-api -- cat /etc/aodh/aodh.confCheck whether the aodh API service is registered in the Identity service:
$ openstack endpoint list | grep aodh | d05d120153cd4f9b8310ac396b572926 | regionOne | aodh | alarming | True | internal | http://aodh-internal.openstack.svc:8042 | | d6daee0183494d7a9a5faee681c79046 | regionOne | aodh | alarming | True | public | http://aodh-public.openstack.svc:8042 |Optional: Create aodh alarms with the
PrometheusAlarmalarm type:NoteYou must use the
PrometheusAlarmalarm type instead ofGnocchiAggregationByResourcesAlarm.$ openstack alarm create --name high_cpu_alarm \ --type prometheus \ --query "(rate(ceilometer_cpu{resource_name=~'cirros'})) * 100" \ --alarm-action 'log://' \ --granularity 15 \ --evaluation-periods 3 \ --comparison-operator gt \ --threshold 7000000000Verify that the alarm is enabled:
$ openstack alarm list +--------------------------------------+------------+------------------+-------------------+----------+ | alarm_id | type | name | state | severity | enabled | +--------------------------------------+------------+------------------+-------------------+----------+ | 209dc2e9-f9d6-40e5-aecc-e767ce50e9c0 | prometheus | prometheus_alarm | ok | low | True | +--------------------------------------+------------+------------------+-------------------+----------+
4.21. Pulling the configuration from a director deployment Copy linkLink copied to clipboard!
Before you start the data plane adoption workflow, back up the configuration from the Red Hat OpenStack Platform (RHOSP) services and director. You can then use the files during the configuration of the adopted services to ensure that nothing is missed or misconfigured.
Prerequisites
- The os-diff tool is installed and configured. For more information, see Comparing configuration files between deployments.
Procedure
Update your ssh parameters according to your environment in the
os-diff.cfg. Os-diff uses the ssh parameters to connect to your director node, and then query and download the configuration files:ssh_cmd=ssh -F ssh.config standalone container_engine=podman connection=ssh remote_config_path=/tmp/tripleoEnsure that the ssh command you provide in
ssh_cmdparameter is correct and includes key authentication.Enable the services that you want to include in the
/etc/os-diff/config.yamlfile, and disable the services that you want to exclude from the file. Ensure that you have the correct permissions to edit the file:$ chown ospng:ospng /etc/os-diff/config.yamlThe following example enables the default Identity service (keystone) to be included in the
/etc/os-diff/config.yamlfile:# service name and file location services: # Service name keystone: # Bool to enable/disable a service (not implemented yet) enable: true # Pod name, in both OCP and podman context. # It could be strict match or will only just grep the podman_name # and work with all the pods which matched with pod_name. # To enable/disable use strict_pod_name_match: true/false podman_name: keystone pod_name: keystone container_name: keystone-api # pod options # strict match for getting pod id in TripleO and podman context strict_pod_name_match: false # Path of the config files you want to analyze. # It could be whatever path you want: # /etc/<service_name> or /etc or /usr/share/<something> or even / # @TODO: need to implement loop over path to support multiple paths such as: # - /etc # - /usr/share path: - /etc/ - /etc/keystone - /etc/keystone/keystone.conf - /etc/keystone/logging.confRepeat this step for each RHOSP service that you want to disable or enable.
If you use non-containerized services, such as the
ovs-external-ids, pull the configuration or the command output. For example:services: ovs_external_ids: hosts: - standalone service_command: "ovs-vsctl list Open_vSwitch . | grep external_ids | awk -F ': ' '{ print $2; }'" cat_output: true path: - ovs_external_ids.json config_mapping: ovn-bridge-mappings: edpm_ovn_bridge_mappings ovn-bridge: edpm_ovn_bridge ovn-encap-type: edpm_ovn_encap_type ovn-monitor-all: ovn_monitor_all ovn-remote-probe-interval: edpm_ovn_remote_probe_interval ovn-ofctrl-wait-before-clear: edpm_ovn_ofctrl_wait_before_clearNoteYou must correctly configure an SSH configuration file or equivalent for non-standard services, such as OVS. The
ovs_external_idsservice does not run in a container, and the OVS data is stored on each host of your cloud, for example,controller_1/controller_2/, and so on.-
hostsspecifies the list of hosts, for example,compute-1,compute-2. -
service_command: "ovs-vsctl list Open_vSwitch . | grep external_ids | awk -F ': ' '{ print $2; }'"runs against the hosts. -
cat_output: trueprovides os-diff with the output of the command and stores the output in a file that is specified by the key path. -
config_mappingprovides a mapping between, in this example, the data plane custom resource definition and theovs-vsctloutput. ovn-bridge-mappings: edpm_ovn_bridge_mappingsmust be a list of strings, for example,["datacentre:br-ex"].Compare the values:
$ os-diff diff ovs_external_ids.json edpm.crd --crd --service ovs_external_idsFor example, to check the
/etc/yum.confon every host, you must put the following statement in theconfig.yamlfile. The following example uses a file calledyum_config:services: yum_config: hosts: - undercloud - controller_1 - compute_1 - compute_2 service_command: "cat /etc/yum.conf" cat_output: true path: - yum.conf
-
Pull the configuration:
NoteThe following command pulls all the configuration files that are included in the
/etc/os-diff/config.yamlfile. You can configure os-diff to update this file automatically according to your running environment by using the--updateor--update-onlyoption. These options set the podman information into theconfig.yamlfor all running containers. The podman information can be useful later, when all the Red Hat OpenStack Platform services are turned off.Note that when the
config.yamlfile is populated automatically you must provide the configuration paths manually for each service.# will only update the /etc/os-diff/config.yaml os-diff pull --update-only# will update the /etc/os-diff/config.yaml and pull configuration os-diff pull --update# will update the /etc/os-diff/config.yaml and pull configuration os-diff pullThe configuration is pulled and stored by default in the following directory:
/tmp/tripleo/
Verification
Verify that you have a directory for each service configuration in your local path:
▾ tmp/ ▾ tripleo/ ▾ glance/ ▾ keystone/
4.22. Rolling back the control plane adoption Copy linkLink copied to clipboard!
If you encountered a problem and are unable to complete the adoption of the Red Hat OpenStack Platform (RHOSP) control plane services, you can roll back the control plane adoption.
Do not attempt the rollback if you altered the data plane nodes in any way. You can only roll back the control plane adoption if you altered the control plane.
During the control plane adoption, services on the RHOSP control plane are stopped but not removed. The databases on the RHOSP control plane are not edited during the adoption procedure. The Red Hat OpenStack Services on OpenShift (RHOSO) control plane receives a copy of the original control plane databases. The rollback procedure assumes that the data plane has not yet been modified by the adoption procedure, and it is still connected to the RHOSP control plane.
The rollback procedure consists of the following steps:
- Restoring the functionality of the RHOSP control plane.
- Removing the partially or fully deployed RHOSO control plane.
Procedure
To restore the source cloud to a working state, start the RHOSP control plane services that you previously stopped during the adoption procedure:
ServicesToStart=("tripleo_horizon.service" "tripleo_keystone.service" "tripleo_barbican_api.service" "tripleo_barbican_worker.service" "tripleo_barbican_keystone_listener.service" "tripleo_cinder_api.service" "tripleo_cinder_api_cron.service" "tripleo_cinder_scheduler.service" "tripleo_cinder_volume.service" "tripleo_cinder_backup.service" "tripleo_glance_api.service" "tripleo_manila_api.service" "tripleo_manila_api_cron.service" "tripleo_manila_scheduler.service" "tripleo_neutron_api.service" "tripleo_placement_api.service" "tripleo_nova_api_cron.service" "tripleo_nova_api.service" "tripleo_nova_conductor.service" "tripleo_nova_metadata.service" "tripleo_nova_scheduler.service" "tripleo_nova_vnc_proxy.service" "tripleo_aodh_api.service" "tripleo_aodh_api_cron.service" "tripleo_aodh_evaluator.service" "tripleo_aodh_listener.service" "tripleo_aodh_notifier.service" "tripleo_ceilometer_agent_central.service" "tripleo_ceilometer_agent_compute.service" "tripleo_ceilometer_agent_ipmi.service" "tripleo_ceilometer_agent_notification.service" "tripleo_ovn_cluster_north_db_server.service" "tripleo_ovn_cluster_south_db_server.service" "tripleo_ovn_cluster_northd.service" "tripleo_octavia_api.service" "tripleo_octavia_health_manager.service" "tripleo_octavia_rsyslog.service" "tripleo_octavia_driver_agent.service" "tripleo_octavia_housekeeping.service" "tripleo_octavia_worker.service") PacemakerResourcesToStart=("galera-bundle" "haproxy-bundle" "rabbitmq-bundle" "openstack-cinder-volume" "openstack-cinder-backup" "openstack-manila-share") echo "Starting systemd OpenStack services" for service in ${ServicesToStart[*]}; do for i in {1..3}; do SSH_CMD=CONTROLLER${i}_SSH if [ ! -z "${!SSH_CMD}" ]; then if ${!SSH_CMD} sudo systemctl is-enabled $service &> /dev/null; then echo "Starting the $service in controller $i" ${!SSH_CMD} sudo systemctl start $service fi fi done done echo "Checking systemd OpenStack services" for service in ${ServicesToStart[*]}; do for i in {1..3}; do SSH_CMD=CONTROLLER${i}_SSH if [ ! -z "${!SSH_CMD}" ]; then if ${!SSH_CMD} sudo systemctl is-enabled $service &> /dev/null; then if ! ${!SSH_CMD} systemctl show $service | grep ActiveState=active >/dev/null; then echo "ERROR: Service $service is not running on controller $i" else echo "OK: Service $service is running in controller $i" fi fi fi done done echo "Starting pacemaker OpenStack services" for i in {1..3}; do SSH_CMD=CONTROLLER${i}_SSH if [ ! -z "${!SSH_CMD}" ]; then echo "Using controller $i to run pacemaker commands" for resource in ${PacemakerResourcesToStart[*]}; do if ${!SSH_CMD} sudo pcs resource config $resource &>/dev/null; then echo "Starting $resource" ${!SSH_CMD} sudo pcs resource enable $resource else echo "Service $resource not present" fi done break fi done echo "Checking pacemaker OpenStack services" for i in {1..3}; do SSH_CMD=CONTROLLER${i}_SSH if [ ! -z "${!SSH_CMD}" ]; then echo "Using controller $i to run pacemaker commands" for resource in ${PacemakerResourcesToStop[*]}; do if ${!SSH_CMD} sudo pcs resource config $resource &>/dev/null; then if ${!SSH_CMD} sudo pcs resource status $resource | grep Started >/dev/null; then echo "OK: Service $resource is started" else echo "ERROR: Service $resource is stopped" fi fi done break fi doneIf the Ceph NFS service is running on the deployment as a Shared File Systems service (manila) back end, you must restore the Pacemaker order and colocation constraints for the
openstack-manila-shareservice:$ sudo pcs constraint order start ceph-nfs then openstack-manila-share kind=Optional id=order-ceph-nfs-openstack-manila-share-Optional $ sudo pcs constraint colocation add openstack-manila-share with ceph-nfs score=INFINITY id=colocation-openstack-manila-share-ceph-nfs-INFINITY-
Verify that the source cloud is operational again, for example, you can run
openstackCLI commands such asopenstack server list, or check that you can access the Dashboard service (horizon). Remove the partially or fully deployed control plane so that you can attempt the adoption again later:
$ oc delete --ignore-not-found=true --wait=false openstackcontrolplane/openstack $ oc patch openstackcontrolplane openstack --type=merge --patch ' > metadata: > finalizers: [] > ' || true > >while oc get pod | grep rabbitmq-server-0; do > sleep 2 >done >while oc get pod | grep openstack-galera-0; do > sleep 2 >done $ oc delete --ignore-not-found=true --wait=false pod mariadb-copy-data $ oc delete --ignore-not-found=true --wait=false pvc mariadb-data $ oc delete --ignore-not-found=true --wait=false pod ovn-copy-data $ oc delete --ignore-not-found=true secret osp-secret
Next steps
After you restore the RHOSP control plane services, their internal state might have changed. Before you retry the adoption procedure, verify that all the control plane resources are removed and that there are no leftovers which could affect the following adoption procedure attempt. You must not use previously created copies of the database contents in another adoption attempt. You must make a new copy of the latest state of the original source database contents. For more information about making new copies of the database, see Migrating databases to the control plane.
Chapter 5. Adopting the data plane Copy linkLink copied to clipboard!
Adopting the Red Hat OpenStack Services on OpenShift (RHOSO) data plane involves the following steps:
- Stop any remaining services on the Red Hat OpenStack Platform (RHOSP) 17.1 control plane.
- Deploy the required custom resources.
- Perform a fast-forward upgrade on Compute services from RHOSP 17.1 to RHOSO 18.0.
- Adopt Networker services to the RHOSO data plane.
After the RHOSO control plane manages the newly deployed data plane, you must not re-enable services on the RHOSP 17.1 control plane and data plane. If you re-enable services, workloads are managed by two control planes or two data planes, resulting in data corruption, loss of control of existing workloads, inability to start new workloads, or other issues.
5.1. Stopping infrastructure management and Compute services Copy linkLink copied to clipboard!
You must stop cloud database nodes and messaging nodes on the Red Hat OpenStack Platform 17.1 control plane. Do not stop nodes that are running the following roles:
- Compute
- Storage
- Networker
-
Controller if running
OVN Controller Gateway agentnetwork agent
The following procedure applies to a standalone director deployment. You must stop the Pacemaker services on your host so that you can install libvirt packages when the Compute roles are adopted as data plane nodes. Modular libvirt daemons no longer run in podman containers on data plane nodes.
Prerequisites
Define the shell variables. Replace the following example values with values that apply to your environment:
CONTROLLER1_SSH="ssh -i <path_to_SSH_key> root@<controller-1 IP>" # ... # ... EDPM_PRIVATEKEY_PATH="<path_to_SSH_key>"-
CONTROLLER<X>_SSHdefines the SSH connection details for all Controller nodes, including cell Controller nodes, of the source director cloud. -
<path_to_SSH_key>defines the path to your SSH key.
-
Procedure
Stop the Pacemaker services:
PacemakerResourcesToStop=( "galera-bundle" "haproxy-bundle" "rabbitmq-bundle") echo "Stopping pacemaker services" for i in {1..3}; do SSH_CMD=CONTROLLER${i}_SSH if [ ! -z "${!SSH_CMD}" ]; then echo "Using controller $i to run pacemaker commands" for resource in ${PacemakerResourcesToStop[*]}; do if ${!SSH_CMD} sudo pcs resource config $resource; then ${!SSH_CMD} sudo pcs resource disable $resource fi done break fi done
5.2. Adopting Compute services to the RHOSO data plane Copy linkLink copied to clipboard!
Adopt your Compute (nova) services to the Red Hat OpenStack Services on OpenShift (RHOSO) data plane.
Prerequisites
- You have stopped the remaining control plane nodes, repositories, and packages on the Compute service (nova) hosts. For more information, see Stopping infrastructure management and Compute services.
-
If you have a Red Hat Ceph Storage environment, you have configured the Ceph back end for the
NovaLibvirtservice. For more information, see Configuring a Ceph back end. You have configured IP Address Management (IPAM):
$ oc apply -f - <<EOF apiVersion: network.openstack.org/v1beta1 kind: NetConfig metadata: name: netconfig spec: networks: - name: ctlplane dnsDomain: ctlplane.example.com subnets: - name: subnet1 allocationRanges: - end: 192.168.122.120 start: 192.168.122.100 - end: 192.168.122.200 start: 192.168.122.150 cidr: 192.168.122.0/24 gateway: 192.168.122.1 - name: internalapi dnsDomain: internalapi.example.com subnets: - name: subnet1 allocationRanges: - end: 172.17.0.250 start: 172.17.0.100 cidr: 172.17.0.0/24 vlan: 20 - name: External dnsDomain: external.example.com subnets: - name: subnet1 allocationRanges: - end: 10.0.0.250 start: 10.0.0.100 cidr: 10.0.0.0/24 gateway: 10.0.0.1 - name: storage dnsDomain: storage.example.com subnets: - name: subnet1 allocationRanges: - end: 172.18.0.250 start: 172.18.0.100 cidr: 172.18.0.0/24 vlan: 21 - name: storagemgmt dnsDomain: storagemgmt.example.com subnets: - name: subnet1 allocationRanges: - end: 172.20.0.250 start: 172.20.0.100 cidr: 172.20.0.0/24 vlan: 23 - name: tenant dnsDomain: tenant.example.com subnets: - name: subnet1 allocationRanges: - end: 172.19.0.250 start: 172.19.0.100 cidr: 172.19.0.0/24 vlan: 22 EOF-
If
neutron-sriov-nic-agentis running on your Compute service nodes, ensure that the physical device mappings match the values that are defined in theOpenStackDataPlaneNodeSetcustom resource (CR). For more information, see Pulling the configuration from a director deployment. To prevent workload shutdown, you have created the
tripleo_nova_libvirt_guests_service_cleanup.yamlplaybook:- become: true hosts: all strategy: tripleo_free name: disable and clean tripleo_nova_libvirt_guests tasks: - name: tripleo_nova_libvirt_guests removal become: true shell: | set -o pipefail systemctl disable tripleo_nova_libvirt_guests.service rm -f /etc/systemd/system/tripleo_nova_libvirt_guests.service rm -f /etc/systemd/system/virt-guest-shutdown.target systemctl daemon-reloadYou have used the following command to run the playbook:
$ansible-playbook -i overcloud-deploy/overcloud/tripleo-ansible-inventory.yaml tripleo_nova_libvirt_guests_service_cleanup.yamlYou have defined the shell variables to run the script that runs the upgrade:
$ CEPH_FSID=$(oc get secret ceph-conf-files -o json | jq -r .data."ceph.conf" | base64 -d | grep fsid | sed -e s/fsid = //) $ alias openstack="oc exec -t openstackclient -- openstack" $ DEFAULT_CELL_NAME="cell3" $ RENAMED_CELLS="cell1 cell2 $DEFAULT_CELL_NAME" $ declare -A COMPUTES_CELL1 $ export COMPUTES_CELL1=( > ["standalone.localdomain"]="192.168.122.100" > # <compute1> > # <compute2> > # <compute3> >) $ declare -A COMPUTES_CELL2 $ export COMPUTES_CELL2=( > # <compute1> >) $ declare -A COMPUTES_CELL3 $export COMPUTES_CELL3=( > # <compute1> > # <compute2> >) $ declare -A COMPUTES_API_CELL1 $export COMPUTES_API_CELL1=( > ["standalone.localdomain"]="172.17.0.100" > ["standalone2.localdomain"]="172.17.0.101" >) $ NODESETS="" $ for CELL in $(echo $RENAMED_CELLS); do > ref="COMPUTES_$(echo ${CELL}|tr [:lower:] [:upper:])" > eval names=\${!${ref}[@]} > [ -z "$names" ] && continue > NODESETS="openstack-${CELL}, $NODESETS" >done $ NODESETS="[${NODESETS%,*}]"-
DEFAULT_CELL_NAME="cell3"defines the source clouddefaultcell that acquires a newDEFAULT_CELL_NAMEon the destination cloud after adoption. In a multi-cell adoption scenario, you can retain the original name,default, or create a new cell default name by providing the incremented index of the last cell in the source cloud. For example, if the incremented index of the last cell iscell5, the new cell default name iscell6. -
export COMPUTES_CELL1=For each cell, update the<["standalone.localdomain"]="x.x.x.x">value and theCOMPUTES_CELL<X>value with the names and IP addresses of the Compute service nodes that are connected to thectlplaneandinternalapinetworks. Do not specify a real FQDN defined for each network. Always use the same hostname for each connected network of a Compute node. Provide the IP addresses and the names of the hosts on the remaining networks of the source cloud as needed, or you can manually adjust the files that you generate in step 9 of this procedure. -
<compute1>,<compute2>, and<compute3>specifies the names of your Compute service nodes for each cell. Assign all Compute service nodes from the source cloudcell1cell intoCOMPUTES_CELL1, and so on. -
export COMPUTES_CELL<X>=(specifies all Compute service nodes that you assign from the source clouddefaultcell intoCOMPUTES_CELL<X>andCOMPUTES_API_CELL<X>, where<X>is theDEFAULT_CELL_NAMEenvironment variable value. In this example, theDEFAULT_CELL_NAMEenvironment variable value equalscell3. -
export COMPUTES_API_CELL1=(For each cell, update the<["standalone.localdomain"]="192.168.122.100">value and theCOMPUTES_API_CELL<X>value with the names and IP addresses of the Compute service nodes that are connected to thectlplaneandinternalapinetworks.["standalone.localdomain"]="192.168.122.100"defines the custom DNS domain in the FQDN value of the nodes. This value is used in the data plane node setspec.nodes.<NODE NAME>.hostName. Do not specify a real FQDN defined for each network. Use the same hostname for each of its connected networks. Provide the IP addresses and the names of the hosts on the remaining networks of the source cloud as needed, or you can manually adjust the files that you generate in step 9 of this procedure. NODESETS="'openstack-${CELL}', $NODESETS"specifies the cells that contain Compute nodes. Cells that do not contain Compute nodes are omitted from this template because no node sets are created for the cells.NoteIf you deployed the source cloud with a
defaultcell, and want to rename it during adoption, define the new name that you want to use, as shown in the following example:$ DEFAULT_CELL_NAME="cell1" $ RENAMED_CELLS="cell1"
-
Do not set a value for the CEPH_FSID parameter if the local storage back end is configured by the Compute service for libvirt. The storage back end must match the source cloud storage back end. You cannot change the storage back end during adoption.
Procedure
Create an SSH authentication secret for the data plane nodes:
$ oc apply -f - <<EOF apiVersion: v1 kind: Secret metadata: name: dataplane-adoption-secret data: ssh-privatekey: | $(cat <path_to_SSH_key> | base64 | sed 's/^/ /') EOFReplace
<path_to_SSH_key>with the path to your SSH key.For more information about creating data plane secrets, see Creating the data plane secrets in Deploying Red Hat OpenStack Services on OpenShift.
Generate an ssh key-pair
nova-migration-ssh-keysecret:$ cd "$(mktemp -d)" $ ssh-keygen -f ./id -t ecdsa-sha2-nistp521 -N '' $ oc get secret nova-migration-ssh-key || oc create secret generic nova-migration-ssh-key \ --from-file=ssh-privatekey=id \ --from-file=ssh-publickey=id.pub \ --type kubernetes.io/ssh-auth $ rm -f id* $ cd -If TLS Everywhere is enabled, set
LIBVIRT_PASSWORDto match the existing RHOSP deployment password:declare -A TRIPLEO_PASSWORDS TRIPLEO_PASSWORDS[default]="$HOME/overcloud-passwords.yaml" LIBVIRT_PASSWORD=$(cat ${TRIPLEO_PASSWORDS[default]} | grep ' LibvirtTLSPassword:' | awk -F ': ' '{ print $2; }') LIBVIRT_PASSWORD_BASE64=$(echo -n "$LIBVIRT_PASSWORD" | base64)Create libvirt-secret when TLS-e is enabled:
$ oc apply -f - <<EOF apiVersion: v1 kind: Secret metadata: name: libvirt-secret type: Opaque data: LibvirtPassword: ${LIBVIRT_PASSWORD_BASE64} EOF
Create a configuration map to use for all cells to configure a local storage back end for libvirt:
$ oc apply -f - <<EOF apiVersion: v1 kind: ConfigMap metadata: name: nova-cells-global-config data: 99-nova-compute-cells-workarounds.conf: | [workarounds] disable_compute_service_check_for_ffu=true EOF-
dataprovides the configuration files for all the cells. 99-nova-compute-cells-workarounds.conf: |specifies the index of the<*.conf>files. There is a requirement to index the<*.conf>files from 03 to 99, based on precedence. A<99-*.conf>file takes the highest precedence, while indexes below 03 are reserved for internal use.NoteIf you adopt a live cloud, you might be required to carry over additional configurations for the default
novadata plane services that are stored in the cell1 defaultnova-extra-configconfiguration map. Do not delete or overwrite the existing configuration in thecell1defaultnova-extra-configconfiguration map that is assigned tonova. Overwriting the configuration can break the data place services that rely on specific contents of thenova-extra-configconfiguration map.
-
Configure a Red Hat Ceph Storage back end for libvirt:
$ oc apply -f - <<EOF apiVersion: v1 kind: ConfigMap metadata: name: nova-cells-global-config data: 99-nova-compute-cells-workarounds.conf: | [workarounds] disable_compute_service_check_for_ffu=true 03-ceph-nova.conf: | [libvirt] images_type=rbd images_rbd_pool=vms images_rbd_ceph_conf=/etc/ceph/ceph.conf images_rbd_glance_store_name=default_backend images_rbd_glance_copy_poll_interval=15 images_rbd_glance_copy_timeout=600 rbd_user=openstack rbd_secret_uuid=$CEPH_FSID EOFNoteFor Red Hat Ceph Storage environments with multi-cell configurations, you must name configuration maps and Red Hat OpenStack Platform data plane services similar to the following examples:
nova-custom-ceph-cellXandnova-compute-extraconfig-cellX.NoteFor Distributed Compute Node (DCN) deployments, do not use the single
nova-cells-global-configConfigMap. Create a per-siteConfigMapand per-siteOpenStackDataPlaneServicefor each site in your DCN deployment. Each site’s Compute service nodes require a different Red Hat Ceph Storage configuration and a different Image service endpoint. For more information, see Adopting Compute services with multiple Ceph back ends (DCN).Create the data plane services for Compute service cells to enable pre-upgrade workarounds, and to configure the Compute services for your chosen storage back end:
for CELL in $(echo $RENAMED_CELLS); do oc apply -f - <<EOF apiVersion: dataplane.openstack.org/v1beta1 kind: OpenStackDataPlaneService metadata: name: nova-$CELL spec: dataSources: - secretRef: name: nova-$CELL-compute-config - secretRef: name: nova-migration-ssh-key - configMapRef: name: nova-cells-global-config playbook: osp.edpm.nova caCerts: combined-ca-bundle edpmServiceType: nova containerImageFields: - NovaComputeImage - EdpmIscsidImage EOF done-
spec.dataSources.secretRefspecifies an additional auto-generatednova-cell<X>-metadata-neutron-configsecret to enable a local metadata service for cell<X>. You should also setspec.nova.template.cellTemplates.cell<X>.metadataServiceTemplate.enablein theOpenStackControlPlane/openstackCR, as described in Adopting the Compute service. You can configure a single top-level metadata, or define the metadata per cell. -
nova-$CELL-compute-configspecifies the secret that auto-generates for eachcell<X>. You must append thenova-cell<X>-compute-configfor each customOpenStackDataPlaneServiceCR that is related to the Compute service. nova-migration-ssh-keyspecifies the secret that you must append for each customOpenStackDataPlaneServiceCR that is related to the Compute service.NoteWhen creating your data plane services for Compute service cells, review the following considerations:
-
In this example, the same
nova-migration-ssh-keykey is shared across cells. However, you should use different keys for different cells. -
For simple configuration overrides, you do not need a custom data plane service. However, to reconfigure the cell,
cell1, the safest option is to create a custom service and a dedicated configuration map for it. -
The cell,
cell1, is already managed with the defaultOpenStackDataPlaneServiceCR callednovaand itsnova-extra-configconfiguration map. Do not change the default data plane servicenovadefinition. The changes are lost when the RHOSO operator is updated with OLM. -
When a cell spans multiple node sets, give the custom
OpenStackDataPlaneServiceresources a name that relates to the node set, for example,nova-cell1-nfvandnova-cell1-enterprise. The auto-generated configuration maps are then namednova-cell1-nfv-extra-configandnova-cell1-enterprise-extra-config. - Different configurations for nodes in multiple node sets of the same cell are also supported, but are not covered in this guide.
-
In this example, the same
-
If TLS Everywhere is enabled, append the following content to the
OpenStackDataPlaneServiceCR:tlsCerts: nova: contents: - dnsnames - ips networks: - ctlplane issuer: osp-rootca-issuer-internal edpmRoleServiceName: nova caCerts: combined-ca-bundle edpmServiceType: novaCreate a secret for the subscription manager:
$ oc create secret generic subscription-manager \ --from-literal rhc_auth='{"login": {"username": "<subscription_manager_username>", "password": "<subscription_manager_password>"}}'-
Replace
<subscription_manager_username>with the applicable username. -
Replace
<subscription_manager_password>with the applicable password.
-
Replace
Create a secret for the Red Hat registry:
$ oc create secret generic redhat-registry \ --from-literal edpm_container_registry_logins='{"registry.redhat.io": {"<registry_username>": "<registry_password>"}}'-
Replace
<registry_username>with the applicable username. Replace
<registry_password>with the applicable password.NoteYou do not need to reference the
subscription-managersecret in thedataSourcesfield of theOpenStackDataPlaneServiceCR. The secret is already passed in with a node-specificOpenStackDataPlaneNodeSetCR in theansibleVarsFromproperty in thenodeTemplatefield.
-
Replace
Create the data plane node set definitions for each cell:
$ declare -A names $ for CELL in $(echo $RENAMED_CELLS); do ref="COMPUTES_$(echo ${CELL}|tr [:lower:] [:upper:])" eval names=\${!${ref}[@]} ref_api="COMPUTES_API_$(echo ${CELL}|tr [:lower:] [:upper:])" [ -z "$names" ] && continue ind=0 rm -f computes-$CELL for compute in $names; do ip="${ref}[$compute]" ip_api="${ref_api}[$compute]" cat >> computes-$CELL << EOF ${compute}: hostName: $compute ansible: ansibleHost: $compute networks: - defaultRoute: true fixedIP: ${!ip} name: ctlplane subnetName: subnet1 - name: internalapi subnetName: subnet1 fixedIP: ${!ip_api} - name: storage subnetName: subnet1 - name: tenant subnetName: subnet1 EOF ind=$(( ind + 1 )) done test -f computes-$CELL || continue cat > nodeset-${CELL}.yaml <<EOF apiVersion: dataplane.openstack.org/v1beta1 kind: OpenStackDataPlaneNodeSet metadata: name: openstack-$CELL spec: tlsEnabled: false networkAttachments: - ctlplane preProvisioned: true services: - redhat - bootstrap - download-cache - configure-network - validate-network - install-os - configure-os - ssh-known-hosts - run-os - reboot-os - install-certs - ovn - neutron-metadata - libvirt - nova-$CELL - telemetry env: - name: ANSIBLE_CALLBACKS_ENABLED value: "profile_tasks" - name: ANSIBLE_FORCE_COLOR value: "True" - name: ANSIBLE_VERBOSITY value: "3" nodeTemplate: ansibleSSHPrivateKeySecret: dataplane-adoption-secret ansible: ansibleUser: root ansibleVarsFrom: - secretRef: name: subscription-manager - secretRef: name: redhat-registry ansibleVars: rhc_release: 9.2 rhc_repositories: - {name: "*", state: disabled} - {name: "rhel-9-for-x86_64-baseos-eus-rpms", state: enabled} - {name: "rhel-9-for-x86_64-appstream-eus-rpms", state: enabled} - {name: "rhel-9-for-x86_64-highavailability-eus-rpms", state: enabled} - {name: "rhoso-18.0-for-rhel-9-x86_64-rpms", state: enabled} - {name: "fast-datapath-for-rhel-9-x86_64-rpms", state: enabled} - {name: "rhceph-7-tools-for-rhel-9-x86_64-rpms", state: enabled} edpm_bootstrap_release_version_package: [] # edpm_network_config # Default nic config template for a EDPM node # These vars are edpm_network_config role vars edpm_network_config_template: | --- {% set mtu_list = [ctlplane_mtu] %} {% for network in nodeset_networks %} {% set _ = mtu_list.append(lookup(vars, networks_lower[network] ~ _mtu)) %} {%- endfor %} {% set min_viable_mtu = mtu_list | max %} network_config: - type: ovs_bridge name: {{ neutron_physical_bridge_name }} mtu: {{ min_viable_mtu }} use_dhcp: false dns_servers: {{ ctlplane_dns_nameservers }} domain: {{ dns_search_domains }} addresses: - ip_netmask: {{ ctlplane_ip }}/{{ ctlplane_cidr }} routes: {{ ctlplane_host_routes }} members: - type: interface name: nic1 mtu: {{ min_viable_mtu }} # force the MAC address of the bridge to this interface primary: true {% for network in nodeset_networks %} - type: vlan mtu: {{ lookup(vars, networks_lower[network] ~ _mtu) }} vlan_id: {{ lookup(vars, networks_lower[network] ~ _vlan_id) }} addresses: - ip_netmask: {{ lookup(vars, networks_lower[network] ~ _ip) }}/{{ lookup(vars, networks_lower[network] ~ _cidr) }} routes: {{ lookup(vars, networks_lower[network] ~ _host_routes) }} {% endfor %} edpm_network_config_nmstate: false # Control resolv.conf management by NetworkManager # false = disable NetworkManager resolv.conf update (default) # true = enable NetworkManager resolv.conf update edpm_bootstrap_network_resolvconf_update: false edpm_network_config_hide_sensitive_logs: false # # These vars are for the network config templates themselves and are # considered EDPM network defaults. neutron_physical_bridge_name: br-ctlplane neutron_public_interface_name: eth0 # edpm_nodes_validation edpm_nodes_validation_validate_controllers_icmp: false edpm_nodes_validation_validate_gateway_icmp: false # edpm ovn-controller configuration edpm_ovn_bridge_mappings: [<"bridge_mappings">] edpm_ovn_bridge: br-int edpm_ovn_encap_type: geneve ovn_monitor_all: true edpm_ovn_remote_probe_interval: 60000 edpm_ovn_ofctrl_wait_before_clear: 8000 timesync_ntp_servers: - hostname: clock.redhat.com - hostname: clock2.redhat.com edpm_bootstrap_command: | set -euxo pipefail dnf -y upgrade openstack-selinux rm -f /run/virtlogd.pid gather_facts: false # edpm firewall, change the allowed CIDR if needed edpm_sshd_configure_firewall: true edpm_sshd_allowed_ranges: [192.168.122.0/24] # Do not attempt OVS major upgrades here edpm_ovs_packages: - openvswitch3.3 edpm_default_mounts: - path: /dev/hugepages<size> opts: pagesize=<size> fstype: hugetlbfs group: hugetlbfs nodes: EOF cat computes-$CELL >> nodeset-${CELL}.yaml done-
${compute}.hostNamespecifies the FQDN for the node if your deployment has a custom DNS Domain. -
${compute}.networksspecifies the network composition. The network composition must match the source cloud configuration to avoid data plane connectivity downtime. Thectlplanenetwork must come first. The commands only retain IP addresses for the hosts on thectlplaneandinternalapinetworks. Repeat this step for other isolated networks, or update the resulting files manually. -
metadata.name:specifies the node set names for each cell, for example,openstack-cell1,openstack-cell2. Only create node sets for cells that contain Compute nodes. -
spec.tlsEnabledspecifies whether TLS Everywhere is enabled. If it is enabled, changetlsEnabledtotrue. -
spec.servicesspecifies the services to be adopted. If you are not adopting telemetry services, omit it from the services list. -
neutron_physical_bridge_name: br-ctlplanespecifies the bridge name. The bridge name and other OVN and Networking service-specific values must match the source cloud configuration to avoid data plane connectivity downtime. -
edpm_ovn_bridge_mappings: Replace[<"bridge_mappings">]with the value of the bridge mappings in your configuration, for example,["datacentre:br-ctlplane"]. path: /dev/hugepages<size>andopts: pagesize=<size>configures huge pages. Replace<size>with the size of the page. To configure multi-sized huge pages, create more items in the list. Note that the mount points must match the source cloud configuration.NoteEnsure that you use the same
ovn-controllersettings in theOpenStackDataPlaneNodeSetCR that you used in the Compute service nodes before adoption. This configuration is stored in theexternal_idscolumn in theOpen_vSwitchtable in the Open vSwitch database:$ ovs-vsctl list Open . ... external_ids : {hostname=standalone.localdomain, ovn-bridge=br-int, ovn-bridge-mappings=<bridge_mappings>, ovn-chassis-mac-mappings="datacentre:1e:0a:bb:e6:7c:ad", ovn-encap-ip="172.19.0.100", ovn-encap-tos="0", ovn-encap-type=geneve, ovn-match-northd-version=False, ovn-monitor-all=True, ovn-ofctrl-wait-before-clear="8000", ovn-openflow-probe-interval="60", ovn-remote="tcp:ovsdbserver-sb.openstack.svc:6642", ovn-remote-probe-interval="60000", rundir="/var/run/openvswitch", system-id="2eec68e6-aa21-4c95-a868-31aeafc11736"} ...
-
Deploy the
OpenStackDataPlaneNodeSetCRs for each Compute cell:for CELL in $(echo $RENAMED_CELLS); do test -f nodeset-${CELL}.yaml || continue oc apply -f nodeset-${CELL}.yaml doneIf you use a Red Hat Ceph Storage back end for Block Storage service (cinder), prepare the adopted data plane workloads:
for CELL in $(echo $RENAMED_CELLS); do test -f nodeset-${CELL}.yaml || continue oc patch osdpns/openstack-$CELL --type=merge --patch " spec: services: - redhat - bootstrap - download-cache - configure-network - validate-network - install-os - configure-os - ssh-known-hosts - run-os - reboot-os - ceph-client - install-certs - ovn - neutron-metadata - libvirt - nova-$CELL - telemetry nodeTemplate: extraMounts: - extraVolType: Ceph volumes: - name: ceph secret: secretName: ceph-conf-files mounts: - name: ceph mountPath: "/etc/ceph" readOnly: true " doneOptional: Enable
neutron-sriov-nic-agentin theOpenStackDataPlaneNodeSetCR:for CELL in $(echo $RENAMED_CELLS); do test -f nodeset-${CELL}.yaml || continue oc patch openstackdataplanenodeset openstack-$CELL --type='json' --patch='[ { "op": "add", "path": "/spec/services/-", "value": "neutron-sriov" }, { "op": "add", "path": "/spec/nodeTemplate/ansible/ansibleVars/edpm_neutron_sriov_agent_SRIOV_NIC_physical_device_mappings", "value": "dummy_sriov_net:dummy-dev" }, { "op": "add", "path": "/spec/nodeTemplate/ansible/ansibleVars/edpm_neutron_sriov_agent_SRIOV_NIC_resource_provider_bandwidths", "value": "dummy-dev:40000000:40000000" }, { "op": "add", "path": "/spec/nodeTemplate/ansible/ansibleVars/edpm_neutron_sriov_agent_SRIOV_NIC_resource_provider_hypervisors", "value": "dummy-dev:standalone.localdomain" }]' doneOptional: Enable
neutron-dhcpin theOpenStackDataPlaneNodeSetCR:for CELL in $(echo $RENAMED_CELLS); do test -f nodeset-${CELL}.yaml || continue oc patch openstackdataplanenodeset openstack-$CELL --type='json' --patch='[ { "op": "add", "path": "/spec/services/-", "value": "neutron-dhcp" }]' doneNoteTo use
neutron-dhcpwith OVN for the Bare Metal Provisioning service (ironic), you must set thedisable_ovn_dhcp_for_baremetal_portsconfiguration option for the Networking service (neutron) totrue. You can set this configuration in theNeutronAPIspec:.. spec: serviceUser: neutron ... customServiceConfig: | [DEFAULT] dhcp_agent_notification = True [ovn] disable_ovn_dhcp_for_baremetal_ports = trueRun the pre-adoption validation:
Create the validation service:
$ oc apply -f - <<EOF apiVersion: dataplane.openstack.org/v1beta1 kind: OpenStackDataPlaneService metadata: name: pre-adoption-validation spec: playbook: osp.edpm.pre_adoption_validation EOFCreate a
OpenStackDataPlaneDeploymentCR that runs only the validation:$ oc apply -f - <<EOF apiVersion: dataplane.openstack.org/v1beta1 kind: OpenStackDataPlaneDeployment metadata: name: openstack-pre-adoption spec: nodeSets: $NODESETS servicesOverride: - pre-adoption-validation EOFNoteIf you created different migration SSH keys for different
OpenStackDataPlaneServiceCRs, you should also define a separateOpenStackDataPlaneDeploymentCR for each node set or node sets that represent a cell.When the validation is finished, confirm that the status of the Ansible EE pods is
Completed:$ watch oc get pod -l app=openstackansibleee$ oc logs -l app=openstackansibleee -f --max-log-requests 20Wait for the deployment to reach the
Readystatus:$ oc wait --for condition=Ready openstackdataplanedeployment/openstack-pre-adoption --timeout=10mImportantIf any openstack-pre-adoption validations fail, you must reference the Ansible logs to determine which ones were unsuccessful, and then try the following troubleshooting options:
-
If the hostname validation failed, check that the hostname of the data plane node is correctly listed in the
OpenStackDataPlaneNodeSetCR. -
If the kernel argument check failed, ensure that the kernel argument configuration in the
edpm_kernel_argsandedpm_kernel_hugepagesvariables in theOpenStackDataPlaneNodeSetCR is the same as the kernel argument configuration that you used in the Red Hat OpenStack Platform (RHOSP) 17.1 node. -
If the tuned profile check failed, ensure that the
edpm_tuned_profilevariable in theOpenStackDataPlaneNodeSetCR is configured to use the same profile as the one set on the RHOSP 17.1 node.
-
If the hostname validation failed, check that the hostname of the data plane node is correctly listed in the
Remove the remaining director services:
Create an
OpenStackDataPlaneServiceCR to clean up the data plane services you are adopting:$ oc apply -f - <<EOF apiVersion: dataplane.openstack.org/v1beta1 kind: OpenStackDataPlaneService metadata: name: tripleo-cleanup spec: playbook: osp.edpm.tripleo_cleanup EOFCreate the
OpenStackDataPlaneDeploymentCR to run the clean-up:$ oc apply -f - <<EOF apiVersion: dataplane.openstack.org/v1beta1 kind: OpenStackDataPlaneDeployment metadata: name: tripleo-cleanup spec: nodeSets: $NODESETS servicesOverride: - tripleo-cleanup EOF
When the clean-up is finished, deploy the
OpenStackDataPlaneDeploymentCR:$ oc apply -f - <<EOF apiVersion: dataplane.openstack.org/v1beta1 kind: OpenStackDataPlaneDeployment metadata: name: openstack spec: nodeSets: $NODESETS EOFNoteIf you have other node sets to deploy, such as Networker nodes, you can add them in the
nodeSetslist in this step, or create separateOpenStackDataPlaneDeploymentCRs later. You cannot add new node sets to anOpenStackDataPlaneDeploymentCR after deployment.
Verification
Confirm that all the Ansible EE pods reach a
Completedstatus:$ watch oc get pod -l app=openstackansibleee$ oc logs -l app=openstackansibleee -f --max-log-requests 20Wait for the data plane node sets to reach the
Readystatus:for CELL in $(echo $RENAMED_CELLS); do oc wait --for condition=Ready osdpns/openstack-$CELL --timeout=30m doneVerify that the Networking service (neutron) agents are running:
$ oc exec openstackclient -- openstack network agent list +--------------------------------------+------------------------------+------------------------+-------------------+-------+-------+----------------------------+ | ID | Agent Type | Host | Availability Zone | Alive | State | Binary | +--------------------------------------+------------------------------+------------------------+-------------------+-------+-------+----------------------------+ | 174fc099-5cc9-4348-b8fc-59ed44fcfb0e | DHCP agent | standalone.localdomain | nova | :-) | UP | neutron-dhcp-agent | | 10482583-2130-5b0d-958f-3430da21b929 | OVN Metadata agent | standalone.localdomain | | :-) | UP | neutron-ovn-metadata-agent | | a4f1b584-16f1-4937-b2b0-28102a3f6eaa | OVN Controller agent | standalone.localdomain | | :-) | UP | ovn-controller | +--------------------------------------+------------------------------+------------------------+-------------------+-------+-------+----------------------------+
After you remove all the services from the director cell controllers, you can decomission the cell controllers. To create new cell Compute nodes, you re-provision the decomissioned controllers as new data plane hosts and add them to the node sets of corresponding or new cells.
Next steps
- You must perform a fast-forward upgrade on your Compute services. For more information, see Performing a fast-forward upgrade on Compute services.
5.3. Configuring data plane node sets for DCN sites Copy linkLink copied to clipboard!
If you are adopting a Distributed Compute Node (DCN) deployment, you must create separate OpenStackDataPlaneNodeSet custom resources (CRs) for each site. Each node set requires site-specific configuration for network subnets, OVN bridge mappings, and inter-site routes.
Prerequisites
- You have adopted the Red Hat OpenStack Platform (RHOSP) control plane to Red Hat OpenStack Services on OpenShift (RHOSO).
-
You have configured control plane networking for your spine-leaf topology, including multi-subnet
NetConfigandNetworkAttachmentDefinitionCRs with routes to remote sites. For more information, see Configuring control plane networking for spine-leaf topologies. You have the network configuration information for each DCN site:
- IP addresses and hostnames for all Compute nodes
- VLAN IDs for each service network
- Gateway addresses for inter-site routing
- You have identified the OVN bridge mappings (physnets) for each site.
Procedure
Define the OVN bridge mappings for each site. Each site requires a unique physnet that maps to the local provider network bridge:
Expand Table 5.1. Example OVN bridge mappings Site OVN bridge mapping Central
leaf0:br-exDCN1
leaf1:br-exDCN2
leaf2:br-exConfigure OVN for DCN sites. The default OVN controller configuration uses the Kubernetes ClusterIP (
ovsdbserver-sb.openstack.svc), which is not routable from remote DCN sites. You must create a DCN-specific configuration that uses directinternalapiIP addresses.Get the OVN Southbound database
internalapiIP addresses:$ oc get pod -l service=ovsdbserver-sb -o jsonpath='{range .items[*]}{.metadata.annotations.k8s\.v1\.cni\.cncf\.io/network-status}{"\n"}{end}' | jq -r '.[] | select(.name=="openstack/internalapi") | .ips[0]'Example output:
172.17.0.34 172.17.0.35 172.17.0.36Create a ConfigMap with the OVN SB direct IPs for DCN sites:
$ oc apply -f - <<EOF apiVersion: v1 kind: ConfigMap metadata: name: ovncontroller-config-dcn namespace: openstack data: ovsdb-config: | ovn-remote: tcp:172.17.0.34:6642,tcp:172.17.0.35:6642,tcp:172.17.0.36:6642 EOF-
Replace the IP addresses with the actual
internalapiIPs from the previous step.
-
Replace the IP addresses with the actual
Create an
OpenStackDataPlaneServiceCR for DCN OVN configuration:$ oc apply -f - <<EOF apiVersion: dataplane.openstack.org/v1beta1 kind: OpenStackDataPlaneService metadata: name: ovn-dcn namespace: openstack spec: addCertMounts: false caCerts: combined-ca-bundle containerImageFields: - OvnControllerImage dataSources: - configMapRef: name: ovncontroller-config-dcn edpmServiceType: ovn playbook: osp.edpm.ovn tlsCerts: default: contents: - dnsnames - ips issuer: osp-rootca-issuer-ovn keyUsages: - digital signature - key encipherment - server auth - client auth networks: - ctlplane EOFNoteThe
ovn-dcnservice uses theovncontroller-config-dcnConfigMap (throughdataSources), which contains the directinternalapiIPs instead of theClusterIP. DCN node sets must use this service instead of the defaultovnservice.
Create an
OpenStackDataPlaneNodeSetCR for the central site Compute nodes:apiVersion: dataplane.openstack.org/v1beta1 kind: OpenStackDataPlaneNodeSet metadata: name: openstack-edpm spec: tlsEnabled: false networkAttachments: - ctlplane preProvisioned: true services: - redhat - bootstrap - download-cache - configure-network - validate-network - install-os - configure-os - ssh-known-hosts - run-os - reboot-os - install-certs - ovn - neutron-metadata - libvirt - nova-cell1 - telemetry nodeTemplate: ansibleSSHPrivateKeySecret: dataplane-adoption-secret ansible: ansibleUser: root ansibleVars: edpm_ovn_bridge_mappings: ["leaf0:br-ex"] edpm_ovn_bridge: br-int edpm_ovn_encap_type: geneve # Network configuration template for central site edpm_network_config_template: | --- network_config: - type: ovs_bridge name: {{ neutron_physical_bridge_name }} use_dhcp: false dns_servers: {{ ctlplane_dns_nameservers }} addresses: - ip_netmask: {{ ctlplane_ip }}/{{ ctlplane_cidr }} routes: {{ ctlplane_host_routes }} members: - type: interface name: nic1 primary: true {% for network in nodeset_networks %} - type: vlan vlan_id: {{ lookup(vars, networks_lower[network] ~ _vlan_id) }} addresses: - ip_netmask: {{ lookup(vars, networks_lower[network] ~ _ip) }}/{{ lookup(vars, networks_lower[network] ~ _cidr) }} routes: {{ lookup(vars, networks_lower[network] ~ _host_routes) }} {% endfor %} nodes: compute-0: hostName: compute-0.example.com ansible: ansibleHost: compute-0.example.com networks: - defaultRoute: true fixedIP: 192.168.122.100 name: ctlplane subnetName: subnet1 - name: internalapi subnetName: subnet1 - name: storage subnetName: subnet1 - name: tenant subnetName: subnet1-
The OVN bridge mapping uses the central site physnet
leaf0. -
Central site nodes reference
subnet1for all networks.
-
The OVN bridge mapping uses the central site physnet
Create an
OpenStackDataPlaneNodeSetCR for DCN1 edge site compute nodes. You must add inter-site routes to the network configuration template and use theovn-dcnservice:apiVersion: dataplane.openstack.org/v1beta1 kind: OpenStackDataPlaneNodeSet metadata: name: openstack-edpm-dcn1 spec: tlsEnabled: false networkAttachments: - ctlplane preProvisioned: true services: - redhat - bootstrap - download-cache - configure-network - validate-network - install-os - configure-os - ssh-known-hosts - run-os - reboot-os - install-certs - ovn-dcn - neutron-metadata - libvirt - nova-cell1 - telemetry nodeTemplate: ansibleSSHPrivateKeySecret: dataplane-adoption-secret ansible: ansibleUser: root ansibleVars: edpm_ovn_bridge_mappings: ["leaf1:br-ex"] edpm_ovn_bridge: br-int edpm_ovn_encap_type: geneve # Network configuration template for DCN1 site with inter-site routes edpm_network_config_template: | --- network_config: - type: ovs_bridge name: {{ neutron_physical_bridge_name }} use_dhcp: false dns_servers: {{ ctlplane_dns_nameservers }} addresses: - ip_netmask: {{ ctlplane_ip }}/{{ ctlplane_cidr }} routes:1 {{ ctlplane_host_routes }} - ip_netmask: 192.168.122.0/24 next_hop: 192.168.133.1 - ip_netmask: 192.168.144.0/24 next_hop: 192.168.133.1 members: - type: interface name: nic1 primary: true {% for network in nodeset_networks %} - type: vlan vlan_id: {{ lookup(vars, networks_lower[network] ~ _vlan_id) }} addresses: - ip_netmask: {{ lookup(vars, networks_lower[network] ~ _ip) }}/{{ lookup(vars, networks_lower[network] ~ _cidr) }} routes: {{ lookup(vars, networks_lower[network] ~ _host_routes) }} {% if network == internalapi %} - ip_netmask: 172.17.0.0/24 next_hop: 172.17.10.1 - ip_netmask: 172.17.20.0/24 next_hop: 172.17.10.1 {% endif %} {% if network == storage %} - ip_netmask: 172.18.0.0/24 next_hop: 172.18.10.1 - ip_netmask: 172.18.20.0/24 next_hop: 172.18.10.1 {% endif %} {% if network == tenant %} - ip_netmask: 172.19.0.0/24 next_hop: 172.19.10.1 - ip_netmask: 172.19.20.0/24 next_hop: 172.19.10.1 {% endif %} {% endfor %} nodes: dcn1-compute-0: hostName: dcn1-compute-0.example.com ansible: ansibleHost: dcn1-compute-0.example.com networks: - defaultRoute: true fixedIP: 192.168.133.100 name: ctlplane subnetName: ctlplanedcn1 - name: internalapi subnetName: internalapidcn1 - name: storage subnetName: storagedcn1 - name: tenant subnetName: tenantdcn1-
Replace
ovnwithovn-dcnunder spec:services. This ensures OVN controller connects to the OVN Southbound database using direct internalapi IPs instead of the unreachable ClusterIP. -
DCN1 uses the
leaf1physnet, for its OVN bridge mapping underspec:nodeTemplate:ansible:ansibleVars:edpm_ovn_bridge_mappings. - Inter-site routes must be added to the network configuration template. These routes enable DCN1 compute nodes to reach the central site (192.168.122.0/24) and other DCN sites (192.168.144.0/24 for DCN2). Similar routes are added for each service network (internalapi, storage, tenant).
-
DCN1 nodes reference site-specific subnet names like
ctlplanedcn1andinternalapidcn1. These subnet names must match those defined in theNetConfigCR.
-
Replace
Repeat step 3 for all other DCN sites. Adjust site specific parameters:
-
The nodeset name, for example:
openstack-edpm-dcn2 -
The OVN bridge mapping, for example:
leaf2:br-ex -
The subnet names, for example:
ctlplanedcn2, andinternalapidcn2 - The inter-site routes. The routes from DCN2 should point to the central site subnets and the DCN1 site subnets.
- The compute node definitions with site-appropriate IP addresses.
-
The nodeset name, for example:
Deploy all nodesets by creating an
OpenStackDataPlaneDeploymentCR:apiVersion: dataplane.openstack.org/v1beta1 kind: OpenStackDataPlaneDeployment metadata: name: openstack-edpm-deployment spec: nodeSets: - openstack-edpm - openstack-edpm-dcn1 - openstack-edpm-dcn2NoteAll nodesets can be deployed in parallel once the control plane adoption is complete.
Wait for the deployment to complete:
$ oc wait --for condition=Ready openstackdataplanedeployment/openstack-edpm-deployment --timeout=40m
Verification
Verify that all node sets reach the
Readystatus:$ oc get openstackdataplanenodeset NAME STATUS MESSAGE openstack-edpm True Ready openstack-edpm-dcn1 True Ready openstack-edpm-dcn2 True ReadyVerify that Compute services are running across all sites. Ensure that all
nova-computeservices showState=upfor nodes in all availability zones:$ oc exec openstackclient -- openstack compute service listVerify inter-site connectivity by checking routes on a DCN Compute node:
$ ssh dcn1-compute-0 ip route show | grep 172.17.0 172.17.0.0/24 via 172.17.10.1 dev internalapiTest that DCN Compute nodes can reach the control plane:
$ ssh dcn1-compute-0 ping -c 3 172.17.0.30Replace
172.17.0.30with an IP address of a control plane service on the internalapi network.
5.4. Adopting Compute services with multiple Red Hat Ceph Storage back ends (DCN) Copy linkLink copied to clipboard!
In a Distributed Compute Node (DCN) deployment where Image Service (glance) and Block Storage service (cinder) services run on edge Compute nodes, each site has its own Red Hat Ceph Storage cluster. The Compute service (nova) nodes at each site must be configured with the Red Hat Ceph Storage connection details and Image service endpoint for their local site. Because the Image service has a separate API endpoint at each site, each site’s OpenStackDataPlaneNodeSet custom resource (CR) must use a different OpenStackDataPlaneService CR that points to the correct Image service.
In a DCN deployment, all node sets belong to a single Compute service cell. The central site and each edge site are separate OpenStackDataPlaneNodeSet resources within that cell. The per-site OpenStackDataPlaneService resources deliver different Red Hat Ceph Storage and Image service configurations to each node set while sharing the same cell-level Compute service configuration.
Prerequisites
- You have adopted the Image service with multiple Red Hat Ceph Storage back ends. For more information, see Adopting the Image service with multiple Ceph back ends.
- You have adopted the Block Storage service with multiple Red Hat Ceph Storage back ends. For more information, see Adopting the Block Storage service with multiple Ceph back ends.
-
The per-site Red Hat Ceph Storage secrets (
ceph-conf-central,ceph-conf-dcn1,ceph-conf-dcn2) exist. For more information, see Configuring a Red Hat Ceph Storage back end. Retrieve the
fsidfor each Red Hat Ceph Storage cluster:$ oc get secret ceph-conf-central -o json | jq -r '.data | to_entries[] | select(.key | endswith(".conf")) | "\(.key): \(.value | @base64d)"' | grep fsid
Procedure
Set the cell name variable. In a DCN deployment, all node sets belong to a single cell:
$ DEFAULT_CELL_NAME="cell1"Retrieve the
fsidfor each Red Hat Ceph Storage cluster and store them in shell variables:$ CEPH_FSID_CENTRAL=$(oc get secret ceph-conf-central -o json | jq -r .data."<central.conf>" | base64 -d | awk /fsid/{print $3}) $ CEPH_FSID_DCN1=$(oc get secret ceph-conf-dcn1 -o json | jq -r .data."<dcn1.conf>" | base64 -d | awk /fsid/{print $3}) $ CEPH_FSID_DCN2=$(oc get secret ceph-conf-dcn2 -o json | jq -r .data."<dcn2.conf>" | base64 -d | awk /fsid/{print $3})where:
<central.conf>-
Specifies the name of the Red Hat Ceph Storage configuration file for the central site in the
ceph-conf-centralsecret. <dcn1.conf>-
Specifies the name of the Red Hat Ceph Storage configuration file for an edge site in the
ceph-conf-dcn1secret. <dcn2.conf>-
Specifies the name of the Red Hat Ceph Storage configuration file for an additional edge site in the
ceph-conf-dcn2secret.
Create a
ConfigMapfor each site. EachConfigMapcontains the Red Hat Ceph Storage and Image service configuration specific to that site.The following example creates
ConfigMapresources for a central site and two edge sites.Create the
ConfigMapfor the central site:$ oc apply -f - <<EOF apiVersion: v1 kind: ConfigMap metadata: name: nova-ceph-central data: 99-nova-compute-cells-workarounds.conf: | [workarounds] disable_compute_service_check_for_ffu=true 03-ceph-nova.conf: | [libvirt] images_type=rbd images_rbd_pool=vms images_rbd_ceph_conf=/etc/ceph/central.conf images_rbd_glance_store_name=central images_rbd_glance_copy_poll_interval=15 images_rbd_glance_copy_timeout=600 rbd_user=openstack rbd_secret_uuid=${CEPH_FSID_CENTRAL} [glance] endpoint_override = http://glance-central-internal.openstack.svc:9292 valid_interfaces = internal [cinder] cross_az_attach = False catalog_info = volumev3:cinderv3:internalURL EOFEach
ConfigMapcontains three configuration sections:-
[libvirt]points to the local Red Hat Ceph Storage cluster configuration and uses the localfsidas therbd_secret_uuid. -
[glance]usesendpoint_overrideto direct Image service requests to the local Image service API endpoint instead of the endpoint that is registered in the Identity service catalog. The examples usehttp://for the Image service endpoints. If your Red Hat OpenStack Platform deployment uses TLS for internal endpoints, usehttps://instead, and ensure that you have completed the TLS migration. For more information, see Migrating TLS-e to the RHOSO deployment. -
[cinder]setscross_az_attach = Falseto prevent volumes from being attached to instances in a different availability zone.
-
Create the
ConfigMapfor the first edge site:$ oc apply -f - <<EOF apiVersion: v1 kind: ConfigMap metadata: name: nova-ceph-dcn1 data: 99-nova-compute-cells-workarounds.conf: | [workarounds] disable_compute_service_check_for_ffu=true 03-ceph-nova.conf: | [libvirt] images_type=rbd images_rbd_pool=vms images_rbd_ceph_conf=/etc/ceph/dcn1.conf images_rbd_glance_store_name=dcn1 images_rbd_glance_copy_poll_interval=15 images_rbd_glance_copy_timeout=600 rbd_user=openstack rbd_secret_uuid=${CEPH_FSID_DCN1} [glance] endpoint_override = http://glance-dcn1-internal.openstack.svc:9292 valid_interfaces = internal [cinder] cross_az_attach = False catalog_info = volumev3:cinderv3:internalURL EOFCreate the
ConfigMapfor the second edge site:$ oc apply -f - <<EOF apiVersion: v1 kind: ConfigMap metadata: name: nova-ceph-dcn2 data: 99-nova-compute-cells-workarounds.conf: | [workarounds] disable_compute_service_check_for_ffu=true 03-ceph-nova.conf: | [libvirt] images_type=rbd images_rbd_pool=vms images_rbd_ceph_conf=/etc/ceph/dcn2.conf images_rbd_glance_store_name=dcn2 images_rbd_glance_copy_poll_interval=15 images_rbd_glance_copy_timeout=600 rbd_user=openstack rbd_secret_uuid=${CEPH_FSID_DCN2} [glance] endpoint_override = http://glance-dcn2-internal.openstack.svc:9292 valid_interfaces = internal [cinder] cross_az_attach = False catalog_info = volumev3:cinderv3:internalURL EOFImportantThe
endpoint_overridein the[glance]section is different for each site. This setting directs the Compute service to contact the local Image service API instead of the central endpoint registered in the Identity service catalog. Without this setting, all Compute nodes contact the central Image service, and image data is transferred across the WAN instead of read from the local Red Hat Ceph Storage cluster.-
Central Compute nodes use
glance-central-internal.openstack.svc -
DCN1 Compute nodes use
glance-dcn1-internal.openstack.svc -
DCN2 Compute nodes use
glance-dcn2-internal.openstack.svc
These endpoint names correspond to the
GlanceAPIinstances that are created when you adopt the Image service with DCN back ends.-
Central Compute nodes use
Create a per-site
OpenStackDataPlaneServiceCR for each site. Each service references the site-specificConfigMapthat you created in the previous step:$ oc apply -f - <<EOF --- apiVersion: dataplane.openstack.org/v1beta1 kind: OpenStackDataPlaneService metadata: name: nova-custom-ceph-central spec: dataSources: - configMapRef: name: nova-ceph-central - secretRef: name: nova-${DEFAULT_CELL_NAME}-compute-config - secretRef: name: nova-migration-ssh-key playbook: osp.edpm.nova caCerts: combined-ca-bundle edpmServiceType: nova containerImageFields: - NovaComputeImage - EdpmIscsidImage --- apiVersion: dataplane.openstack.org/v1beta1 kind: OpenStackDataPlaneService metadata: name: nova-custom-ceph-dcn1 spec: dataSources: - configMapRef: name: nova-ceph-dcn1 - secretRef: name: nova-${DEFAULT_CELL_NAME}-compute-config - secretRef: name: nova-${DEFAULT_CELL_NAME}-metadata-neutron-config - secretRef: name: nova-migration-ssh-key playbook: osp.edpm.nova caCerts: combined-ca-bundle edpmServiceType: nova containerImageFields: - NovaComputeImage - EdpmIscsidImage --- apiVersion: dataplane.openstack.org/v1beta1 kind: OpenStackDataPlaneService metadata: name: nova-custom-ceph-dcn2 spec: dataSources: - configMapRef: name: nova-ceph-dcn2 - secretRef: name: nova-${DEFAULT_CELL_NAME}-compute-config - secretRef: name: nova-${DEFAULT_CELL_NAME}-metadata-neutron-config - secretRef: name: nova-migration-ssh-key playbook: osp.edpm.nova caCerts: combined-ca-bundle edpmServiceType: nova containerImageFields: - NovaComputeImage - EdpmIscsidImage EOFNoteAll
OpenStackDataPlaneServiceCRs reference the same cell secret (nova-cell1-compute-config) because all node sets belong to a single cell. The per-siteConfigMapis what differentiates the Red Hat Ceph Storage and Image service configuration for each site.When you create the
OpenStackDataPlaneNodeSetCR for each site, reference the per-site service in theserviceslist instead ofnova-$CELL. For example:-
The central node set uses
nova-custom-ceph-centralin itsserviceslist. -
The DCN1 node set uses
nova-custom-ceph-dcn1in itsserviceslist. The DCN2 node set uses
nova-custom-ceph-dcn2in itsserviceslist.If you have already created the
OpenStackDataPlaneNodeSetCRs with the defaultnova-$CELLservice, patch each node set to use the per-site service. The following example patches the central node set:$ oc patch osdpns/openstack-${DEFAULT_CELL_NAME} --type=merge --patch " spec: services: - bootstrap - download-cache - configure-network - validate-network - install-os - configure-os - ssh-known-hosts - run-os - reboot-os - install-certs - ceph-client - ovn - neutron-metadata - libvirt - nova-custom-ceph-central nodeTemplate: extraMounts: - extraVolType: Ceph volumes: - name: ceph secret: secretName: ceph-conf-central mounts: - name: ceph mountPath: "/etc/ceph" readOnly: true "Patch each DCN edge node set with the same services list, replacing
ovnwithovn-dcnandnova-custom-ceph-centralwith the per-site service name. You must include theceph-clientservice so that the Red Hat Ceph Storage configuration files from the per-site secret are deployed into the Compute service containers on the edge nodes. Withoutceph-client, the/etc/ceph/directory inside the Compute service container is empty and instances fail to launch with aRADOS object not found (error calling conf_read_file)error.For example, for the DCN1 node set named
dcn1:$ oc patch osdpns/dcn1 --type=merge --patch " spec: services: - bootstrap - download-cache - configure-network - validate-network - install-os - configure-os - ssh-known-hosts - run-os - reboot-os - install-certs - ceph-client - ovn-dcn - neutron-metadata - libvirt - nova-custom-ceph-dcn1 nodeTemplate: extraMounts: - extraVolType: Ceph volumes: - name: ceph secret: secretName: ceph-conf-dcn1 mounts: - name: ceph mountPath: "/etc/ceph" readOnly: true "Repeat this step for each additional edge site, replacing
dcn1andnova-custom-ceph-dcn1with the appropriate site name, for example,dcn2andnova-custom-ceph-dcn2.
-
The central node set uses
5.5. Performing a fast-forward upgrade on Compute services Copy linkLink copied to clipboard!
You must upgrade the Compute services from Red Hat OpenStack Platform 17.1 to Red Hat OpenStack Services on OpenShift (RHOSO) 18.0 on the control plane and data plane by completing the following tasks:
- Update the cell1 Compute data plane services version.
- Remove pre-fast-forward upgrade workarounds from the Compute control plane services and Compute data plane services.
- Run Compute database online migrations to update live data.
Prerequisites
You have defined the shell variables necessary to apply the fast-forward upgrade commands for each Compute service cell.
DEFAULT_CELL_NAME="cell1" RENAMED_CELLS="$DEFAULT_CELL_NAME" declare -A PODIFIED_DB_ROOT_PASSWORD for CELL in $(echo "super $RENAMED_CELLS"); do PODIFIED_DB_ROOT_PASSWORD[$CELL]=$(oc get -o json secret/osp-secret | jq -r .data.DbRootPassword | base64 -d) done- You have completed the steps in Adopting Compute services to the RHOSO data plane.
Procedure
Wait for the Compute service data plane services version to update for all the cells:
for CELL in $(echo $RENAMED_CELLS); do oc exec openstack-$CELL-galera-0 -c galera -- mysql -rs -uroot -p"${PODIFIED_DB_ROOT_PASSWORD[$CELL]}" \ -e "select a.version from nova_${CELL}.services a join nova_${CELL}.services b where a.version!=b.version and a.binary='nova-compute' and a.deleted=0;" doneNoteThe query returns an empty result when the update is completed. No downtime is expected for virtual machine (VM) workloads.
Review any errors in the nova Compute agent logs on the data plane, and the
nova-conductorjournal records on the control plane.Patch the
OpenStackControlPlaneCR to remove the pre-fast-forward upgrade workarounds from the Compute control plane services:$ rm -f celltemplates $ for CELL in $(echo $RENAMED_CELLS); do $ cat >> celltemplates << EOF ${CELL}: metadataServiceTemplate: customServiceConfig: | [workarounds] disable_compute_service_check_for_ffu=false conductorServiceTemplate: customServiceConfig: | [workarounds] disable_compute_service_check_for_ffu=false EOF done $ cat > oscp-patch.yaml << EOF spec: nova: template: apiServiceTemplate: customServiceConfig: | [workarounds] disable_compute_service_check_for_ffu=false metadataServiceTemplate: customServiceConfig: | [workarounds] disable_compute_service_check_for_ffu=false schedulerServiceTemplate: customServiceConfig: | [workarounds] disable_compute_service_check_for_ffu=false cellTemplates: cell0: conductorServiceTemplate: customServiceConfig: | [workarounds] disable_compute_service_check_for_ffu=false EOF $ cat celltemplates >> oscp-patch.yamlIf you are adopting the Compute service with the Bare Metal Provisioning service (ironic), append the following
novaComputeTemplatesin thecell<X>section of the Compute service CR patch:cell<X>: novaComputeTemplates: <hostname>: customServiceConfig: | [DEFAULT] host = <hostname> [workarounds] disable_compute_service_check_for_ffu=true computeDriver: ironic.IronicDriver ...where:
- <hostname>
-
Specifies the hostname of the node that is running the
ironicCompute driver in the source cloud ofcell<X>.
Apply the patch file:
$ oc patch openstackcontrolplane openstack --type=merge --patch-file=oscp-patch.yamlWait until the Compute control plane services CRs are ready:
$ oc wait --for condition=Ready --timeout=300s Nova/novaRemove the pre-fast-forward upgrade workarounds from the Compute data plane services:
$ oc patch cm nova-cells-global-config --type=json -p='[{"op": "replace", "path": "/data/99-nova-compute-cells-workarounds.conf", "value": "[workarounds]\n"}]' $ for CELL in $(echo $RENAMED_CELLS); do $ oc get Openstackdataplanenodeset openstack-${CELL} || continue $ oc apply -f - <<EOF --- apiVersion: dataplane.openstack.org/v1beta1 kind: OpenStackDataPlaneDeployment metadata: name: openstack-nova-compute-ffu-$CELL spec: nodeSets: - openstack-${CELL} servicesOverride: - nova-${CELL} backoffLimit: 3 EOF doneWait for the Compute data plane services to be ready for all the cells:
$ oc wait --for condition=Ready openstackdataplanedeployments --all --timeout=5mRun Compute database online migrations to complete the upgrade:
$ oc exec -it nova-cell0-conductor-0 -- nova-manage db online_data_migrations $ for CELL in $(echo $RENAMED_CELLS); do $ oc exec -it nova-${CELL}-conductor-0 -- nova-manage db online_data_migrations doneDiscover the Compute hosts in the cells:
$ oc rsh nova-cell0-conductor-0 nova-manage cell_v2 discover_hosts --verboseIf you have a test VM that is not a production workload, complete the following verification steps:
Verify if the existing test VM instance is running:
${BASH_ALIASES[openstack]} server --os-compute-api-version 2.48 show --diagnostics test 2>&1 || echo FAILVerify if the Compute services can stop the existing test VM instance:
${BASH_ALIASES[openstack]} server list -c Name -c Status -f value | grep -qF "test ACTIVE" && ${BASH_ALIASES[openstack]} server stop test || echo PASS ${BASH_ALIASES[openstack]} server list -c Name -c Status -f value | grep -qF "test SHUTOFF" || echo FAIL ${BASH_ALIASES[openstack]} server --os-compute-api-version 2.48 show --diagnostics test 2>&1 || echo PASSVerify if the Compute services can start the existing test VM instance:
${BASH_ALIASES[openstack]} server list -c Name -c Status -f value | grep -qF "test SHUTOFF" && ${BASH_ALIASES[openstack]} server start test || echo PASS ${BASH_ALIASES[openstack]} server list -c Name -c Status -f value | grep -qF "test ACTIVE" && \ ${BASH_ALIASES[openstack]} server --os-compute-api-version 2.48 show --diagnostics test --fit-width -f json | jq -r '.state' | grep running || echo FAIL
Next steps
After the data plane adoption, the Compute hosts continue to run Red Hat Enterprise Linux (RHEL) 9.2. To take advantage of RHEL 9.4, perform a minor update procedure after finishing the adoption procedure.
5.6. Adopting Networker services to the RHOSO data plane Copy linkLink copied to clipboard!
Adopt the Networker services in your existing Red Hat OpenStack Platform deployment to the Red Hat OpenStack Services on OpenShift (RHOSO) data plane. The Networker services could be running on Controller nodes or dedicated Networker nodes. You decide which services you want to run on the Networker nodes, and create a separate OpenStackDataPlaneNodeSet custom resource (CR) for the Networker nodes.
By definition, any node that has set enable-chassis-as-gw is considered a Networker node. After the adoption process, these nodes continue to be Networker nodes.
You can implement the following options if they apply to your environment:
-
Depending on your topology, you might need to run the
neutron-metadataservice on the nodes, specifically when you want to serve metadata to SR-IOV ports that are hosted on Compute nodes. -
If you want to continue running OVN gateway services on Networker nodes, keep
ovnservice in the list to deploy. -
Optional: You can run the
neutron-dhcpservice on your Networker nodes instead of your Compute nodes. You might not need to useneutron-dhcpwith OVN, unless your deployment uses DHCP relays, or advanced DHCP options that are supported by dnsmasq but not by the OVN DHCP implementation.
Adopt each Controller or Networker node in your existing Red Hat OpenStack Platform deployment to the Red Hat OpenStack Services on OpenShift (RHOSO) when your node is set as an OVN chassis gateway. Any node with parameter set to enable-chassis-as-gw is considered OVN gateway chassis. In this case, such nodes will become edpm networker nodes after adoption.
Check for the nodes where
OVN Controller Gateway agentagents are running. The list of agents varies depending on the services you enabled:$ oc exec openstackclient -- openstack network agent list +--------------------------------------+------------------------------+--------------------------+-------------------+-------+-------+----------------------------+ | ID | Agent Type | Host | Availability Zone | Alive | State | Binary | +--------------------------------------+------------------------------+--------------------------+-------------------+-------+-------+----------------------------+ | e5075ee0-9dd9-4f0a-a42a-6bbdf1a6111c | OVN Controller Gateway agent | controller-0.localdomain | | XXX | UP | ovn-controller | | f3112349-054c-403a-b00a-e219238192b8 | OVN Controller agent | compute-0.localdomain | | XXX | UP | ovn-controller | | af9dae2d-1c1c-55a8-a743-f84719f6406d | OVN Metadata agent | compute-0.localdomain | | XXX | UP | neutron-ovn-metadata-agent | | 51a11df8-a66e-47a2-aec0-52eb8589626c | OVN Controller Gateway agent | controller-1.localdomain | | XXX | UP | ovn-controller | | bb817e5e-7832-410a-9e67-934dac8c602f | OVN Controller Gateway agent | controller-2.localdomain | | XXX | UP | ovn-controller | +--------------------------------------+------------------------------+--------------------------+-------------------+-------+-------+----------------------------+
Prerequisites
Define the shell variable. Based on above agent list output, controller-0, controller-1, controller-2 are our target hosts. If you have both
ControllerandNetworkernodes running networker services then add all those hosts below.declare -A networkers networkers+=( ["controller-0.localdomain"]="192.168.122.100" ["controller-1.localdomain"]="192.168.122.101" ["controller-2.localdomain"]="192.168.122.102" # ... )-
Replace
["<node-name>"]="192.168.122.100"with the name and IP address of the corresponding Networker or Controller node as per your environment.
-
Replace
Procedure
Deploy the
OpenStackDataPlaneNodeSetCR for your nodes:NoteYou can reuse most of the
nodeTemplatesection from theOpenStackDataPlaneNodeSetCR that is designated for your Compute nodes. You can omit some of the variables because of the limited set of services that are running on the Networker nodes.$ oc apply -f - <<EOF apiVersion: dataplane.openstack.org/v1beta1 kind: OpenStackDataPlaneNodeSet metadata: name: openstack-networker spec: tlsEnabled: false networkAttachments: - ctlplane preProvisioned: true services: - redhat - bootstrap - download-cache - configure-network - validate-network - install-os - configure-os - ssh-known-hosts - run-os - install-certs - ovn env: - name: ANSIBLE_CALLBACKS_ENABLED value: "profile_tasks" - name: ANSIBLE_FORCE_COLOR value: "True" nodes: controller-0: hostName: controller-0 ansible: ansibleHost: ${networkers[controller-0.localdomain]} networks: - defaultRoute: true fixedIP: ${networkers[controller-0.localdomain]} name: ctlplane subnetName: subnet1 - name: internalapi subnetName: subnet1 - name: storage subnetName: subnet1 - name: tenant subnetName: subnet1 controller-1: hostName: controller-1 ansible: ansibleHost: ${networkers[controller-1.localdomain]} networks: - defaultRoute: true fixedIP: ${networkers[controller-1.localdomain]} name: ctlplane subnetName: subnet1 - name: internalapi subnetName: subnet1 - name: storage subnetName: subnet1 - name: tenant subnetName: subnet1 controller-2: hostName: controller-2 ansible: ansibleHost: ${networkers[controller-2.localdomain]} networks: - defaultRoute: true fixedIP: ${networkers[controller-2.localdomain]} name: ctlplane subnetName: subnet1 - name: internalapi subnetName: subnet1 - name: storage subnetName: subnet1 - name: tenant subnetName: subnet1 nodeTemplate: ansibleSSHPrivateKeySecret: dataplane-adoption-secret ansible: ansibleUser: root ansibleVarsFrom: - secretRef: name: subscription-manager - secretRef: name: redhat-registry ansibleVars: rhc_release: 9.2 rhc_repositories: - {name: "*", state: disabled} - {name: "rhel-9-for-x86_64-baseos-eus-rpms", state: enabled} - {name: "rhel-9-for-x86_64-appstream-eus-rpms", state: enabled} - {name: "rhel-9-for-x86_64-highavailability-eus-rpms", state: enabled} - {name: "rhoso-18.0-for-rhel-9-x86_64-rpms", state: enabled} - {name: "fast-datapath-for-rhel-9-x86_64-rpms", state: enabled} - {name: "rhceph-7-tools-for-rhel-9-x86_64-rpms", state: enabled} edpm_bootstrap_release_version_package: [] # edpm_network_config # Default nic config template for a EDPM node # These vars are edpm_network_config role vars edpm_network_config_template: | --- {% set mtu_list = [ctlplane_mtu] %} {% for network in nodeset_networks %} {% set _ = mtu_list.append(lookup(vars, networks_lower[network] ~ _mtu)) %} {%- endfor %} {% set min_viable_mtu = mtu_list | max %} network_config: - type: ovs_bridge name: {{ neutron_physical_bridge_name }} mtu: {{ min_viable_mtu }} use_dhcp: false dns_servers: {{ ctlplane_dns_nameservers }} domain: {{ dns_search_domains }} addresses: - ip_netmask: {{ ctlplane_ip }}/{{ ctlplane_cidr }} routes: {{ ctlplane_host_routes }} members: - type: interface name: nic1 mtu: {{ min_viable_mtu }} # force the MAC address of the bridge to this interface primary: true {% for network in nodeset_networks %} - type: vlan mtu: {{ lookup(vars, networks_lower[network] ~ _mtu) }} vlan_id: {{ lookup(vars, networks_lower[network] ~ _vlan_id) }} addresses: - ip_netmask: {{ lookup(vars, networks_lower[network] ~ _ip) }}/{{ lookup(vars, networks_lower[network] ~ _cidr) }} routes: {{ lookup(vars, networks_lower[network] ~ _host_routes) }} {% endfor %} edpm_network_config_nmstate: false edpm_network_config_hide_sensitive_logs: false # # These vars are for the network config templates themselves and are # considered EDPM network defaults. neutron_physical_bridge_name: br-ctlplane neutron_public_interface_name: eth0 # edpm_nodes_validation edpm_nodes_validation_validate_controllers_icmp: false edpm_nodes_validation_validate_gateway_icmp: false # edpm ovn-controller configuration edpm_ovn_bridge_mappings: [<"bridge_mappings">] edpm_ovn_bridge: br-int edpm_ovn_encap_type: geneve ovn_monitor_all: true edpm_ovn_remote_probe_interval: 60000 edpm_ovn_ofctrl_wait_before_clear: 8000 # serve as a OVN gateway edpm_enable_chassis_gw: true timesync_ntp_servers: - hostname: clock.redhat.com - hostname: clock2.redhat.com gather_facts: false enable_debug: false # edpm firewall, change the allowed CIDR if needed edpm_sshd_configure_firewall: true edpm_sshd_allowed_ranges: [192.168.122.0/24] # SELinux module edpm_selinux_mode: enforcing # Do not attempt OVS major upgrades here edpm_ovs_packages: - openvswitch3.3 EOF-
spec.tlsEnabledspecifies whether TLS Everywhere is enabled. If TLS is enabled, changespec:tlsEnabledtotrue. -
edpm_ovn_bridge_mappings: Replace[<"bridge_mappings">]with the bridge mapping values that you used in your Red Hat OpenStack Platform 17.1 deployment, for example,["datacentre:br-ctlplane"]. -
edpm_enable_chassis_gwspecifies whether to runovn-controllerin gateway mode.
-
Ensure that you use the same
ovn-controllersettings in theOpenStackDataPlaneNodeSetCR that you used in the Networker nodes before adoption. This configuration is stored in theexternal_idscolumn in theOpen_vSwitchtable in the Open vSwitch database:ovs-vsctl list Open . ... external_ids : {hostname=controller-0.localdomain, ovn-bridge=br-int, ovn-bridge-mappings=<bridge_mappings>, ovn-chassis-mac-mappings="datacentre:1e:0a:bb:e6:7c:ad", ovn-cms-options=enable-chassis-as-gw, ovn-encap-ip="172.19.0.100", ovn-encap-tos="0", ovn-encap-type=geneve, ovn-match-northd-version=False, ovn-monitor-all=True, ovn-ofctrl-wait-before-clear="8000", ovn-openflow-probe-interval="60", ovn-remote="tcp:ovsdbserver-sb.openstack.svc:6642", ovn-remote-probe-interval="60000", rundir="/var/run/openvswitch", system-id="2eec68e6-aa21-4c95-a868-31aeafc11736"} ...-
Replace
<bridge_mappings>with the value of the bridge mappings in your configuration, for example,"datacentre:br-ctlplane".
-
Replace
Optional: Enable
neutron-metadatain theOpenStackDataPlaneNodeSetCR:$ oc patch openstackdataplanenodeset <networker_CR_name> --type='json' --patch='[ { "op": "add", "path": "/spec/services/-", "value": "neutron-metadata" }]'-
Replace
<networker_CR_name>with the name of the CR that you deployed for your Networker nodes, for example,openstack-networker.
-
Replace
Optional: Enable
neutron-dhcpin theOpenStackDataPlaneNodeSetCR:$ oc patch openstackdataplanenodeset <networker_CR_name> --type='json' --patch='[ { "op": "add", "path": "/spec/services/-", "value": "neutron-dhcp" }]'Run the
pre-adoption-validationservice for Networker nodes:Create a
OpenStackDataPlaneDeploymentCR that runs only the validation:$ oc apply -f - <<EOF apiVersion: dataplane.openstack.org/v1beta1 kind: OpenStackDataPlaneDeployment metadata: name: openstack-pre-adoption-networker spec: nodeSets: - openstack-networker servicesOverride: - pre-adoption-validation EOFWhen the validation is finished, confirm that the status of the Ansible EE pods is
Completed:$ watch oc get pod -l app=openstackansibleee$ oc logs -l app=openstackansibleee -f --max-log-requests 20Wait for the deployment to reach the
Readystatus:$ oc wait --for condition=Ready openstackdataplanedeployment/openstack-pre-adoption-networker --timeout=10m
Deploy the
OpenStackDataPlaneDeploymentCR for Networker nodes:$ oc apply -f - <<EOF apiVersion: dataplane.openstack.org/v1beta1 kind: OpenStackDataPlaneDeployment metadata: name: openstack-networker spec: nodeSets: - openstack-networker EOFNoteAlternatively, you can include the Networker node set in the
nodeSetslist before you deploy the mainOpenStackDataPlaneDeploymentCR. You cannot add new node sets to theOpenStackDataPlaneDeploymentCR after deployment.Clean up any Networking service (neutron) agents that are no longer running.
NoteIn some cases, agents from the old data plane that are replaced or retired remain in RHOSO. The function these agents provided might be provided by a new agent that is running in RHOSO, or the function might be replaced by other components. For example, DHCP agents might no longer be needed, since OVN DHCP in RHOSO can provide this function.
List the agents:
$ oc exec openstackclient -- openstack network agent list +--------------------------------------+------------------------------+--------------------------+-------------------+-------+-------+----------------------------+ | ID | Agent Type | Host | Availability Zone | Alive | State | Binary | +--------------------------------------+------------------------------+--------------------------+-------------------+-------+-------+----------------------------+ | e5075ee0-9dd9-4f0a-a42a-6bbdf1a6111c | OVN Controller Gateway agent | controller-0.localdomain | | :-) | UP | ovn-controller | | 856960f0-5530-46c7-a331-6eadcba362da | DHCP agent | controller-1.localdomain | nova | XXX | UP | neutron-dhcp-agent | | 8bd22720-789f-45b8-8d7d-006dee862bf9 | DHCP agent | controller-2.localdomain | nova | XXX | UP | neutron-dhcp-agent | | e584e00d-be4c-4e98-a11a-4ecd87d21be7 | DHCP agent | controller-0.localdomain | nova | XXX | UP | neutron-dhcp-agent | +--------------------------------------+------------------------------+--------------------------+-------------------+-------+-------+----------------------------+If any agent in the list shows
XXXin theAlivefield, verify the Host and Agent Type, if the functions of this agent is no longer required, and the agent has been permanently stopped on the Red Hat OpenStack Platform host. Then, delete the agent:$ oc exec openstackclient -- openstack network agent delete <agent_id>-
Replace
<agent_id>with the ID of the agent to delete, for example,856960f0-5530-46c7-a331-6eadcba362da.
-
Replace
Verification
Confirm that all the Ansible EE pods reach a
Completedstatus:$ watch oc get pod -l app=openstackansibleee$ oc logs -l app=openstackansibleee -f --max-log-requests 20Wait for the data plane node set to reach the
Readystatus:$ oc wait --for condition=Ready osdpns/<networker_CR_name> --timeout=30m-
Replace
<networker_CR_name>with the name of the CR that you deployed for your Networker nodes, for example,openstack-networker.
-
Replace
Verify that the Networking service (neutron) agents are running. The list of agents varies depending on the services you enabled:
$ oc exec openstackclient -- openstack network agent list +--------------------------------------+------------------------------+--------------------------+-------------------+-------+-------+----------------------------+ | ID | Agent Type | Host | Availability Zone | Alive | State | Binary | +--------------------------------------+------------------------------+--------------------------+-------------------+-------+-------+----------------------------+ | e5075ee0-9dd9-4f0a-a42a-6bbdf1a6111c | OVN Controller Gateway agent | controller-0.localdomain | | :-) | UP | ovn-controller | | f3112349-054c-403a-b00a-e219238192b8 | OVN Controller agent | compute-0.localdomain | | :-) | UP | ovn-controller | | af9dae2d-1c1c-55a8-a743-f84719f6406d | OVN Metadata agent | compute-0.localdomain | | :-) | UP | neutron-ovn-metadata-agent | | 51a11df8-a66e-47a2-aec0-52eb8589626c | OVN Controller Gateway agent | controller-1.localdomain | | :-) | UP | ovn-controller | | bb817e5e-7832-410a-9e67-934dac8c602f | OVN Controller Gateway agent | controller-2.localdomain | | :-) | UP | ovn-controller | +--------------------------------------+------------------------------+--------------------------+-------------------+-------+-------+----------------------------+
5.7. Enabling the high availability for Compute instances service Copy linkLink copied to clipboard!
To enable the high availability for Compute instances (Instance HA) service, you create the following resources:
- Fencing secret.
- Configuration map. You can create the configuration map manually, or the configuration map is created automatically when you deploy the Instance HA resource. However, you must create the configuration map manually if you want to disable the Instance HA service.
- Instance HA resource.
Prerequisites
-
You have created the
fencing-secret.yamlconfiguration file. For more information, see Maintaining the Instance HA functionality after adoption. - You have disabled Pacemaker on your Compute nodes. For more information, see Preventing Pacemaker from monitoring Compute nodes.
Procedure
Create the secret:
$ oc apply -f fencing-secret.yaml -n openstackOptional: Create the Instance HA configuration map and set the
DISABLEDparameter tofalse. For example:$ cat << EOF > iha-cm.yaml kind: ConfigMap metadata: name: instanceha-0-config namespace: openstack apiVersion: v1 data: config.yaml: | config: EVACUABLE_TAG: "evacuable" TAGGED_IMAGES: "true" TAGGED_FLAVORS: "true" TAGGED_AGGREGATES: "true" SMART_EVACUATION: "false" DELTA: "30" DELAY: "0" POLL: "45" THRESHOLD: "50" WORKERS: "4" RESERVED_HOSTS: "false" LEAVE_DISABLED: "false" CHECK_KDUMP: "false" LOGLEVEL: "info" DISABLED: "false" EOFApply the configuration:
$ oc apply -f iha-cm.yaml -n openstackNoteIf you want to restrict which Compute nodes are evacuated, create host aggregates and set them by using the
EVACUABLE_TAGparameter. Alternatively, you can set theTAGGED_AGGREGATESparameter tofalseto enable monitoring and evacuation of all your Compute nodes. For more information about Instance HA service parameters, see Editing the Instance HA service parameters in Configuring high availability for instances.
Create an Instance HA resource and reference the fencing secret and configuration map. For example:
$ cat << EOF > iha.yaml apiVersion: instanceha.openstack.org/v1beta1 kind: InstanceHa metadata: name: instanceha-0 namespace: openstack spec: caBundleSecretName: combined-ca-bundle instanceHaConfigMap: fencingSecret: fencing-secret EOF-
spec.instanceHaConfigMapdefines the name of the YAML file containing the Instance HA configuration map that you created. If you do not create this file, then a YAML file calledinstanceha-configis created automatically when the Instance HA service is installed, providing the default values of the Instance HA service parameters. You can then edit the values as needed.
-
Deploy the Instance HA resource:
$ oc apply -f iha.yaml -n openstack
Next steps
After you complete the Red Hat OpenStack Services on OpenShift adoption, remove the Pacemaker components from the Compute nodes. You must run the following commands on each Compute node:
$ sudo systemctl stop pacemaker_remote $ sudo systemctl stop pcsd $ sudo systemctl stop pcsd-ruby.service $ sudo systemctl disable pacemaker_remote $ sudo systemctl disable pcsd $ sudo systemctl disable pcsd-ruby.service $ sudo dnf remove pacemaker pacemaker-remote pcs pcsd -y
5.8. Post-adoption tasks for the Load-balancing service Copy linkLink copied to clipboard!
If you adopted the Load-balancing service (octavia), after you complete the data plane adoption, you must perform the following tasks:
- Upgrade the amphorae virtual machines to the new images.
- Remove obsolete resources from your existing load balancers.
Prerequisites
- You have adopted the Load-balancing service. For more information, see Adopting the Load-balancing service.
Procedure
Ensure that the connectivity between the new control plane and the adopted Compute nodes is functional by creating a new load balancer and checking that its
provisioning_statusbecomesACTIVE:$ alias openstack="oc exec -t openstackclient -- openstack" $ openstack loadbalancer create --vip-subnet-id public-subnet --name lb-post-adoption --waitTrigger a failover for all existing load balancers to upgrade the amphorae virtual machines to use the new image and to establish connectivity with the new control plane:
$ openstack loadbalancer list -f value -c id | \ xargs -r -n1 -P4 ${BASH_ALIASES[openstack]} loadbalancer failover --waitDelete old flavors that were migrated to the new control plane:
$ openstack flavor delete octavia_65 # The following flavors might not exist in OSP 17.1 deployments $ openstack flavor show octavia_amphora-mvcpu-ha && \ openstack flavor delete octavia_amphora-mvcpu-ha $ openstack loadbalancer flavor show octavia_amphora-mvcpu-ha && \ openstack loadbalancer flavor delete octavia_amphora-mvcpu-ha $ openstack loadbalancer flavorprofile show octavia_amphora-mvcpu-ha_profile && \ openstack loadbalancer flavorprofile delete octavia_amphora-mvcpu-ha_profileNoteSome flavors might still be used by load balancers and cannot be deleted.
Delete the old management network and its ports:
$ for net_id in $(openstack network list -f value -c ID --name lb-mgmt-net); do \ desc=$(openstack network show "$net_id" -f value -c description); \ [ -z "$desc" ] && WALLABY_LB_MGMT_NET_ID="$net_id" ; \ done $ for id in $(openstack port list --network "$WALLABY_LB_MGMT_NET_ID" -f value -c ID); do \ openstack port delete "$id" ; \ done $ openstack network delete "$WALLABY_LB_MGMT_NET_ID"Verify that only one
lb-mgmt-netand onelb-mgmt-subnetexists:$ openstack network list | grep lb-mgmt-net | fe470c29-0482-4809-9996-6d636e3feea3 | lb-mgmt-net | 6a881091-097d-441c-937b-5a23f4f243b7 | $ openstack subnet list | grep lb-mgmt-subnet | 6a881091-097d-441c-937b-5a23f4f243b7 | lb-mgmt-subnet | fe470c29-0482-4809-9996-6d636e3feea3 | 172.24.0.0/16 |
Chapter 6. Migrating the Object Storage service to Red Hat OpenStack Services on OpenShift nodes Copy linkLink copied to clipboard!
If you are using the Red Hat OpenStack Platform Object Storage service (swift) as an Object Storage service, you must migrate your Object Storage service to Red Hat OpenStack Services on OpenShift nodes.
If you are using the Object Storage API of the Ceph Object Gateway (RGW), you can skip this chapter and migrate your Red Hat Ceph Storage cluster. For more information, see "Migrating the Red Hat Ceph Storage cluster". If you are not planning to migrate Ceph daemons from Controller nodes, you must perform the steps that are described in "Deploying a Ceph ingress daemon" and "Create or update the Object Storage service endpoints".
The data migration happens replica by replica. For example, if you have 3 replicas, move them one at a time to ensure that the other 2 replicas are still operational, which enables you to continue to use the Object Storage service during the migration.
Data migration to the new deployment is a long-running process that executes mostly in the background. The Object Storage service replicators move data from old to new nodes, which might take a long time depending on the amount of storage used. To reduce downtime, you can use the old nodes if they are running and continue with adopting other services while waiting for the migration to complete. Performance might be degraded due to the amount of replication traffic in the network.
6.1. Migrating the Object Storage service data from RHOSP to RHOSO nodes Copy linkLink copied to clipboard!
The Object Storage service (swift) migration involves the following steps:
- Add new nodes to the Object Storage service rings.
- Set weights of existing nodes to 0.
- Rebalance rings by moving one replica.
- Copy rings to old nodes and restart services.
- Check replication status and repeat the previous two steps until the old nodes are drained.
- Remove the old nodes from the rings.
Prerequisites
- Adopt the Object Storage service. For more information, see Adopting the Object Storage service.
For DNS servers, ensure that all existing nodes are able to resolve the hostnames of the Red Hat OpenShift Container Platform (RHOCP) pods, for example, by using the external IP of the DNSMasq service as the nameserver in
/etc/resolv.conf:$ oc get service dnsmasq-dns -o jsonpath="{.status.loadBalancer.ingress[0].ip}" | $CONTROLLER1_SSH sudo tee /etc/resolv.confTrack the current status of the replication by using the
swift-dispersiontool:$ oc debug --keep-labels=true job/swift-ring-rebalance -- /bin/sh -c 'swift-ring-tool get && swift-dispersion-populate'The command might need a few minutes to complete. It creates 0-byte objects that are distributed across the Object Storage service deployment, and you can use the
swift-dispersion-reportafterward to show the current replication status:$ oc debug --keep-labels=true job/swift-ring-rebalance -- /bin/sh -c 'swift-ring-tool get && swift-dispersion-report'The output of the
swift-dispersion-reportcommand looks similar to the following:Queried 1024 containers for dispersion reporting, 5s, 0 retries 100.00% of container copies found (3072 of 3072) Sample represents 100.00% of the container partition space Queried 1024 objects for dispersion reporting, 4s, 0 retries There were 1024 partitions missing 0 copies. 100.00% of object copies found (3072 of 3072) Sample represents 100.00% of the object partition space
Procedure
Add new nodes by scaling up the SwiftStorage resource from 0 to 3:
$ oc patch openstackcontrolplane openstack --type=merge -p='{"spec":{"swift":{"template":{"swiftStorage":{"replicas": 3}}}}}'This command creates three storage instances on the Red Hat OpenShift Container Platform (RHOCP) cluster that use Persistent Volume Claims.
Wait until all three pods are running and the rings include the new devices:
$ oc wait pods --for condition=Ready -l component=swift-storage $ oc debug --keep-labels=true job/swift-ring-rebalance -- /bin/sh -c 'swift-ring-tool get && swift-ring-builder object.builder search --device pv'From the current rings, get the storage management IP addresses of the nodes to drain:
$ oc debug --keep-labels=true job/swift-ring-rebalance -- /bin/sh -c 'swift-ring-tool get && swift-ring-builder object.builder search _' | tail -n +2 | awk '{print $4}' | sort -uThe output looks similar to the following:
172.20.0.100 swift-storage-0.swift-storage.openstack.svc swift-storage-1.swift-storage.openstack.svc swift-storage-2.swift-storage.openstack.svcDrain the old nodes. In the following example, the old node
172.20.0.100is drained:$ oc debug --keep-labels=true job/swift-ring-rebalance -- /bin/sh -c ' > swift-ring-tool get > swift-ring-tool drain 172.20.0.100 > swift-ring-tool rebalance > swift-ring-tool push'Depending on your deployment, you might have more nodes to include in the command.
Copy and apply the updated rings to the original nodes. Run the ssh commands for your existing nodes that store the Object Storage service data:
$ oc extract --confirm cm/swift-ring-files $ $CONTROLLER1_SSH "tar -C /var/lib/config-data/puppet-generated/swift/etc/swift/ -xzf -" < swiftrings.tar.gz $ $CONTROLLER1_SSH "systemctl restart tripleo_swift_*" $ $CONTROLLER2_SSH "tar -C /var/lib/config-data/puppet-generated/swift/etc/swift/ -xzf -" < swiftrings.tar.gz $ $CONTROLLER2_SSH "systemctl restart tripleo_swift_*" $ $CONTROLLER3_SSH "tar -C /var/lib/config-data/puppet-generated/swift/etc/swift/ -xzf -" < swiftrings.tar.gz $ $CONTROLLER3_SSH "systemctl restart tripleo_swift_*"Track the replication progress by using the
swift-dispersion-reporttool:$ oc debug --keep-labels=true job/swift-ring-rebalance -- /bin/sh -c "swift-ring-tool get && swift-dispersion-report"The output shows less than 100% of copies found. Repeat the command until all container and object copies are found:
Queried 1024 containers for dispersion reporting, 6s, 0 retries There were 5 partitions missing 1 copy. 99.84% of container copies found (3067 of 3072) Sample represents 100.00% of the container partition space Queried 1024 objects for dispersion reporting, 7s, 0 retries There were 739 partitions missing 1 copy. There were 285 partitions missing 0 copies. 75.94% of object copies found (2333 of 3072) Sample represents 100.00% of the object partition spaceNoteThe rebalance command moves only one replica at a time to ensure that data is available continuously. This requires running the rebalance command multiple times to complete the full rebalance operation. Additionally, a minimum wait time of one hour between consecutive rebalance commands is enforced to prevent moving multiple replicas at the same time. Running the rebalance again before this period elapses has no effect.
Move the next replica to the new nodes by rebalancing and distributing the rings:
$ oc debug --keep-labels=true job/swift-ring-rebalance -- /bin/sh -c ' > swift-ring-tool get > swift-ring-tool rebalance > swift-ring-tool push' $ oc extract --confirm cm/swift-ring-files $ $CONTROLLER1_SSH "tar -C /var/lib/config-data/puppet-generated/swift/etc/swift/ -xzf -" < swiftrings.tar.gz $ $CONTROLLER1_SSH "systemctl restart tripleo_swift_*"Monitor the
swift-dispersion-reportoutput again, wait until all copies are found, and then repeat this step until all your replicas are moved to the new nodes.Remove the nodes from the rings:
$ oc debug --keep-labels=true job/swift-ring-rebalance -- /bin/sh -c ' > swift-ring-tool get > swift-ring-tool remove 172.20.0.100 > swift-ring-tool rebalance > swift-ring-tool push'NoteEven if all replicas are on the new nodes and the
swift-dispersion-reportcommand reports 100% of the copies found, there might still be data on the old nodes. The replicators remove this data, but it might take more time.
Verification
Check the disk usage of all disks in the cluster:
$ oc debug --keep-labels=true job/swift-ring-rebalance -- /bin/sh -c 'swift-ring-tool get && swift-recon -d'Confirm that there are no more
\*.dbor*.datafiles in the/srv/nodedirectory on the nodes:$CONTROLLER1_SSH "find /srv/node/ -type f -name '*.db' -o -name '*.data' | wc -l"
6.2. Troubleshooting the Object Storage service migration Copy linkLink copied to clipboard!
You can troubleshoot issues with the Object Storage service (swift) migration.
If the replication is not working and the
swift-dispersion-reportis not back to 100% availability, check the replicator progress to help you debug:$ CONTROLLER1_SSH tail /var/log/containers/swift/swift.log | grep object-serverThe following shows an example of the output:
Mar 14 06:05:30 standalone object-server[652216]: <f+++++++++ 4e2/9cbea55c47e243994b0b10d8957184e2/1710395823.58025.data Mar 14 06:05:30 standalone object-server[652216]: Successful rsync of /srv/node/vdd/objects/626/4e2 to swift-storage-1.swift-storage.openstack.svc::object/d1/objects/626 (0.094) Mar 14 06:05:30 standalone object-server[652216]: Removing partition: /srv/node/vdd/objects/626 Mar 14 06:05:31 standalone object-server[652216]: <f+++++++++ 85f/cf53b5a048e5b19049e05a548cde185f/1710395796.70868.data Mar 14 06:05:31 standalone object-server[652216]: Successful rsync of /srv/node/vdb/objects/829/85f to swift-storage-2.swift-storage.openstack.svc::object/d1/objects/829 (0.095) Mar 14 06:05:31 standalone object-server[652216]: Removing partition: /srv/node/vdb/objects/829You can also check the ring consistency and replicator status:
$ oc debug --keep-labels=true job/swift-ring-rebalance -- /bin/sh -c 'swift-ring-tool get && swift-recon -r --md5'The output might show a md5 mismatch until approximately 2 minutes after pushing the new rings. After the 2 minutes, the output looks similar to the following example:
[...] Oldest completion was 2024-03-14 16:53:27 (3 minutes ago) by 172.20.0.100:6000. Most recent completion was 2024-03-14 16:56:38 (12 seconds ago) by swift-storage-0.swift-storage.openstack.svc:6200. =============================================================================== [2024-03-14 16:56:50] Checking ring md5sums 4/4 hosts matched, 0 error[s] while checking hosts. [...]
Chapter 7. Migrating the Red Hat Ceph Storage cluster Copy linkLink copied to clipboard!
In the context of data plane adoption, where the Red Hat OpenStack Platform (RHOSP) services are redeployed in Red Hat OpenShift Container Platform (RHOCP), you migrate a director-deployed Red Hat Ceph Storage cluster by using a process called “externalizing” the Red Hat Ceph Storage cluster.
There are two deployment topologies that include an internal Red Hat Ceph Storage cluster:
- RHOSP includes dedicated Red Hat Ceph Storage nodes to host object storage daemons (OSDs)
- Hyperconverged Infrastructure (HCI), where Compute and Storage services are colocated on hyperconverged nodes
In either scenario, there are some Red Hat Ceph Storage processes that are deployed on RHOSP Controller nodes: Red Hat Ceph Storage monitors, Ceph Object Gateway (RGW), Rados Block Device (RBD), Ceph Metadata Server (MDS), Ceph Dashboard, and NFS Ganesha. To migrate your Red Hat Ceph Storage cluster, you must decommission the Controller nodes and move the Red Hat Ceph Storage daemons to a set of target nodes that are already part of the Red Hat Ceph Storage cluster.
7.1. Prerequisites Copy linkLink copied to clipboard!
- Before you begin the migration, complete the tasks in your Red Hat OpenStack Platform 17.1 environment. For more information, see "Red Hat Ceph Storage prerequisites" in the Adoption overview chapter.
7.2. Red Hat Ceph Storage migration for Distributed Compute Node deployments Copy linkLink copied to clipboard!
Before you adopt your Distributed Compute Node (DCN) deployments that host Red Hat Ceph Storage clusters on Compute nodes at edge sites so that your architecture runs on Red Hat OpenStack Services on OpenShift (RHOSO) (RHOSO), be aware of important considerations.
- Supported edge storage topologies
DCN deployments support the following storage topologies at edge sites:
- Hyperconverged Infrastructure (HCI): Red Hat Ceph Storage daemons run on Compute nodes at each edge site.
- director-deployed dedicated storage: Red Hat Ceph Storage runs on separate storage nodes deployed by director.
- External Red Hat Ceph Storage cluster: Edge sites connect to pre-existing Red Hat Ceph Storage clusters not managed by director.
- Central site Red Hat Ceph Storage migration
- For the central site, migrate Red Hat Ceph Storage daemons from the RHOSP Controller nodes by using the same process as a non-DCN deployment. For more information, see Red Hat Ceph Storage daemon cardinality.
- Edge site Red Hat Ceph Storage migration
For edge sites that use HCI or director-deployed dedicated storage, the Red Hat Ceph Storage daemons can continue to run on their current nodes without migration. The Compute nodes or dedicated storage nodes at edge sites are not decommissioned during adoption, so the Red Hat Ceph Storage daemons remain operational.
For edge sites that use external Red Hat Ceph Storage clusters, no migration is required because the Red Hat Ceph Storage cluster is not managed by director.
- Red Hat Ceph Storage back-end configuration and key distribution
In a DCN deployment, each site has its own Red Hat Ceph Storage cluster with its own configuration file and Red Hat Ceph Storage keyring. These must be stored in Kubernetes secrets and mounted into the appropriate Red Hat OpenStack Services on OpenShift (RHOSO) service pods.
Rather than storing all Red Hat Ceph Storage keys in a single secret accessible to every pod, the recommended approach is to create one secret per site containing only the keys that site actually needs. This limits the security impact if a site is compromised: a pod at an edge site can authenticate only to its local Red Hat Ceph Storage cluster and the central cluster, not to the Red Hat Ceph Storage keyrings of other edge sites.
The key distribution rule for N sites is:
-
The central site (site 0) receives the Red Hat Ceph Storage keys and configuration for all clusters, because central services such as Image service use the
splitback end and must be able to copy images to and from any site. Each edge site (site 1 through N) receives only the keys for the central cluster and its own local cluster.
For example, in a three-site deployment with a central site and two edge sites:
ceph-conf-central -> central.conf + central.keyring dcn1.conf + dcn1.keyring dcn2.conf + dcn2.keyring ceph-conf-dcn1 -> central.conf + central.keyring dcn1.conf + dcn1.keyring ceph-conf-dcn2 -> central.conf + central.keyring dcn2.conf + dcn2.keyringThe per-site secrets are created and then mounted into the appropriate pods using
extraMountspropagation labels. The procedure in Configuring a Red Hat Ceph Storage back end covers both creating the secrets and applying the propagation labels so that each pod receives only its site-specific keys.
-
The central site (site 0) receives the Red Hat Ceph Storage keys and configuration for all clusters, because central services such as Image service use the
7.3. Red Hat Ceph Storage daemon cardinality Copy linkLink copied to clipboard!
Red Hat Ceph Storage 7 and later applies strict constraints in the way daemons can be colocated within the same node. For more information, see the Red Hat Knowledgebase article Red Hat Ceph Storage: Supported configurations. Your topology depends on the available hardware and the amount of Red Hat Ceph Storage services in the Controller nodes that you retire. The amount of services that you can migrate depends on the amount of available nodes in the cluster. The following diagrams show the distribution of Red Hat Ceph Storage daemons on Red Hat Ceph Storage nodes where at least 3 nodes are required.
The following scenario includes only RGW and RBD, without the Red Hat Ceph Storage dashboard:
| | | | |----|---------------------|-------------| | osd | mon/mgr/crash | rgw/ingress | | osd | mon/mgr/crash | rgw/ingress | | osd | mon/mgr/crash | rgw/ingress |With the Red Hat Ceph Storage dashboard, but without Shared File Systems service (manila), at least 4 nodes are required. The Red Hat Ceph Storage dashboard has no failover:
| | | | |-----|---------------------|-------------| | osd | mon/mgr/crash | rgw/ingress | | osd | mon/mgr/crash | rgw/ingress | | osd | mon/mgr/crash | dashboard/grafana | | osd | rgw/ingress | (free) |With the Red Hat Ceph Storage dashboard and the Shared File Systems service, a minimum of 5 nodes are required, and the Red Hat Ceph Storage dashboard has no failover:
| | | | |-----|---------------------|-------------------------| | osd | mon/mgr/crash | rgw/ingress | | osd | mon/mgr/crash | rgw/ingress | | osd | mon/mgr/crash | mds/ganesha/ingress | | osd | rgw/ingress | mds/ganesha/ingress | | osd | mds/ganesha/ingress | dashboard/grafana |
7.4. Migrating the monitoring stack component to new nodes within an existing Red Hat Ceph Storage cluster Copy linkLink copied to clipboard!
The Red Hat Ceph Storage Dashboard module adds web-based monitoring and administration to the Ceph Manager. With director-deployed Red Hat Ceph Storage, the Red Hat Ceph Storage Dashboard is enabled as part of the overcloud deploy and is composed of the following components:
- Ceph Manager module
- Grafana
- Prometheus
- Alertmanager
- Node exporter
The Red Hat Ceph Storage Dashboard containers are included through tripleo-container-image-prepare parameters, and high availability (HA) relies on HAProxy and Pacemaker to be deployed on the Red Hat OpenStack Platform (RHOSP) environment. For an external Red Hat Ceph Storage cluster, HA is not supported.
7.4.1. Prerequisites Copy linkLink copied to clipboard!
You migrate and relocate the Ceph Monitoring components to free Controller nodes. Before you begin the migration, complete the tasks in your Red Hat OpenStack Platform 17.1 environment. For more information, see "Red Hat Ceph Storage prerequisites" in the "Adoption overview" chapter.
7.4.2. Migrating the monitoring stack to the target nodes Copy linkLink copied to clipboard!
To migrate the monitoring stack to the target nodes, you add the monitoring label to your existing nodes and update the configuration of each daemon. You do not need to migrate node exporters. These daemons are deployed across the nodes that are part of the Red Hat Ceph Storage cluster (the placement is ‘*’).
Depending on the target nodes and the number of deployed or active daemons, you can either relocate the existing containers to the target nodes, or select a subset of nodes that host the monitoring stack daemons. High availability (HA) is not supported. Reducing the placement with count: 1 allows you to migrate the existing daemons in a Hyperconverged Infrastructure, or hardware-limited, scenario without impacting other services.
7.4.2.1. Migrating the existing daemons to the target nodes Copy linkLink copied to clipboard!
The following procedure is an example of an environment with 3 Red Hat Ceph Storage nodes or ComputeHCI nodes. This scenario extends the monitoring labels to all the Red Hat Ceph Storage or ComputeHCI nodes that are part of the cluster. This means that you keep 3 placements for the target nodes.
Prerequisites
- Confirm that the firewall rules are in place and the ports are open for a given monitoring stack service.
Procedure
Add the monitoring label to all the Red Hat Ceph Storage or ComputeHCI nodes in the cluster:
for item in $(sudo cephadm shell -- ceph orch host ls --format json | jq -r '.[].hostname'); do sudo cephadm shell -- ceph orch host label add $item monitoring; doneVerify that all the hosts on the target nodes have the monitoring label:
[tripleo-admin@controller-0 ~]$ sudo cephadm shell -- ceph orch host ls HOST ADDR LABELS cephstorage-0.redhat.local 192.168.24.11 osd monitoring cephstorage-1.redhat.local 192.168.24.12 osd monitoring cephstorage-2.redhat.local 192.168.24.47 osd monitoring controller-0.redhat.local 192.168.24.35 _admin mon mgr monitoring controller-1.redhat.local 192.168.24.53 mon _admin mgr monitoring controller-2.redhat.local 192.168.24.10 mon _admin mgr monitoringRemove the labels from the Controller nodes:
$ for i in 0 1 2; do sudo cephadm shell -- ceph orch host label rm "controller-$i.redhat.local" monitoring; done Removed label monitoring from host controller-0.redhat.local Removed label monitoring from host controller-1.redhat.local Removed label monitoring from host controller-2.redhat.localDump the current monitoring stack spec:
function export_spec { local component="$1" local target_dir="$2" sudo cephadm shell -- ceph orch ls --export "$component" > "$target_dir/$component" } SPEC_DIR=${SPEC_DIR:-"$PWD/ceph_specs"} mkdir -p ${SPEC_DIR} for m in grafana prometheus alertmanager; do export_spec "$m" "$SPEC_DIR" doneFor each daemon, edit the current spec and replace the
placement.hosts:section with theplacement.label:section, for example:service_type: grafana service_name: grafana placement: label: monitoring networks: - 172.17.3.0/24 spec: port: 3100This step also applies to Prometheus and Alertmanager specs.
Apply the new monitoring spec to relocate the monitoring stack daemons:
SPEC_DIR=${SPEC_DIR:-"$PWD/ceph_specs"} function migrate_daemon { local component="$1" local target_dir="$2" sudo cephadm shell -m "$target_dir" -- ceph orch apply -i /mnt/ceph_specs/$component } for m in grafana prometheus alertmanager; do migrate_daemon "$m" "$SPEC_DIR" doneVerify that the daemons are deployed on the expected nodes:
[ceph: root@controller-0 /]# ceph orch ps | grep -iE "(prome|alert|grafa)" alertmanager.cephstorage-2 cephstorage-2.redhat.local 172.17.3.144:9093,9094 grafana.cephstorage-0 cephstorage-0.redhat.local 172.17.3.83:3100 prometheus.cephstorage-1 cephstorage-1.redhat.local 172.17.3.53:9092NoteAfter you migrate the monitoring stack, you lose high availability. The monitoring stack daemons no longer have a Virtual IP address and HAProxy anymore. Node exporters are still running on all the nodes.
Review the Red Hat Ceph Storage configuration to ensure that it aligns with the configuration on the target nodes. In particular, focus on the following configuration entries:
[ceph: root@controller-0 /]# ceph config dump | grep -i dashboard ... mgr advanced mgr/dashboard/ALERTMANAGER_API_HOST http://172.17.3.83:9093 mgr advanced mgr/dashboard/GRAFANA_API_URL https://172.17.3.144:3100 mgr advanced mgr/dashboard/PROMETHEUS_API_HOST http://172.17.3.83:9092 mgr advanced mgr/dashboard/controller-0.ycokob/server_addr 172.17.3.33 mgr advanced mgr/dashboard/controller-1.lmzpuc/server_addr 172.17.3.147 mgr advanced mgr/dashboard/controller-2.xpdgfl/server_addr 172.17.3.138Verify that the
API_HOST/URLof thegrafana,alertmanagerandprometheusservices points to the IP addresses on the storage network of the node where each daemon is relocated:[ceph: root@controller-0 /]# ceph orch ps | grep -iE "(prome|alert|grafa)" alertmanager.cephstorage-0 cephstorage-0.redhat.local 172.17.3.83:9093,9094 alertmanager.cephstorage-1 cephstorage-1.redhat.local 172.17.3.53:9093,9094 alertmanager.cephstorage-2 cephstorage-2.redhat.local 172.17.3.144:9093,9094 grafana.cephstorage-0 cephstorage-0.redhat.local 172.17.3.83:3100 grafana.cephstorage-1 cephstorage-1.redhat.local 172.17.3.53:3100 grafana.cephstorage-2 cephstorage-2.redhat.local 172.17.3.144:3100 prometheus.cephstorage-0 cephstorage-0.redhat.local 172.17.3.83:9092 prometheus.cephstorage-1 cephstorage-1.redhat.local 172.17.3.53:9092 prometheus.cephstorage-2 cephstorage-2.redhat.local 172.17.3.144:9092[ceph: root@controller-0 /]# ceph config dump ... ... mgr advanced mgr/dashboard/ALERTMANAGER_API_HOST http://172.17.3.83:9093 mgr advanced mgr/dashboard/PROMETHEUS_API_HOST http://172.17.3.83:9092 mgr advanced mgr/dashboard/GRAFANA_API_URL https://172.17.3.144:3100NoteThe Ceph Dashboard, as the service provided by the Ceph
mgr, is not impacted by the relocation. You might experience an impact when the activemgrdaemon is migrated or is force-failed. However, you can define 3 replicas in the Ceph Manager configuration to redirect requests to a different instance.
7.5. Migrating Red Hat Ceph Storage MDS to new nodes within the existing cluster Copy linkLink copied to clipboard!
You can migrate the MDS daemon when Shared File Systems service (manila), deployed with either a cephfs-native or ceph-nfs back end, is part of the overcloud deployment. The MDS migration is performed by cephadm, and you move the daemons placement from a hosts-based approach to a label-based approach. This ensures that you can visualize the status of the cluster and where daemons are placed by using the ceph orch host command. You can also have a general view of how the daemons are co-located within a given host, as described in the Red Hat Knowledgebase article Red Hat Ceph Storage: Supported configurations.
Prerequisites
- Complete the tasks in your Red Hat OpenStack Platform 17.1 environment. For more information, see Red Hat Ceph Storage prerequisites.
Procedure
Verify that the Red Hat Ceph Storage cluster is healthy and check the MDS status:
$ sudo cephadm shell -- ceph fs ls name: cephfs, metadata pool: manila_metadata, data pools: [manila_data ] $ sudo cephadm shell -- ceph mds stat cephfs:1 {0=mds.controller-2.oebubl=up:active} 2 up:standby $ sudo cephadm shell -- ceph fs status cephfs cephfs - 0 clients ====== RANK STATE MDS ACTIVITY DNS INOS DIRS CAPS 0 active mds.controller-2.oebubl Reqs: 0 /s 696 196 173 0 POOL TYPE USED AVAIL manila_metadata metadata 152M 141G manila_data data 3072M 141G STANDBY MDS mds.controller-0.anwiwd mds.controller-1.cwzhogRetrieve more detailed information on the Ceph File System (CephFS) MDS status:
$ sudo cephadm shell -- ceph fs dump e8 enable_multiple, ever_enabled_multiple: 1,1 default compat: compat={},rocompat={},incompat={1=base v0.20,2=client writeable ranges,3=default file layouts on dirs,4=dir inode in separate object,5=mds uses versioned encoding,6=dirfrag is stored in omap,8=no anchor table,9=file layout v2,10=snaprealm v2} legacy client fscid: 1 Filesystem 'cephfs' (1) fs_name cephfs epoch 5 flags 12 joinable allow_snaps allow_multimds_snaps created 2024-01-18T19:04:01.633820+0000 modified 2024-01-18T19:04:05.393046+0000 tableserver 0 root 0 session_timeout 60 session_autoclose 300 max_file_size 1099511627776 required_client_features {} last_failure 0 last_failure_osd_epoch 0 compat compat={},rocompat={},incompat={1=base v0.20,2=client writeable ranges,3=default file layouts on dirs,4=dir inode in separate object,5=mds uses versioned encoding,6=dirfrag is stored in omap,7=mds uses inline data,8=no anchor table,9=file layout v2,10=snaprealm v2} max_mds 1 in 0 up {0=24553} failed damaged stopped data_pools [7] metadata_pool 9 inline_data disabled balancer standby_count_wanted 1 [mds.mds.controller-2.oebubl{0:24553} state up:active seq 2 addr [v2:172.17.3.114:6800/680266012,v1:172.17.3.114:6801/680266012] compat {c=[1],r=[1],i=[7ff]}] Standby daemons: [mds.mds.controller-0.anwiwd{-1:14715} state up:standby seq 1 addr [v2:172.17.3.20:6802/3969145800,v1:172.17.3.20:6803/3969145800] compat {c=[1],r=[1],i=[7ff]}] [mds.mds.controller-1.cwzhog{-1:24566} state up:standby seq 1 addr [v2:172.17.3.43:6800/2227381308,v1:172.17.3.43:6801/2227381308] compat {c=[1],r=[1],i=[7ff]}] dumped fsmap epoch 8Check the OSD blocklist and clean up the client list:
$ sudo cephadm shell -- ceph osd blocklist ls $ for item in $(sudo cephadm shell -- ceph osd blocklist ls | awk '{print $1}'); do > sudo cephadm shell -- ceph osd blocklist rm $item; > doneNoteWhen a file system client is unresponsive or misbehaving, the access to the file system might be forcibly terminated. This process is called eviction. Evicting a CephFS client prevents it from communicating further with MDS daemons and OSD daemons.
Ordinarily, a blocklisted client cannot reconnect to the servers; you must unmount and then remount the client. However, permitting a client that was evicted to attempt to reconnect can be useful. Because CephFS uses the RADOS OSD blocklist to control client eviction, you can permit CephFS clients to reconnect by removing them from the blocklist.
Get the hosts that are currently part of the Red Hat Ceph Storage cluster:
[ceph: root@controller-0 /]# ceph orch host ls HOST ADDR LABELS STATUS cephstorage-0.redhat.local 192.168.24.25 osd cephstorage-1.redhat.local 192.168.24.50 osd cephstorage-2.redhat.local 192.168.24.47 osd controller-0.redhat.local 192.168.24.24 _admin mgr mon controller-1.redhat.local 192.168.24.42 mgr _admin mon controller-2.redhat.local 192.168.24.37 mgr _admin mon 6 hosts in clusterApply the MDS labels to the target nodes:
for item in $(sudo cephadm shell -- ceph orch host ls --format json | jq -r '.[].hostname'); do sudo cephadm shell -- ceph orch host label add $item mds; doneVerify that all the hosts have the MDS label:
$ sudo cephadm shell -- ceph orch host ls HOST ADDR LABELS cephstorage-0.redhat.local 192.168.24.11 osd mds cephstorage-1.redhat.local 192.168.24.12 osd mds cephstorage-2.redhat.local 192.168.24.47 osd mds controller-0.redhat.local 192.168.24.35 _admin mon mgr mds controller-1.redhat.local 192.168.24.53 mon _admin mgr mds controller-2.redhat.local 192.168.24.10 mon _admin mgr mdsDump the current MDS spec:
$ SPEC_DIR=${SPEC_DIR:-"$PWD/ceph_specs"} $ mkdir -p ${SPEC_DIR} $ sudo cephadm shell -- ceph orch ls --export mds > ${SPEC_DIR}/mdsEdit the retrieved spec and replace the
placement.hostssection withplacement.label:service_type: mds service_id: mds service_name: mds.mds placement: label: mdsUse the
ceph orchestratorto apply the new MDS spec:$ SPEC_DIR=${SPEC_DIR:-"$PWD/ceph_specs"} $ sudo cephadm shell -m ${SPEC_DIR}/mds -- ceph orch apply -i /mnt/mds Scheduling new mds deployment ...This results in an increased number of MDS daemons.
Check the new standby daemons that are temporarily added to the CephFS:
$ sudo cephadm shell -- ceph fs dump Active standby_count_wanted 1 [mds.mds.controller-0.awzplm{0:463158} state up:active seq 307 join_fscid=1 addr [v2:172.17.3.20:6802/51565420,v1:172.17.3.20:6803/51565420] compat {c=[1],r=[1],i=[7ff]}] Standby daemons: [mds.mds.cephstorage-1.jkvomp{-1:463800} state up:standby seq 1 join_fscid=1 addr [v2:172.17.3.135:6820/2075903648,v1:172.17.3.135:6821/2075903648] compat {c=[1],r=[1],i=[7ff]}] [mds.mds.controller-2.gfrqvc{-1:475945} state up:standby seq 1 addr [v2:172.17.3.114:6800/2452517189,v1:172.17.3.114:6801/2452517189] compat {c=[1],r=[1],i=[7ff]}] [mds.mds.cephstorage-0.fqcshx{-1:476503} state up:standby seq 1 join_fscid=1 addr [v2:172.17.3.92:6820/4120523799,v1:172.17.3.92:6821/4120523799] compat {c=[1],r=[1],i=[7ff]}] [mds.mds.cephstorage-2.gnfhfe{-1:499067} state up:standby seq 1 addr [v2:172.17.3.79:6820/2448613348,v1:172.17.3.79:6821/2448613348] compat {c=[1],r=[1],i=[7ff]}] [mds.mds.controller-1.tyiziq{-1:499136} state up:standby seq 1 addr [v2:172.17.3.43:6800/3615018301,v1:172.17.3.43:6801/3615018301] compat {c=[1],r=[1],i=[7ff]}]To migrate MDS to the target nodes, set the MDS affinity that manages the MDS failover:
NoteIt is possible to elect a dedicated MDS as "active" for a particular file system. To configure this preference,
CephFSprovides a configuration option for MDS calledmds_join_fs, which enforces this affinity. When failing over MDS daemons, cluster monitors prefer standby daemons withmds_join_fsequal to the file system name with the failed rank. If no standby exists withmds_join_fsequal to the file system name, it chooses an unqualified standby as a replacement.$ sudo cephadm shell -- ceph config set mds.mds.cephstorage-0.fqcshx mds_join_fs cephfs-
Replace
mds.mds.cephstorage-0.fqcshxwith the daemon deployed oncephstorage-0that was retrieved from the previous step.
-
Replace
Remove the labels from the Controller nodes and force the MDS failover to the target node:
$ for i in 0 1 2; do sudo cephadm shell -- ceph orch host label rm "controller-$i.redhat.local" mds; done Removed label mds from host controller-0.redhat.local Removed label mds from host controller-1.redhat.local Removed label mds from host controller-2.redhat.localThe switch to the target node happens in the background. The new active MDS is the one that you set by using the
mds_join_fscommand.Check the result of the failover and the new deployed daemons:
$ sudo cephadm shell -- ceph fs dump … … standby_count_wanted 1 [mds.mds.cephstorage-0.fqcshx{0:476503} state up:active seq 168 join_fscid=1 addr [v2:172.17.3.92:6820/4120523799,v1:172.17.3.92:6821/4120523799] compat {c=[1],r=[1],i=[7ff]}] Standby daemons: [mds.mds.cephstorage-2.gnfhfe{-1:499067} state up:standby seq 1 addr [v2:172.17.3.79:6820/2448613348,v1:172.17.3.79:6821/2448613348] compat {c=[1],r=[1],i=[7ff]}] [mds.mds.cephstorage-1.jkvomp{-1:499760} state up:standby seq 1 join_fscid=1 addr [v2:172.17.3.135:6820/452139733,v1:172.17.3.135:6821/452139733] compat {c=[1],r=[1],i=[7ff]}] $ sudo cephadm shell -- ceph orch ls NAME PORTS RUNNING REFRESHED AGE PLACEMENT crash 6/6 10m ago 10d * mds.mds 3/3 10m ago 32m label:mds $ sudo cephadm shell -- ceph orch ps | grep mds mds.mds.cephstorage-0.fqcshx cephstorage-0.redhat.local running (79m) 3m ago 79m 27.2M - 17.2.6-100.el9cp 1af7b794f353 2a2dc5ba6d57 mds.mds.cephstorage-1.jkvomp cephstorage-1.redhat.local running (79m) 3m ago 79m 21.5M - 17.2.6-100.el9cp 1af7b794f353 7198b87104c8 mds.mds.cephstorage-2.gnfhfe cephstorage-2.redhat.local running (79m) 3m ago 79m 24.2M - 17.2.6-100.el9cp 1af7b794f353 f3cb859e2a15
7.6. Migrating Red Hat Ceph Storage RGW to external RHEL nodes Copy linkLink copied to clipboard!
For Hyperconverged Infrastructure (HCI) or dedicated Storage nodes, you must migrate the Ceph Object Gateway (RGW) daemons that are included in the Red Hat OpenStack Platform Controller nodes into the existing external Red Hat Enterprise Linux (RHEL) nodes. The external RHEL nodes typically include the Compute nodes for an HCI environment or Red Hat Ceph Storage nodes. Your environment must have Red Hat Ceph Storage 7 or later and be managed by cephadm or Ceph Orchestrator.
7.6.1. Prerequisites Copy linkLink copied to clipboard!
Before you begin the migration, complete the tasks in your Red Hat OpenStack Platform 17.1 environment. For more information, see the "Red Hat Ceph Storage prerequisites" in the "Adoption overview" chapter.
7.6.2. Migrating the Red Hat Ceph Storage RGW back ends Copy linkLink copied to clipboard!
You must migrate your Ceph Object Gateway (RGW) back ends from your Controller nodes to your Red Hat Ceph Storage nodes. To ensure that you distribute the correct amount of services to your available nodes, you use cephadm labels to refer to a group of nodes where a given daemon type is deployed. For more information about the cardinality diagram, see Red Hat Ceph Storage daemon cardinality. The following procedure assumes that you have three target nodes, cephstorage-0, cephstorage-1, cephstorage-2.
Procedure
Add the RGW label to the Red Hat Ceph Storage nodes that you want to migrate your RGW back ends to:
$ sudo cephadm shell -- ceph orch host label add cephstorage-0 rgw; $ sudo cephadm shell -- ceph orch host label add cephstorage-1 rgw; $ sudo cephadm shell -- ceph orch host label add cephstorage-2 rgw; Added label rgw to host cephstorage-0 Added label rgw to host cephstorage-1 Added label rgw to host cephstorage-2 $ sudo cephadm shell -- ceph orch host ls HOST ADDR LABELS STATUS cephstorage-0 192.168.24.54 osd rgw cephstorage-1 192.168.24.44 osd rgw cephstorage-2 192.168.24.30 osd rgw controller-0 192.168.24.45 _admin mon mgr controller-1 192.168.24.11 _admin mon mgr controller-2 192.168.24.38 _admin mon mgr 6 hosts in clusterLocate the RGW spec and dump in the spec directory:
$ SPEC_DIR=${SPEC_DIR:-"$PWD/ceph_specs"} $ mkdir -p ${SPEC_DIR} $ sudo cephadm shell -- ceph orch ls --export rgw > ${SPEC_DIR}/rgw $ cat ${SPEC_DIR}/rgwnetworks: - 172.17.3.0/24 placement: hosts: - controller-0 - controller-1 - controller-2 service_id: rgw service_name: rgw.rgw service_type: rgw spec: rgw_frontend_port: 8080 rgw_realm: default rgw_zone: defaultThis example assumes that
172.17.3.0/24is thestoragenetwork.In the
placementsection, ensure that thelabelandrgw_frontend_portvalues are set:--- networks: - 172.17.3.0/24 placement: label: rgw service_id: rgw service_name: rgw.rgw service_type: rgw spec: rgw_frontend_port: 8090 rgw_realm: default rgw_zone: default rgw_frontend_ssl_certificate: ... ssl: true-
networksdefines the storage network where the RGW back ends are deployed. -
placement.label: rgwreplaces the Controller nodes with thergwlabel. -
spec.rgw_frontend_portspecifies the value as8090to avoid conflicts with the Ceph ingress daemon. -
spec.rgw_frontend_ssl_certificatedefines the SSL certificate and key concatenation if TLS is enabled as described in Configuring RGW with TLS for an external Red Hat Ceph Storage cluster in Configuring persistent storage.
-
Apply the new RGW spec by using the orchestrator CLI:
$ SPEC_DIR=${SPEC_DIR:-"$PWD/ceph_specs"} $ sudo cephadm shell -m ${SPEC_DIR}/rgw -- ceph orch apply -i /mnt/rgwThis command triggers the redeploy, for example:
... osd.9 cephstorage-2 rgw.rgw.cephstorage-0.wsjlgx cephstorage-0 172.17.3.23:8090 starting rgw.rgw.cephstorage-1.qynkan cephstorage-1 172.17.3.26:8090 starting rgw.rgw.cephstorage-2.krycit cephstorage-2 172.17.3.81:8090 starting rgw.rgw.controller-1.eyvrzw controller-1 172.17.3.146:8080 running (5h) rgw.rgw.controller-2.navbxa controller-2 172.17.3.66:8080 running (5h) ... osd.9 cephstorage-2 rgw.rgw.cephstorage-0.wsjlgx cephstorage-0 172.17.3.23:8090 running (19s) rgw.rgw.cephstorage-1.qynkan cephstorage-1 172.17.3.26:8090 running (16s) rgw.rgw.cephstorage-2.krycit cephstorage-2 172.17.3.81:8090 running (13s)Ensure that the new RGW back ends are reachable on the new ports, so you can enable an ingress daemon on port
8080later. Log in to each Red Hat Ceph Storage node that includes RGW and add theiptablesrule to allow connections to both 8080 and 8090 ports in the Red Hat Ceph Storage nodes:$ iptables -I INPUT -p tcp -m tcp --dport 8080 -m conntrack --ctstate NEW -m comment --comment "ceph rgw ingress" -j ACCEPT $ iptables -I INPUT -p tcp -m tcp --dport 8090 -m conntrack --ctstate NEW -m comment --comment "ceph rgw backends" -j ACCEPT $ sudo iptables-save $ sudo systemctl restart iptablesIf
nftablesis used in the existing deployment, edit/etc/nftables/tripleo-rules.nftand add the following content:# 100 ceph_rgw {'dport': ['8080','8090']} add rule inet filter TRIPLEO_INPUT tcp dport { 8080,8090 } ct state new counter accept comment "100 ceph_rgw"- Save the file.
Restart the
nftablesservice:$ sudo systemctl restart nftablesVerify that the rules are applied:
$ sudo nft list ruleset | grep ceph_rgwFrom a Controller node, such as
controller-0, try to reach the RGW back ends:$ curl http://cephstorage-0.storage:8090;You should observe the following output:
<?xml version="1.0" encoding="UTF-8"?><ListAllMyBucketsResult xmlns="http://s3.amazonaws.com/doc/2006-03-01/"><Owner><ID>anonymous</ID><DisplayName></DisplayName></Owner><Buckets></Buckets></ListAllMyBucketsResult>Repeat the verification for each node where a RGW daemon is deployed.
If you migrated RGW back ends to the Red Hat Ceph Storage nodes, there is no
internalAPInetwork, except in the case of HCI nodes. You must reconfigure the RGW keystone endpoint to point to the external network that you propagated:[ceph: root@controller-0 /]# ceph config dump | grep keystone global basic rgw_keystone_url http://172.16.1.111:5000 [ceph: root@controller-0 /]# ceph config set global rgw_keystone_url http://<keystone_endpoint>:5000-
Replace
<keystone_endpoint>with the Identity service (keystone) internal endpoint of the service that is deployed in theOpenStackControlPlaneCR when you adopt the Identity service. For more information, see Adopting the Identity service.
-
Replace
7.6.3. Deploying a Red Hat Ceph Storage ingress daemon Copy linkLink copied to clipboard!
To deploy the Ceph ingress daemon, you perform the following actions:
-
Remove the existing
ceph_rgwconfiguration. - Clean up the configuration created by director.
- Redeploy the Object Storage service (swift).
When you deploy the ingress daemon, two new containers are created:
- HAProxy, which you use to reach the back ends.
- Keepalived, which you use to own the virtual IP address.
You use the rgw label to distribute the ingress daemon to only the number of nodes that host Ceph Object Gateway (RGW) daemons. For more information about distributing daemons among your nodes, see Red Hat Ceph Storage daemon cardinality.
After you complete this procedure, you can reach the RGW back end from the ingress daemon and use RGW through the Object Storage service CLI.
Procedure
Log in to each Controller node and remove the following configuration from the
/var/lib/config-data/puppet-generated/haproxy/etc/haproxy/haproxy.cfgfile:listen ceph_rgw bind 10.0.0.103:8080 transparent mode http balance leastconn http-request set-header X-Forwarded-Proto https if { ssl_fc } http-request set-header X-Forwarded-Proto http if !{ ssl_fc } http-request set-header X-Forwarded-Port %[dst_port] option httpchk GET /swift/healthcheck option httplog option forwardfor server controller-0.storage.redhat.local 172.17.3.73:8080 check fall 5 inter 2000 rise 2 server controller-1.storage.redhat.local 172.17.3.146:8080 check fall 5 inter 2000 rise 2 server controller-2.storage.redhat.local 172.17.3.156:8080 check fall 5 inter 2000 rise 2Restart
haproxy-bundleand confirm that it is started:[root@controller-0 ~]# sudo pcs resource restart haproxy-bundle haproxy-bundle successfully restarted [root@controller-0 ~]# sudo pcs status | grep haproxy * Container bundle set: haproxy-bundle [undercloud-0.ctlplane.redhat.local:8787/rh-osbs/rhosp17-openstack-haproxy:pcmklatest]: * haproxy-bundle-podman-0 (ocf:heartbeat:podman): Started controller-0 * haproxy-bundle-podman-1 (ocf:heartbeat:podman): Started controller-1 * haproxy-bundle-podman-2 (ocf:heartbeat:podman): Started controller-2Confirm that no process is connected to port 8080:
[root@controller-0 ~]# ss -antop | grep 8080 [root@controller-0 ~]#You can expect the Object Storage service (swift) CLI to fail to establish the connection:
(overcloud) [root@cephstorage-0 ~]# swift list HTTPConnectionPool(host='10.0.0.103', port=8080): Max retries exceeded with url: /swift/v1/AUTH_852f24425bb54fa896476af48cbe35d3?format=json (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fc41beb0430>: Failed to establish a new connection: [Errno 111] Connection refused'))Set the required images for both HAProxy and Keepalived:
[ceph: root@controller-0 /]# ceph config set mgr mgr/cephadm/container_image_haproxy registry.redhat.io/rhceph/rhceph-haproxy-rhel9:latest [ceph: root@controller-0 /]# ceph config set mgr mgr/cephadm/container_image_keepalived registry.redhat.io/rhceph/keepalived-rhel9:latestCreate a file called
rgw_ingressincontroller-0:$ SPEC_DIR=${SPEC_DIR:-"$PWD/ceph_specs"} $ vim ${SPEC_DIR}/rgw_ingressPaste the following content into the
rgw_ingressfile:--- service_type: ingress service_id: rgw.rgw placement: label: rgw spec: backend_service: rgw.rgw virtual_ip: 10.0.0.89/24 frontend_port: 8080 monitor_port: 8898 virtual_interface_networks: - <external_network> ssl_cert: ...-
Replace
<external_network>with your external network, for example,10.0.0.0/24. For more information, see Completing prerequisites for migrating Red Hat Ceph Storage RGW. - If TLS is enabled, add the SSL certificate and key concatenation as described in Configuring RGW with TLS for an external Red Hat Ceph Storage cluster in Configuring persistent storage.
-
Replace
Apply the
rgw_ingressspec by using the Ceph orchestrator CLI:$ SPEC_DIR=${SPEC_DIR:-"$PWD/ceph_specs"} $ cephadm shell -m ${SPEC_DIR}/rgw_ingress -- ceph orch apply -i /mnt/rgw_ingressWait until the ingress is deployed and query the resulting endpoint:
$ sudo cephadm shell -- ceph orch ls NAME PORTS RUNNING REFRESHED AGE PLACEMENT crash 6/6 6m ago 3d * ingress.rgw.rgw 10.0.0.89:8080,8898 6/6 37s ago 60s label:rgw mds.mds 3/3 6m ago 3d controller-0;controller-1;controller-2 mgr 3/3 6m ago 3d controller-0;controller-1;controller-2 mon 3/3 6m ago 3d controller-0;controller-1;controller-2 osd.default_drive_group 15 37s ago 3d cephstorage-0;cephstorage-1;cephstorage-2 rgw.rgw ?:8090 3/3 37s ago 4m label:rgw$ curl 10.0.0.89:8080 --- <?xml version="1.0" encoding="UTF-8"?><ListAllMyBucketsResult xmlns="http://s3.amazonaws.com/doc/2006-03-01/"><Owner><ID>anonymous</ID><DisplayName></DisplayName></Owner><Buckets></Buckets></ListAllMyBucketsResult>[ceph: root@controller-0 /]# —
7.6.4. Create or update the Object Storage service endpoints Copy linkLink copied to clipboard!
You must create or update the Object Storage service (swift) endpoints to configure the new virtual IP address (VIP) that you reserved on the same network that you used to deploy RGW ingress.
Procedure
List the current swift endpoints and service:
$ oc rsh openstackclient openstack endpoint list | grep "swift.*object"$ oc rsh openstackclient openstack service list | grep "swift.*object"If the service and endpoints do not exist, create the missing swift resources:
$ oc rsh openstackclient openstack service create --name swift --description 'OpenStack Object Storage' object-store $ oc rsh openstackclient openstack role add --user swift --project service member $ oc rsh openstackclient openstack role add --user swift --project service admin > for i in public internal; do > oc rsh openstackclient endpoint create --region regionOne object-store $i http://<RGW_VIP>:8080/swift/v1/AUTH_%\(tenant_id\)s > done $ oc rsh openstackclient openstack role add --project admin --user admin swiftoperator-
Replace
<RGW_VIP>with the Ceph RGW ingress VIP.
-
Replace
If the endpoints exist, update the endpoints to point to the right RGW ingress VIP:
$ oc rsh openstackclient openstack endpoint set --url http://<RGW_VIP>:8080/swift/v1/AUTH_%\(tenant_id\)s <swift_public_endpoint_uuid> $ oc rsh openstackclient openstack endpoint set --url http://<RGW_VIP>:8080/swift/v1/AUTH_%\(tenant_id\)s <swift_internal_endpoint_uuid> $ oc rsh openstackclient openstack endpoint list | grep object | 0d682ad71b564cf386f974f90f80de0d | regionOne | swift | object-store | True | public | http://172.18.0.100:8080/swift/v1/AUTH_%(tenant_id)s | | b311349c305346f39d005feefe464fb1 | regionOne | swift | object-store | True | internal | http://172.18.0.100:8080/swift/v1/AUTH_%(tenant_id)s |-
Replace
<swift_public_endpoint_uuid>with the UUID of the swift public endpoint. -
Replace
<swift_internal_endpoint_uuid>with the UUID of the swift internal endpoint.
-
Replace
Test the migrated service:
$ oc rsh openstackclient openstack container list --debug ... ... ... REQ: curl -g -i -X GET http://keystone-public-openstack.apps.ocp.openstack.lab -H "Accept: application/json" -H "User-Agent: openstacksdk/1.0.2 keystoneauth1/5.1.3 python-requests/2.25.1 CPython/3.9.23" Starting new HTTP connection (1): keystone-public-openstack.apps.ocp.openstack.lab:80 http://keystone-public-openstack.apps.ocp.openstack.lab:80 "GET / HTTP/1.1" 300 298 RESP: [300] content-length: 298 content-type: application/json date: Mon, 14 Jul 2025 17:41:29 GMT location: http://keystone-public-openstack.apps.ocp.openstack.lab/v3/ server: Apache set-cookie: b5697f82cf3c19ece8be533395142512=d5c6a9ee2 267c4b63e9f656ad7565270; path=/; HttpOnly vary: X-Auth-Token x-openstack-request-id: req-452e42c5-e60f-440f-a6e8-fe1b9ea89055 RESP BODY: {"versions": {"values": [{"id": "v3.14", "status": "stable", "updated": "2020-04-07T00:00:00Z", "links": [{"rel": "self", "href": "http://keystone-public-openstack.apps.ocp.openstack.lab/v3/"}], "media-types": [{"base": "applic ation/json", "type": "application/vnd.openstack.identity-v3+json"}]}]}} GET call to http://keystone-public-openstack.apps.ocp.openstack.lab/ used request id req-452e42c5-e60f-440f-a6e8-fe1b9ea89055 ... REQ: curl -g -i -X GET "http://172.18.0.100:8080/swift/v1/AUTH_44477474b0dc4b5b8911ceec23a22246?format=json" -H "User-Agent: openstacksdk/1.0.2 keystoneauth1/5.1.3 python-requests/2.25.1 CPython/3.9.23" -H "X-Auth-Token: {SHA256}ec5deca0be37bd8bfe659f132b9cdf396b8f409db5dc16972d50cbf3f28474d4" Starting new HTTP connection (1): 172.18.0.100:8080 http://172.18.0.100:8080 "GET /swift/v1/AUTH_44477474b0dc4b5b8911ceec23a22246?format=json HTTP/1.1" 200 2 RESP: [200] accept-ranges: bytes content-length: 2 content-type: application/json; charset=utf-8 date: Mon, 14 Jul 2025 17:41:31 GMT x-account-bytes-used: 0 x-account-bytes-used-actual: 0 x-account-container-count: 0 x-account-object-count: 0 x-account-storage-policy-default-placement-bytes-used: 0 x-account-storage-policy-default-placement-bytes-used-actual: 0 x-account-storage-policy-default-placement-container-count: 0 x-account-storage-policy-default-placement-object-count: 0 x-openstack-request-id: tx000001e95361131ccf694-006875414a-7753-default x-timestamp: 1752514891.25991 x-trans-id: tx000001e95361131ccf694-006875414a-7753-default RESP BODY: [] GET call to http://172.18.0.100:8080/swift/v1/AUTH_44477474b0dc4b5b8911ceec23a22246?format=json used request id tx000001e95361131ccf694-006875414a-7753-default clean_up ListContainer: END return value: 0
7.7. Migrating Red Hat Ceph Storage RBD to external RHEL nodes Copy linkLink copied to clipboard!
For Hyperconverged Infrastructure (HCI) or dedicated Storage nodes that are running Red Hat Ceph Storage 7 or later, you must migrate the daemons that are included in the Red Hat OpenStack Platform control plane into the existing external Red Hat Enterprise Linux (RHEL) nodes. The external RHEL nodes typically include the Compute nodes for an HCI environment or dedicated storage nodes.
7.7.1. Prerequisites Copy linkLink copied to clipboard!
Before you begin the migration, complete the tasks in your Red Hat OpenStack Platform 17.1 environment. For more information, see the "Red Hat Ceph Storage prerequisites" in the "Adoption overview" chapter.
7.7.2. Migrating Ceph Manager daemons to Red Hat Ceph Storage nodes Copy linkLink copied to clipboard!
You must migrate your Ceph Manager daemons from the Red Hat OpenStack Platform (RHOSP) Controller nodes to a set of target nodes. Target nodes are either existing Red Hat Ceph Storage nodes, or RHOSP Compute nodes if Red Hat Ceph Storage is deployed by director with a Hyperconverged Infrastructure (HCI) topology.
The following procedure uses cephadm and the Ceph Orchestrator to drive the Ceph Manager migration, and the Ceph spec to modify the placement and reschedule the Ceph Manager daemons. Ceph Manager is run in an active/passive state. It also provides many modules, including the Ceph Orchestrator. Every potential module, such as the Ceph Dashboard, that is provided by ceph-mgr is implicitly migrated with Ceph Manager.
Procedure
SSH into the target node and enable the firewall rules that are required to reach a Ceph Manager service:
dports="6800:7300" ssh heat-admin@<target_node> sudo iptables -I INPUT \ -p tcp --match multiport --dports $dports -j ACCEPT;Replace
<target_node>with the hostname of the hosts that are listed in the Red Hat Ceph Storage environment. Runceph orch host lsto see the list of the hosts.Repeat this step for each target node.
Check that the rules are properly applied to the target node and persist them:
$ sudo iptables-save $ sudo systemctl restart iptablesNoteThe default dashboard port for
ceph-mgrin a greenfield deployment is 8443. With director-deployed Red Hat Ceph Storage, the default port is 8444 because the service ran on the Controller node, and it was necessary to use this port to avoid a conflict. For adoption, update the dashboard port to 8443 in theceph-mgrconfiguration and firewall rules.Log in to
controller-0and update the dashboard port in theceph-mgrconfiguration to 8443:$ sudo cephadm shell $ ceph config set mgr mgr/dashboard/server_port 8443 $ ceph config set mgr mgr/dashboard/ssl_server_port 8443 $ ceph mgr module disable dashboard $ ceph mgr module enable dashboardIf
nftablesis used in the existing deployment, edit/etc/nftables/tripleo-rules.nftand add the following content:# 113 ceph_mgr {'dport': ['6800-7300', 8443]} add rule inet filter TRIPLEO_INPUT tcp dport { 6800-7300,8443 } ct state new counter accept comment "113 ceph_mgr"- Save the file.
Restart the
nftablesservice:$ sudo systemctl restart nftablesVerify that the rules are applied:
$ sudo nft list ruleset | grep ceph_mgrPrepare the target node to host the new Ceph Manager daemon, and add the
mgrlabel to the target node:$ sudo cephadm shell -- ceph orch host label add <target_node> mgr- Repeat steps 1-7 for each target node that hosts a Ceph Manager daemon.
Get the Ceph Manager spec:
$ SPEC_DIR=${SPEC_DIR:-"$PWD/ceph_specs"} $ mkdir -p ${SPEC_DIR} $ sudo cephadm shell -- ceph orch ls --export mgr > ${SPEC_DIR}/mgrEdit the retrieved spec and add the
label: mgrsection to theplacementsection:service_type: mgr service_id: mgr placement: label: mgr- Save the spec.
Apply the spec with
cephadmby using the Ceph Orchestrator:$ SPEC_DIR=${SPEC_DIR:-"$PWD/ceph_specs"} $ sudo cephadm shell -m ${SPEC_DIR}/mgr -- ceph orch apply -i /mnt/mgr
Verification
Verify that the new Ceph Manager daemons are created in the target nodes:
$ sudo cephadm shell -- ceph orch ps | grep -i mgr $ sudo cephadm shell -- ceph -sThe Ceph Manager daemon count should match the number of hosts where the
mgrlabel is added.NoteThe migration does not shrink the Ceph Manager daemons. The count grows by the number of target nodes, and migrating Ceph Monitor daemons to Red Hat Ceph Storage nodes decommissions the stand-by Ceph Manager instances. For more information, see Migrating Ceph Monitor daemons to Red Hat Ceph Storage nodes.
7.7.3. Migrating Ceph Monitor daemons to Red Hat Ceph Storage nodes Copy linkLink copied to clipboard!
You must move Ceph Monitor daemons from the Red Hat OpenStack Platform (RHOSP) Controller nodes to a set of target nodes. Target nodes are either existing Red Hat Ceph Storage nodes, or RHOSP Compute nodes if Red Hat Ceph Storage is deployed by director with a Hyperconverged Infrastructure (HCI) topology. Additional Ceph Monitors are deployed to the target nodes, and they are promoted as _admin nodes that you can use to manage the Red Hat Ceph Storage cluster and perform day 2 operations.
To migrate the Ceph Monitor daemons, you must perform the following high-level steps:
- Configure the target nodes for Ceph Monitor migration.
- Drain the source node
- Migrate your Ceph Monitor IP addresses to the target nodes
- Redeploy the Ceph Monitor on the target node
- Verify that the Red Hat Ceph Storage cluster is healthy
Repeat these steps for any additional Controller node that hosts a Ceph Monitor until you migrate all the Ceph Monitor daemons to the target nodes.
7.7.3.1. Configuring target nodes for Ceph Monitor migration Copy linkLink copied to clipboard!
Prepare the target Red Hat Ceph Storage nodes for the Ceph Monitor migration by performing the following actions:
- Enable firewall rules in a target node and persist them.
-
Create a spec that is based on labels and apply it by using
cephadm. - Ensure that the Ceph Monitor quorum is maintained during the migration process.
Procedure
SSH into the target node and enable the firewall rules that are required to reach a Ceph Monitor service:
$ for port in 3300 6789; { ssh heat-admin@<target_node> sudo iptables -I INPUT \ -p tcp -m tcp --dport $port -m conntrack --ctstate NEW \ -j ACCEPT; }-
Replace
<target_node>with the hostname of the node that hosts the new Ceph Monitor.
-
Replace
Check that the rules are properly applied to the target node and persist them:
$ sudo iptables-save $ sudo systemctl restart iptablesIf
nftablesis used in the existing deployment, edit/etc/nftables/tripleo-rules.nftand add the following content:# 110 ceph_mon {'dport': [6789, 3300, '9100']} add rule inet filter TRIPLEO_INPUT tcp dport { 6789,3300,9100 } ct state new counter accept comment "110 ceph_mon"- Save the file.
Restart the
nftablesservice:$ sudo systemctl restart nftablesVerify that the rules are applied:
$ sudo nft list ruleset | grep ceph_monTo migrate the existing Ceph Monitors to the target Red Hat Ceph Storage nodes, retrieve the Red Hat Ceph Storage mon spec from the first Ceph Monitor, or the first Controller node:
$ SPEC_DIR=${SPEC_DIR:-"$PWD/ceph_specs"} $ mkdir -p ${SPEC_DIR} $ sudo cephadm shell -- ceph orch ls --export mon > ${SPEC_DIR}/monAdd the
label:monsection to theplacementsection:service_type: mon service_id: mon placement: label: mon- Save the spec.
Apply the spec with
cephadmby using the Ceph Orchestrator:$ SPEC_DIR=${SPEC_DIR:-"$PWD/ceph_specs"} $ sudo cephadm shell -m ${SPEC_DIR}/mon -- ceph orch apply -i /mnt/monExtend the
monlabel to the remaining Red Hat Ceph Storage target nodes to ensure that quorum is maintained during the migration process:for item in $(sudo cephadm shell -- ceph orch host ls --format json | jq -r '.[].hostname'); do sudo cephadm shell -- ceph orch host label add $item mon; sudo cephadm shell -- ceph orch host label add $item _admin; doneNoteApplying the
monspec allows the existing strategy to uselabelsinstead ofhosts. As a result, any node with themonlabel can host a Ceph Monitor daemon. Perform this step only once to avoid multiple iterations when multiple Ceph Monitors are migrated.Check the status of the Red Hat Ceph Storage and the Ceph Orchestrator daemons list. Ensure that Ceph Monitors are in a quorum and listed by the
ceph orchcommand:$ sudo cephadm shell -- ceph -s cluster: id: f6ec3ebe-26f7-56c8-985d-eb974e8e08e3 health: HEALTH_OK services: mon: 6 daemons, quorum controller-0,controller-1,controller-2,ceph-0,ceph-1,ceph-2 (age 19m) mgr: controller-0.xzgtvo(active, since 32m), standbys: controller-1.mtxohd, controller-2.ahrgsk osd: 8 osds: 8 up (since 12m), 8 in (since 18m); 1 remapped pgs data: pools: 1 pools, 1 pgs objects: 0 objects, 0 B usage: 43 MiB used, 400 GiB / 400 GiB avail pgs: 1 active+clean$ sudo cephadm shell -- ceph orch host ls HOST ADDR LABELS STATUS ceph-0 192.168.24.14 osd mon mgr _admin ceph-1 192.168.24.7 osd mon mgr _admin ceph-2 192.168.24.8 osd mon mgr _admin controller-0 192.168.24.15 _admin mgr mon controller-1 192.168.24.23 _admin mgr mon controller-2 192.168.24.13 _admin mgr monSet up a Ceph client on the first Controller node that is used during the rest of the procedure to interact with Red Hat Ceph Storage. Set up an additional IP address on the storage network that is used to interact with Red Hat Ceph Storage when the first Controller node is decommissioned:
Back up the content of
/etc/cephin theceph_client_backupdirectory.$ mkdir -p $HOME/ceph_client_backup $ sudo cp -R /etc/ceph/* $HOME/ceph_client_backup-
Edit
/etc/os-net-config/config.yamland add- ip_netmask: 172.17.3.200after the IP address on the VLAN that belongs to the storage network. Replace172.17.3.200with any other available IP address on the storage network that can be statically assigned tocontroller-0. Save the file and refresh the
controller-0network configuration:$ sudo os-net-config -c /etc/os-net-config/config.yamlVerify that the IP address is present in the Controller node:
$ ip -o a | grep 172.17.3.200Ping the IP address and confirm that it is reachable:
$ ping -c 3 172.17.3.200Verify that you can interact with the Red Hat Ceph Storage cluster:
$ sudo cephadm shell -c $HOME/ceph_client_backup/ceph.conf -k $HOME/ceph_client_backup/ceph.client.admin.keyring -- ceph -s
Next steps
Proceed to the next step Draining the source node.
7.7.3.2. Draining the source node Copy linkLink copied to clipboard!
Drain the source node and remove the source node host from the Red Hat Ceph Storage cluster.
Procedure
On the source node, back up the
/etc/ceph/directory to runcephadmand get a shell for the Red Hat Ceph Storage cluster from the source node:$ mkdir -p $HOME/ceph_client_backup $ sudo cp -R /etc/ceph $HOME/ceph_client_backupIdentify the active
ceph-mgrinstance:$ sudo cephadm shell -- ceph mgr statFail the
ceph-mgrif it is active on the source node:$ sudo cephadm shell -- ceph mgr fail <mgr_instance>-
Replace
<mgr_instance>with the Ceph Manager daemon to fail.
-
Replace
From the
cephadmshell, remove the labels on the source node:$ for label in mon mgr _admin; do sudo cephadm shell -- ceph orch host label rm <source_node> $label; done-
Replace
<source_node>with the hostname of the source node.
-
Replace
Optional: Ensure that you remove the Ceph Monitor daemon from the source node if it is still running:
$ sudo cephadm shell -- ceph orch daemon rm mon.<source_node> --forceDrain the source node to remove any leftover daemons:
$ sudo cephadm shell -- ceph orch host drain <source_node>Remove the source node host from the Red Hat Ceph Storage cluster:
$ sudo cephadm shell -- ceph orch host rm <source_node> --forceNoteThe source node is not part of the cluster anymore, and should not appear in the Red Hat Ceph Storage host list when you run
sudo cephadm shell -- ceph orch host ls. However, if you runsudo podman psin the source node, the list might show that both Ceph Monitors and Ceph Managers are still running.[root@controller-1 ~]# sudo podman ps CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 5c1ad36472bc registry.redhat.io/ceph/rhceph@sha256:320c364dcc8fc8120e2a42f54eb39ecdba12401a2546763b7bef15b02ce93bc4 -n mon.contro... 35 minutes ago Up 35 minutes ago ceph-f6ec3ebe-26f7-56c8-985d-eb974e8e08e3-mon-controller-1 3b14cc7bf4dd registry.redhat.io/ceph/rhceph@sha256:320c364dcc8fc8120e2a42f54eb39ecdba12401a2546763b7bef15b02ce93bc4 -n mgr.contro... 35 minutes ago Up 35 minutes ago ceph-f6ec3ebe-26f7-56c8-985d-eb974e8e08e3-mgr-controller-1-mtxohdTo clean up the existing containers and remove the
cephadmdata from the source node, contact Red Hat Support.Confirm that mons are still in quorum:
$ sudo cephadm shell -- ceph -s $ sudo cephadm shell -- ceph orch ps | grep -i mon
Next steps
Proceed to the next step Migrating the Ceph Monitor IP address.
7.7.3.3. Migrating the Ceph Monitor IP address Copy linkLink copied to clipboard!
You must migrate your Ceph Monitor IP addresses to the target Red Hat Ceph Storage nodes. The IP address migration assumes that the target nodes are originally deployed by director and that the network configuration is managed by os-net-config.
Procedure
Get the original Ceph Monitor IP addresses from
$HOME/ceph_client_backup/ceph.conffile on themon_hostline, for example:mon_host = [v2:172.17.3.60:3300/0,v1:172.17.3.60:6789/0] [v2:172.17.3.29:3300/0,v1:172.17.3.29:6789/0] [v2:172.17.3.53:3300/0,v1:172.17.3.53:6789/0]Match the IP address retrieved in the previous step with the storage network IP addresses on the source node, and find the Ceph Monitor IP address:
[tripleo-admin@controller-0 ~]$ ip -o -4 a | grep 172.17.3 9: vlan30 inet 172.17.3.60/24 brd 172.17.3.255 scope global vlan30\ valid_lft forever preferred_lft forever 9: vlan30 inet 172.17.3.13/32 brd 172.17.3.255 scope global vlan30\ valid_lft forever preferred_lft foreverConfirm that the Ceph Monitor IP address is present in the
os-net-configconfiguration that is located in the/etc/os-net-configdirectory on the source node:[tripleo-admin@controller-0 ~]$ grep "172.17.3.60" /etc/os-net-config/config.yaml - ip_netmask: 172.17.3.60/24-
Edit the
/etc/os-net-config/config.yamlfile and remove theip_netmaskline. Save the file and refresh the node network configuration:
$ sudo os-net-config -c /etc/os-net-config/config.yamlVerify that the IP address is not present in the source node anymore, for example:
[controller-0]$ ip -o a | grep 172.17.3.60-
SSH into the target node, for example
cephstorage-0, and add the IP address for the new Ceph Monitor. -
On the target node, edit
/etc/os-net-config/config.yamland add the- ip_netmask: 172.17.3.60line that you removed in the source node. Save the file and refresh the node network configuration:
$ sudo os-net-config -c /etc/os-net-config/config.yamlVerify that the IP address is present in the target node.
$ ip -o a | grep 172.17.3.60From the Ceph client node,
controller-0, ping the IP address that is migrated to the target node and confirm that it is still reachable:[controller-0]$ ping -c 3 172.17.3.60
Next steps
Proceed to the next step Redeploying the Ceph Monitor on the target node.
7.7.3.4. Redeploying a Ceph Monitor on the target node Copy linkLink copied to clipboard!
You use the IP address that you migrated to the target node to redeploy the Ceph Monitor on the target node.
Procedure
From the Ceph client node, for example
controller-0, get the Ceph mon spec:$ SPEC_DIR=${SPEC_DIR:-"$PWD/ceph_specs"} $ sudo cephadm shell -- ceph orch ls --export mon > ${SPEC_DIR}/monEdit the retrieved spec and add the
unmanaged: truekeyword:service_type: mon service_id: mon placement: label: mon unmanaged: true- Save the spec.
Apply the spec with
cephadmby using the Ceph Orchestrator:$ SPEC_DIR=${SPEC_DIR:-"$PWD/ceph_specs"} $ sudo cephadm shell -m ${SPEC_DIR}/mon -- ceph orch apply -i /mnt/monThe Ceph Monitor daemons are marked as
unmanaged, and you can now redeploy the existing daemon and bind it to the migrated IP address.Delete the existing Ceph Monitor on the target node:
$ sudo cephadm shell -- ceph orch daemon rm mon.<target_node> --force-
Replace
<target_node>with the hostname of the target node that is included in the Red Hat Ceph Storage cluster.
-
Replace
Redeploy the new Ceph Monitor on the target node by using the migrated IP address:
$ sudo cephadm shell -- ceph orch daemon add mon <target_node>:<ip_address>-
Replace
<ip_address>with the IP address of the migrated IP address.
-
Replace
Get the Ceph Monitor spec:
$ SPEC_DIR=${SPEC_DIR:-"$PWD/ceph_specs"} $ sudo cephadm shell -- ceph orch ls --export mon > ${SPEC_DIR}/monEdit the retrieved spec and set the
unmanagedkeyword tofalse:service_type: mon service_id: mon placement: label: mon unmanaged: false- Save the spec.
Apply the spec with
cephadmby using the Ceph Orchestrator:$ SPEC_DIR=${SPEC_DIR:-"$PWD/ceph_specs"} $ sudo cephadm shell -m ${SPEC_DIR}/mon -- ceph orch apply -i /mnt/monThe new Ceph Monitor runs on the target node with the original IP address.
Identify the running
mgr:$ sudo cephadm shell -- ceph mgr statRefresh the Ceph Manager information by force-failing it:
$ sudo cephadm shell -- ceph mgr failRefresh the
OSDinformation:$ sudo cephadm shell -- ceph orch reconfig osd.default_drive_group
Next steps
Repeat the procedure starting from step Draining the source node for each node that you want to decommission. Proceed to the next step Verifying the Red Hat Ceph Storage cluster after Ceph Monitor migration.
7.7.3.5. Verifying the Red Hat Ceph Storage cluster after Ceph Monitor migration Copy linkLink copied to clipboard!
After you finish migrating your Ceph Monitor daemons to the target nodes, verify that the Red Hat Ceph Storage cluster is healthy.
Procedure
Verify that the Red Hat Ceph Storage cluster is healthy:
$ ceph -s cluster: id: f6ec3ebe-26f7-56c8-985d-eb974e8e08e3 health: HEALTH_OK ... ...Verify that the Red Hat Ceph Storage mons are running with the old IP addresses. SSH into the target nodes and verify that the Ceph Monitor daemons are bound to the expected IP and port:
$ netstat -tulpn | grep 3300
7.8. Updating the Red Hat Ceph Storage cluster Ceph Dashboard configuration Copy linkLink copied to clipboard!
If the Ceph Dashboard is part of the enabled Ceph Manager modules, you need to reconfigure the failover settings.
Procedure
Regenerate the following Red Hat Ceph Storage configuration keys to point to the right
mgrcontainer:mgr advanced mgr/dashboard/controller-0.ycokob/server_addr 172.17.3.33 mgr advanced mgr/dashboard/controller-1.lmzpuc/server_addr 172.17.3.147 mgr advanced mgr/dashboard/controller-2.xpdgfl/server_addr 172.17.3.138$ sudo cephadm shell $ ceph orch ps | awk '/mgr./ {print $1}'For each retrieved
mgrdaemon, update the corresponding entry in the Red Hat Ceph Storage configuration:$ ceph config set mgr mgr/dashboard/<>/server_addr/<ip addr>
Chapter 8. Post-adoption tasks Copy linkLink copied to clipboard!
Perform the following post-adoption tasks to ensure that your Red Hat OpenStack Services on OpenShift (RHOSO) environment is functioning optimally.
After adoption, RHOSO data plane nodes run Red Hat Enterprise Linux (RHEL) 9.2. The data plane nodes can remain on RHEL 9.2; however, you must perform a system update to use the full feature set from the release, and to align your environment with the maximum support lifecycle of RHOSO.
- You can perform a system update any time after you complete the adoption procedure.
- You can defer the system update to a separate maintenance window.
- You can perform the system update on one node set at a time. For example, you can update one node set from RHEL 9.2 to RHEL 9.4 or 9.6 in one maintenance window, and then update a different node set in another maintenance window later.
- If you enabled the high availability for Compute instances (Instance HA) service, remove the Pacemaker components from your Compute nodes.
- Enable TLS Everywhere (TLS-e).
- Verify that you migrated all services from the Controller nodes, and then power off the nodes. If any services are still running in the Controller nodes, such as Open Virtual Networking (ML2/OVN), Object Storage service (swift), or Red Hat Ceph Storage, do not power off the nodes.
- Optional: Run tempest to verify that the entire adoption process is working correctly.