1.21. 使用 EDPM
1.21.1. 先决条件 复制链接链接已复制到粘贴板!
- 以前的 Adoption 步骤已完成。
- 剩余的源云 在 Compute 主机上停止基础架构管理和 计算服务。
警告 :在 EDPM 采用过程中,此步骤为"无返回"。在部署 EDPM 后,不能启用源 control plane 和数据平面服务,并且对它的 Podified control plane 进行了控制。
1.21.2. 变量 复制链接链接已复制到粘贴板!
定义以下 Fast-forward 升级步骤中使用的 shell 变量。将 FIP 设置为之前在源云上预先创建的测试虚拟机的浮动 IP 地址。定义计算节点名称 IP 对的映射。这些值只是说明,使用适合您环境的值:
PODIFIED_DB_ROOT_PASSWORD=$(oc get -o json secret/osp-secret | jq -r .data.DbRootPassword | base64 -d)
alias openstack="oc exec -t openstackclient -- openstack"
FIP=192.168.122.20
declare -A computes
export computes=(
["standalone.localdomain"]="192.168.122.100"
# ...
)
1.21.3. pre-checks 复制链接链接已复制到粘贴板!
- 确保配置了 IPAM
oc apply -f - <<EOF
apiVersion: network.openstack.org/v1beta1
kind: NetConfig
metadata:
name: netconfig
spec:
networks:
- name: CtlPlane
dnsDomain: ctlplane.example.com
subnets:
- name: subnet1
allocationRanges:
- end: 192.168.122.120
start: 192.168.122.100
- end: 192.168.122.200
start: 192.168.122.150
cidr: 192.168.122.0/24
gateway: 192.168.122.1
- name: InternalApi
dnsDomain: internalapi.example.com
subnets:
- name: subnet1
allocationRanges:
- end: 172.17.0.250
start: 172.17.0.100
cidr: 172.17.0.0/24
vlan: 20
- name: External
dnsDomain: external.example.com
subnets:
- name: subnet1
allocationRanges:
- end: 10.0.0.250
start: 10.0.0.100
cidr: 10.0.0.0/24
gateway: 10.0.0.1
- name: Storage
dnsDomain: storage.example.com
subnets:
- name: subnet1
allocationRanges:
- end: 172.18.0.250
start: 172.18.0.100
cidr: 172.18.0.0/24
vlan: 21
- name: StorageMgmt
dnsDomain: storagemgmt.example.com
subnets:
- name: subnet1
allocationRanges:
- end: 172.20.0.250
start: 172.20.0.100
cidr: 172.20.0.0/24
vlan: 23
- name: Tenant
dnsDomain: tenant.example.com
subnets:
- name: subnet1
allocationRanges:
- end: 172.19.0.250
start: 172.19.0.100
cidr: 172.19.0.0/24
vlan: 22
EOF
1.21.4. 步骤 - EDPM 的采用 复制链接链接已复制到粘贴板!
临时修复,直到 stable 计算 UUID 功能移植到 OSP 17 向后移植。
对于每个计算节点获取计算服务的 UUID,并将它写入
/var/lib/nova/目录中的 stablecompute_id文件。for name in "${!computes[@]}"; do uuid=$(\ openstack hypervisor show $name \ -f value -c 'id'\ ) echo "Writing $uuid to /var/lib/nova/compute_id on $name" ssh \ -i ~/install_yamls/out/edpm/ansibleee-ssh-key-id_rsa \ root@"${computes[$name]}" \ "echo $uuid > /var/lib/nova/compute_id" done为 EDPM 节点创建 ssh 身份验证 secret :
oc apply -f - <<EOF apiVersion: v1 kind: Secret metadata: name: dataplane-adoption-secret namespace: openstack data: ssh-privatekey: | $(cat ~/install_yamls/out/edpm/ansibleee-ssh-key-id_rsa | base64 | sed 's/^/ /') EOF生成 ssh 密钥对
nova-migration-ssh-keysecretcd "$(mktemp -d)" ssh-keygen -f ./id -t ecdsa-sha2-nistp521 -N '' oc get secret nova-migration-ssh-key || oc create secret generic nova-migration-ssh-key \ -n openstack \ --from-file=ssh-privatekey=id \ --from-file=ssh-publickey=id.pub \ --type kubernetes.io/ssh-auth rm -f id* cd -创建 Nova Compute Extra Config 服务
oc apply -f - <<EOF apiVersion: v1 kind: ConfigMap metadata: name: nova-compute-extraconfig namespace: openstack data: 19-nova-compute-cell1-workarounds.conf: | [workarounds] disable_compute_service_check_for_ffu=true --- apiVersion: dataplane.openstack.org/v1beta1 kind: OpenStackDataPlaneService metadata: name: nova-compute-extraconfig namespace: openstack spec: label: nova.compute.extraconfig configMaps: - nova-compute-extraconfig secrets: - nova-cell1-compute-config - nova-migration-ssh-key playbook: osp.edpm.nova EOFsecret
nova-cell<X>-compute-config是为每个单元<X> 自动生成的。该 secret 和nova-migration-ssh-key都应始终为与 Nova 相关的每个自定义OpenStackDataPlaneService指定。部署 OpenStackDataPlaneNodeSet :
oc apply -f - <<EOF apiVersion: dataplane.openstack.org/v1beta1 kind: OpenStackDataPlaneNodeSet metadata: name: openstack spec: networkAttachments: - ctlplane preProvisioned: true services: - bootstrap - download-cache - configure-network - validate-network - install-os - configure-os - run-os - libvirt - nova-compute-extraconfig - ovn - neutron-metadata env: - name: ANSIBLE_CALLBACKS_ENABLED value: "profile_tasks" - name: ANSIBLE_FORCE_COLOR value: "True" nodes: standalone: hostName: standalone ansible: ansibleHost: ${computes[standalone.localdomain]} networks: - defaultRoute: true fixedIP: ${computes[standalone.localdomain]} name: CtlPlane subnetName: subnet1 - name: InternalApi subnetName: subnet1 - name: Storage subnetName: subnet1 - name: Tenant subnetName: subnet1 nodeTemplate: ansibleSSHPrivateKeySecret: dataplane-adoption-secret managementNetwork: ctlplane ansible: ansibleUser: root ansiblePort: 22 ansibleVars: service_net_map: nova_api_network: internal_api nova_libvirt_network: internal_api # edpm_network_config # Default nic config template for a EDPM compute node # These vars are edpm_network_config role vars edpm_network_config_override: "" edpm_network_config_template: | --- {% set mtu_list = [ctlplane_mtu] %} {% for network in role_networks %} {{ mtu_list.append(lookup('vars', networks_lower[network] ~ '_mtu')) }} {%- endfor %} {% set min_viable_mtu = mtu_list | max %} network_config: - type: ovs_bridge name: {{ neutron_physical_bridge_name }} mtu: {{ min_viable_mtu }} use_dhcp: false dns_servers: {{ ctlplane_dns_nameservers }} domain: {{ dns_search_domains }} addresses: - ip_netmask: {{ ctlplane_ip }}/{{ ctlplane_subnet_cidr }} routes: {{ ctlplane_host_routes }} members: - type: interface name: nic1 mtu: {{ min_viable_mtu }} # force the MAC address of the bridge to this interface primary: true {% for network in role_networks %} - type: vlan mtu: {{ lookup('vars', networks_lower[network] ~ '_mtu') }} vlan_id: {{ lookup('vars', networks_lower[network] ~ '_vlan_id') }} addresses: - ip_netmask: {{ lookup('vars', networks_lower[network] ~ '_ip') }}/{{ lookup('vars', networks_lower[network] ~ '_cidr') }} routes: {{ lookup('vars', networks_lower[network] ~ '_host_routes') }} {% endfor %} edpm_network_config_hide_sensitive_logs: false # # These vars are for the network config templates themselves and are # considered EDPM network defaults. neutron_physical_bridge_name: br-ctlplane neutron_public_interface_name: eth0 role_networks: - InternalApi - Storage - Tenant networks_lower: External: external InternalApi: internal_api Storage: storage Tenant: tenant # edpm_nodes_validation edpm_nodes_validation_validate_controllers_icmp: false edpm_nodes_validation_validate_gateway_icmp: false timesync_ntp_servers: - hostname: clock.redhat.com - hostname: clock2.redhat.com edpm_ovn_controller_agent_image: registry.redhat.io/rhosp-dev-preview/openstack-ovn-controller-rhel9:18.0 edpm_iscsid_image: registry.redhat.io/rhosp-dev-preview/openstack-iscsid-rhel9:18.0 edpm_logrotate_crond_image: registry.redhat.io/rhosp-dev-preview/openstack-cron-rhel9:18.0 edpm_nova_compute_container_image: registry.redhat.io/rhosp-dev-preview/openstack-nova-compute-rhel9:18.0 edpm_nova_libvirt_container_image: registry.redhat.io/rhosp-dev-preview/openstack-nova-libvirt-rhel9:18.0 edpm_ovn_metadata_agent_image: registry.redhat.io/rhosp-dev-preview/openstack-neutron-metadata-agent-ovn-rhel9:18.0 edpm_bootstrap_command: | subscription-manager register --username <subscription_manager_username> --password <subscription_manager_password> subscription-manager release --set=9.2 subscription-manager repos --disable=* subscription-manager repos --enable=rhel-9-for-x86_64-baseos-eus-rpms --enable=rhel-9-for-x86_64-appstream-eus-rpms --enable=rhel-9-for-x86_64-highavailability-eus-rpms --enable=openstack-17.1-for-rhel-9-x86_64-rpms --enable=fast-datapath-for-rhel-9-x86_64-rpms --enable=openstack-dev-preview-for-rhel-9-x86_64-rpms podman login -u <registry_username> -p <registry_password> registry.redhat.io gather_facts: false enable_debug: false # edpm firewall, change the allowed CIDR if needed edpm_sshd_configure_firewall: true edpm_sshd_allowed_ranges: ['192.168.122.0/24'] # SELinux module edpm_selinux_mode: enforcing plan: overcloud # Do not attempt OVS 3.2 major upgrades here edpm_ovs_packages: - openvswitch3.1 EOF部署 OpenStackDataPlaneDeployment :
oc apply -f - <<EOF apiVersion: dataplane.openstack.org/v1beta1 kind: OpenStackDataPlaneDeployment metadata: name: openstack spec: nodeSets: - openstack EOF
1.21.5. post-checks 复制链接链接已复制到粘贴板!
检查所有 Ansible EE pod 是否都达到
Completed状态:# watching the pods watch oc get pod -l app=openstackansibleee# following the ansible logs with: oc logs -l app=openstackansibleee -f --max-log-requests 10等待 dataplane 节点设置达到 Ready 状态:
oc wait --for condition=Ready osdpns/openstack --timeout=30m
1.21.6. Nova 计算服务从 Wallaby 升级到 Antelope 复制链接链接已复制到粘贴板!
Nova 服务滚动升级在采用过程中无法执行,与 Nova control plane 服务有锁定步骤,因为它们由 EDPM ansible 和 Kubernetes operator 独立管理。Nova 服务 operator 和 OpenStack Dataplane operator 通过为 Nova 服务配置 [upgrade_levels]compute=auto 来确保相互独立升级。Nova control plane 服务在对 CR 进行补丁后应用正确的更改。Nova 计算 EDPM 服务稍后会捕获与 ansible 部署相同的配置更改。
注意 :围绕 FFU 解决 Nova 计算 EDPM 服务的临时解决方案配置进行其他编排是未来变化的主题。
等待 cell1 Nova 计算 EDPM 服务版本已更新(可能需要一些时间):
oc exec openstack-cell1-galera-0 -c galera -- mysql -rs -uroot -p$PODIFIED_DB_ROOT_PASSWORD \ -e "select a.version from nova_cell1.services a join nova_cell1.services b where a.version!=b.version and a.binary='nova-compute';"以上查询应返回空结果,作为完成条件。
删除 Nova control plane 服务的 pre-FFU 临时解决方案:
oc patch openstackcontrolplane openstack -n openstack --type=merge --patch ' spec: nova: template: cellTemplates: cell0: conductorServiceTemplate: customServiceConfig: | [workarounds] disable_compute_service_check_for_ffu=false cell1: metadataServiceTemplate: customServiceConfig: | [workarounds] disable_compute_service_check_for_ffu=false conductorServiceTemplate: customServiceConfig: | [workarounds] disable_compute_service_check_for_ffu=false apiServiceTemplate: customServiceConfig: | [workarounds] disable_compute_service_check_for_ffu=false metadataServiceTemplate: customServiceConfig: | [workarounds] disable_compute_service_check_for_ffu=false schedulerServiceTemplate: customServiceConfig: | [workarounds] disable_compute_service_check_for_ffu=false '等待 Nova control plane 服务 CR 就绪:
oc wait --for condition=Ready --timeout=300s Nova/nova删除 Nova 计算 EDPM 服务的 pre-FFU 临时解决方案:
oc apply -f - <<EOF apiVersion: v1 kind: ConfigMap metadata: name: nova-compute-ffu namespace: openstack data: 20-nova-compute-cell1-ffu-cleanup.conf: | [workarounds] disable_compute_service_check_for_ffu=false --- apiVersion: dataplane.openstack.org/v1beta1 kind: OpenStackDataPlaneService metadata: name: nova-compute-ffu namespace: openstack spec: label: nova.compute.ffu configMaps: - nova-compute-ffu secrets: - nova-cell1-compute-config - nova-migration-ssh-key playbook: osp.edpm.nova --- apiVersion: dataplane.openstack.org/v1beta1 kind: OpenStackDataPlaneDeployment metadata: name: openstack-nova-compute-ffu namespace: openstack spec: nodeSets: - openstack servicesOverride: - nova-compute-ffu EOF等待 Nova 计算 EDPM 服务就绪:
oc wait --for condition=Ready osdpd/openstack-nova-compute-ffu --timeout=5m运行 Nova DB 在线迁移以完成 FFU:
oc exec -it nova-cell0-conductor-0 -- nova-manage db online_data_migrations oc exec -it nova-cell1-conductor-0 -- nova-manage db online_data_migrations验证 Nova 服务是否可以停止现有的测试虚拟机实例:
${BASH_ALIASES[openstack]} server list | grep -qF '| test | ACTIVE |' && openstack server stop test ${BASH_ALIASES[openstack]} server list | grep -qF '| test | SHUTOFF |' ${BASH_ALIASES[openstack]} server --os-compute-api-version 2.48 show --diagnostics test | grep "it is in power state shutdown" || echo PASS验证 Nova 服务是否可以启动现有的测试虚拟机实例:
${BASH_ALIASES[openstack]} server list | grep -qF '| test | SHUTOFF |' && openstack server start test ${BASH_ALIASES[openstack]} server list | grep -F '| test | ACTIVE |' ${BASH_ALIASES[openstack]} server --os-compute-api-version 2.48 show --diagnostics test --fit-width -f json | jq -r '.state' | grep running