16.4. Performing an image-based upgrade for single-node OpenShift clusters using GitOps ZTP
You can use a single resource on the hub cluster, the ImageBasedGroupUpgrade custom resource (CR), to manage an imaged-based upgrade on a selected group of managed clusters through all stages. Topology Aware Lifecycle Manager (TALM) reconciles the ImageBasedGroupUpgrade CR and creates the underlying resources to complete the defined stage transitions, either in a manually controlled or a fully automated upgrade flow.
For more information about the image-based upgrade, see "Understanding the image-based upgrade for single-node OpenShift clusters".
The ImageBasedGroupUpgrade CR combines the ImageBasedUpgrade and ClusterGroupUpgrade APIs. For example, you can define the cluster selection and rollout strategy with the ImageBasedGroupUpgrade API in the same way as the ClusterGroupUpgrade API. The stage transitions are different from the ImageBasedUpgrade API. The ImageBasedGroupUpgrade API allows you to combine several stage transitions, also called actions, into one step that share one rollout strategy.
Example ImageBasedGroupUpgrade.yaml
apiVersion: lcm.openshift.io/v1alpha1
kind: ImageBasedGroupUpgrade
metadata:
name: <filename>
namespace: default
spec:
clusterLabelSelectors:
- matchExpressions:
- key: name
operator: In
values:
- spoke1
- spoke4
- spoke6
ibuSpec:
seedImageRef:
image: quay.io/seed/image:4.20.0-rc.1
version: 4.20.0-rc.1
pullSecretRef:
name: "<seed_pull_secret>"
extraManifests:
- name: example-extra-manifests
namespace: openshift-lifecycle-agent
oadpContent:
- name: oadp-cm
namespace: openshift-adp
plan:
- actions: ["Prep", "Upgrade", "FinalizeUpgrade"]
rolloutStrategy:
maxConcurrency: 200
timeout: 2400
- 1
- Clusters to upgrade.
- 2
- Target platform version, the seed image to be used, and the secret required to access the image.注意
If you add the seed image pull secret in the hub cluster, in the same namespace as the
ImageBasedGroupUpgraderesource, the secret is added to the manifest list for thePrepstage. The secret is recreated in each spoke cluster in theopenshift-lifecycle-agentnamespace. - 3
- Optional: Applies additional manifests, which are not in the seed image, to the target cluster. Also applies
ConfigMapobjects for custom catalog sources. - 4
ConfigMapresources that contain the OADPBackupandRestoreCRs.- 5
- Upgrade plan details.
- 6
- Number of clusters to update in a batch.
- 7
- Timeout limit to complete the action in minutes.
16.4.1.1. Supported action combinations 复制链接链接已复制到粘贴板!
Actions are the list of stage transitions that TALM completes in the steps of an upgrade plan for the selected group of clusters. Each action entry in the ImageBasedGroupUpgrade CR is a separate step and a step contains one or several actions that share the same rollout strategy. You can achieve more control over the rollout strategy for each action by separating actions into steps.
These actions can be combined differently in your upgrade plan and you can add subsequent steps later. Wait until the previous steps either complete or fail before adding a step to your plan. The first action of an added step for clusters that failed a previous steps must be either Abort or Rollback.
You cannot remove actions or steps from an ongoing plan.
The following table shows example plans for different levels of control over the rollout strategy:
| Example plan | Description |
|---|---|
| All actions share the same strategy |
| Some actions share the same strategy |
| All actions have different strategies |
Clusters that fail one of the actions will skip the remaining actions in the same step.
The ImageBasedGroupUpgrade API accepts the following actions:
Prep-
Start preparing the upgrade resources by moving to the
Prepstage. Upgrade-
Start the upgrade by moving to the
Upgradestage. FinalizeUpgrade-
Finalize the upgrade on selected clusters that completed the
Upgradeaction by moving to theIdlestage. Rollback-
Start a rollback only on successfully upgraded clusters by moving to the
Rollbackstage. FinalizeRollback-
Finalize the rollback by moving to the
Idlestage. AbortOnFailure-
Cancel the upgrade on selected clusters that failed the
PreporUpgradeactions by moving to theIdlestage. Abort-
Cancel an ongoing upgrade only on clusters that are not yet upgraded by moving to the
Idlestage.
The following action combinations are supported. A pair of brackets signifies one step in the plan section:
-
["Prep"],["Abort"] -
["Prep", "Upgrade", "FinalizeUpgrade"] -
["Prep"],["AbortOnFailure"],["Upgrade"],["AbortOnFailure"],["FinalizeUpgrade"] -
["Rollback", "FinalizeRollback"]
Use one of the following combinations when you need to resume or cancel an ongoing upgrade from a completely new ImageBasedGroupUpgrade CR:
-
["Upgrade","FinalizeUpgrade"] -
["FinalizeUpgrade"] -
["FinalizeRollback"] -
["Abort"] -
["AbortOnFailure"]
16.4.1.2. Labeling for cluster selection 复制链接链接已复制到粘贴板!
Use the spec.clusterLabelSelectors field for initial cluster selection. In addition, TALM labels the managed clusters according to the results of their last stage transition.
When a stage completes or fails, TALM marks the relevant clusters with the following labels:
-
lcm.openshift.io/ibgu-<stage>-completed -
lcm.openshift.io/ibgu-<stage>-failed
Use these cluster labels to cancel or roll back an upgrade on a group of clusters after troubleshooting issues that you might encounter.
If you are using the ImageBasedGroupUpgrade CR to upgrade your clusters, ensure that the lcm.openshift.io/ibgu-<stage>-completed or lcm.openshift.io/ibgu-<stage>-failed cluster labels are updated properly after performing troubleshooting or recovery steps on the managed clusters. This ensures that the TALM continues to manage the image-based upgrade for the cluster.
For example, if you want to cancel the upgrade for all managed clusters except for clusters that successfully completed the upgrade, you can add an Abort action to your plan. The Abort action moves back the ImageBasedUpgrade CR to the Idle stage, which cancels the upgrade on clusters that are not yet upgraded. Adding a separate Abort action ensures that the TALM does not perform the Abort action on clusters that have the lcm.openshift.io/ibgu-upgrade-completed label.
The cluster labels are removed after successfully canceling or finalizing the upgrade.
16.4.1.3. Status monitoring 复制链接链接已复制到粘贴板!
The ImageBasedGroupUpgrade CR ensures a better monitoring experience with a comprehensive status reporting for all clusters that is aggregated in one place. You can monitor the following actions:
status.clusters.completedActions-
Shows all completed actions defined in the
plansection. status.clusters.currentAction- Shows all actions that are currently in progress.
status.clusters.failedActions- Shows all failed actions along with a detailed error message.