16.4. Performing an image-based upgrade for single-node OpenShift clusters using GitOps ZTP


You can use a single resource on the hub cluster, the ImageBasedGroupUpgrade custom resource (CR), to manage an imaged-based upgrade on a selected group of managed clusters through all stages. Topology Aware Lifecycle Manager (TALM) reconciles the ImageBasedGroupUpgrade CR and creates the underlying resources to complete the defined stage transitions, either in a manually controlled or a fully automated upgrade flow.

For more information about the image-based upgrade, see "Understanding the image-based upgrade for single-node OpenShift clusters".

The ImageBasedGroupUpgrade CR combines the ImageBasedUpgrade and ClusterGroupUpgrade APIs. For example, you can define the cluster selection and rollout strategy with the ImageBasedGroupUpgrade API in the same way as the ClusterGroupUpgrade API. The stage transitions are different from the ImageBasedUpgrade API. The ImageBasedGroupUpgrade API allows you to combine several stage transitions, also called actions, into one step that share one rollout strategy.

Example ImageBasedGroupUpgrade.yaml

apiVersion: lcm.openshift.io/v1alpha1
kind: ImageBasedGroupUpgrade
metadata:
  name: <filename>
  namespace: default
spec:
  clusterLabelSelectors: 
1

    - matchExpressions:
      - key: name
        operator: In
        values:
        - spoke1
        - spoke4
        - spoke6
  ibuSpec:
    seedImageRef: 
2

      image: quay.io/seed/image:4.20.0-rc.1
      version: 4.20.0-rc.1
      pullSecretRef:
        name: "<seed_pull_secret>"
    extraManifests: 
3

      - name: example-extra-manifests
        namespace: openshift-lifecycle-agent
    oadpContent: 
4

      - name: oadp-cm
        namespace: openshift-adp
  plan: 
5

    - actions: ["Prep", "Upgrade", "FinalizeUpgrade"]
      rolloutStrategy:
        maxConcurrency: 200 
6

        timeout: 2400 
7

1
Clusters to upgrade.
2
Target platform version, the seed image to be used, and the secret required to access the image.
注意

If you add the seed image pull secret in the hub cluster, in the same namespace as the ImageBasedGroupUpgrade resource, the secret is added to the manifest list for the Prep stage. The secret is recreated in each spoke cluster in the openshift-lifecycle-agent namespace.

3
Optional: Applies additional manifests, which are not in the seed image, to the target cluster. Also applies ConfigMap objects for custom catalog sources.
4
ConfigMap resources that contain the OADP Backup and Restore CRs.
5
Upgrade plan details.
6
Number of clusters to update in a batch.
7
Timeout limit to complete the action in minutes.

16.4.1.1. Supported action combinations

Actions are the list of stage transitions that TALM completes in the steps of an upgrade plan for the selected group of clusters. Each action entry in the ImageBasedGroupUpgrade CR is a separate step and a step contains one or several actions that share the same rollout strategy. You can achieve more control over the rollout strategy for each action by separating actions into steps.

These actions can be combined differently in your upgrade plan and you can add subsequent steps later. Wait until the previous steps either complete or fail before adding a step to your plan. The first action of an added step for clusters that failed a previous steps must be either Abort or Rollback.

重要

You cannot remove actions or steps from an ongoing plan.

The following table shows example plans for different levels of control over the rollout strategy:

Expand
表 16.5. Example upgrade plans
Example planDescription
plan:
- actions: ["Prep", "Upgrade", "FinalizeUpgrade"]
  rolloutStrategy:
    maxConcurrency: 200
    timeout: 60

All actions share the same strategy

plan:
- actions: ["Prep", "Upgrade"]
  rolloutStrategy:
    maxConcurrency: 200
    timeout: 60
- actions: ["FinalizeUpgrade"]
  rolloutStrategy:
    maxConcurrency: 500
    timeout: 10

Some actions share the same strategy

plan:
- actions: ["Prep"]
  rolloutStrategy:
    maxConcurrency: 200
    timeout: 60
- actions: ["Upgrade"]
  rolloutStrategy:
    maxConcurrency: 200
    timeout: 20
- actions: ["FinalizeUpgrade"]
  rolloutStrategy:
    maxConcurrency: 500
    timeout: 10

All actions have different strategies

重要

Clusters that fail one of the actions will skip the remaining actions in the same step.

The ImageBasedGroupUpgrade API accepts the following actions:

Prep
Start preparing the upgrade resources by moving to the Prep stage.
Upgrade
Start the upgrade by moving to the Upgrade stage.
FinalizeUpgrade
Finalize the upgrade on selected clusters that completed the Upgrade action by moving to the Idle stage.
Rollback
Start a rollback only on successfully upgraded clusters by moving to the Rollback stage.
FinalizeRollback
Finalize the rollback by moving to the Idle stage.
AbortOnFailure
Cancel the upgrade on selected clusters that failed the Prep or Upgrade actions by moving to the Idle stage.
Abort
Cancel an ongoing upgrade only on clusters that are not yet upgraded by moving to the Idle stage.

The following action combinations are supported. A pair of brackets signifies one step in the plan section:

  • ["Prep"], ["Abort"]
  • ["Prep", "Upgrade", "FinalizeUpgrade"]
  • ["Prep"], ["AbortOnFailure"], ["Upgrade"], ["AbortOnFailure"], ["FinalizeUpgrade"]
  • ["Rollback", "FinalizeRollback"]

Use one of the following combinations when you need to resume or cancel an ongoing upgrade from a completely new ImageBasedGroupUpgrade CR:

  • ["Upgrade","FinalizeUpgrade"]
  • ["FinalizeUpgrade"]
  • ["FinalizeRollback"]
  • ["Abort"]
  • ["AbortOnFailure"]

16.4.1.2. Labeling for cluster selection

Use the spec.clusterLabelSelectors field for initial cluster selection. In addition, TALM labels the managed clusters according to the results of their last stage transition.

When a stage completes or fails, TALM marks the relevant clusters with the following labels:

  • lcm.openshift.io/ibgu-<stage>-completed
  • lcm.openshift.io/ibgu-<stage>-failed

Use these cluster labels to cancel or roll back an upgrade on a group of clusters after troubleshooting issues that you might encounter.

重要

If you are using the ImageBasedGroupUpgrade CR to upgrade your clusters, ensure that the lcm.openshift.io/ibgu-<stage>-completed or lcm.openshift.io/ibgu-<stage>-failed cluster labels are updated properly after performing troubleshooting or recovery steps on the managed clusters. This ensures that the TALM continues to manage the image-based upgrade for the cluster.

For example, if you want to cancel the upgrade for all managed clusters except for clusters that successfully completed the upgrade, you can add an Abort action to your plan. The Abort action moves back the ImageBasedUpgrade CR to the Idle stage, which cancels the upgrade on clusters that are not yet upgraded. Adding a separate Abort action ensures that the TALM does not perform the Abort action on clusters that have the lcm.openshift.io/ibgu-upgrade-completed label.

The cluster labels are removed after successfully canceling or finalizing the upgrade.

16.4.1.3. Status monitoring

The ImageBasedGroupUpgrade CR ensures a better monitoring experience with a comprehensive status reporting for all clusters that is aggregated in one place. You can monitor the following actions:

status.clusters.completedActions
Shows all completed actions defined in the plan section.
status.clusters.currentAction
Shows all actions that are currently in progress.
status.clusters.failedActions
Shows all failed actions along with a detailed error message.
Red Hat logoGithubredditYoutubeTwitter

学习

尝试、购买和销售

社区

關於紅帽

我们提供强化的解决方案,使企业能够更轻松地跨平台和环境(从核心数据中心到网络边缘)工作。

让开源更具包容性

红帽致力于替换我们的代码、文档和 Web 属性中存在问题的语言。欲了解更多详情,请参阅红帽博客.

关于红帽文档

Legal Notice

Theme

© 2026 Red Hat
返回顶部