Upgrading OpenShift AI Cloud Service
Upgrade OpenShift AI on an OpenShift Dedicated or Red Hat OpenShift Service on AWS (ROSA Classic) cluster
Abstract
Preface
The Red Hat OpenShift AI Add-on is automatically updated as new releases or versions become available.
Chapter 1. Overview of upgrading OpenShift AI
Red Hat OpenShift AI is automatically updated as new release or versions become available. Currently, no administrator action is necessary to trigger the process.
When an OpenShift AI upgrade occurs, you should complete the Requirements for upgrading OpenShift AI.
Notes:
- Before you can use an accelerator in OpenShift AI, your instance must have the associated hardware profile. If your OpenShift cluster instance has an accelerator, its hardware profile is preserved after the upgrade. For more information about accelerators, see Working with accelerators.
- Notebook images are integrated into the image stream during the upgrade and subsequently appear in the OpenShift AI dashboard. Notebook images are constructed externally; they are prebuilt images that undergo quarterly changes and they do not change with every OpenShift AI upgrade.
Previously, data science pipelines in OpenShift AI were based on KubeFlow Pipelines v1. Data science pipelines are now based on KubeFlow Pipelines v2, which uses a different workflow engine. Data science pipelines 2.0 is enabled and deployed by default in OpenShift AI.
Data science pipelines 1.0 resources are no longer supported or managed by OpenShift AI. After upgrading to OpenShift AI with data science pipelines 2.0, it is no longer possible to deploy, view, or edit the details of pipelines that are based on data science pipelines 1.0 from either the dashboard or the KFP API server. If you are a current data science pipelines user, do not upgrade to OpenShift AI with data science pipelines 2.0 until you are ready to migrate to the new data science pipelines solution.
OpenShift AI does not automatically migrate existing data science pipelines 1.0 instances to 2.0. If you are upgrading to OpenShift AI with data science pipelines 2.0, you must manually migrate your existing data science pipelines 1.0 instances and update your workbenches. For more information, see Migrating to data science pipelines 2.0.
Data science pipelines 2.0 contains an installation of Argo Workflows. Red Hat does not support direct customer usage of this installation of Argo Workflows. To upgrade to OpenShift AI with data science pipelines 2.0, ensure that no separate installation of Argo Workflows exists on your cluster.
Chapter 2. Configuring the upgrade strategy for OpenShift AI
As a cluster administrator, you can configure either an automatic or manual upgrade strategy for the Red Hat OpenShift AI Operator.
By default, the Red Hat OpenShift AI Operator follows a sequential update process. This means that if there are several versions between the current version and the version that you intend to upgrade to, Operator Lifecycle Manager (OLM) upgrades the Operator to each of the intermediate versions before it upgrades it to the final, target version. If you configure automatic upgrades, OLM automatically upgrades the Operator to the latest available version, without human intervention. If you configure manual upgrades, a cluster administrator must manually approve each sequential update between the current version and the final, target version.
For information about supported versions, see the Red Hat OpenShift AI Life Cycle Knowledgebase article.
Prerequisites
- You have cluster administrator privileges for your OpenShift cluster.
- The Red Hat OpenShift AI Operator is installed.
Procedure
- Log in to the OpenShift cluster web console as a cluster administrator.
- In the Administrator perspective, in the left menu, select Operators → Installed Operators.
- Click the Red Hat OpenShift AI Operator.
- Click the Subscription tab.
Under Update approval, click the pencil icon and select one of the following update strategies:
-
Automatic
: New updates are installed as soon as they become available. -
Manual
: A cluster administrator must approve any new update before installation begins.
-
- Click Save.
Additional resources
- For more information about upgrading Operators that have been installed by using OLM, see Updating installed Operators in OpenShift Dedicated or Updating installed Operators in Red Hat OpenShift Service on AWS (ROSA)
Chapter 3. Requirements for upgrading OpenShift AI
When upgrading OpenShift AI, you must complete the following tasks.
Check the components in the DataScienceCluster
object
When you upgrade Red Hat OpenShift AI, the upgrade process automatically uses the values from the previous DataScienceCluster
object.
After the upgrade, you should inspect the DataScienceCluster
object and optionally update the status of any components as described in Updating the installation status of Red Hat OpenShift AI components by using the web console.
New components are not automatically added to the DataScienceCluster
object during upgrade. If you want to use a new component, you must manually edit the DataScienceCluster
object to add the component entry.
Migrate data science pipelines
Previously, data science pipelines in OpenShift AI were based on KubeFlow Pipelines v1. Data science pipelines are now based on KubeFlow Pipelines v2, which uses a different workflow engine. Data science pipelines 2.0 is enabled and deployed by default in OpenShift AI.
Data science pipelines 1.0 resources are no longer supported or managed by OpenShift AI. It is no longer possible to deploy, view, or edit the details of pipelines that are based on data science pipelines 1.0 from either the dashboard or the KFP API server.
OpenShift AI does not automatically migrate existing data science pipelines 1.0 instances to 2.0. Before upgrading OpenShift AI, you must manually migrate your existing data science pipelines 1.0 instances. For more information, see Migrating to data science pipelines 2.0.
Data science pipelines 2.0 contains an installation of Argo Workflows. Red Hat does not support direct customer usage of this installation of Argo Workflows.
If you upgrade to OpenShift AI with data science pipelines 2.0 and an Argo Workflows installation that is not installed by data science pipelines exists on your cluster, OpenShift AI components will not be upgraded. To complete the component upgrade, disable data science pipelines or remove the separate installation of Argo Workflows. The component upgrade will complete automatically.
Address KServe requirements
For the KServe component, which is used by the single-model serving platform to serve large models, you must meet the following requirements:
- To fully install and use KServe, you must also install Operators for Red Hat OpenShift Serverless and Red Hat OpenShift Service Mesh and perform additional configuration. For more information, see Serving large models.
-
If you want to add an authorization provider for the single-model serving platform, you must install the
Red Hat - Authorino
Operator. For information, see Adding an authorization provider for the single-model serving platform.
Update workflows interacting with OdhDashboardConfig
resource
Previously, cluster administrators used the groupsConfig
option in the OdhDashboardConfig
resource to manage the OpenShift groups (both administrators and non-administrators) that can access the OpenShift AI dashboard. Starting with OpenShift AI 2.17, this functionality has moved to the Auth
resource. If you have workflows (such as GitOps workflows) that interact with OdhDashboardConfig
, you must update them to reference the Auth
resource instead.
OpenShift AI 2.16 and earlier | OpenShift AI 2.17 and later | |
---|---|---|
|
|
|
|
|
|
|
|
|
Admin groups |
|
|
User groups |
|
|
Update Kueue
In OpenShift AI, cluster administrators use Kueue to configure quota management for distributed workloads.
When upgrading from OpenShift AI 2.17 or earlier, the version of the MultiKueue Custom Resource Definitions (CRDs) changes from v1alpha1
to v1beta1
.
However, if the kueue
component is set to Managed
, the Red Hat OpenShift AI Operator does not automatically remove the v1alpha1
MultiKueue CRDs during the upgrade. The deployment of the Kueue component then becomes blocked, as indicated in the default-dsc
DataScienceCluster
custom resource, where the value of the kueueReady
condition remains set to False
.
You can resolve this problem as follows:
The MultiKueue feature is not currently supported in Red Hat OpenShift AI. If you created any resources based on the MultiKueue CRDs, those resources will be deleted when you delete the CRDs. If you do not want to lose your data, create a backup before deleting the CRDs.
- Log in to the OpenShift Console.
- In the Administrator perspective, click Administration → CustomResourceDefinitions.
-
In the search field, enter
multik
. Update the MultiKueueCluster CRD as follows:
- Click the CRD name, and click the YAML tab.
Ensure that the
metadata:labels
section includes the following entry:Copy to Clipboard Copied! Toggle word wrap Toggle overflow app.opendatahub.io/kueue: 'true'
app.opendatahub.io/kueue: 'true'
- Click Save.
- Repeat the above steps to update the MultiKueueConfig CRD.
Remove the MultiKueueCluster and MultiKueueConfig CRDs, by completing the following steps for each CRD:
- Click the Actions menu.
- Click Delete CustomResourceDefinition.
- Click Delete to confirm the deletion.
The Red Hat OpenShift AI Operator starts the Kueue Controller, and Kueue then automatically creates the v1beta1
MultiKueue CRDs. In the default-dsc
DataScienceCluster
custom resource, the kueueReady
condition changes to True
. For information about how to check that the kueue-controller-manager-<pod-id> pod is Running, see Installing the distributed workloads components.
Chapter 4. Updating the installation status of Red Hat OpenShift AI components by using the web console
You can use the OpenShift web console to update the installation status of components of Red Hat OpenShift AI on your OpenShift cluster.
If you upgraded OpenShift AI, the upgrade process automatically used the values of the previous version’s DataScienceCluster
object. New components are not automatically added to the DataScienceCluster
object.
After upgrading OpenShift AI:
-
Inspect the default
DataScienceCluster
object to check and optionally update themanagementState
status of the existing components. -
Add any new components to the
DataScienceCluster
object.
Prerequisites
- Red Hat OpenShift AI is installed as an Add-on to your Red Hat OpenShift cluster.
- You have cluster administrator privileges for your OpenShift cluster.
Procedure
- Log in to the OpenShift web console as a cluster administrator.
- In the web console, click Operators → Installed Operators and then click the Red Hat OpenShift AI Operator.
- Click the Data Science Cluster tab.
-
On the DataScienceClusters page, click the
default
object. Click the YAML tab.
An embedded YAML editor opens showing the default custom resource (CR) for the
DataScienceCluster
object, similar to the following example:Copy to Clipboard Copied! Toggle word wrap Toggle overflow apiVersion: datasciencecluster.opendatahub.io/v1 kind: DataScienceCluster metadata: name: default-dsc spec: components: codeflare: managementState: Removed dashboard: managementState: Removed datasciencepipelines: managementState: Removed kserve: managementState: Removed kueue: managementState: Removed modelmeshserving: managementState: Removed ray: managementState: Removed trainingoperator: managementState: Removed trustyai: managementState: Removed workbenches: managementState: Removed
apiVersion: datasciencecluster.opendatahub.io/v1 kind: DataScienceCluster metadata: name: default-dsc spec: components: codeflare: managementState: Removed dashboard: managementState: Removed datasciencepipelines: managementState: Removed kserve: managementState: Removed kueue: managementState: Removed modelmeshserving: managementState: Removed ray: managementState: Removed trainingoperator: managementState: Removed trustyai: managementState: Removed workbenches: managementState: Removed
In the
spec.components
section of the CR, for each OpenShift AI component shown, set the value of themanagementState
field to eitherManaged
orRemoved
. These values are defined as follows:- Managed
- The Operator actively manages the component, installs it, and tries to keep it active. The Operator will upgrade the component only if it is safe to do so.
- Removed
- The Operator actively manages the component but does not install it. If the component is already installed, the Operator will try to remove it.
Important- To learn how to install the KServe component, which is used by the single-model serving platform to serve large models, see Installing the single-model serving platform.
-
If you have not enabled the KServe component (that is, you set the value of the
managementState
field toRemoved
), you must also disable the dependent Service Mesh component to avoid errors. See Disabling KServe dependencies. - To learn how to install the distributed workloads feature, see Installing the distributed workloads components.
Click Save.
For any components that you updated, OpenShift AI initiates a rollout that affects all pods to use the updated image.
If you are upgrading from OpenShift AI 2.19 or earlier, upgrade the Authorino Operator to the
stable
update channel, version 1.2.1 or later.-
Update Authorino to the latest available release in the
tech-preview-v1
channel (1.1.2), if you have not done so already. Switch to the
stable
channel:- Navigate to the Subscription settings of the Authorino Operator.
- Under Update channel, click on the highlighted tech-preview-v1.
-
Change the channel to
stable
.
- Select the update option for Authorino 1.2.1.
-
Update Authorino to the latest available release in the
Verification
Confirm that there is a running pod for each component:
- In the OpenShift web console, click Workloads → Pods.
-
In the Project list at the top of the page, select
redhat-ods-applications
. - In the applications namespace, confirm that there are running pods for each of the OpenShift AI components that you installed.
Confirm the status of all installed components:
- In the OpenShift web console, click Operators → Installed Operators.
- Click the Red Hat OpenShift AI Operator.
-
Click the Data Science Cluster tab and select the
DataScienceCluster
object calleddefault-dsc
. - Select the YAML tab.
In the
installedComponents
section, confirm that the components you installed have a status value oftrue
.NoteIf a component shows with the
component-name: {}
format in thespec.components
section of the CR, the component is not installed.
- In the Red Hat OpenShift AI dashboard, users can view the list of the installed OpenShift AI components, their corresponding source (upstream) components, and the versions of the installed components, as described in Viewing installed components.