Chapter 3. Requirements for upgrading OpenShift AI
When upgrading OpenShift AI, you must complete the following tasks.
Check the components in the DataScienceCluster
object
When you upgrade Red Hat OpenShift AI, the upgrade process automatically uses the values from the previous DataScienceCluster
object.
After the upgrade, you should inspect the DataScienceCluster
object and optionally update the status of any components as described in Updating the installation status of Red Hat OpenShift AI components by using the web console.
New components are not automatically added to the DataScienceCluster
object during upgrade. If you want to use a new component, you must manually edit the DataScienceCluster
object to add the component entry.
Migrate data science pipelines
Previously, data science pipelines in OpenShift AI were based on KubeFlow Pipelines v1. Starting with OpenShift AI 2.9, data science pipelines are based on KubeFlow Pipelines v2, which uses a different workflow engine. Data science pipelines 2.0 is enabled and deployed by default in OpenShift AI.
Starting with OpenShift AI 2.16, data science pipelines 1.0 resources are no longer supported or managed by OpenShift AI. It is no longer possible to deploy, view, or edit the details of pipelines that are based on data science pipelines 1.0 from either the dashboard or the KFP API server.
OpenShift AI does not automatically migrate existing data science pipelines 1.0 instances to 2.0. If you are upgrading to OpenShift AI 2.16 or later, you must manually migrate your existing data science pipelines 1.0 instances. For more information, see Migrating to data science pipelines 2.0.
Data science pipelines 2.0 contains an installation of Argo Workflows. Red Hat does not support direct customer usage of this installation of Argo Workflows.
If you upgrade to OpenShift AI with data science pipelines 2.0 and an Argo Workflows installation that is not installed by data science pipelines exists on your cluster, OpenShift AI components will not be upgraded. To complete the component upgrade, disable data science pipelines or remove the separate installation of Argo Workflows. The component upgrade will complete automatically.
Address KServe requirements
For the KServe component, which is used by the single-model serving platform to serve large models, you must meet the following requirements:
- To fully install and use KServe, you must also install Operators for Red Hat OpenShift Serverless and Red Hat OpenShift Service Mesh and perform additional configuration. For more information, see Serving large models.
-
If you want to add an authorization provider for the single-model serving platform, you must install the
Red Hat - Authorino
Operator. For information, see Adding an authorization provider for the single-model serving platform. -
If you have not enabled the KServe component (that is, you set the value of the
managementState
field toRemoved
in theDataScienceCluster
object), you must also disable the dependent Service Mesh component to avoid errors. See Disabling KServe dependencies.
Update workflows interacting with OdhDashboardConfig
resource
Previously, cluster administrators used the groupsConfig
option in the OdhDashboardConfig
resource to manage the OpenShift groups (both administrators and non-administrators) that can access the OpenShift AI dashboard. Starting with OpenShift AI 2.17, this functionality has moved to the Auth
resource. If you have workflows (such as GitOps workflows) that interact with OdhDashboardConfig
, you must update them to reference the Auth
resource instead.
OpenShift AI 2.16 and earlier | OpenShift AI 2.17 and later | |
---|---|---|
|
|
|
|
|
|
|
|
|
Admin groups |
|
|
User groups |
|
|
Update Kueue
In OpenShift AI, cluster administrators use Kueue to configure quota management for distributed workloads.
When upgrading from OpenShift AI 2.17 or earlier, the version of the MultiKueue Custom Resource Definitions (CRDs) changes from v1alpha1
to v1beta1
.
However, if the kueue
component is set to Managed
, the Red Hat OpenShift AI Operator does not automatically remove the v1alpha1
MultiKueue CRDs during the upgrade. The deployment of the Kueue component then becomes blocked, as indicated in the default-dsc
DataScienceCluster
custom resource, where the value of the kueueReady
condition remains set to False
.
You can resolve this problem as follows:
The MultiKueue feature is not currently supported in Red Hat OpenShift AI. If you created any resources based on the MultiKueue CRDs, those resources will be deleted when you delete the CRDs. If you do not want to lose your data, create a backup before deleting the CRDs.
- Log in to the OpenShift Console.
-
In the Administrator perspective, click Administration
CustomResourceDefinitions. -
In the search field, enter
multik
. Update the MultiKueueCluster CRD as follows:
- Click the CRD name, and click the YAML tab.
Ensure that the
metadata:labels
section includes the following entry:Copy to Clipboard Copied! Toggle word wrap Toggle overflow app.opendatahub.io/kueue: 'true'
app.opendatahub.io/kueue: 'true'
- Click Save.
- Repeat the above steps to update the MultiKueueConfig CRD.
Remove the MultiKueueCluster and MultiKueueConfig CRDs, by completing the following steps for each CRD:
- Click the Actions menu.
- Click Delete CustomResourceDefinition.
- Click Delete to confirm the deletion.
The Red Hat OpenShift AI Operator starts the Kueue Controller, and Kueue then automatically creates the v1beta1
MultiKueue CRDs. In the default-dsc
DataScienceCluster
custom resource, the kueueReady
condition changes to True
. For information about how to check that the kueue-controller-manager-<pod-id> pod is Running, see Installing the distributed workloads components (for disconnected environments, see Installing the distributed workloads components).
Check the status of certificate management
You can use self-signed certificates in OpenShift AI.
After you upgrade, check the management status for Certificate Authority (CA) bundles as described in Working with certificates.