Installing and uninstalling OpenShift AI Self-Managed


Red Hat OpenShift AI Self-Managed 3.2

Install and uninstall OpenShift AI Self-Managed

Abstract

Install and uninstall OpenShift AI Self-Managed on your OpenShift cluster.

Preface

Learn how to use both the OpenShift CLI (oc) and web console to install Red Hat OpenShift AI Self-Managed on your OpenShift cluster. To uninstall the product, learn how to use the recommended command-line interface (CLI) method.

Important

You cannot upgrade from OpenShift AI 2.25 or any earlier version to 3.0. OpenShift AI 3.0 introduces significant technology and component changes and is intended for new installations only. To use OpenShift AI 3.0, install the Red Hat OpenShift AI Operator on a cluster running OpenShift Container Platform 4.19 or later and select the fast-3.x channel.

Support for upgrades will be available in a later release, including upgrades from OpenShift AI 2.25 to a stable 3.x version.

For more information, see the Why upgrades to OpenShift AI 3.0 are not supported Knowledgebase article.

Note

Red Hat does not support installing more than one instance of OpenShift AI on your cluster.

Red Hat does not support installing the Red Hat OpenShift AI Operator on the same cluster as the Red Hat OpenShift AI Add-on.

Red Hat OpenShift AI Self-Managed is an Operator that is available in a self-managed environment, such as Red Hat OpenShift Container Platform, or in Red Hat-managed cloud environments such as Red Hat OpenShift Dedicated (with a Customer Cloud Subscription for AWS or GCP), Red Hat OpenShift Service on Amazon Web Services (ROSA classic or ROSA HCP), or Microsoft Azure Red Hat OpenShift.

OpenShift AI integrates the following components and services:

  • At the service layer:

    OpenShift AI dashboard
    A customer-facing dashboard that shows available and installed applications for the OpenShift AI environment as well as learning resources such as tutorials, quick starts, and documentation. Administrative users can access functionality to manage users, clusters, workbench images, and model-serving runtimes. Data scientists can use the dashboard to create projects to organize their data science work.
    Model serving
    Data scientists can deploy trained machine-learning models to serve intelligent applications in production. After deployment, applications can send requests to the model using its deployed API endpoint.
    AI pipelines
    Data scientists can build portable machine learning (ML) workflows with AI pipelines by using Docker containers. With AI pipelines, data scientists can automate workflows as they develop their data science models.
    Jupyter (self-managed)
    A self-managed application that allows data scientists to configure a basic standalone workbench and develop machine learning models in JupyterLab.
    Distributed workloads
    Data scientists can use multiple nodes in parallel to train machine-learning models or process data more quickly. This approach significantly reduces the task completion time, and enables the use of larger datasets and more complex models.
    Retrieval-Augmented Generation (RAG)
    Data scientists and AI engineers can leverage Retrieval-Augmented Generation (RAG) capabilities provided by the integrated Llama Stack Operator. By combining large language model inference, semantic retrieval, and vector database storage, data scientists and AI engineers can obtain tailored, accurate, and verifiable answers to complex queries based on their own datasets within a project.
  • At the management layer:

    The Red Hat OpenShift AI Operator
    A meta-operator that deploys and maintains all components and sub-operators that are part of OpenShift AI.

When you install the Red Hat OpenShift AI Operator in the OpenShift cluster using the predefined projects, the following new projects are created:

  • The redhat-ods-operator project contains the Red Hat OpenShift AI Operator.
  • The redhat-ods-applications project includes the dashboard and other required components of OpenShift AI.
  • The rhods-notebooks project is where basic workbenches are deployed by default.

You can specify custom projects if needed. You or your data scientists must also create additional projects for the applications that will use your machine learning models.

Do not install independent software vendor (ISV) applications in namespaces associated with OpenShift AI.

Chapter 2. Understanding update channels

You can use update channels to specify which Red Hat OpenShift AI minor version you intend to update your Operator to. Update channels also allow you to choose the timing and level of support your updates have through the fast, stable, stable-x.y eus-x.y, and alpha channel options.

The subscription of an installed Operator specifies the update channel, which is used to track and receive updates for the Operator. You can change the update channel to start tracking and receiving updates from a newer channel. For more information about the release frequency and the lifecycle associated with each of the available update channels, see the Red Hat OpenShift AI Self-Managed Life Cycle Knowledgebase article.

Expand
ChannelSupportRelease frequencyRecommended environment

fast or fast-x.y

One month of full support

Every month

Production environments with access to the latest product features.

Select this streaming channel with automatic updates to avoid manually upgrading every month.

NOTE: OpenShift AI 3.0 is available only through the fast-3.x channel, not the general fast channel.

stable

Three months of full support

Every three months

Production environments with stability prioritized over new feature availability.

Select this streaming channel with automatic updates to access the latest stable release and avoid manually upgrading.

stable-x.y

Seven months of full support

Every three months

Production environments with stability prioritized over new feature availability.

Select numbered stable channels (such as stable-2.25) to plan and upgrade to the next stable release while keeping your deployment under full support.

eus-x.y

Seven months of full support followed by Extended Update Support for eleven months

Every nine months

Enterprise-grade environments that cannot upgrade within a seven month window.

Select this streaming channel if you prioritize stability over new feature availability.

alpha

One month of full support

Every month

Development environments with early-access features that might not be functionally complete.

Select this channel to use early-access features to test functionality and provide feedback during the development process. Early-access features are not supported with Red Hat production service level agreements (SLAs).

For more information about the support scope of Red Hat Technology Preview features, see Technology Preview Features Support Scope.

For more information about the support scope of Red Hat Developer Preview features, see Developer Preview Features Support Scope.

Note

The embedded and beta channels are legacy channels that will be removed in a future release. Do not select the embedded or beta channels for a new Operator installation.

Chapter 3. Installing and deploying OpenShift AI

Important

You cannot upgrade from OpenShift AI 2.25 or any earlier version to 3.0. OpenShift AI 3.0 introduces significant technology and component changes and is intended for new installations only. To use OpenShift AI 3.0, install the Red Hat OpenShift AI Operator on a cluster running OpenShift Container Platform 4.19 or later and select the fast-3.x channel.

Support for upgrades will be available in a later release, including upgrades from OpenShift AI 2.25 to a stable 3.x version.

For more information, see the Why upgrades to OpenShift AI 3.0 are not supported Knowledgebase article.

Red Hat OpenShift AI is a platform for data scientists and developers of artificial intelligence (AI) applications. It provides a fully supported environment that lets you rapidly develop, train, test, and deploy machine learning models on-premises and/or in the public cloud.

OpenShift AI is provided as a managed cloud service add-on for Red Hat OpenShift or as self-managed software that you can install on-premise or in the public cloud on OpenShift.

For information about installing OpenShift AI as self-managed software on your OpenShift cluster in a disconnected environment, see Installing and uninstalling OpenShift AI Self-Managed in a disconnected environment. For information about installing OpenShift AI as a managed cloud service add-on, see Installing and uninstalling OpenShift AI Cloud Service.

Installing OpenShift AI involves the following high-level tasks:

  1. Confirm that your OpenShift cluster meets all requirements. See Requirements for OpenShift AI Self-Managed.
  2. Install the Red Hat OpenShift AI Operator. See Installing the Red Hat OpenShift AI Operator.
  3. Install OpenShift AI components. See Installing and managing Red Hat OpenShift AI components.
  4. Complete any additional configuration required for the components you enabled. See the component-specific configuration sections for details.
  5. Configure user and administrator groups to provide user access to OpenShift AI. See Adding users to OpenShift AI user groups.
  6. Access the OpenShift AI dashboard. See Accessing the OpenShift AI dashboard.

3.1. Requirements for OpenShift AI Self-Managed

You must meet the following requirements before you can install Red Hat OpenShift AI on your Red Hat OpenShift cluster.

3.1.1. Platform requirements

Subscriptions

  • A subscription for Red Hat OpenShift AI Self-Managed is required.
  • If you want to install OpenShift AI Self-Managed in a Red Hat-managed cloud environment, you must also have a subscription for one of the following platforms:

    • Red Hat OpenShift Dedicated on Amazon Web Services (AWS) or Google Cloud Platform (GCP)
    • Red Hat OpenShift Service on Amazon Web Services (ROSA classic)
    • Red Hat OpenShift Service on Amazon Web Services with hosted control planes (ROSA HCP)
    • Microsoft Azure Red Hat OpenShift
    • Red Hat OpenShift Kubernetes Engine (OKE)

      Note

      While OpenShift Kubernetes Engine (OKE) typically restricts the installation of certain post-installation Operators, Red Hat provides a specific licensing exception for Red Hat OpenShift AI users. This exception exclusively applies to Operators used to support Red Hat OpenShift AI workloads. Installing or using these Operators for purposes unrelated to OpenShift AI is a violation of the OKE service agreement.

Contact your Red Hat account manager to purchase new subscriptions. If you do not yet have an account manager, complete the form at https://www.redhat.com/en/contact to request one.

Cluster administrator access

  • Cluster administrator access is required to install OpenShift AI.
  • You can use an existing cluster or create a new one that meets the supported version requirements.

Supported OpenShift versions

The following OpenShift versions are supported for installing OpenShift AI:

  • OpenShift Container Platform 4.19 to 4.20. See OpenShift Container Platform installation overview.

    • To deploy models by using Distributed Inference with llm-d, your cluster must be running version 4.20 or later.
  • OpenShift Dedicated 4. See Creating an OpenShift Dedicated cluster.
  • ROSA classic 4. See Install ROSA classic clusters.
  • ROSA HCP 4. See Install ROSA with HCP clusters.
  • OpenShift Kubernetes Engine (OKE). See About OpenShift Kubernetes Engine.

    Note

    While OpenShift Kubernetes Engine (OKE) typically restricts the installation of certain post-installation Operators, Red Hat provides a specific licensing exception for Red Hat OpenShift AI users. This exception exclusively applies to Operators used to support Red Hat OpenShift AI workloads. Installing or using these Operators for purposes unrelated to OpenShift AI is a violation of the OKE service agreement.

    • The following Operators are required dependencies for Red Hat OpenShift AI 2.x and 3.x. These Operators are not supported on OKE, but can be installed if given an exception.

      Expand
      Red Hat OpenShift AI versionOperator (Unsupported, Exception Required)

      2.x

      Authorino Operator, Service Mesh Operator, Serverless Operator

      3.x

      Job-set-operator, openshift-custom-metrics-autoscaler-operator, cert-manager Operator, Leader Worker Set Operator, Red Hat Connectivity Link Operator, Kueue Operator (RHBOK), SR-IOV Operator, GPU Operator (with custom configurations), OpenTelemetry, Tempo, Cluster Observability Operator.

Important

In OpenStack, CodeReady Containers (CRC), and other private cloud environments without integrated external DNS, you must manually configure DNS A or CNAME records after installing the Operator and components, when the LoadBalancer IP becomes available. For more information, see Configuring External DNS for RHOAI 3.x on OpenStack and Private Clouds.

Cluster configuration

  • A minimum of 2 worker nodes with at least 8 CPUs and 32 GiB RAM each is required to install the Operator.
  • For single-node OpenShift clusters, the node must have at least 32 CPUs and 128 GiB RAM.
  • Additional resources are required depending on your workloads.
  • Open Data Hub must not be installed on the cluster.

Storage requirements

  • Your cluster must have a default storage class that supports dynamic provisioning. To confirm that a default storage class is configured, run the following command:

    oc get storageclass
    Copy to Clipboard Toggle word wrap

    If no storage class is marked as the default, see Changing the default storage class in the OpenShift Container Platform documentation.

Identity provider configuration

Internet access

  • Along with internet access, the following domains must be accessible during the installation of OpenShift AI:

    • cdn.redhat.com
    • subscription.rhn.redhat.com
    • registry.access.redhat.com
    • registry.redhat.io
    • quay.io
  • For environments that build or customize CUDA-based images using NVIDIA’s base images, or that directly pull artifacts from the NVIDIA NGC catalog, the following domains must also be accessible:

    • ngc.download.nvidia.cn
    • developer.download.nvidia.com
Note

Access to these NVIDIA domains is not required for standard OpenShift AI installations. The CUDA-based container images used by OpenShift AI are prebuilt and hosted on Red Hat’s registry at registry.redhat.io.

Object storage

  • Several components of OpenShift AI require or can use S3-compatible object storage, such as AWS S3, MinIO, Ceph, or IBM Cloud Storage. Object storage provides HTTP-based access to data by using the S3 API, which is the standard interface for most object storage services.
  • Object storage is required for:

    • Single-model serving platform, for storing and deploying models.
    • AI pipelines, for storing artifacts, logs, and intermediate results.
  • Object storage can also be used by:

    • Workbenches, for accessing large datasets.
    • Kueue-based workloads, for reading input data and writing output results.
    • Code executed inside pipelines, for persisting generated models or other artifacts.

Custom namespaces

  • By default, OpenShift AI uses predefined namespaces, but you can define custom namespaces for the Operator, applications, and workbenches if needed. Namespaces created by OpenShift AI typically include openshift or redhat in their name. Do not rename these system namespaces because they are required for OpenShift AI to function properly.
  • If you use custom namespaces, create and label them before installing the OpenShift AI Operator. See Configuring custom namespaces.

3.1.2. Component requirements

Meet the requirements for the components and capabilities that you plan to use.

Workbenches (workbenches)

AI Pipelines (aipipelines)

  • To store your pipeline artifacts in an S3-compatible object storage bucket so that you do not consume local storage, configure write access to your S3 bucket on your storage account.
  • If your cluster is running in FIPS mode, any custom container images for data science pipelines must be based on UBI 9 or RHEL 9. This ensures compatibility with FIPS-approved pipeline components and prevents errors related to mismatched OpenSSL or GNU C Library (glibc) versions.
  • To use your own Argo Workflows instance, after installing the OpenShift AI Operator see Configuring pipelines with your own Argo Workflows instance.

Kueue-based workloads (kueue, ray, trainingoperator)

Model serving platform (kserve)

  • Install the cert-manager Operator.

Distributed Inference with llm-d (advanced kserve)

Llama Stack and RAG workloads (llamastackoperator)

  • Install the Llama Stack Operator.
  • Install the Red Hat OpenShift Service Mesh Operator 3.x.
  • Install the cert-manager Operator.
  • Ensure you have GPU-enabled nodes available on your cluster.
  • Install the Node Feature Discovery Operator.
  • Install the NVIDIA GPU Operator.
  • Configure access to S3-compatible object storage for your model artifacts.
  • See Working with Llama Stack.

Model registry (modelregistry)

  • Configure access to an external MySQL database 5.x or later; 8.x is recommended.
  • Configure access to S3-compatible object storage.
  • See Creating a model registry.

3.2. Configuring custom namespaces

By default, OpenShift AI uses the following predefined namespaces:

  • redhat-ods-operator contains the Red Hat OpenShift AI Operator
  • redhat-ods-applications includes the dashboard and other required components of OpenShift AI
  • rhods-notebooks is where basic workbenches are deployed by default

If needed, you can define custom namespaces to use instead of the predefined ones before installing OpenShift AI. This flexibility supports environments with naming policies or conventions and allows cluster administrators to control where components such as workbenches are deployed.

Namespaces created by OpenShift AI typically include openshift or redhat in their name. Do not rename these system namespaces because they are required for OpenShift AI to function properly.

Prerequisites

  • You have access to an OpenShift AI cluster with cluster administrator privileges.
  • You have installed the OpenShift CLI (oc) as described in the appropriate documentation for your cluster:

  • You have not yet installed the Red Hat OpenShift AI Operator.

Procedure

  1. In a terminal window, if you are not already logged in to your OpenShift cluster as a cluster administrator, log in to the OpenShift CLI (oc) as shown in the following example:

    oc login <openshift_cluster_url> -u <admin_username> -p <password>
    Copy to Clipboard Toggle word wrap
  2. Optional: To configure a custom operator namespace:

    1. Create a namespace YAML file named operator-namespace.yaml.

      apiVersion: v1
      kind: Namespace
      metadata:
        name: <operator-namespace> 
      1
      Copy to Clipboard Toggle word wrap
      1
      Defines the operator namespace.
    2. Create the namespace in your OpenShift cluster.

      $ oc create -f operator-namespace.yaml
      Copy to Clipboard Toggle word wrap

      You see output similar to the following:

      namespace/<operator-namespace> created
      Copy to Clipboard Toggle word wrap
    3. When you install the Red Hat OpenShift AI Operator, use this namespace instead of redhat-ods-operator.
  3. Optional: To configure a custom applications namespace:

    1. Create a namespace YAML file named applications-namespace.yaml.

      apiVersion: v1
      kind: Namespace
      metadata:
        name: <applications-namespace> 
      1
      
        labels:
          opendatahub.io/application-namespace: 'true' 
      2
      Copy to Clipboard Toggle word wrap
      1
      Defines the applications namespace.
      2
      Adds the required label.
    2. Create the namespace in your OpenShift cluster.

      $ oc create -f applications-namespace.yaml
      Copy to Clipboard Toggle word wrap

      You see output similar to the following:

      namespace/<applications-namespace> created
      Copy to Clipboard Toggle word wrap
  4. Optional: To configure a custom workbench namespace:

    1. Create a namespace YAML file named workbench-namespace.yaml.

      apiVersion: v1
      kind: Namespace
      metadata:
        name: <workbench-namespace> 
      1
      Copy to Clipboard Toggle word wrap
      1
      Defines the workbench namespace.
    2. Create the namespace in your OpenShift cluster.

      $ oc create -f workbench-namespace.yaml
      Copy to Clipboard Toggle word wrap

      You see output similar to the following:

      namespace/<workbench-namespace> created
      Copy to Clipboard Toggle word wrap
    3. When you install the Red Hat OpenShift AI components, specify this namespace for the spec.workbenches.workbenchNamespace field. You cannot change the default workbench namespace after you have installed the Red Hat OpenShift AI Operator.

3.3. Installing the Red Hat OpenShift AI Operator

This section shows how to install the Red Hat OpenShift AI Operator on your OpenShift cluster using the command-line interface (CLI) and the OpenShift web console.

Note

If your OpenShift cluster uses a proxy to access the Internet, you can configure the proxy settings for the Red Hat OpenShift AI Operator. See Overriding proxy settings of an Operator for more information.

The following procedure shows how to use the OpenShift CLI (oc) to install the Red Hat OpenShift AI Operator on your OpenShift cluster. You must install the Operator before you can install OpenShift AI components on the cluster.

Prerequisites

  • You have a running OpenShift cluster, version 4.19 or greater, configured with a default storage class that can be dynamically provisioned.
  • You have cluster administrator privileges for your OpenShift cluster.
  • You have installed the OpenShift CLI (oc) as described in the appropriate documentation for your cluster:

  • If you are using custom namespaces, you have created and labeled them as required.

    Note

    The example commands in this procedure use the predefined operator namespace. If you are using a custom operator namespace, replace redhat-ods-operator with your namespace.

Procedure

  1. Open a new terminal window.
  2. Follow these steps to log in to your OpenShift cluster as a cluster administrator:

    1. In the upper-right corner of the OpenShift web console, click your user name and select Copy login command.
    2. After you have logged in, click Display token.
    3. Copy the Log in with this token command and paste it in your terminal.

      $ oc login --token=<token> --server=<openshift_cluster_url>
      Copy to Clipboard Toggle word wrap
  3. Create a namespace for installation of the Operator by performing the following actions:

    Note

    If you have already created a custom namespace for the Operator, you can skip this step.

    1. Create a namespace YAML file named rhods-operator-namespace.yaml.

      apiVersion: v1
      kind: Namespace
      metadata:
        name: redhat-ods-operator 
      1
      Copy to Clipboard Toggle word wrap
      1
      Defines the operator namespace.
    2. Create the namespace in your OpenShift cluster.

      $ oc create -f rhods-operator-namespace.yaml
      Copy to Clipboard Toggle word wrap

      You see output similar to the following:

      namespace/redhat-ods-operator created
      Copy to Clipboard Toggle word wrap
  4. Create an operator group for installation of the Operator by performing the following actions:

    1. Create an OperatorGroup object custom resource (CR) file, for example, rhods-operator-group.yaml.

      apiVersion: operators.coreos.com/v1
      kind: OperatorGroup
      metadata:
        name: rhods-operator
        namespace: redhat-ods-operator 
      1
      Copy to Clipboard Toggle word wrap
      1
      Defines the operator namespace.
    2. Create the OperatorGroup object in your OpenShift cluster.

      $ oc create -f rhods-operator-group.yaml
      Copy to Clipboard Toggle word wrap

      You see output similar to the following:

      operatorgroup.operators.coreos.com/rhods-operator created
      Copy to Clipboard Toggle word wrap
  5. Create a subscription for installation of the Operator by performing the following actions:

    1. Create a Subscription object CR file, for example, rhods-operator-subscription.yaml.

      apiVersion: operators.coreos.com/v1alpha1
      kind: Subscription
      metadata:
        name: rhods-operator
        namespace: redhat-ods-operator 
      1
      
      spec:
        name: rhods-operator
        channel: <channel> 
      2
      
        source: redhat-operators
        sourceNamespace: openshift-marketplace
        startingCSV: rhods-operator.x.y.z 
      3
      Copy to Clipboard Toggle word wrap
      1
      Defines the operator namespace.
      2
      Sets the update channel. You must specify a value of fast, fast-x.y, stable, stable-x.y eus-x.y, or alpha. For more information, see Understanding update channels.
      3
      Optional: Sets the operator version. If you do not specify a value, the subscription defaults to the latest operator version. For more information, see the Red Hat OpenShift AI Self-Managed Life Cycle Knowledgebase article.
    2. Create the Subscription object in your OpenShift cluster to install the Operator.

      $ oc create -f rhods-operator-subscription.yaml
      Copy to Clipboard Toggle word wrap

      You see output similar to the following:

      subscription.operators.coreos.com/rhods-operator created
      Copy to Clipboard Toggle word wrap

Verification

  • In the OpenShift web console, click OperatorsInstalled Operators and confirm that the Red Hat OpenShift AI Operator shows one of the following statuses:

    • Installing - installation is in progress; wait for this to change to Succeeded. This might take several minutes.
    • Succeeded - installation is successful.

The following procedure shows how to use the OpenShift web console to install the Red Hat OpenShift AI Operator on your cluster. You must install the Operator before you can install OpenShift AI components on the cluster.

Prerequisites

  • You have a running OpenShift cluster, version 4.19 or greater, configured with a default storage class that can be dynamically provisioned.
  • You have cluster administrator privileges for your OpenShift cluster.
  • If you are using custom namespaces, you have created and labeled them as required.

Procedure

  1. Log in to the OpenShift web console as a cluster administrator.
  2. In the web console, click OperatorsOperatorHub.
  3. On the OperatorHub page, locate the Red Hat OpenShift AI Operator by scrolling through the available Operators or by typing Red Hat OpenShift AI into the Filter by keyword box.
  4. Click the Red Hat OpenShift AI tile. The Red Hat OpenShift AI information pane opens.
  5. Select a Channel. For information about subscription update channels, see Understanding update channels.
  6. Select a Version.
  7. Click Install. The Install Operator page opens.
  8. Review or change the selected channel and version as needed.
  9. For Installation mode, note that the only available value is All namespaces on the cluster (default). This installation mode makes the Operator available to all namespaces in the cluster.
  10. For Installed Namespace, choose one of the following options:

    • To use the predefined operator namespace, select the Operator recommended Namespace: redhat-ods-operator option.
    • To use the custom operator namespace that you created, select the Select a Namespace option, and then select the namespace from the drop-down list.
  11. For Update approval, select one of the following update strategies:

    • Automatic: New updates in the update channel are installed as soon as they become available.
    • Manual: A cluster administrator must approve any new updates before installation begins.

      Important

      By default, the Red Hat OpenShift AI Operator follows a sequential update process. This means that if there are several versions between the current version and the target version, Operator Lifecycle Manager (OLM) upgrades the Operator to each of the intermediate versions before it upgrades it to the final, target version.

      If you configure automatic upgrades, OLM automatically upgrades the Operator to the latest available version. If you configure manual upgrades, a cluster administrator must manually approve each sequential update between the current version and the final, target version.

      For information about supported versions, see the Red Hat OpenShift AI Life Cycle Knowledgebase article.

  12. Click Install.

    The Installing Operators pane appears. When the installation finishes, a checkmark appears next to the Operator name.

Verification

  • In the OpenShift web console, click OperatorsInstalled Operators and confirm that the Red Hat OpenShift AI Operator shows one of the following statuses:

    • Installing - installation is in progress; wait for this to change to Succeeded. This might take several minutes.
    • Succeeded - installation is successful.

You can use the OpenShift command-line interface (CLI) or OpenShift web console to install and manage components of Red Hat OpenShift AI on your OpenShift cluster.

To install Red Hat OpenShift AI components by using the OpenShift CLI (oc), you must create and configure a DataScienceCluster object.

Important

The following procedure describes how to create and configure a DataScienceCluster object to install Red Hat OpenShift AI components as part of a new installation.

For information about changing the installation status of OpenShift AI components after installation, see Updating the installation status of Red Hat OpenShift AI components by using the web console.

Prerequisites

Procedure

  1. Open a new terminal window.
  2. Follow these steps to log in to your OpenShift cluster as a cluster administrator:

    1. In the upper-right corner of the OpenShift web console, click your user name and select Copy login command.
    2. After you have logged in, click Display token.
    3. Copy the Log in with this token command and paste it in your terminal.

      $ oc login --token=<token> --server=<openshift_cluster_url>
      Copy to Clipboard Toggle word wrap
  3. Create a DataScienceCluster object custom resource (CR) file, for example, rhods-operator-dsc.yaml.

    apiVersion: datasciencecluster.opendatahub.io/v2
    kind: DataScienceCluster
    metadata:
      name: default-dsc
    spec:
      components:
        aipipelines:
          argoWorkflowsControllers:
            managementState: Removed 
    1
    
          managementState: Removed
        dashboard:
          managementState: Removed
        feastoperator:
          managementState: Removed
        kserve:
          managementState: Removed
        kueue:
          defaultClusterQueueName: default
          defaultLocalQueueName: default
          managementState: Removed
        llamastackoperator:
          managementState: Removed
        modelregistry:
          managementState: Removed
          registriesNamespace: rhoai-model-registries
        ray:
          managementState: Removed
        trainingoperator:
          managementState: Removed
        trustyai:
          managementState: Removed
        workbenches:
          managementState: Removed
          workbenchNamespace: rhods-notebooks 
    2
    Copy to Clipboard Toggle word wrap
    1
    To use your own Argo Workflows instance with the aipipelines component, set argoWorkflowsControllers.managementState to Removed. This allows you to integrate with a managed Argo Workflows installation already on your OpenShift cluster and avoid conflicts with the embedded controller. See Configuring pipelines with your own Argo Workflows instance.
    2
    To use the predefined workbench namespace, set this value to rhods-notebooks or omit this line. To use a custom workbench namespace, set this value to your namespace.
  4. In the spec.components section of the CR, for each OpenShift AI component shown, set the value of the managementState field to either Managed or Removed. These values are defined as follows:

    Managed
    The Operator actively manages the component, installs it, and tries to keep it active. The Operator will upgrade the component only if it is safe to do so.
    Removed
    The Operator actively manages the component but does not install it. If the component is already installed, the Operator will try to remove it.
    Important
  5. Create the DataScienceCluster object in your OpenShift cluster to install the specified OpenShift AI components.

    $ oc create -f rhods-operator-dsc.yaml
    Copy to Clipboard Toggle word wrap

    You see output similar to the following:

    datasciencecluster.datasciencecluster.opendatahub.io/default created
    Copy to Clipboard Toggle word wrap

Verification

  1. Confirm that there is at least one running pod for each component:

    1. In the OpenShift web console, click WorkloadsPods.
    2. In the Project list at the top of the page, select redhat-ods-applications.
    3. In the applications namespace, confirm that there are one or more running pods for each of the OpenShift AI components that you installed.
  2. Confirm the status of all installed components:

    1. In the OpenShift web console, click OperatorsInstalled Operators.
    2. Click the Red Hat OpenShift AI Operator.
    3. Click the Data Science Cluster tab.
    4. For the DataScienceCluster object called default-dsc, verify that the status is Phase: Ready.

      Note

      When you edit the spec.components section to change the installation status of a component, the default-dsc status also changes. During the initial installation, it might take a few minutes for the status phase to change from Progressing to Ready. You can access the OpenShift AI dashboard before the default-dsc status phase is Ready, but all components might not be ready.

    5. Click the default-dsc link to display the data science cluster details.
    6. Select the YAML tab.
    7. In the status.installedComponents section, confirm that the components you installed have a status value of true.

      Note

      If a component shows with the component-name: {} format in the spec.components section of the CR, the component is not installed.

  3. In the OpenShift AI dashboard, users can view the list of the installed OpenShift AI components, their corresponding source (upstream) components, and the versions of the installed components, as described in Viewing installed OpenShift AI components.

Next steps

  • If you are using OpenStack, CodeReady Containers (CRC), or other private cloud environments without integrated external DNS, manually configure DNS A or CNAME records after the LoadBalancer IP becomes available. For more information, see Configuring External DNS for RHOAI 3.x on OpenStack and Private Clouds.
  • Complete any additional configuration required for the components you enabled. See the component-specific configuration sections for details.

To install Red Hat OpenShift AI components by using the OpenShift web console, you must create and configure a DataScienceCluster object.

Important

The following procedure describes how to create and configure a DataScienceCluster object to install Red Hat OpenShift AI components as part of a new installation.

Prerequisites

  • The Red Hat OpenShift AI Operator is installed on your OpenShift cluster. See Installing the Red Hat OpenShift AI Operator.
  • You have cluster administrator privileges for your OpenShift cluster.
  • If you are using custom namespaces, you have created the namespaces.

Procedure

  1. Log in to the OpenShift web console as a cluster administrator.
  2. In the web console, click OperatorsInstalled Operators and then click the Red Hat OpenShift AI Operator.
  3. Click the Data Science Cluster tab.
  4. Click Create DataScienceCluster.
  5. For Configure via, select YAML view.

    An embedded YAML editor opens showing a default custom resource (CR) for the DataScienceCluster object, similar to the following example:

    apiVersion: datasciencecluster.opendatahub.io/v2
    kind: DataScienceCluster
    metadata:
      name: default-dsc
    spec:
      components:
        aipipelines:
          argoWorkflowsControllers:
            managementState: Removed 
    1
    
          managementState: Removed
        dashboard:
          managementState: Removed
        feastoperator:
          managementState: Removed
        kserve:
          managementState: Removed
        kueue:
          defaultClusterQueueName: default
          defaultLocalQueueName: default
          managementState: Removed
        llamastackoperator:
          managementState: Removed
        modelregistry:
          managementState: Removed
          registriesNamespace: rhoai-model-registries
        ray:
          managementState: Removed
        trainingoperator:
          managementState: Removed
        trustyai:
          managementState: Removed
        workbenches:
          managementState: Removed
          workbenchNamespace: rhods-notebooks 
    2
    Copy to Clipboard Toggle word wrap
    1
    To use your own Argo Workflows instance with the aipipelines component, set argoWorkflowsControllers.managementState to Removed. This allows you to integrate with a managed Argo Workflows installation already on your OpenShift cluster and avoid conflicts with the embedded controller. See Configuring pipelines with your own Argo Workflows instance.
    2
    To use the predefined workbench namespace, set this value to rhods-notebooks or omit this line. To use a custom workbench namespace, set this value to your namespace.
  6. In the spec.components section of the CR, for each OpenShift AI component shown, set the value of the managementState field to either Managed or Removed. These values are defined as follows:

    Managed
    The Operator actively manages the component, installs it, and tries to keep it active. The Operator will upgrade the component only if it is safe to do so.
    Removed
    The Operator actively manages the component but does not install it. If the component is already installed, the Operator will try to remove it.
    Important
  7. Click Create.

Verification

  1. Confirm the status of all installed components:

    1. In the OpenShift web console, click OperatorsInstalled Operators.
    2. Click the Red Hat OpenShift AI Operator.
    3. Click the Data Science Cluster tab.
    4. For the DataScienceCluster object called default-dsc, verify that the status is Phase: Ready.

      Note

      When you edit the spec.components section to change the installation status of a component, the default-dsc status also changes. During the initial installation, it might take a few minutes for the status phase to change from Progressing to Ready. You can access the OpenShift AI dashboard before the default-dsc status phase is Ready, but all components might not be ready.

    5. Click the default-dsc link to display the data science cluster details.
    6. Select the YAML tab.
    7. In the status.installedComponents section, confirm that the components you installed have a status value of true.

      Note

      If a component shows with the component-name: {} format in the spec.components section of the CR, the component is not installed.

  2. Confirm that there is at least one running pod for each component:

    1. In the OpenShift web console, click WorkloadsPods.
    2. In the Project list at the top of the page, select redhat-ods-applications or your custom applications namespace.
    3. In the applications namespace, confirm that there are one or more running pods for each of the OpenShift AI components that you installed.
  3. In the OpenShift AI dashboard, users can view the list of the installed OpenShift AI components, their corresponding source (upstream) components, and the versions of the installed components, as described in Viewing installed OpenShift AI components.

Next steps

  • If you are using OpenStack, CodeReady Containers (CRC), or other private cloud environments without integrated external DNS, manually configure DNS A or CNAME records after the LoadBalancer IP becomes available. For more information, see Configuring External DNS for RHOAI 3.x on OpenStack and Private Clouds.
  • Complete any additional configuration required for the components you enabled. See the component-specific configuration sections for details.

You can use the OpenShift web console to update the installation status of components of Red Hat OpenShift AI on your OpenShift cluster.

Prerequisites

  • The Red Hat OpenShift AI Operator is installed on your OpenShift cluster.
  • You have cluster administrator privileges for your OpenShift cluster.

Procedure

  1. Log in to the OpenShift web console as a cluster administrator.
  2. In the web console, click OperatorsInstalled Operators and then click the Red Hat OpenShift AI Operator.
  3. Click the Data Science Cluster tab.
  4. On the DataScienceClusters page, click the default-dsc object.
  5. Click the YAML tab.

    An embedded YAML editor opens showing the default custom resource (CR) for the DataScienceCluster object, similar to the following example:

    apiVersion: datasciencecluster.opendatahub.io/v2
    kind: DataScienceCluster
    metadata:
      name: default-dsc
    spec:
      components:
        aipipelines:
          argoWorkflowsControllers:
            managementState: Removed
          managementState: Removed
        dashboard:
          managementState: Removed
        feastoperator:
          managementState: Removed
        kserve:
          managementState: Removed
        kueue:
          defaultClusterQueueName: default
          defaultLocalQueueName: default
          managementState: Removed
        llamastackoperator:
          managementState: Removed
        modelregistry:
          managementState: Removed
          registriesNamespace: rhoai-model-registries
        ray:
          managementState: Removed
        trainingoperator:
          managementState: Removed
        trustyai:
          managementState: Removed
        workbenches:
          managementState: Removed
          workbenchNamespace: rhods-notebooks
    Copy to Clipboard Toggle word wrap
  6. In the spec.components section of the CR, for each OpenShift AI component shown, set the value of the managementState field to either Managed or Removed. These values are defined as follows:

    Managed
    The Operator actively manages the component, installs it, and tries to keep it active. The Operator will upgrade the component only if it is safe to do so.
    Removed
    The Operator actively manages the component but does not install it. If the component is already installed, the Operator will try to remove it.
    Important
  7. Click Save.

    For any components that you updated, OpenShift AI initiates a rollout that affects all pods to use the updated image.

  8. If you are upgrading from OpenShift AI 2.19 or earlier, upgrade the Authorino Operator to the stable update channel, version 1.2.1 or later.

    1. Update Authorino to the latest available release in the tech-preview-v1 channel (1.1.2), if you have not done so already.
    2. Switch to the stable channel:

      1. Navigate to the Subscription settings of the Authorino Operator.
      2. Under Update channel, click on the highlighted tech-preview-v1.
      3. Change the channel to stable.
    3. Select the update option for Authorino 1.2.1.

Verification

  1. Confirm that there is at least one running pod for each component:

    1. In the OpenShift web console, click WorkloadsPods.
    2. In the Project list at the top of the page, select redhat-ods-applications or your custom applications namespace.
    3. In the applications namespace, confirm that there are one or more running pods for each of the OpenShift AI components that you installed.
  2. Confirm the status of all installed components:

    1. In the OpenShift web console, click OperatorsInstalled Operators.
    2. Click the Red Hat OpenShift AI Operator.
    3. Click the Data Science Cluster tab and select the DataScienceCluster object called default-dsc.
    4. Select the YAML tab.
    5. In the status.installedComponents section, confirm that the components you installed have a status value of true.

      Note

      If a component shows with the component-name: {} format in the spec.components section of the CR, the component is not installed.

  3. In the OpenShift AI dashboard, users can view the list of the installed OpenShift AI components, their corresponding source (upstream) components, and the versions of the installed components, as described in Viewing installed OpenShift AI components.

3.4.4. Viewing installed OpenShift AI components

In the Red Hat OpenShift AI dashboard, you can view a list of the installed OpenShift AI components, their corresponding source (upstream) components, and the versions of the installed components.

Prerequisites

  • OpenShift AI is installed in your OpenShift cluster.

Procedure

  1. Log in to the OpenShift AI dashboard.
  2. In the top navigation bar, click the help icon ( Help icon ) and then select About.

Verification

The About page shows a list of the installed OpenShift AI components along with their corresponding upstream components and upstream component versions.

You can configure OpenShift AI to use an existing Argo Workflows instance instead of the embedded one included with AI pipelines. This configuration is useful if your OpenShift cluster already includes a managed Argo Workflows instance and you want to integrate it with OpenShift AI pipelines without conflicts. Disabling the embedded Argo Workflows controller allows cluster administrators to manage the lifecycles of OpenShift AI and Argo Workflows independently.

Note

You cannot enable both the embedded Argo Workflows instance and your own Argo Workflows instance on the same cluster.

Prerequisites

  • You have cluster administrator privileges for your OpenShift cluster.
  • You have installed Red Hat OpenShift AI.

Procedure

  1. Log in to the OpenShift web console as a cluster administrator.
  2. In the OpenShift console, click OperatorsInstalled Operators.
  3. Search for the Red Hat OpenShift AI Operator, and then click the Operator name to open the Operator details page.
  4. Click the Data Science Cluster tab.
  5. Click the default instance name (for example, default-dsc) to open the instance details page.
  6. Click the YAML tab to show the instance specifications.
  7. Disable the embedded Argo Workflows controllers that are managed by the OpenShift AI Operator:

    1. In the spec.components section, set the value of the managementState field for the aipipelines component to Managed.
    2. In the spec.components.aipipelines section, set the value of the managementState field for argoWorkflowsControllers to Removed, as shown in the following example:

      Example aipipelines specification

      # ...
      spec:
        components:
          aipipelines:
            argoWorkflowsControllers:
              managementState: Removed
            managementState: Managed
      # ...
      Copy to Clipboard Toggle word wrap

  8. Click Save to apply your changes.
  9. Install and configure a compatible version of Argo Workflows on your cluster. For compatible version information, see Supported Configurations for 3.x. For installation information, see the Argo Workflows Installation documentation.

Verification

  1. On the Details tab of the DataScienceCluster instance (for example, default-dsc), verify that AIPipelinesReady has a Status of True.
  2. Verify that the ds-pipeline-workflow-controller pod does not exist:

    1. Go to WorkloadsPods.
    2. Search for the ds-pipeline-workflow-controller pod.
    3. Verify that this pod does not exist. The absence of this pod confirms that the embedded Argo Workflows controller is disabled.

To use the distributed workloads feature in OpenShift AI, you must install several components.

Prerequisites

  • You have logged in to OpenShift with the cluster-admin role and you can access the data science cluster.
  • You have installed Red Hat OpenShift AI.
  • You have installed the Red Hat build of Kueue Operator on your OpenShift cluster, as described in the Red Hat build of Kueue documentation.
  • You have sufficient resources. In addition to the minimum OpenShift AI resources described in Installing and deploying OpenShift AI (for disconnected environments, see Deploying OpenShift AI in a disconnected environment), you need 1.6 vCPU and 2 GiB memory to deploy the distributed workloads infrastructure.
  • You have installed the cert-manager Operator in OpenShift by using the web console as described in Installing the cert-manager Operator for Red Hat OpenShift.
  • If you want to use graphics processing units (GPUs), you have enabled GPU support in OpenShift AI. If you use NVIDIA GPUs, see Enabling NVIDIA GPUs. If you use AMD GPUs, see AMD GPU integration.

    Note

    In OpenShift AI, Red Hat supports the use of accelerators within the same cluster only.

    Starting from Red Hat OpenShift AI 2.19, Red Hat supports remote direct memory access (RDMA) for NVIDIA GPUs only, enabling them to communicate directly with each other by using NVIDIA GPUDirect RDMA across either Ethernet or InfiniBand networks.

  • If you want to use self-signed certificates, you have added them to a central Certificate Authority (CA) bundle as described in Working with certificates (for disconnected environments, see Working with certificates). No additional configuration is necessary to use those certificates with distributed workloads. The centrally configured self-signed certificates are automatically available in the workload pods at the following mount points:

    • Cluster-wide CA bundle:

      /etc/pki/tls/certs/odh-trusted-ca-bundle.crt
      /etc/ssl/certs/odh-trusted-ca-bundle.crt
      Copy to Clipboard Toggle word wrap
    • Custom CA bundle:

      /etc/pki/tls/certs/odh-ca-bundle.crt
      /etc/ssl/certs/odh-ca-bundle.crt
      Copy to Clipboard Toggle word wrap

Procedure

  1. In the OpenShift console, click OperatorsInstalled Operators.
  2. Search for the Red Hat OpenShift AI Operator, and then click the Operator name to open the Operator details page.
  3. Click the Data Science Cluster tab.
  4. Click the default instance name (for example, default-dsc) to open the instance details page.
  5. Click the YAML tab to show the instance specifications.
  6. Enable the required distributed workloads components. In the spec.components section, set the managementState field correctly for the required components:

    • Set kueue to Unmanaged to allow the Red Hat build of Kueue Operator to manage Kueue.
    • If you want to use the Ray framework to tune models, set ray to Managed.
    • If you want to use the Kubeflow Training Operator to tune models, set trainingoperator to Managed.
    • The list of required components depends on whether the distributed workload is run from a pipeline or workbench or both, as shown in the following table.
    Expand
    Table 5.1. Components required for distributed workloads
    ComponentPipelines onlyWorkbenches onlyPipelines and workbenches

    dashboard

    Managed

    Managed

    Managed

    aipipelines

    Managed

    Removed

    Managed

    kueue

    Unmanaged

    Unmanaged

    Unmanaged

    ray

    Managed

    Managed

    Managed

    trainingoperator

    Managed

    Managed

    Managed

    workbenches

    Removed

    Managed

    Managed

  7. Click Save. After a short time, the components with a Managed state are ready.

Verification

Check the status of the kubeflow-training-operator, kuberay-operator, kueue-controller-manager, and openshift-kueue-operator pods, as follows:

  1. In the OpenShift console, click WorkloadsDeployments.
  2. In the Search by name field, enter the following search strings:

    • In the redhat-ods-applications project, search for kubeflow-training-operator and kuberay-operator.
    • In the openshift-kueue-operator project, search for kueue-controller-manager and openshift-kueue-operator.
  3. In each case, check the status as follows:

    1. Click the deployment name to open the deployment details page.
    2. Click the Pods tab.
    3. Check the pod status.

      When the status of the pods is Running, the pods are ready to use.

    4. To see more information about each pod, click the pod name to open the pod details page, and then click the Logs tab.

Next Step

Configure the distributed workloads feature as described in Managing distributed workloads.

Chapter 6. Accessing the dashboard

After you have installed OpenShift AI and added users, you can access the URL for your OpenShift AI console and share the URL with the users to let them log in and work on their models.

Prerequisites

  • You have installed OpenShift AI on your OpenShift cluster.
  • You have added at least one user to the user group for OpenShift AI.

Procedure

  1. Log in to OpenShift web console.
  2. Click the application launcher ( The application launcher ).
  3. Right-click Red Hat OpenShift AI and copy the URL for your OpenShift AI instance.
  4. Provide this instance URL to your data scientists to let them log in to OpenShift AI.

Verification

  • Confirm that you and your users can log in to OpenShift AI by using the instance URL.

Note: In the Red Hat OpenShift AI dashboard, users can view the list of the installed OpenShift AI components, their corresponding source (upstream) components, and the versions of the installed components, as described in Viewing installed components.

Chapter 7. Enabling accelerators

Before you can use an accelerator in OpenShift AI, you must install the relevant software components. The installation process varies based on the accelerator type.

Prerequisites

  • You have logged in to your OpenShift cluster.
  • You have the cluster-admin role in your OpenShift cluster.
  • You have installed an accelerator and confirmed that it is detected in your environment.

Procedure

  1. Follow the appropriate documentation to enable your accelerator:

  2. After installing your accelerator, create a hardware profile as described in: Working with hardware profiles.

Verification

  • From the Administrator perspective, go to the OperatorsInstalled Operators page. Confirm that the following Operators appear:

    • The Operator for your accelerator
    • Node Feature Discovery (NFD)
    • Kernel Module Management (KMM)
  • The accelerator is correctly detected a few minutes after full installation of the Node Feature Discovery (NFD) and the relevant accelerator Operator. The OpenShift CLI (oc) displays the appropriate output for the GPU worker node. For example, here is output confirming that an NVIDIA GPU is detected:

    # Expected output when the accelerator is detected correctly
    oc describe node <node name>
    ...
    Capacity:
      cpu:                4
      ephemeral-storage:  313981932Ki
      hugepages-1Gi:      0
      hugepages-2Mi:      0
      memory:             16076568Ki
      nvidia.com/gpu:     1
      pods:               250
    Allocatable:
      cpu:                3920m
      ephemeral-storage:  288292006229
      hugepages-1Gi:      0
      hugepages-2Mi:      0
      memory:             12828440Ki
      nvidia.com/gpu:     1
      pods:               250
    Copy to Clipboard Toggle word wrap

Chapter 8. Working with certificates

When you install Red Hat OpenShift AI, OpenShift automatically applies a default Certificate Authority (CA) bundle to manage authentication for most OpenShift AI components, such as workbenches and model servers. These certificates are trusted self-signed certificates that help secure communication. However, as a cluster administrator, you might need to configure additional self-signed certificates to use some components, such as the AI pipeline server and object storage solutions. If an OpenShift AI component uses a self-signed certificate that is not part of the existing cluster-wide CA bundle, you have the following options for including the certificate:

  • Add it to the OpenShift cluster-wide CA bundle.
  • Add it to a custom CA bundle, separate from the cluster-wide CA bundle.

As a cluster administrator, you can also change how to manage authentication for OpenShift AI as follows:

  • Manually manage certificate changes, instead of relying on the OpenShift AI Operator to handle them automatically.
  • Remove the cluster-wide CA bundle, either from all namespaces or specific ones. If you prefer to implement a different authentication approach, you can override the default OpenShift AI behavior, as described in Removing the CA bundle.

After installing OpenShift AI, the Red Hat OpenShift AI Operator automatically creates an empty odh-trusted-ca-bundle configuration file (ConfigMap). The Cluster Network Operator (CNO) injects the cluster-wide CA bundle into the odh-trusted-ca-bundle configMap with the label "config.openshift.io/inject-trusted-cabundle".

apiVersion: v1
kind: ConfigMap
metadata:
  labels:
    app.kubernetes.io/part-of: opendatahub-operator
    config.openshift.io/inject-trusted-cabundle: 'true'
  name: odh-trusted-ca-bundle
Copy to Clipboard Toggle word wrap

After the CNO operator injects the bundle, it updates the ConfigMap with the contents of the ca-bundle.crt file.

apiVersion: v1
kind: ConfigMap
metadata:
  labels:
    app.kubernetes.io/part-of: opendatahub-operator
    config.openshift.io/inject-trusted-cabundle: 'true'
  name: odh-trusted-ca-bundle
data:
  ca-bundle.crt: |
    <BUNDLE OF CLUSTER-WIDE CERTIFICATES>
Copy to Clipboard Toggle word wrap

The management of CA bundles is configured through the Data Science Cluster Initialization (DSCI) object. Within this object, you can set the spec.trustedCABundle.managementState field to one of the following values:

  • Managed: (Default) The Red Hat OpenShift AI Operator manages the odh-trusted-ca-bundle ConfigMap and adds it to all non-reserved existing and new namespaces. It does not add the ConfigMap to any reserved or system namespaces, such as default, openshift-\* or kube-*. The Red Hat OpenShift AI Operator automatically updates the ConfigMap to reflect any changes made to the customCABundle field.
  • Unmanaged: The Red Hat OpenShift AI administrator manually manages the odh-trusted-ca-bundle ConfigMap, instead of allowing the Operator to manage it. Changing the managementState from Managed to Unmanaged does not remove the odh-trusted-ca-bundle ConfigMap. However, the ConfigMap is no longer automatically updated if changes are made to the customCABundle field.

    The Unmanaged setting is useful if your organization implements a different method for managing trusted CA bundles, such as Ansible automation, and does not want the Red Hat OpenShift AI Operator to handle certificates automatically. This setting provides greater control, preventing the Operator from overwriting custom configurations.

  • Removed: The Red Hat OpenShift AI Operator removes the odh-trusted-ca-bundle ConfigMap, if present, and prevents ConfigMaps from being created in new namespaces. Changing this field from Managed to Removed also deletes the ConfigMap from existing namespaces. This is the default value after upgrading Red Hat OpenShift AI from 2.7 or earlier versions to 3.2.

    The Removed setting reduces complexity and mitigates security risks, such as unauthorized certificate changes. In high-security environments, removing the CA bundle ensures that only approved CAs are trusted, reducing the risk of cyberattacks. For example, your organization might want to restrict cluster administrators from creating trusted CA bundles to prevent OpenShift pods from communicating externally.

8.2. Adding certificates

If you must use a self-signed certificate that is not part of the existing cluster-wide CA bundle, you have two options for configuring the certificate:

  • Add it to the cluster-wide CA bundle.

    This option is useful when the certificate is needed for secure communication across multiple services or when it’s required by security policies to be trusted cluster-wide. This option ensures that all services and components in the cluster trust the certificate automatically. It simplifies management because the certificate is trusted across the entire cluster, avoiding the need to configure the certificate separately for each service.

  • Add it to a custom CA bundle that is separate from the OpenShift cluster-wide bundle.

    Consider this option for the following scenarios:

    • Limit scope: Only specific services need the certificate, not the whole cluster.
    • Isolation: Keeps custom certificates separate, preventing changes to the global configuration.
    • Avoid global impact: Does not affect services that do not need the certificate.
    • Easier management: Makes it simpler to manage certificates for specific services.

You can add a self-signed certificate to a cluster-wide Certificate Authority (CA) bundle (ca-bundle.crt).

When the cluster-wide CA bundle is updated, the Cluster Network Operator (CNO) automatically detects the change and injects the updated bundle into the odh-trusted-ca-bundle ConfigMap, making the certificate available to OpenShift AI components.

Note: By default, the management state for the Trusted CA bundle is Managed (that is, the spec.trustedCABundle.managementState field in the Red Hat OpenShift AI Operator’s DSCI object is set to Managed). If you change this setting to Unmanaged, you must manually update the odh-trusted-ca-bundle ConfigMap to include the updated cluster-wide CA bundle.

Alternatively, you can add certificates to a custom CA bundle, as described in Adding certificates to a custom CA bundle.

Prerequisites

  • You have created a self-signed certificate and saved the certificate to a file. For example, you have created a certificate using OpenSSL and saved it to a file named example-ca.crt.
  • You have cluster administrator access for the OpenShift cluster where Red Hat OpenShift AI is installed.
  • You have installed the OpenShift CLI (oc) as described in the appropriate documentation for your cluster:

Procedure

  1. Create a ConfigMap that includes the root CA certificate used to sign the certificate, where </path/to/example-ca.crt> is the path to the CA certificate bundle on your local file system:

    oc create configmap custom-ca \
     	--from-file=ca-bundle.crt=</path/to/example-ca.crt> \
     	-n openshift-config
    Copy to Clipboard Toggle word wrap
  2. Update the cluster-wide proxy configuration with the newly-created ConfigMap:

    oc patch proxy/cluster \
        	 --type=merge \
       	 --patch='{"spec":{"trustedCA":{"name":"custom-ca"}}}'
    Copy to Clipboard Toggle word wrap

Verification

Run the following command to verify that all non-reserved namespaces contain the odh-trusted-ca-bundle ConfigMap:

oc get configmaps --all-namespaces -l app.kubernetes.io/part-of=opendatahub-operator | grep odh-trusted-ca-bundle
Copy to Clipboard Toggle word wrap

Additional resources

8.4. Adding certificates to a custom CA bundle

You can add self-signed certificates to a custom CA bundle that is separate from the OpenShift cluster-wide bundle.

This method is ideal for scenarios where components need access to external resources that require a self-signed certificate. For example, you might need to add self-signed certificates to grant AI pipelines access to S3-compatible object storage.

Prerequisites

  • You have created a self-signed certificate and saved the certificate to a file. For example, you have created a certificate using OpenSSL and saved it to a file named example-ca.crt.
  • You have cluster administrator access for the OpenShift cluster where Red Hat OpenShift AI is installed.
  • You have installed the OpenShift CLI (oc) as described in the appropriate documentation for your cluster:

Procedure

  1. Log in to OpenShift.
  2. Click OperatorsInstalled Operators and then click the Red Hat OpenShift AI Operator.
  3. Click the DSC Initialization tab.
  4. Click the default-dsci object.
  5. Click the YAML tab.
  6. In the spec.trustedCABundle section, add the custom certificate to the customCABundle field, as shown in the following example:

    spec:
      trustedCABundle:
        managementState: Managed
        customCABundle: |
          -----BEGIN CERTIFICATE-----
          examplebundle123
          -----END CERTIFICATE-----
    Copy to Clipboard Toggle word wrap
  7. Click Save.

The Red Hat OpenShift AI Operator automatically updates the ConfigMap to reflect any changes made to the customCABundle field. It adds the odh-ca-bundle.crt file containing the certificates to the odh-trusted-ca-bundle ConfigMap, as shown in the following example:

apiVersion: v1
kind: ConfigMap
metadata:
  labels:
    app.kubernetes.io/part-of: opendatahub-operator
    config.openshift.io/inject-trusted-cabundle: 'true'
  name: odh-trusted-ca-bundle
data:
  ca-bundle.crt: |
    <BUNDLE OF CLUSTER-WIDE CERTIFICATES>
  odh-ca-bundle.crt: |
    <BUNDLE OF CUSTOM CERTIFICATES>
Copy to Clipboard Toggle word wrap

Verification

Run the following command to verify that a non-reserved namespace contains the odh-trusted-ca-bundle ConfigMap and that the ConfigMap contains your customCABundle value. In the following command, example-namespace is the non-reserved namespace and examplebundle123 is the customCABundle value.

oc get configmap odh-trusted-ca-bundle -n example-namespace -o yaml | grep examplebundle123
Copy to Clipboard Toggle word wrap

Some OpenShift AI components have additional options or required configuration for self-signed certificates.

To securely connect OpenShift AI components to object storage solutions or databases that are deployed within an OpenShift cluster that uses self-signed certificates, you must provide a certificate authority (CA) certificate. Each namespace includes a ConfigMap named kube-root-ca.crt, which contains the CA certificate of the internal API Server.

Prerequisites

  • You have cluster administrator privileges for your OpenShift cluster.
  • You have installed the OpenShift CLI (oc) as described in the appropriate documentation for your cluster:

  • You have deployed an object storage solution or database in your OpenShift cluster.

Procedure

  1. In a terminal window, log in to the OpenShift CLI (oc) as shown in the following example:

    oc login api.<cluster_name>.<cluster_domain>:6443 --web
    Copy to Clipboard Toggle word wrap
  2. Retrieve the current OpenShift AI trusted CA configuration and store it in a new file:

    oc get dscinitializations.dscinitialization.opendatahub.io default-dsci -o json | jq -r '.spec.trustedCABundle.customCABundle' > /tmp/my-custom-ca-bundles.crt
    Copy to Clipboard Toggle word wrap
  3. Add the cluster’s kube-root-ca.crt ConfigMap to the OpenShift AI trusted CA configuration:

    oc get configmap kube-root-ca.crt -o jsonpath="{['data']['ca\.crt']}" >> /tmp/my-custom-ca-bundles.crt
    Copy to Clipboard Toggle word wrap
  4. Update the OpenShift AI trusted CA configuration to trust certificates issued by the certificate authorities in kube-root-ca.crt:

    oc patch dscinitialization default-dsci --type='json' -p='[{"op":"replace","path":"/spec/trustedCABundle/customCABundle","value":"'"$(awk '{printf "%s\\n", $0}' /tmp/my-custom-ca-bundles.crt)"'"}]'
    Copy to Clipboard Toggle word wrap

Verification

  • You can successfully deploy components that are configured to use object storage solutions or databases that are deployed in the OpenShift cluster. For example, a pipeline server that is configured to use a database deployed in the cluster starts successfully.
Note

You can verify your new certificate configuration by following the steps in the OpenShift AI tutorial - Fraud Detection example. Run the script to install local object storage buckets and create connections, and then enable AI pipelines.

For more information about running the script to install local object storage buckets, see Running a script to install local object storage buckets and create connections.

For more information about enabling AI pipelines, see Enabling pipelines.

8.5.2. Configuring a certificate for pipelines

By default, OpenShift AI includes OpenShift cluster-wide certificates in the odh-trusted-ca-bundle ConfigMap. These cluster-wide certificates cover most components, such as workbenches and model servers. However, the pipeline server might require additional Certificate Authority (CA) configuration, especially when interacting with external systems that use self-signed or custom certificates.

You have the following options for adding the certificate for AI pipelines:

Prerequisites

  • You have cluster administrator access for the OpenShift cluster where Red Hat OpenShift AI is installed.
  • You have created a self-signed certificate and saved the certificate to a file. For example, you have created a certificate using OpenSSL and saved it to a file named example-ca.crt.
  • You have configured an AI pipeline server.

Procedure

  1. Log in to the OpenShift console.
  2. From WorkloadsConfigMaps, create a ConfigMap with the required bundle in the same project as the target AI pipeline:

    kind: ConfigMap
    apiVersion: v1
    metadata:
        name: custom-ca-bundle
    data:
        ca-bundle.crt: |
        # contents of ca-bundle.crt
    Copy to Clipboard Toggle word wrap
  3. Add the following snippet to the .spec.apiserver.caBundle field of the underlying DataSciencePipelinesApplication (DSPA):

    apiVersion: datasciencepipelinesapplications.opendatahub.io/v1
    kind: DataSciencePipelinesApplication
    metadata:
        name: data-science-dspa
    spec:
        ...
        apiServer:
        ...
        cABundle:
            configMapName: custom-ca-bundle
            configMapKey: ca-bundle.crt
    Copy to Clipboard Toggle word wrap
  4. Save the ConfigMap. The pipeline server pod automatically redeploys with the updated bundle.

Verification

Confirm that your CA bundle was successfully mounted:

  1. Log in to the OpenShift console.
  2. Go to the project that has the target AI pipeline.
  3. Click the Pods tab.
  4. Click the pipeline server pod with the ds-pipeline-dspa-<hash> prefix.
  5. Click Terminal.
  6. Enter cat /dsp-custom-certs/dsp-ca.crt.
  7. Verify that your CA bundle is present within this file.

8.5.3. Configuring a certificate for workbenches

Important

By default, self-signed certificates apply to workbenches that you create after configuring cluster-wide certificates. To apply cluster-wide certificates to an existing workbench, stop and then restart the workbench.

Self-signed certificates are stored in /etc/pki/tls/custom-certs/ca-bundle.crt. Workbenches use a preset environment variable that many popular HTTP client packages point to for certificates. For packages that are not included by default, you can provide this certificate path. For example, for the kfp package to connect to the AI pipeline server:

from kfp.client import Client

with open(sa_token_file_path, 'r') as token_file:
    bearer_token = token_file.read()

    client = Client(
        host='https://<GO_TO_ROUTER_OF_PROJECT>/',
        existing_token=bearer_token,
        ssl_ca_cert='/etc/pki/tls/custom-certs/ca-bundle.crt'
    )
    print(client.list_experiments())
Copy to Clipboard Toggle word wrap

By default, the model serving platform in OpenShift AI uses a self-signed certificate generated at installation for the endpoints that are created when deploying a server.

If you have configured cluster-wide certificates on your OpenShift cluster, they are used by default for other types of endpoints, such as endpoints for routes.

The following procedure explains how to use the same certificate that you already have for your OpenShift cluster.

Prerequisites

  • You have cluster administrator access for the OpenShift cluster where Red Hat OpenShift AI is installed.
  • You have configured cluster-wide certificates in OpenShift.
  • You have configured the model serving platform, as described in Installing the model serving platform.

Procedure

  1. Log in to the OpenShift console.
  2. From the list of projects, open the openshift-ingress project.
  3. Click YAML.
  4. Search for "cert" to find a secret with a name that includes "cert". For example, rhods-internal-primary-cert-bundle-secret. The contents of the secret should contain two items that are used for all OpenShift Routes: tls.cert (the certificate) and tls.key (the key).
  5. Copy the reference to the secret.
  6. From the list of projects, open the istio-system project.
  7. Create a YAML file and paste the reference to the secret that you copied from the openshift-ingress YAML file.
  8. Edit the YAML code to keep only the relevant content, as shown in the following example. Replace rhods-internal-primary-cert-bundle-secret with the name of your secret:

    kind: Secret
    apiVersion: v1
    metadata:
    name: rhods-internal-primary-cert-bundle-secret
    data:
    tls.crt: >-
        LS0tLS1CRUd...
    tls.key: >-
        LS0tLS1CRUd...
    type: kubernetes.io/tls
    Copy to Clipboard Toggle word wrap
  9. Save the YAML file in the istio-system project.
  10. Navigate to OperatorsInstalled OperatorsRed Hat OpenShift AI.
  11. Click Data Science Cluster*, and then click default-dscYAML.
  12. Edit the kserve configuration section to refer to your secret as shown in the following example. Replace rhods-internal-primary-cert-bundle-secret with the name of the secret that you created in Step 8.

    kserve:
    devFlags: {}
    managementState: Managed
    serving:
        ingressGateway:
        certificate:
            secretName: rhods-internal-primary-cert-bundle-secret
            type: Provided
        managementState: Managed
        name: knative-serving
    Copy to Clipboard Toggle word wrap

By default, the Red Hat OpenShift AI Operator manages the odh-trusted-ca-bundle ConfigMap, which contains the trusted CA bundle and is applied to all non-reserved namespaces in the cluster. The Operator automatically updates this ConfigMap whenever changes are made to the CA bundle.

If your organization prefers to manage trusted CA bundles independently, for example, by using Ansible automation, you can disable this default behavior to prevent automatic updates by the Red Hat OpenShift AI Operator.

Prerequisites

  • You have cluster administrator privileges for your OpenShift cluster.

Procedure

  1. In the OpenShift web console, click OperatorsInstalled Operators and then click the Red Hat OpenShift AI Operator.
  2. Click the DSC Initialization tab.
  3. Click the default-dsci object.
  4. Click the YAML tab.
  5. In the spec section, change the value of the managementState field for trustedCABundle to Unmanaged, as shown:

    spec:
      trustedCABundle:
        managementState: Unmanaged
    Copy to Clipboard Toggle word wrap
  6. Click Save.

    Changing the managementState from Managed to Unmanaged prevents automatic updates when the customCABundle field is modified, but does not remove the odh-trusted-ca-bundle ConfigMap.

Verification

  1. In the spec section, set the value of the customCABundle field for trustedCABundle, for example:

    spec:
      trustedCABundle:
        managementState: Unmanaged
        customCABundle: example123
    Copy to Clipboard Toggle word wrap
  2. Click Save.
  3. Click WorkloadsConfigMaps.
  4. Select a project from the project list.
  5. Click the odh-trusted-ca-bundle ConfigMap.
  6. Click the YAML tab and verify that the value of the customCABundle field did not update.

8.7. Removing the CA bundle

If you prefer to implement a different authentication approach for your OpenShift AI installation, you can override the default behavior by removing the CA bundle.

You have two options for removing the CA bundle:

  • Remove the CA bundle from all non-reserved projects in OpenShift AI.
  • Remove the CA bundle from a specific project.

8.7.1. Removing the CA bundle from all namespaces

You can remove a Certificate Authority (CA) bundle from all non-reserved namespaces in OpenShift AI. This process changes the default configuration and disables the creation of the odh-trusted-ca-bundle configuration file (ConfigMap), as described in Working with certificates (OpenShift AI Self-Managed) or Working with certificates (OpenShift AI Self-Managed in a disconnected environment).

Note

The odh-trusted-ca-bundle ConfigMaps are only deleted from namespaces when you set the managementState of trustedCABundle to Removed; deleting the DSC Initialization does not delete the ConfigMaps.

To remove a CA bundle from a single namespace only, see Removing the CA bundle from a single namespace (OpenShift AI Self-Managed) or Removing the CA bundle from a single namespace (OpenShift AI Self-Managed in a disconnected environment).

Prerequisites

  • You have cluster administrator privileges for your OpenShift cluster.
  • You have installed the OpenShift CLI (oc) as described in the appropriate documentation for your cluster:

Procedure

  1. In the OpenShift web console, click OperatorsInstalled Operators and then click the Red Hat OpenShift AI Operator.
  2. Click the DSC Initialization tab.
  3. Click the default-dsci object.
  4. Click the YAML tab.
  5. In the spec section, change the value of the managementState field for trustedCABundle to Removed:

    spec:
      trustedCABundle:
        managementState: Removed
    Copy to Clipboard Toggle word wrap
  6. Click Save.

Verification

  • Run the following command to verify that the odh-trusted-ca-bundle ConfigMap has been removed from all namespaces:

    oc get configmaps --all-namespaces | grep odh-trusted-ca-bundle
    Copy to Clipboard Toggle word wrap

    The command should not return any ConfigMaps.

You can remove a custom Certificate Authority (CA) bundle from individual namespaces in OpenShift AI. This process disables the creation of the odh-trusted-ca-bundle configuration file (ConfigMap) for the specified namespace only.

To remove a CA bundle from all namespaces,Removing the CA bundle from all namespaces (OpenShift AI Self-Managed) or Removing the CA bundle from all namespaces (OpenShift AI Self-Managed in a disconnected environment).

Prerequisites

  • You have cluster administrator privileges for your OpenShift cluster.
  • You have installed the OpenShift CLI (oc) as described in the appropriate documentation for your cluster:

Procedure

  • Run the following command to remove a CA bundle from a namespace. In the following command, example-namespace is the non-reserved namespace.

    oc annotate ns example-namespace security.opendatahub.io/inject-trusted-ca-bundle=false
    Copy to Clipboard Toggle word wrap

Verification

  • Run the following command to verify that the CA bundle has been removed from the namespace. In the following command, example-namespace is the non-reserved namespace.

    oc get configmap odh-trusted-ca-bundle -n example-namespace
    Copy to Clipboard Toggle word wrap

    The command should return configmaps "odh-trusted-ca-bundle" not found.

Chapter 9. Viewing logs and audit records

As a cluster administrator, you can use the OpenShift AI Operator logger to monitor and troubleshoot issues. You can also use OpenShift audit records to review a history of changes made to the OpenShift AI Operator configuration.

9.1. Configuring the OpenShift AI Operator logger

You can change the log level for OpenShift AI Operator components by setting the .spec.devFlags.logmode flag for the DSC Initialization/DSCI custom resource during runtime. If you do not set a logmode value, the logger uses the INFO log level by default.

The log level that you set with .spec.devFlags.logmode applies to all components, not just those in a Managed state.

The following table shows the available log levels:

Expand
Log levelStacktrace levelVerbosityOutputTimestamp type

devel or development

WARN

INFO

Console

Epoch timestamps

"" (or no logmode value set)

ERROR

INFO

JSON

Human-readable timestamps

prod or production

ERROR

INFO

JSON

Human-readable timestamps

Logs that are set to devel or development generate in a plain text console format. Logs that are set to prod, production, or which do not have a level set generate in a JSON format.

Prerequisites

  • You have administrator access to the DSCInitialization resources in the OpenShift cluster.
  • You have installed the OpenShift CLI (oc) as described in the appropriate documentation for your cluster:

Procedure

  1. Log in to the OpenShift as a cluster administrator.
  2. Click OperatorsInstalled Operators and then click the Red Hat OpenShift AI Operator.
  3. Click the DSC Initialization tab.
  4. Click the default-dsci object.
  5. Click the YAML tab.
  6. In the spec section, update the .spec.devFlags.logmode flag with the log level that you want to set.

    apiVersion: dscinitialization.opendatahub.io/v2
    kind: DSCInitialization
    metadata:
      name: default-dsci
    spec:
      devFlags:
        logmode: development
    Copy to Clipboard Toggle word wrap
  7. Click Save.

You can also configure the log level from the OpenShift CLI (oc) by using the following command with the logmode value set to the log level that you want.

oc patch dsci default-dsci -p '{"spec":{"devFlags":{"logmode":"development"}}}' --type=merge
Copy to Clipboard Toggle word wrap

Verification

  • If you set the component log level to devel or development, logs generate more frequently and include logs at WARN level and above.
  • If you set the component log level to prod or production, or do not set a log level, logs generate less frequently and include logs at ERROR level or above.

9.1.1. Viewing the OpenShift AI Operator logs

  1. Log in to the OpenShift CLI (oc).
  2. Run the following command to stream logs from all Operator pods:

    for pod in $(oc get pods -l name=rhods-operator -n redhat-ods-operator -o name); do
      oc logs -f "$pod" -n redhat-ods-operator &
    done
    Copy to Clipboard Toggle word wrap

    The Operator pod logs open in your terminal.

    Tip

    Press Ctrl+C to stop viewing. To fully stop all log streams, run kill $(jobs -p).

You can also view each Operator pod log in the OpenShift console by navigating to WorkloadsPods, selecting the redhat-ods-operator project, clicking a pod name, and then clicking the Logs tab.

9.2. Viewing audit records

Cluster administrators can use OpenShift auditing to see changes made to the OpenShift AI Operator configuration by reviewing modifications to the DataScienceCluster (DSC) and DSCInitialization (DSCI) custom resources. Audit logging is enabled by default in standard OpenShift cluster configurations. For more information, see Viewing audit logs in the OpenShift documentation.

Note

In Red Hat OpenShift Service on AWS, audit logging is disabled by default because the Elasticsearch log store does not provide secure storage for audit logs. To configure log forwarding, see Logging in the Red Hat OpenShift Service on AWS documentation.

The following example shows how to use the OpenShift audit logs to see the history of changes made (by users) to the DSC and DSCI custom resources.

Prerequisites

  • You have cluster administrator privileges for your OpenShift cluster.
  • You have installed the OpenShift CLI (oc) as described in the appropriate documentation for your cluster:

Procedure

  1. In a terminal window, if you are not already logged in to your OpenShift cluster as a cluster administrator, log in to the OpenShift CLI as shown in the following example:

    $ oc login <openshift_cluster_url> -u <admin_username> -p <password>
    Copy to Clipboard Toggle word wrap
  2. To access the full content of the changed custom resources, set the OpenShift audit log policy to WriteRequestBodies or a more comprehensive profile. For more information, see Configuring the audit log policy.
  3. Fetch the audit log files that are available for the relevant control plane nodes. For example:

    oc adm node-logs --role=master --path=kube-apiserver/ \
      | awk '{ print $1 }' | sort -u \
      | while read node ; do
          oc adm node-logs $node --path=kube-apiserver/audit.log < /dev/null
        done \
      | grep opendatahub > /tmp/kube-apiserver-audit-opendatahub.log
    Copy to Clipboard Toggle word wrap
  4. Search the files for the DSC and DSCI custom resources. For example:

    jq 'select((.objectRef.apiGroup == "dscinitialization.opendatahub.io"
                    or .objectRef.apiGroup == "datasciencecluster.opendatahub.io")
                  and .user.username != "system:serviceaccount:redhat-ods-operator:redhat-ods-operator-controller-manager"
                  and .verb != "get" and .verb != "watch" and .verb != "list")' < /tmp/kube-apiserver-audit-opendatahub.log
    Copy to Clipboard Toggle word wrap

Verification

  • The commands return relevant log entries.
Tip

To configure the log retention time, see the Logging section in the OpenShift documentation.

If you are experiencing difficulties installing the Red Hat OpenShift AI Operator, read this section to understand what could be causing the problem and how to resolve it.

If the problem is not included here or in the release notes, contact Red Hat Support. When opening a support case, it is helpful to include debugging information about your cluster. You can collect this information by using the must-gather tool as described in Must-Gather for Red Hat OpenShift AI and Gathering data about your cluster.

You can also adjust the log level of OpenShift AI Operator components to increase or reduce log verbosity to suit your use case. For more information, see Configuring the OpenShift AI Operator logger.

Problem

When attempting to retrieve the Red Hat OpenShift AI Operator from the image registry, an Failure to pull from quay error message appears. The Red Hat OpenShift AI Operator might be unavailable for retrieval in the following circumstances:

  • The image registry is unavailable.
  • There is a problem with your network connection.
  • Your cluster is not operational and is therefore unable to retrieve the image registry.

Diagnosis

Check the logs in the Events section in OpenShift for further information about the Failure to pull from quay error message.

Resolution

  • Contact Red Hat support.

Problem

You are deploying on an environment that is not documented as supported by the Red Hat OpenShift AI Operator.

Diagnosis

  1. In the OpenShift web console, switch to the Administrator perspective.
  2. Click WorkloadsPods.
  3. Set the Project to All Projects or redhat-ods-operator.
  4. Click the rhods-operator-<random string> pod that shows an error in the Status column.

    The Pod details page appears.

  5. Click Logs.
  6. Select rhods-operator from the drop-down list.
  7. Check the log for the ERROR: Deploying on $infrastructure, which is not supported. Failing Installation error message.

Resolution

  • Before proceeding with a new installation, ensure that you have a fully supported environment on which to install OpenShift AI. For more information, see Supported Configurations for 3.x.

Problem

During the installation process, the OpenShift AI Custom Resource (CR) does not get created. This issue occurs in unknown circumstances.

Diagnosis

  1. In the OpenShift web console, switch to the Administrator perspective.
  2. Click WorkloadsPods.
  3. Set the Project to All Projects or redhat-ods-operator.
  4. Click the rhods-operator-<random string> pod that shows an error in the Status column.

    The Pod details page appears.

  5. Click Logs.
  6. Select rhods-operator from the drop-down list.
  7. Check the log for the ERROR: Attempt to create the ODH CR failed. error message.

Resolution

  • Contact Red Hat support.

Problem

During the installation process, the OpenShift AI Notebooks Custom Resource (CR) does not get created. This issue occurs in unknown circumstances.

Diagnosis

  1. In the OpenShift web console, switch to the Administrator perspective.
  2. Click WorkloadsPods.
  3. Set the Project to All Projects or redhat-ods-operator.
  4. Click the rhods-operator-<random string> pod that shows an error in the Status column.

    The Pod details page appears.

  5. Click Logs.
  6. Select rhods-operator from the drop-down list.
  7. Check the log for the ERROR: Attempt to create the RHODS Notebooks CR failed. error message.

Resolution

  • Contact Red Hat support.

10.5. The OpenShift AI dashboard is not accessible

Problem

After installing OpenShift AI, the redhat-ods-applications, redhat-ods-monitoring, and redhat-ods-operator project namespaces are Active but you cannot access the dashboard due to an error in one of the pods.

Diagnosis

  1. In the OpenShift web console, switch to the Administrator perspective.
  2. Click WorkloadsPods.
  3. Set the Project to All Projects.
  4. Click Filter and select the checkbox for every status except Running and Completed.

    The page displays the pods that have an error.

Resolution

  • To see more information and troubleshooting steps for a pod, on the Pods page, click the link in the Status column for the pod.
  • If the Status column does not display a link, click the pod name to open the pod details page and then click the Logs tab.

Problem

After uninstalling the OpenShift AI Operator and reinstalling it by using the CLI, the reinstallation fails with an unable to find DSCInitialization error in one of the OpenShift AI Operator pod logs. This issue can occur if the Auth custom resource from the previous installation was not deleted after uninstalling the OpenShift AI Operator and before reinstalling it. For more information, see Understanding the uninstallation process.

Diagnosis

  1. In the OpenShift web console, switch to the Administrator perspective.
  2. Click WorkloadsPods.
  3. Set the Project to All Projects or redhat-ods-operator.
  4. Click the rhods-operator-<random string> pod that shows an error in the Status column.

    The Pod details page appears.

  5. Click Logs.
  6. Select rhods-operator from the drop-down list.
  7. Check the log for an error message similar to the following:

    {"name":"auth"},"namespace":"","name":"auth","reconcileID":"7bff53ae-1252-46fe-831a-fdc824078a1b","error":"unable to find DSCInitialization","stacktrace":"sigs.k8s.io/controller-runtime/pkg/internal/controller.
    Copy to Clipboard Toggle word wrap

Resolution

  1. Uninstall the OpenShift AI Operator.
  2. Delete the Auth custom resource:

    1. In the OpenShift web console, switch to the Administrator perspective.
    2. Click API Explorer.
    3. From the All groups drop-down list, select or enter services.platform.opendatahub.io.
    4. Click the Auth kind.
    5. Click the Instances tab.
    6. Click the action menu (⋮) and select Delete Auth.

      The Delete Auth dialog appears.

    7. Click Delete.
  3. Install the OpenShift AI Operator again.

Problem

The Role-based access control (RBAC) policy for the dedicated-admins group in the target project cannot be created. This issue occurs in unknown circumstances.

Diagnosis

  1. In the OpenShift web console, switch to the Administrator perspective.
  2. Click WorkloadsPods.
  3. Set the Project to All Projects or redhat-ods-operator.
  4. Click the rhods-operator-<random string> pod that shows an error in the Status column.

    The Pod details page appears.

  5. Click Logs.
  6. Select rhods-operator from the drop-down list.
  7. Check the log for the ERROR: Attempt to create the RBAC policy for dedicated admins group in $target_project failed. error message.

Resolution

  • Contact Red Hat support.

Problem

An issue with the OpenShift AI Operator’s flow could result in failure to create the ODH parameter.

Diagnosis

  1. In the OpenShift web console, switch to the Administrator perspective.
  2. Click WorkloadsPods.
  3. Set the Project to All Projects or redhat-ods-operator.
  4. Click the rhods-operator-<random string> pod that shows an error in the Status column.

    The Pod details page appears.

  5. Click Logs.
  6. Select rhods-operator from the drop-down list.
  7. Check the log for the ERROR: Addon managed odh parameter secret does not exist. error message.

Resolution

  • Contact Red Hat support.

This section shows how to use the OpenShift CLI (oc) to uninstall the Red Hat OpenShift AI Operator and any OpenShift AI components installed and managed by the Operator.

Note

Using the OpenShift CLI (oc) is the recommended way to uninstall the Operator. Depending on your version of OpenShift, using the web console to perform the uninstallation might not prompt you to uninstall all associated components. This could leave you unclear about the final state of your cluster.

11.1. Understanding the uninstallation process

Installing Red Hat OpenShift AI created several custom resource instances on your OpenShift cluster for various components of OpenShift AI. After installation, users likely created several additional resources while using OpenShift AI. Uninstalling OpenShift AI removes the resources that were created by the Operator, but retains the resources created by users to prevent inadvertently deleting information you might want.

What is deleted

Uninstalling OpenShift AI removes the following resources from your OpenShift cluster:

  • DataScienceCluster custom resource instance and the custom resource instances it created for each component
  • DSCInitialization custom resource instance
  • Auth custom resource instance created during or after installation
  • FeatureTracker custom resource instances created during or after installation
  • ServiceMesh custom resource instance created by the Operator during or after installation
  • KNativeServing custom resource instance created by the Operator during or after installation
  • redhat-ods-applications, redhat-ods-monitoring, and rhods-notebooks namespaces created by the Operator
  • Workloads in the rhods-notebooks namespace
  • Subscription, ClusterServiceVersion, and InstallPlan objects
  • KfDef object (version 1 Operator only)

What might remain

Uninstalling OpenShift AI retains the following resources in your OpenShift cluster:

  • Projects created by users
  • Custom resource instances created by users
  • Custom resource definitions (CRDs) created by users or by the Operator

While these resources might still remain in your OpenShift cluster, they are not functional. After uninstalling, Red Hat recommends that you review the projects and custom resources in your OpenShift cluster and delete anything no longer in use to prevent potential issues, such as pipelines that cannot run, notebooks that cannot be undeployed, or models that cannot be undeployed.

The following procedure shows how to use the OpenShift CLI (oc) to uninstall the Red Hat OpenShift AI Operator and any OpenShift AI components installed and managed by the Operator.

Prerequisites

  • You have cluster administrator privileges for your OpenShift cluster.
  • You have installed the OpenShift CLI (oc) as described in the appropriate documentation for your cluster:

  • You have backed up the persistent disks or volumes used by your persistent volume claims (PVCs).

Procedure

  1. Open a new terminal window.
  2. Log in to your OpenShift cluster as a cluster administrator, as shown in the following example:

    $ oc login <openshift_cluster_url> -u system:admin
    Copy to Clipboard Toggle word wrap
  3. Create a ConfigMap object for deletion of the Red Hat OpenShift AI Operator.

    $ oc create configmap delete-self-managed-odh -n redhat-ods-operator
    Copy to Clipboard Toggle word wrap
  4. To delete the rhods-operator, set the addon-managed-odh-delete label to true.

    $ oc label configmap/delete-self-managed-odh api.openshift.com/addon-managed-odh-delete=true -n redhat-ods-operator
    Copy to Clipboard Toggle word wrap
  5. When all objects associated with the Operator are removed, delete the redhat-ods-operator project.

    1. Set an environment variable for the redhat-ods-applications project.

      $ PROJECT_NAME=redhat-ods-applications
      Copy to Clipboard Toggle word wrap
    2. Wait until the redhat-ods-applications project has been deleted.

      while oc get project $PROJECT_NAME &> /dev/null; do
        echo "The $PROJECT_NAME project still exists"
        sleep 1
      done
      echo "The $PROJECT_NAME project no longer exists"
      Copy to Clipboard Toggle word wrap

      When the redhat-ods-applications project has been deleted, you see the following output.

      The redhat-ods-applications project no longer exists
      Copy to Clipboard Toggle word wrap
    3. When the redhat-ods-applications project has been deleted, delete the redhat-ods-operator project.

      $ oc delete namespace redhat-ods-operator
      Copy to Clipboard Toggle word wrap

Verification

  1. Confirm that the rhods-operator subscription no longer exists.

    $ oc get subscriptions --all-namespaces | grep rhods-operator
    Copy to Clipboard Toggle word wrap
  2. Confirm that the following projects no longer exist.

    • redhat-ods-applications
    • redhat-ods-monitoring
    • redhat-ods-operator
    • rhods-notebooks

      $ oc get namespaces | grep -e redhat-ods* -e rhods*
      Copy to Clipboard Toggle word wrap

      The rhods-notebooks project existed only if you installed the workbenches component of OpenShift AI. See Installing and managing Red Hat OpenShift AI components.

Legal Notice

Copyright © Red Hat.
The text of and illustrations in this document are licensed by Red Hat under a Creative Commons Attribution–Share Alike 3.0 Unported license ("CC-BY-SA"). An explanation of CC-BY-SA is available at http://creativecommons.org/licenses/by-sa/3.0/. In accordance with CC-BY-SA, if you distribute this document or an adaptation of it, you must provide the URL for the original version.
Red Hat, as the licensor of this document, waives the right to enforce, and agrees not to assert, Section 4d of CC-BY-SA to the fullest extent permitted by applicable law.
Red Hat, Red Hat Enterprise Linux, the Shadowman logo, JBoss, OpenShift, Fedora, the Infinity logo, and RHCE are trademarks of Red Hat, Inc., registered in the United States and other countries.
Linux® is the registered trademark of Linus Torvalds in the United States and other countries.
Java® is a registered trademark of Oracle and/or its affiliates.
XFS® is a trademark of Silicon Graphics International Corp. or its subsidiaries in the United States and/or other countries.
MySQL® is a registered trademark of MySQL AB in the United States, the European Union and other countries.
Node.js® is an official trademark of Joyent. Red Hat Software Collections is not formally related to or endorsed by the official Joyent Node.js open source or commercial project.
The OpenStack® Word Mark and OpenStack logo are either registered trademarks/service marks or trademarks/service marks of the OpenStack Foundation, in the United States and other countries and are used with the OpenStack Foundation's permission. We are not affiliated with, endorsed or sponsored by the OpenStack Foundation, or the OpenStack community.
All other trademarks are the property of their respective owners.
Red Hat logoGithubredditYoutubeTwitter

Learn

Try, buy, & sell

Communities

About Red Hat Documentation

We help Red Hat users innovate and achieve their goals with our products and services with content they can trust. Explore our recent updates.

Making open source more inclusive

Red Hat is committed to replacing problematic language in our code, documentation, and web properties. For more details, see the Red Hat Blog.

About Red Hat

We deliver hardened solutions that make it easier for enterprises to work across platforms and environments, from the core datacenter to the network edge.

Theme

© 2026 Red Hat
Back to top