Chapter 9. Troubleshooting common installation problems


If you are experiencing difficulties installing the Red Hat OpenShift AI Operator, read this section to understand what could be causing the problem and how to resolve it.

If the problem is not included here or in the release notes, contact Red Hat Support. When opening a support case, it is helpful to include debugging information about your cluster. You can collect this information by using the must-gather tool as described in Must-Gather for Red Hat OpenShift AI and Gathering data about your cluster.

You can also adjust the log level of OpenShift AI Operator components to increase or reduce log verbosity to suit your use case. For more information, see Configuring the OpenShift AI Operator logger.

9.1. The Red Hat OpenShift AI Operator cannot be retrieved from the image registry

Problem

When attempting to retrieve the Red Hat OpenShift AI Operator from the image registry, an Failure to pull from quay error message appears. The Red Hat OpenShift AI Operator might be unavailable for retrieval in the following circumstances:

  • The image registry is unavailable.
  • There is a problem with your network connection.
  • Your cluster is not operational and is therefore unable to retrieve the image registry.

Diagnosis

Check the logs in the Events section in OpenShift for further information about the Failure to pull from quay error message.

Resolution

  • Contact Red Hat support.

9.2. OpenShift AI does not install on unsupported infrastructure

Problem

You are deploying on an environment that is not documented as supported by the Red Hat OpenShift AI Operator.

Diagnosis

  1. In the OpenShift web console, switch to the Administrator perspective.
  2. Click Workloads Pods.
  3. Set the Project to All Projects or redhat-ods-operator.
  4. Click the rhods-operator-<random string> pod.

    The Pod details page appears.

  5. Click Logs.
  6. Select rhods-operator from the drop-down list.
  7. Check the log for the ERROR: Deploying on $infrastructure, which is not supported. Failing Installation error message.

Resolution

9.3. The creation of the OpenShift AI Custom Resource (CR) fails

Problem

During the installation process, the OpenShift AI Custom Resource (CR) does not get created. This issue occurs in unknown circumstances.

Diagnosis

  1. In the OpenShift web console, switch to the Administrator perspective.
  2. Click Workloads Pods.
  3. Set the Project to All Projects or redhat-ods-operator.
  4. Click the rhods-operator-<random string> pod.

    The Pod details page appears.

  5. Click Logs.
  6. Select rhods-operator from the drop-down list.
  7. Check the log for the ERROR: Attempt to create the ODH CR failed. error message.

Resolution

  • Contact Red Hat support.

9.4. The creation of the OpenShift AI Notebooks Custom Resource (CR) fails

Problem

During the installation process, the OpenShift AI Notebooks Custom Resource (CR) does not get created. This issue occurs in unknown circumstances.

Diagnosis

  1. In the OpenShift web console, switch to the Administrator perspective.
  2. Click Workloads Pods.
  3. Set the Project to All Projects or redhat-ods-operator.
  4. Click the rhods-operator-<random string> pod.

    The Pod details page appears.

  5. Click Logs.
  6. Select rhods-operator from the drop-down list.
  7. Check the log for the ERROR: Attempt to create the RHODS Notebooks CR failed. error message.

Resolution

  • Contact Red Hat support.

9.5. The OpenShift AI dashboard is not accessible

Problem

After installing OpenShift AI, the redhat-ods-applications, redhat-ods-monitoring, and redhat-ods-operator project namespaces are Active but you cannot access the dashboard due to an error in the pod.

Diagnosis

  1. In the OpenShift web console, switch to the Administrator perspective.
  2. Click Workloads Pods.
  3. Set the Project to All Projects.
  4. Click Filter and select the checkbox for every status except Running and Completed.

    The page displays the pods that have an error.

Resolution

  • To see more information and troubleshooting steps for a pod, on the Pods page, click the link in the Status column for the pod.
  • If the Status column does not display a link, click the pod name to open the pod details page and then click the Logs tab.

9.6. The dedicated-admins Role-based access control (RBAC) policy cannot be created

Problem

The Role-based access control (RBAC) policy for the dedicated-admins group in the target project cannot be created. This issue occurs in unknown circumstances.

Diagnosis

  1. In the OpenShift web console, switch to the Administrator perspective.
  2. Click Workloads Pods.
  3. Set the Project to All Projects or redhat-ods-operator.
  4. Click the rhods-operator-<random string> pod.

    The Pod details page appears.

  5. Click Logs.
  6. Select rhods-operator from the drop-down list.
  7. Check the log for the ERROR: Attempt to create the RBAC policy for dedicated admins group in $target_project failed. error message.

Resolution

  • Contact Red Hat support.

9.7. The PagerDuty secret does not get created

Problem

An issue with Managed Tenants SRE automation process causes the PagerDuty’s secret to not get created.

Diagnosis

  1. In the OpenShift web console, switch to the Administrator perspective.
  2. Click Workloads Pods.
  3. Set the Project to All Projects or redhat-ods-operator.
  4. Click the rhods-operator-<random string> pod.

    The Pod details page appears.

  5. Click Logs.
  6. Select rhods-operator from the drop-down list.
  7. Check the log for the ERROR: Pagerduty secret does not exist error message.

Resolution

  • Contact Red Hat support.

9.8. The SMTP secret does not exist

Problem

An issue with Managed Tenants SRE automation process causes the SMTP secret to not get created.

Diagnosis

  1. In the OpenShift web console, switch to the Administrator perspective.
  2. Click Workloads Pods.
  3. Set the Project to All Projects or redhat-ods-operator.
  4. Click the rhods-operator-<random string> pod.

    The Pod details page appears.

  5. Click Logs.
  6. Select rhods-operator from the drop-down list.
  7. Check the log for the ERROR: SMTP secret does not exist error message.

Resolution

  • Contact Red Hat support.

9.9. The ODH parameter secret does not get created

Problem

An issue with the OpenShift AI Operator’s flow could result in failure to create the ODH parameter.

Diagnosis

  1. In the OpenShift web console, switch to the Administrator perspective.
  2. Click Workloads Pods.
  3. Set the Project to All Projects or redhat-ods-operator.
  4. Click the rhods-operator-<random string> pod.

    The Pod details page appears.

  5. Click Logs.
  6. Select rhods-operator from the drop-down list.
  7. Check the log for the ERROR: Addon managed odh parameter secret does not exist. error message.

Resolution

  • Contact Red Hat support.

9.10. Data science pipelines are not enabled after installing OpenShift AI 2.9 or later due to existing Argo Workflows resources

Problem

After installing OpenShift AI 2.9 or later with an Argo Workflows installation that is not installed by OpenShift AI on your cluster, data science pipelines are not enabled despite the datasciencepipelines component being enabled in the DataScienceCluster object.

Diagnosis

After you install OpenShift AI 2.9 or later, the Data Science Pipelines tab is not visible on the OpenShift AI dashboard navigation menu.

Resolution

  • Delete the separate installation of Argo workflows on your cluster. After you have removed any Argo Workflows resources that are not created by OpenShift AI from your cluster, data science pipelines are enabled automatically.
Red Hat logoGithubRedditYoutubeTwitter

Learn

Try, buy, & sell

Communities

About Red Hat Documentation

We help Red Hat users innovate and achieve their goals with our products and services with content they can trust.

Making open source more inclusive

Red Hat is committed to replacing problematic language in our code, documentation, and web properties. For more details, see the Red Hat Blog.

About Red Hat

We deliver hardened solutions that make it easier for enterprises to work across platforms and environments, from the core datacenter to the network edge.

© 2024 Red Hat, Inc.