此内容没有您所选择的语言版本。

Chapter 11. Troubleshooting common installation problems


If you are experiencing difficulties installing the Red Hat OpenShift AI Operator, read this section to understand what could be causing the problem and how to resolve it.

If the problem is not included here or in the release notes, contact Red Hat Support. When opening a support case, it is helpful to include debugging information about your cluster. You can collect this information by using the must-gather tool as described in Must-Gather for Red Hat OpenShift AI and Gathering data about your cluster.

You can also adjust the log level of OpenShift AI Operator components to increase or reduce log verbosity to suit your use case. For more information, see Configuring the OpenShift AI Operator logger.

Problem

When attempting to retrieve the Red Hat OpenShift AI Operator from the image registry, an Failure to pull from quay error message appears. The Red Hat OpenShift AI Operator might be unavailable for retrieval in the following circumstances:

  • The image registry is unavailable.
  • There is a problem with your network connection.
  • Your cluster is not operational and is therefore unable to retrieve the image registry.

Diagnosis

Check the logs in the Events section in OpenShift for further information about the Failure to pull from quay error message.

Resolution

  • Contact Red Hat support.

Problem

You are deploying on an environment that is not documented as supported by the Red Hat OpenShift AI Operator.

Diagnosis

  1. In the OpenShift web console, switch to the Administrator perspective.
  2. Click Workloads Pods.
  3. Set the Project to All Projects or redhat-ods-operator.
  4. Click the rhods-operator-<random string> pod that shows an error in the Status column.

    The Pod details page appears.

  5. Click Logs.
  6. Select rhods-operator from the drop-down list.
  7. Check the log for the ERROR: Deploying on $infrastructure, which is not supported. Failing Installation error message.

Resolution

Problem

During the installation process, the OpenShift AI Custom Resource (CR) does not get created. This issue occurs in unknown circumstances.

Diagnosis

  1. In the OpenShift web console, switch to the Administrator perspective.
  2. Click Workloads Pods.
  3. Set the Project to All Projects or redhat-ods-operator.
  4. Click the rhods-operator-<random string> pod that shows an error in the Status column.

    The Pod details page appears.

  5. Click Logs.
  6. Select rhods-operator from the drop-down list.
  7. Check the log for the ERROR: Attempt to create the ODH CR failed. error message.

Resolution

  • Contact Red Hat support.

Problem

During the installation process, the OpenShift AI Notebooks Custom Resource (CR) does not get created. This issue occurs in unknown circumstances.

Diagnosis

  1. In the OpenShift web console, switch to the Administrator perspective.
  2. Click Workloads Pods.
  3. Set the Project to All Projects or redhat-ods-operator.
  4. Click the rhods-operator-<random string> pod that shows an error in the Status column.

    The Pod details page appears.

  5. Click Logs.
  6. Select rhods-operator from the drop-down list.
  7. Check the log for the ERROR: Attempt to create the RHODS Notebooks CR failed. error message.

Resolution

  • Contact Red Hat support.

11.5. The OpenShift AI dashboard is not accessible

Problem

After installing OpenShift AI, the redhat-ods-applications, redhat-ods-monitoring, and redhat-ods-operator project namespaces are Active but you cannot access the dashboard due to an error in one of the pods.

Diagnosis

  1. In the OpenShift web console, switch to the Administrator perspective.
  2. Click Workloads Pods.
  3. Set the Project to All Projects.
  4. Click Filter and select the checkbox for every status except Running and Completed.

    The page displays the pods that have an error.

Resolution

  • To see more information and troubleshooting steps for a pod, on the Pods page, click the link in the Status column for the pod.
  • If the Status column does not display a link, click the pod name to open the pod details page and then click the Logs tab.

11.6. Reinstalling OpenShift AI fails with an error

Problem

After uninstalling the OpenShift AI Operator and reinstalling it by using the CLI, the reinstallation fails with an unable to find DSCInitialization error in one of the OpenShift AI Operator pod logs. This issue can occur if the Auth custom resource from the previous installation was not deleted after uninstalling the OpenShift AI Operator and before reinstalling it. For more information, see Understanding the uninstallation process.

Diagnosis

  1. In the OpenShift web console, switch to the Administrator perspective.
  2. Click Workloads Pods.
  3. Set the Project to All Projects or redhat-ods-operator.
  4. Click the rhods-operator-<random string> pod that shows an error in the Status column.

    The Pod details page appears.

  5. Click Logs.
  6. Select rhods-operator from the drop-down list.
  7. Check the log for an error message similar to the following:

    {"name":"auth"},"namespace":"","name":"auth","reconcileID":"7bff53ae-1252-46fe-831a-fdc824078a1b","error":"unable to find DSCInitialization","stacktrace":"sigs.k8s.io/controller-runtime/pkg/internal/controller.
    Copy to Clipboard Toggle word wrap

Resolution

  1. Uninstall the OpenShift AI Operator.
  2. Delete the Auth custom resource:

    1. In the OpenShift web console, switch to the Administrator perspective.
    2. Click API Explorer.
    3. From the All groups drop-down list, select or enter services.platform.opendatahub.io.
    4. Click the Auth kind.
    5. Click the Instances tab.
    6. Click the action menu (⋮) and select Delete Auth.

      The Delete Auth dialog appears.

    7. Click Delete.
  3. Install the OpenShift AI Operator again.

Problem

The Role-based access control (RBAC) policy for the dedicated-admins group in the target project cannot be created. This issue occurs in unknown circumstances.

Diagnosis

  1. In the OpenShift web console, switch to the Administrator perspective.
  2. Click Workloads Pods.
  3. Set the Project to All Projects or redhat-ods-operator.
  4. Click the rhods-operator-<random string> pod that shows an error in the Status column.

    The Pod details page appears.

  5. Click Logs.
  6. Select rhods-operator from the drop-down list.
  7. Check the log for the ERROR: Attempt to create the RBAC policy for dedicated admins group in $target_project failed. error message.

Resolution

  • Contact Red Hat support.

11.8. The PagerDuty secret does not get created

Problem

An issue with Managed Tenants SRE automation process causes the PagerDuty’s secret to not get created.

Diagnosis

  1. In the OpenShift web console, switch to the Administrator perspective.
  2. Click Workloads Pods.
  3. Set the Project to All Projects or redhat-ods-operator.
  4. Click the rhods-operator-<random string> pod that shows an error in the Status column.

    The Pod details page appears.

  5. Click Logs.
  6. Select rhods-operator from the drop-down list.
  7. Check the log for the ERROR: Pagerduty secret does not exist error message.

Resolution

  • Contact Red Hat support.

11.9. The SMTP secret does not exist

Problem

An issue with Managed Tenants SRE automation process causes the SMTP secret to not get created.

Diagnosis

  1. In the OpenShift web console, switch to the Administrator perspective.
  2. Click Workloads Pods.
  3. Set the Project to All Projects or redhat-ods-operator.
  4. Click the rhods-operator-<random string> pod that shows an error in the Status column.

    The Pod details page appears.

  5. Click Logs.
  6. Select rhods-operator from the drop-down list.
  7. Check the log for the ERROR: SMTP secret does not exist error message.

Resolution

  • Contact Red Hat support.

11.10. The ODH parameter secret does not get created

Problem

An issue with the OpenShift AI Operator’s flow could result in failure to create the ODH parameter.

Diagnosis

  1. In the OpenShift web console, switch to the Administrator perspective.
  2. Click Workloads Pods.
  3. Set the Project to All Projects or redhat-ods-operator.
  4. Click the rhods-operator-<random string> pod that shows an error in the Status column.

    The Pod details page appears.

  5. Click Logs.
  6. Select rhods-operator from the drop-down list.
  7. Check the log for the ERROR: Addon managed odh parameter secret does not exist. error message.

Resolution

  • Contact Red Hat support.

Problem

After installing OpenShift AI 2.9 or later with an Argo Workflows installation that is not installed by OpenShift AI on your cluster, data science pipelines are not enabled despite the datasciencepipelines component being enabled in the DataScienceCluster object.

Diagnosis

After you install OpenShift AI 2.9 or later, the Data science pipelines tab is not visible on the OpenShift AI dashboard navigation menu.

Resolution

  • Delete the separate instance of Argo workflows from your cluster. After you have removed any Argo Workflows resources that are not created by OpenShift AI from your cluster, data science pipelines are enabled automatically.
返回顶部
Red Hat logoGithubredditYoutubeTwitter

学习

尝试、购买和销售

社区

关于红帽文档

通过我们的产品和服务,以及可以信赖的内容,帮助红帽用户创新并实现他们的目标。 了解我们当前的更新.

让开源更具包容性

红帽致力于替换我们的代码、文档和 Web 属性中存在问题的语言。欲了解更多详情,请参阅红帽博客.

關於紅帽

我们提供强化的解决方案,使企业能够更轻松地跨平台和环境(从核心数据中心到网络边缘)工作。

Theme

© 2025 Red Hat