Chapter 1. Support overview


Red Hat offers cluster administrators tools for gathering data for your cluster, monitoring, and troubleshooting.

1.1. Get support

Get support: Visit the Red Hat Customer Portal to review knowledge base articles, submit a support case, and review additional product documentation and resources.

1.2. Remote health monitoring issues

Remote health monitoring issues: Red Hat OpenShift Service on AWS collects telemetry and configuration data about your cluster and reports it to Red Hat by using the Telemeter Client and the Insights Operator. Red Hat uses this data to understand and resolve issues in connected cluster. Red Hat OpenShift Service on AWS collects data and monitors health using the following:

  • Telemetry: The Telemetry Client gathers and uploads the metrics values to Red Hat every four minutes and thirty seconds. Red Hat uses this data to:

    • Monitor the clusters.
    • Roll out Red Hat OpenShift Service on AWS upgrades.
    • Improve the upgrade experience.
  • Insight Operator: By default, Red Hat OpenShift Service on AWS installs and enables the Insight Operator, which reports configuration and component failure status every two hours. The Insight Operator helps to:

    • Identify potential cluster issues proactively.
    • Provide a solution and preventive action in Red Hat OpenShift Cluster Manager.

You can review telemetry information.

If you have enabled remote health reporting, Use Insights to identify issues. You can optionally disable remote health reporting.

1.3. Troubleshooting issues

A cluster administrator can monitor and troubleshoot the following Red Hat OpenShift Service on AWS component issues:

  • Node issues: A cluster administrator can verify and troubleshoot node-related issues by reviewing the status, resource usage, and configuration of a node. You can query the following:

    • Kubelet’s status on a node.
    • Cluster node journal logs.
  • Operator issues: A cluster administrator can do the following to resolve Operator issues:

    • Verify Operator subscription status.
    • Check Operator pod health.
    • Gather Operator logs.
  • Pod issues: A cluster administrator can troubleshoot pod-related issues by reviewing the status of a pod and completing the following:

    • Review pod and container logs.
    • Start debug pods with root access.
  • Storage issues: A multi-attach storage error occurs when the mounting volume on a new node is not possible because the failed node cannot unmount the attached volume. A cluster administrator can do the following to resolve multi-attach storage issues:

    • Enable multiple attachments by using RWX volumes.
    • Recover or delete the failed node when using an RWO volume.
  • Monitoring issues: A cluster administrator can follow the procedures on the troubleshooting page for monitoring. If the metrics for your user-defined projects are unavailable or if Prometheus is consuming a lot of disk space, check the following:

    • Investigate why user-defined metrics are unavailable.
    • Determine why Prometheus is consuming a lot of disk space.
Red Hat logoGithubRedditYoutubeTwitter

Learn

Try, buy, & sell

Communities

About Red Hat Documentation

We help Red Hat users innovate and achieve their goals with our products and services with content they can trust.

Making open source more inclusive

Red Hat is committed to replacing problematic language in our code, documentation, and web properties. For more details, see the Red Hat Blog.

About Red Hat

We deliver hardened solutions that make it easier for enterprises to work across platforms and environments, from the core datacenter to the network edge.

© 2024 Red Hat, Inc.