Monitoring your OpenShift cluster health with Insights Advisor
Using Insights advisor service to monitor your OpenShift cluster infrastructure
Abstract
Chapter 1. About the Red Hat Insights for OpenShift advisor service Copy linkLink copied to clipboard!
You can use the Insights for OpenShift advisor service to assess and monitor the health of your OpenShift clusters. Whether you are concerned about individual clusters, or with your whole infrastructure, it is important to be aware of your exposure to issues that can affect service availability, fault tolerance, performance, or security.
Insights repeatedly analyzes data collected by Insights Operator against a database of recommendations, which are sets of conditions that can leave your clusters at risk. You can then perform the following actions in the Red Hat Hybrid Cloud Console:
- See clusters impacted by a specific recommendation.
- Use robust filtering capabilities to refine your results.
- Learn more about individual recommendations, details about the risks they present, and get resolutions tailored to your individual clusters.
- Share results with other stakeholders.
To use the advisor service, your cluster must be registered to Red Hat OpenShift Cluster Manager. To register a disconnected cluster, see Registering OpenShift Container Platform clusters to OpenShift Cluster Manager.
1.1. Understanding advisor service recommendations Copy linkLink copied to clipboard!
The Insights for OpenShift advisor service bundles information about various cluster states and component configurations that can negatively affect the service availability, fault tolerance, performance, or security of your clusters. This information set is called a recommendation in advisor service and includes the following information:
- Name: A concise description of the recommendation
- Added: When the recommendation was published to the Insights archive
- Category: Whether the issue has the potential to negatively affect service availability, fault tolerance, performance, or security
- Total risk: A value derived from the likelihood that the condition will negatively affect your infrastructure, and the impact on operations if that were to happen
- Clusters: A list of clusters on which a recommendation is detected
- Link to associated topics: More information from Red Hat about the issue
Chapter 2. Using the Insights for OpenShift advisor service Copy linkLink copied to clipboard!
The Insights advisor service repeatedly analyzes the data collected by the Insights Operator. You can view and manage reports showing advisor results for your OpenShift cluster in the advisor service GUI on the Red Hat Hybrid Cloud Console.
2.1. Displaying potential issues with your cluster Copy linkLink copied to clipboard!
This section describes how to view advisor results in the Insights for OpenShift GUI.
Note that the Insights Operator repeatedly analyzes your cluster and shows the latest results in the Insights for OpenShift advisor service. These results can change if you fix an issue or a new issue has been detected, for example.
Prerequisites
- Your cluster is registered with OpenShift Cluster Manager.
- Remote health reporting is enabled, which is the default.
- You are logged in to the Hybrid Cloud Console.
Procedure
Navigate to OpenShift > Advisor > Recommendations.
Depending on the result, you may see one of the following messages:
- No matching recommendations found, if Insights did not identify any issues.
- A list of issues Insights has detected, grouped by risk (low, moderate, important, and critical).
- No clusters yet, if Insights has not yet analyzed the cluster. The analysis starts shortly after the cluster has been installed, registered, and connected to the internet.
If any issues are displayed, click the > icon in front of the entry for more details.
Depending on the issue, the details can also contain a link to more information from Red Hat about the issue.
2.2. Displaying all Insights for OpenShift advisor service recommendations Copy linkLink copied to clipboard!
The Recommendations view, by default, only displays the recommendations that are detected on your clusters. However, you can view all of the recommendations contained in the Insights for OpenShift archive.
Prerequisites
- Remote health reporting is enabled, which is the default.
- Your cluster is registered with OpenShift Cluster Manager.
- You are logged in to the Hybrid Cloud Console.
Procedure
- Navigate to OpenShift > Advisor > Recommendations.
Click the X icons next to the Clusters Impacted and Status filters.
You can now browse through all of the recommendations against which your cluster is scanned.
2.3. Filtering advisor recommendations to see your most critical issues Copy linkLink copied to clipboard!
The Insights for OpenShift advisor service can return a large number of recommendations. To focus on your most critical recommendations, you can apply filters to the advisor service recommendations list to remove low-priority recommendations.
By default, filters are set to only show enabled recommendations that are impacting one or more clusters. To view all or disabled recommendations in the Insights library, you can customize the filters.
To apply a filter, select a filter type and then set its value based on the options that are available in the drop-down list. You can apply multiple filters to the list of recommendations.
You can set the following filter types:
- Name: Search for a recommendation by name.
- Total risk: Select one or more values from Critical, Important, Moderate, and Low indicating the likelihood and the severity of a negative impact on a cluster.
- Impact: Select one or more values from Critical, High, Medium, and Low indicating the potential impact to the continuity of cluster operations.
- Likelihood: Select one or more values from Critical, High, Medium, and Low indicating the potential for a negative impact to a cluster if the recommendation comes to fruition.
- Category: Select one or more categories from Service Availability, Performance, Fault Tolerance, Security, and Best Practice to focus your attention on.
- Status: Click a radio button to show enabled recommendations (default), disabled recommendations, or all recommendations.
- Clusters impacted: Set the filter to show recommendations currently impacting one or more clusters, non-impacting recommendations, or all recommendations.
- Risk of change: Select one or more values from High, Moderate, Low, and Very low indicating the risk that the implementation of the resolution could have on cluster operations.
2.3.1. Applying filters to the list of advisor recommendations Copy linkLink copied to clipboard!
As an OpenShift cluster administrator, you can filter the advisor recommendations that are displayed on the recommendations list. By applying filters, you can reduce the number of reported recommendations and concentrate on your highest priority recommendations.
The following procedure demonstrates how to set and remove Category filters; however, the procedure is applicable to any of the filter types and respective values.
Prerequisites
You are logged in to the Hybrid Cloud Console.
Procedure
- Navigate to OpenShift > Advisor > Recommendations.
- In the main, filter-type drop-down list, select the Category filter type.
- Expand the filter-value drop-down list and select the checkbox next to each category of recommendation you want to view. Leave the checkboxes for unnecessary categories clear.
- Optional: Apply additional filters to further refine the list.
Only recommendations from the selected categories are shown in the list.
Verification
- After applying filters, you can view the updated recommendations list. The applied filters are added next to the default filters.
2.3.2. Removing filters from the advisor recommendations list Copy linkLink copied to clipboard!
You can apply multiple filters to the list of recommendations. When ready, you can remove them individually or completely reset them.
Removing filters individually
- Click the X icon next to each filter, including the default filters, to remove them individually.
Removing all non-default filters
- Click Reset filters to remove only the filters that you applied, leaving the default filters in place.
2.4. Disabling advisor recommendations Copy linkLink copied to clipboard!
You can disable specific recommendations that affect your clusters, so that they no longer appear in your reports. It is possible to disable a recommendation for a single cluster or all of your clusters.
Disabling a recommendation for all of your current clusters also disables that recommendation on all future clusters.
Prerequisites
- Remote health reporting is enabled, which is the default.
- Your cluster is registered with Red Hat OpenShift Cluster Manager.
- You are logged in to the Red Hat Hybrid Cloud Console.
Procedure
- Navigate to OpenShift > Advisor > Recommendations.
Disable the recommendation for a single cluster:
- Click the name of the recommendation to disable. You are directed to the details page for the recommendation.
- Click the options icon (⋮) for that cluster, and then click Disable recommendation for cluster.
- Enter a justification note and click Save.
Disable the recommendation for all of your clusters:
- Click the name of the recommendation to disable. You are directed to the details page for the recommendation.
- Click Actions > Disable recommendation.
- Enter a justification note and click Save.
Verification
- If you disabled the recommendation for all clusters, go to OpenShift > Advisor > Recommendations and search for the recommendation. If the procedure was successful, you will not see the recommendation in the search result.
- If you disabled the recommendation for an individual cluster, go to the Recommendations page and search for the recommendation.
- Click on the recommendation and search for the cluster ID for the cluster on which it was disabled. If the procedure was successful, you will not see your cluster in the list of clusters affected by the recommendation.
2.5. Enabling a previously disabled advisor recommendation Copy linkLink copied to clipboard!
When an Insights advisor recommendation is disabled for all clusters, you will no longer see it on the list of recommendations. You can reenable the recommendation so that your clusters are scanned for it.
Prerequisites
- Remote health reporting is enabled, which is the default.
- Your cluster is registered with Red Hat OpenShift Cluster Manager.
- You are logged in to the Red Hat Hybrid Cloud Console.
Procedure
- Navigate to OpenShift > Advisor > Recommendations.
- Filter the recommendations by Status > Disabled.
- Locate the recommendation to enable.
- Click the options icon (⋮), and then click Enable recommendation.
2.6. Displaying the Insights advisor status in the OpenShift web console Copy linkLink copied to clipboard!
Red Hat Insights for OpenShift repeatedly analyzes your cluster and you can display the status of identified potential issues of your cluster in the Red Hat OpenShift Container Platform web console. This status shows the number of issues in the different categories and, for further details, links to the reports in the Red Hat Hybrid Cloud Console.
Prerequisites
- Your cluster is registered with OpenShift Cluster Manager.
- Remote health reporting is enabled, which is the default.
- You are logged in to the OpenShift Container Platform web console.
Procedure
- Navigate to Home → Overview in the OpenShift Container Platform web console.
Click Insights on the Status card.
The pop-up window lists potential issues grouped by risk. Click the individual categories or View all recommendations in Insights ADVISOR to display more details.
2.7. Using update-risk assessment to identify and mitigate cluster-update risks Copy linkLink copied to clipboard!
Update-risk assessment uses machine learning developed in collaboration with IBM Research to compare the recent state of the cluster with conditions known to cause updates to fail, including failing operator conditions, alerts, and other metrics.
The advisor service identifies clusters with update risks, and also shows you alerts and components that need attention. You can use the update-risk feature to generate a checklist of issues to fix before beginning a cluster update.
Complete the following steps to view any update risks in your clusters.
Prerequisites
- Cluster is connected to the Red Hat Hybrid Cloud Console using instructions in OpenShift, Remote Health Monitoring documentation, Enabling remote health reporting.
- Cluster has sent data to Red Hat within the last two hours.
- You are logged into the Red Hat Hybrid Cloud Console.
Procedure
- Navigate to the Advisor clusters page in the Hybrid Cloud Console.
Select a cluster with the Update risk label. The label is visible with the cluster name or ID.
Click the cluster name to open the cluster details page.
- If no risks exist for the cluster, a banner displays the message, "No known update risks identified for this cluster."
- If the cluster has not checked in for more than two hours, the banner message says "Warning alert: Update risks are not currently available. This cluster has gone more than two hours without sending metrics. Check the cluster’s web console if you think that this is incorrect."
Click the Update risks tab.
- View alerts or cluster operator risks for the cluster.
- Click an alert to open the Alert details page in the Red Hat OpenShift web console.
- Click a cluster operator to open the in-cluster, ClusterOperator details page in the Red Hat OpenShift web console.
For more information about alerts, see OpenShift documentation, Getting information about alerts, silences, and alerting rules.
2.8. Identifying workload recommendations for namespaces in your clusters Copy linkLink copied to clipboard!
The Insights for OpenShift advisor service compiles workload recommendations for specific objects in namespaces in clusters and outputs them to OpenShift > Advisor > Workloads. Workload recommendations are best practices for cloud-native Kubernetes applications and highlight misconfigurations within specific namespaces in managed and on-premise clusters in your organization.
Workload recommendations are based on data provided by the Deployment Validation Operator (DVO), which executes KubeLinter checks.
If you are a cluster administrator, you can view workload recommendation details, including cluster, namespace, and affected object IDs, and also steps to resolve the issue in OpenShift > Advisor > Workloads. As a cluster administrator, you can also either fix the issue or inform developers who manage the namespaces so that they can apply a resolution.
Additional resources
2.8.1. Viewing workload recommendations for namespaces in your cluster Copy linkLink copied to clipboard!
You can view and gather information about recommendations for specific objects in namespaces in your clusters by viewing them in the Workloads view of the Red Hat Hybrid Cloud Console. Depending on whether you have a managed or self-managed cluster, you will see the following:
- For self-managed clusters: cluster name and namespace name
- For managed clusters: cluster name and namespace UUID
Prerequisites
- You have access as a cluster administrator to at least one cluster in your infrastructure.
- You have logged in to the Red Hat Hybrid Cloud Console.
Procedure
- Go to OpenShift > Advisor > Workloads.
Click a namespace (identified by cluster and namespace) to display the namespace details page, which includes recommendations based on the Deployment Validation Operator (DVO) checks for this namespace.
NoteYou can use the filtering function to search by cluster name or namespace name.
Click the arrow next to a recommendation to see the following information about the issue:
- A description of the detected issue
- Steps to resolve the issue on affected objects in the namespace
- Identifiers for affected objects in the namespace
- Additional resources and information
- Click View all objects to view a list of all of the objects affected by the recommendation in this namespace.
You can use this information to work with your developers to resolve issues, as needed. In some cases, you might find that your organization’s policies are different from a recommendation and the proposed solution to resolve it. In these cases, you can exclude objects from workload recommendations.
2.8.2. Excluding objects from workload recommendations in your clusters Copy linkLink copied to clipboard!
After viewing workload recommendations, you might decide that some of the recommendations for specific objects in a cluster are not aligned with your organization’s policies or cluster management workflow. You can exclude these recommendations from view by excluding specific objects from the Deployment Validation Operator (DVO) checks. For example, you might have a cluster recommendation to run as non-root, because your organization has some containers running as root. You can change your configuration to turn off the recommendation and exclude the specified workload object from future Deployment Validation Operator (DVO) checks.
Prerequisites
- You have OCP Advisor administrator permissions to a cluster.
- You have permission to edit the namespace.
- You have logged in to the Red Hat Hybrid Cloud Console.
- You have identified a cluster with a recommendation and have at least one object in Workloads that you want to exclude.
Procedure
- Navigate to OpenShift > Advisor > Workloads.
Click a namespace that relates to the recommendation that includes the object you want to exclude. A namespace is identified by cluster and namespace.
NoteYou can use the filtering function to search by the following:
- For self-managed clusters: cluster name and namespace name
- For managed clusters: cluster name and namespace UUID
Expand Description on the details page to see all issues or click the arrow next to a recommendation to see the following information about the issue:
- A description of the detected issue
- Steps to resolve the issue on affected objects in the namespace
- Identifiers for affected objects in the namespace
- Additional resources and information
-
Note the name of the KubeLinter check, for example
run-as-non-root
ornon-isolated-pod
. - In the Steps to resolve section, click View all objects to display a list of all of the objects affected by the recommendation in this namespace.
- Note the Name and Kind values for the workload object you want to exclude from the Deployment Validation Operator’s (DVO) KubeLinter checks. You will need this information to change the annotation for the object.
Use Ignoring specific resources in the Github resource for Deployment Validation Operator (DVO) to annotate the object based on your organization’s workflow. The following example shows a change to the annotation in the
nginx
deployment object in the project namedtest-dvo
to exclude thenginx
object from the KubeLinter checkrun-as-root
. In this YAML example, you would add the annotation,ignore-check.kube-linter.io/run-as-non-root: "This image must be run as a privileged user for it to work.
.Copy to Clipboard Copied! Toggle word wrap Toggle overflow - It takes approximately two hours for the Insights Operator to update the recommendations. After the update completes, the recommendation for the specific object is no longer shown in the annotation.
Verification step
Wait approximately two hours or until the Insights Operator has updated the recommendations, and then log on as a cluster administrator and check that the object is excluded from the workload recommendation.
2.9. Using the Deployment Validation Operator in your Red Hat Insights for OpenShift workflow Copy linkLink copied to clipboard!
The Deployment Validation Operator validates on-premise and managed clusters against a curated collection of KubeLinter checks. These checks implement best practices for Kubernetes-native workloads, helping to ensure that applications are optimized for the operational stability of the cluster.
The Insights Operator gathers Deployment Validation Operator checks every two hours by default and presents data in the Insights advisor service. If the Deployment Validation Operator detects issues, cluster administrators can view resolutions in the advisor service. If the Deployment Validation Operator detects no issues, no results are visible in the advisor service.
Curated KubeLinter checks
The Deployment Validation Operator checks a curated, limited collection of all of the available KubeLinter checks. The Deployment Validation Operator does not execute the whole list of available KubeLinter checks. Insights advisor service recommendations do not exist for all available KubeLinter checks.
Supported OpenShift Container Platform versions
All Red Hat-supported versions of Red Hat OpenShift Container Platform, OpenShift Dedicated, and Red Hat OpenShift Service on AWS support the Deployment Validation Operator.
With a managed OpenShift cluster, the Deployment Validation Operator is installed and operational by default. For an on-premise OpenShift cluster, a cluster administrator must download and install the Deployment Validation Operator from OperatorHub and can change the default list of checks.
Additional resources
2.9.1. The Deployment Validation Operator on managed OpenShift clusters Copy linkLink copied to clipboard!
The Deployment Validation Operator is already installed and operational on managed OpenShift clusters. This includes clusters on OpenShift Dedicated and Red Hat OpenShift Service on AWS.
Deployment Validation Operator configuration
The Deployment Validation Operator for managed clusters comes preconfigured, by default. The Deployment Validation Operator configuration file contains the default, curated set of KubeLinter checks and is not editable.
Deployment Validation Operator updates
On managed clusters, the Deployment Validation Operator updates automatically.
2.9.2. The Deployment Validation Operator on on-premise OpenShift clusters Copy linkLink copied to clipboard!
Administrators of on-premise clusters must install the Deployment Validation Operator from OperatorHub in the OpenShift web console. On-premise cluster administrators can also configure the default set of KubeLinter checks.
The advisor does not have recommendations for all of the checks that KubeLinter has available.
2.9.2.1. Installing the Deployment Validation Operator on on-premise OpenShift clusters Copy linkLink copied to clipboard!
You can find the Deployment Validation Operator in OperatorHub and install it from there.
Prerequisites
- You are logged into the Red Hat OpenShift web console as a cluster-admin.
Procedure
- Navigate to Red Hat OpenShift web console > Operators > OperatorHub.
- In the Search box, start typing “deployment-validation-operator."
- Click on the DVO card when you see it.
- The Deployment Validation Operator card displays information about capabilities, configuration, version, and GitHub source files. When you are ready to install the Operator, click Install.
- Choose the namespace or use the default.
- Click Install and the Operator installs. You can confirm the installation in the InstalledOperators view. The Deployment Validation Operator is also visible in the Pods and Deployments views for the cluster in the corresponding namespace.
2.9.2.2. Configuring the list of Deployment Validation Operator checks on on-premise clusters Copy linkLink copied to clipboard!
If you are an administrator of an on-premise OpenShift cluster, you can change the default list of Deployment Validation Operator checks to focus on specific best practices of interest. For more information, see Configuring Checks, in the Deployment Validation Operator GitHub repository.
2.9.2.3. Updating the Deployment Validation Operator on on-premise clusters Copy linkLink copied to clipboard!
You can set the Deployment Validation Operator to automatically update during the installation from OperatorHub.
2.9.3. Viewing Deployment Validation Operator results in the Insights for OpenShift advisor service Copy linkLink copied to clipboard!
If the Deployment Validation Operator detects issues, the Overview page for the cluster in the OpenShift web console shows an Insights link with the number of detected issues in the Status information block. Click the link to learn more about the issue and how to resolve it in the advisor service in the Red Hat Hybrid Cloud Console.
If the latest Deployment Validation Operator returns no issues, you will not see any issues in Insights.
Prerequisites
- You are logged into the Red Hat Hybrid Cloud Console.
Procedure
- Go to the Overview page for the cluster in the OpenShift web console.
- Look for the Status block, an Insights link, and the number of detected issues.
If one or more issues exist, click the Insights link to open the advisor service in the Hybrid Cloud Console.
The cluster information page displays. On the Recommendations tab, you can see the issues detected for the cluster.
- Click the arrow to view complete information about the issue, including the necessary actions to resolve it.
2.9.4. Viewing the default list of Deployment Validation Operator checks Copy linkLink copied to clipboard!
The Deployment Validation Operator checks your cluster against a default list of checks. You can view the list of DVO checks in GitHub.
Administrators of managed clusters cannot modify the list of default checks, but can view them using the link above.
Providing feedback on Red Hat documentation Copy linkLink copied to clipboard!
We appreciate and prioritize your feedback regarding our documentation. Provide as much detail as possible, so that your request can be quickly addressed.
Prerequisites
- You are logged in to the Red Hat Customer Portal.
Procedure
To provide feedback, perform the following steps:
- Click the following link: Create Issue
- Describe the issue or enhancement in the Summary text box.
- Provide details about the issue or requested enhancement in the Description text box.
- Type your name in the Reporter text box.
- Click the Create button.
This action creates a documentation ticket and routes it to the appropriate documentation team. Thank you for taking the time to provide feedback.