Este contenido no está disponible en el idioma seleccionado.
Chapter 4. Remote health monitoring with connected clusters
4.1. About remote health monitoring
OpenShift Container Platform collects telemetry and configuration data about your cluster and reports it to Red Hat by using the Telemeter Client and the Insights Operator. The data that is provided to Red Hat enables the benefits outlined in this document.
A cluster that reports data to Red Hat through Telemetry and the Insights Operator is considered a connected cluster.
Telemetry is the term that Red Hat uses to describe the information being sent to Red Hat by the OpenShift Container Platform Telemeter Client. Lightweight attributes are sent from connected clusters to Red Hat to enable subscription management automation, monitor the health of clusters, assist with support, and improve customer experience.
The Insights Operator gathers OpenShift Container Platform configuration data and sends it to Red Hat. The data is used to produce insights about potential issues that a cluster might be exposed to. These insights are communicated to cluster administrators on OpenShift Cluster Manager.
More information is provided in this document about these two processes.
Telemetry and Insights Operator benefits
Telemetry and the Insights Operator enable the following benefits for end-users:
- Enhanced identification and resolution of issues. Events that might seem normal to an end-user can be observed by Red Hat from a broader perspective across a fleet of clusters. Some issues can be more rapidly identified from this point of view and resolved without an end-user needing to open a support case or file a Jira issue.
-
Advanced release management. OpenShift Container Platform offers the
candidate
,fast
, andstable
release channels, which enable you to choose an update strategy. The graduation of a release fromfast
tostable
is dependent on the success rate of updates and on the events seen during upgrades. With the information provided by connected clusters, Red Hat can improve the quality of releases tostable
channels and react more rapidly to issues found in thefast
channels. - Targeted prioritization of new features and functionality. The data collected provides insights about which areas of OpenShift Container Platform are used most. With this information, Red Hat can focus on developing the new features and functionality that have the greatest impact for our customers.
- A streamlined support experience. You can provide a cluster ID for a connected cluster when creating a support ticket on the Red Hat Customer Portal. This enables Red Hat to deliver a streamlined support experience that is specific to your cluster, by using the connected information. This document provides more information about that enhanced support experience.
- Predictive analytics. The insights displayed for your cluster on OpenShift Cluster Manager are enabled by the information collected from connected clusters. Red Hat is investing in applying deep learning, machine learning, and artificial intelligence automation to help identify issues that OpenShift Container Platform clusters are exposed to.
4.1.1. About Telemetry
Telemetry sends a carefully chosen subset of the cluster monitoring metrics to Red Hat. The Telemeter Client fetches the metrics values every four minutes and thirty seconds and uploads the data to Red Hat. These metrics are described in this document.
This stream of data is used by Red Hat to monitor the clusters in real-time and to react as necessary to problems that impact our customers. It also allows Red Hat to roll out OpenShift Container Platform upgrades to customers to minimize service impact and continuously improve the upgrade experience.
This debugging information is available to Red Hat Support and Engineering teams with the same restrictions as accessing data reported through support cases. All connected cluster information is used by Red Hat to help make OpenShift Container Platform better and more intuitive to use.
Additional resources
- See the OpenShift Container Platform update documentation for more information about updating or upgrading a cluster.
4.1.1.1. Information collected by Telemetry
The following information is collected by Telemetry:
4.1.1.1.1. System information
- Version information, including the OpenShift Container Platform cluster version and installed update details that are used to determine update version availability
- Update information, including the number of updates available per cluster, the channel and image repository used for an update, update progress information, and the number of errors that occur in an update
- The unique random identifier that is generated during an installation
- Configuration details that help Red Hat Support to provide beneficial support for customers, including node configuration at the cloud infrastructure level, hostnames, IP addresses, Kubernetes pod names, namespaces, and services
- The OpenShift Container Platform framework components installed in a cluster and their condition and status
- Events for all namespaces listed as "related objects" for a degraded Operator
- Information about degraded software
- Information about the validity of certificates
- The name of the provider platform that OpenShift Container Platform is deployed on and the data center location
4.1.1.1.2. Sizing Information
- Sizing information about clusters, machine types, and machines, including the number of CPU cores and the amount of RAM used for each
- The number of etcd members and the number of objects stored in the etcd cluster
- Number of application builds by build strategy type
4.1.1.1.3. Usage information
- Usage information about components, features, and extensions
- Usage details about Technology Previews and unsupported configurations
Telemetry does not collect identifying information such as usernames or passwords. Red Hat does not intend to collect personal information. If Red Hat discovers that personal information has been inadvertently received, Red Hat will delete such information. To the extent that any telemetry data constitutes personal data, please refer to the Red Hat Privacy Statement for more information about Red Hat’s privacy practices.
Additional resources
- See Showing data collected by Telemetry for details about how to list the attributes that Telemetry gathers from Prometheus in OpenShift Container Platform.
- See the upstream cluster-monitoring-operator source code for a list of the attributes that Telemetry gathers from Prometheus.
- Telemetry is installed and enabled by default. If you need to opt out of remote health reporting, see Opting out of remote health reporting.
4.1.2. About the Insights Operator
The Insights Operator periodically gathers configuration and component failure status and, by default, reports that data every two hours to Red Hat. This information enables Red Hat to assess configuration and deeper failure data than is reported through Telemetry.
Users of OpenShift Container Platform can display the report of each cluster in the Insights Advisor service on Red Hat Hybrid Cloud Console. If any issues have been identified, Insights provides further details and, if available, steps on how to solve a problem.
The Insights Operator does not collect identifying information, such as user names, passwords, or certificates. See Red Hat Insights Data & Application Security for information about Red Hat Insights data collection and controls.
Red Hat uses all connected cluster information to:
- Identify potential cluster issues and provide a solution and preventive actions in the Insights Advisor service on Red Hat Hybrid Cloud Console
- Improve OpenShift Container Platform by providing aggregated and critical information to product and support teams
- Make OpenShift Container Platform more intuitive
Additional resources
- The Insights Operator is installed and enabled by default. If you need to opt out of remote health reporting, see Opting out of remote health reporting.
4.1.2.1. Information collected by the Insights Operator
The following information is collected by the Insights Operator:
- General information about your cluster and its components to identify issues that are specific to your OpenShift Container Platform version and environment
- Configuration files, such as the image registry configuration, of your cluster to determine incorrect settings and issues that are specific to parameters you set
- Errors that occur in the cluster components
- Progress information of running updates, and the status of any component upgrades
- Details of the platform that OpenShift Container Platform is deployed on, such as Amazon Web Services, and the region that the cluster is located in
- Cluster workload information transformed into discreet Secure Hash Algorithm (SHA) values, which allows Red Hat to assess workloads for security and version vulnerabilities without disclosing sensitive details
-
If an Operator reports an issue, information is collected about core OpenShift Container Platform pods in the
openshift-*
andkube-*
projects. This includes state, resource, security context, volume information, and more.
Additional resources
- See Showing data collected by the Insights Operator for details about how to review the data that is collected by the Insights Operator.
- The Insights Operator source code is available for review and contribution. See the Insights Operator upstream project for a list of the items collected by the Insights Operator.
4.1.3. Understanding Telemetry and Insights Operator data flow
The Telemeter Client collects selected time series data from the Prometheus API. The time series data is uploaded to api.openshift.com every four minutes and thirty seconds for processing.
The Insights Operator gathers selected data from the Kubernetes API and the Prometheus API into an archive. The archive is uploaded to OpenShift Cluster Manager every two hours for processing. The Insights Operator also downloads the latest Insights analysis from OpenShift Cluster Manager. This is used to populate the Insights status pop-up that is included in the Overview page in the OpenShift Container Platform web console.
All of the communication with Red Hat occurs over encrypted channels by using Transport Layer Security (TLS) and mutual certificate authentication. All of the data is encrypted in transit and at rest.
Access to the systems that handle customer data is controlled through multi-factor authentication and strict authorization controls. Access is granted on a need-to-know basis and is limited to required operations.
Telemetry and Insights Operator data flow
Additional resources
- See Monitoring overview for more information about the OpenShift Container Platform monitoring stack.
- See Configuring your firewall for details about configuring a firewall and enabling endpoints for Telemetry and Insights
4.1.4. Additional details about how remote health monitoring data is used
The information collected to enable remote health monitoring is detailed in Information collected by Telemetry and Information collected by the Insights Operator.
As further described in the preceding sections of this document, Red Hat collects data about your use of the Red Hat Product(s) for purposes such as providing support and upgrades, optimizing performance or configuration, minimizing service impacts, identifying and remediating threats, troubleshooting, improving the offerings and user experience, responding to issues, and for billing purposes if applicable.
Collection safeguards
Red Hat employs technical and organizational measures designed to protect the telemetry and configuration data.
Sharing
Red Hat may share the data collected through Telemetry and the Insights Operator internally within Red Hat to improve your user experience. Red Hat may share telemetry and configuration data with its business partners in an aggregated form that does not identify customers to help the partners better understand their markets and their customers’ use of Red Hat offerings or to ensure the successful integration of products jointly supported by those partners.
Third parties
Red Hat may engage certain third parties to assist in the collection, analysis, and storage of the Telemetry and configuration data.
User control / enabling and disabling telemetry and configuration data collection
You may disable OpenShift Container Platform Telemetry and the Insights Operator by following the instructions in Opting out of remote health reporting.
4.2. Showing data collected by remote health monitoring
As an administrator, you can review the metrics collected by Telemetry and the Insights Operator.
4.2.1. Showing data collected by Telemetry
You can see the cluster and components time series data captured by Telemetry.
Prerequisites
-
Install the OpenShift CLI (
oc
). -
You must log in to the cluster with a user that has either the
cluster-admin
role or thecluster-monitoring-view
role.
Procedure
Find the URL for the Prometheus service that runs in the OpenShift Container Platform cluster:
$ oc get route prometheus-k8s -n openshift-monitoring -o jsonpath="{.spec.host}"
- Navigate to the URL.
Enter this query in the Expression input box and press Execute:
{__name__=~"cluster:usage:.*|count:up0|count:up1|cluster_version|cluster_version_available_updates|cluster_operator_up|cluster_operator_conditions|cluster_version_payload|cluster_installer|cluster_infrastructure_provider|cluster_feature_set|instance:etcd_object_counts:sum|ALERTS|code:apiserver_request_total:rate:sum|cluster:capacity_cpu_cores:sum|cluster:capacity_memory_bytes:sum|cluster:cpu_usage_cores:sum|cluster:memory_usage_bytes:sum|openshift:cpu_usage_cores:sum|openshift:memory_usage_bytes:sum|workload:cpu_usage_cores:sum|workload:memory_usage_bytes:sum|cluster:virt_platform_nodes:sum|cluster:node_instance_type_count:sum|cnv:vmi_status_running:count|cluster:vmi_request_cpu_cores:sum|node_role_os_version_machine:cpu_capacity_cores:sum|node_role_os_version_machine:cpu_capacity_sockets:sum|subscription_sync_total|olm_resolution_duration_seconds|csv_succeeded|csv_abnormal|cluster:kube_persistentvolumeclaim_resource_requests_storage_bytes:provisioner:sum|cluster:kubelet_volume_stats_used_bytes:provisioner:sum|ceph_cluster_total_bytes|ceph_cluster_total_used_raw_bytes|ceph_health_status|job:ceph_osd_metadata:count|job:kube_pv:count|job:ceph_pools_iops:total|job:ceph_pools_iops_bytes:total|job:ceph_versions_running:count|job:noobaa_total_unhealthy_buckets:sum|job:noobaa_bucket_count:sum|job:noobaa_total_object_count:sum|noobaa_accounts_num|noobaa_total_usage|console_url|cluster:network_attachment_definition_instances:max|cluster:network_attachment_definition_enabled_instance_up:max|cluster:ingress_controller_aws_nlb_active:sum|insightsclient_request_send_total|cam_app_workload_migrations|cluster:apiserver_current_inflight_requests:sum:max_over_time:2m|cluster:alertmanager_integrations:max|cluster:telemetry_selected_series:count|openshift:prometheus_tsdb_head_series:sum|openshift:prometheus_tsdb_head_samples_appended_total:sum|monitoring:container_memory_working_set_bytes:sum|namespace_job:scrape_series_added:topk3_sum1h|namespace_job:scrape_samples_post_metric_relabeling:topk3|monitoring:haproxy_server_http_responses_total:sum|rhmi_status|cluster_legacy_scheduler_policy|cluster_master_schedulable|che_workspace_status|che_workspace_started_total|che_workspace_failure_total|che_workspace_start_time_seconds_sum|che_workspace_start_time_seconds_count|cco_credentials_mode|cluster:kube_persistentvolume_plugin_type_counts:sum|visual_web_terminal_sessions_total|acm_managed_cluster_info|cluster:vsphere_vcenter_info:sum|cluster:vsphere_esxi_version_total:sum|cluster:vsphere_node_hw_version_total:sum|openshift:build_by_strategy:sum|rhods_aggregate_availability|rhods_total_users|instance:etcd_disk_wal_fsync_duration_seconds:histogram_quantile|instance:etcd_mvcc_db_total_size_in_bytes:sum|instance:etcd_network_peer_round_trip_time_seconds:histogram_quantile|instance:etcd_mvcc_db_total_size_in_use_in_bytes:sum|instance:etcd_disk_backend_commit_duration_seconds:histogram_quantile|jaeger_operator_instances_storage_types|jaeger_operator_instances_strategies|jaeger_operator_instances_agent_strategies|appsvcs:cores_by_product:sum|nto_custom_profiles:count|openshift_csi_share_configmap|openshift_csi_share_secret|openshift_csi_share_mount_failures_total|openshift_csi_share_mount_requests_total",alertstate=~"firing|"}
This query replicates the request that Telemetry makes against a running OpenShift Container Platform cluster’s Prometheus service and returns the full set of time series captured by Telemetry.
4.2.2. Showing data collected by the Insights Operator
You can review the data that is collected by the Insights Operator.
Prerequisites
-
Access to the cluster as a user with the
cluster-admin
role.
Procedure
Find the name of the currently running pod for the Insights Operator:
$ INSIGHTS_OPERATOR_POD=$(oc get pods --namespace=openshift-insights -o custom-columns=:metadata.name --no-headers --field-selector=status.phase=Running)
Copy the recent data archives collected by the Insights Operator:
$ oc cp openshift-insights/$INSIGHTS_OPERATOR_POD:/var/lib/insights-operator ./insights-data
The recent Insights Operator archives are now available in the insights-data
directory.
4.3. Opting out of remote health reporting
You may choose to opt out of reporting health and usage data for your cluster.
To opt out of remote health reporting, you must:
- Modify the global cluster pull secret to disable remote health reporting.
- Update the cluster to use this modified pull secret.
4.3.1. Consequences of disabling remote health reporting
In OpenShift Container Platform, customers can opt out of reporting usage information. However, connected clusters allow Red Hat to react more quickly to problems and better support our customers, as well as better understand how product upgrades impact clusters. Connected clusters also help to simplify the subscription and entitlement process and enable the OpenShift Cluster Manager service to provide an overview of your clusters and their subscription status.
Red Hat strongly recommends leaving health and usage reporting enabled for pre-production and test clusters even if it is necessary to opt out for production clusters. This allows Red Hat to be a participant in qualifying OpenShift Container Platform in your environments and react more rapidly to product issues.
Some of the consequences of opting out of having a connected cluster are:
- Red Hat will not be able to monitor the success of product upgrades or the health of your clusters without a support case being opened.
- Red Hat will not be able to use configuration data to better triage customer support cases and identify which configurations our customers find important.
- The OpenShift Cluster Manager will not show data about your clusters including health and usage information.
- Your subscription entitlement information must be manually entered via console.redhat.com without the benefit of automatic usage reporting.
In restricted networks, Telemetry and Insights data can still be reported through appropriate configuration of your proxy.
4.3.2. Modifying the global cluster pull secret to disable remote health reporting
You can modify your existing global cluster pull secret to disable remote health reporting. This disables both Telemetry and the Insights Operator.
Prerequisites
-
You have access to the cluster as a user with the
cluster-admin
role.
Procedure
Download the global cluster pull secret to your local file system.
$ oc extract secret/pull-secret -n openshift-config --to=.
-
In a text editor, edit the
.dockerconfigjson
file that was downloaded. Remove the
cloud.openshift.com
JSON entry, for example:"cloud.openshift.com":{"auth":"<hash>","email":"<email_address>"}
- Save the file.
You can now update your cluster to use this modified pull secret.
4.3.3. Registering your disconnected cluster
Register your disconnected OpenShift Container Platform cluster on the Red Hat Hybrid Cloud Console so that your cluster is not impacted by the consequences listed in the section named "Consequences of disabling remote health reporting".
By registering your disconnected cluster, you can continue to report your subscription usage to Red Hat. In turn, Red Hat can return accurate usage and capacity trends associated with your subscription, so that you can use the returned information to better organize subscription allocations across all of your resources.
Prerequisites
-
You are logged in to the OpenShift Container Platform web console as
cluster-admin
. - You can log in to the Red Hat Hybrid Cloud Console.
Procedure
- Go to the Register disconnected cluster web page on the Red Hat Hybrid Cloud Console.
- Optional: To access the Register disconnected cluster web page from the home page of the Red Hat Hybrid Cloud Console, go to the Clusters navigation menu item and then select the Register cluster button.
- Enter your cluster’s details in the provided fields on the Register disconnected cluster page.
- From the Subscription settings section of the page, select the subcription settings that apply to your Red Hat subscription offering.
- To register your disconnected cluster, select the Register cluster button.
Additional resources
- Consequences of disabling remote health reporting
- How does the subscriptions service show my subscription data?(Getting Started with the Subscription Service)
4.3.4. Updating the global cluster pull secret
You can update the global pull secret for your cluster by either replacing the current pull secret or appending a new pull secret.
The procedure is required when users use a separate registry to store images than the registry used during installation.
Prerequisites
-
You have access to the cluster as a user with the
cluster-admin
role.
Procedure
Optional: To append a new pull secret to the existing pull secret, complete the following steps:
Enter the following command to download the pull secret:
$ oc get secret/pull-secret -n openshift-config --template='{{index .data ".dockerconfigjson" | base64decode}}' ><pull_secret_location> 1
- 1
- Provide the path to the pull secret file.
Enter the following command to add the new pull secret:
$ oc registry login --registry="<registry>" \ 1 --auth-basic="<username>:<password>" \ 2 --to=<pull_secret_location> 3
Alternatively, you can perform a manual update to the pull secret file.
Enter the following command to update the global pull secret for your cluster:
$ oc set data secret/pull-secret -n openshift-config --from-file=.dockerconfigjson=<pull_secret_location> 1
- 1
- Provide the path to the new pull secret file.
This update is rolled out to all nodes, which can take some time depending on the size of your cluster.
NoteAs of OpenShift Container Platform 4.7.4, changes to the global pull secret no longer trigger a node drain or reboot.
4.4. Enabling remote health reporting
If you or your organization have disabled remote health reporting, you can enable this feature again. You can see that remote health reporting is disabled from the message "Insights not available" in the Status tile on the OpenShift Container Platform Web Console Overview page.
To enable remote health reporting, you must Modify the global cluster pull secret with a new authorization token.
Enabling remote health reporting enables both Insights Operator and Telemetry.
4.4.1. Modifying your global cluster pull secret to enable remote health reporting
You can modify your existing global cluster pull secret to enable remote health reporting. If you have previously disabled remote health monitoring, you must first download a new pull secret with your console.openshift.com
access token from Red Hat OpenShift Cluster Manager.
Prerequisites
-
Access to the cluster as a user with the
cluster-admin
role. - Access to OpenShift Cluster Manager.
Procedure
- Navigate to https://console.redhat.com/openshift/downloads.
From Tokens
Pull Secret, click Download. The file
pull-secret.txt
containing yourcloud.openshift.com
access token in JSON format downloads:{ "auths": { "cloud.openshift.com": { "auth": "<your_token>", "email": "<email_address>" } }
Download the global cluster pull secret to your local file system.
$ oc get secret/pull-secret -n openshift-config --template='{{index .data ".dockerconfigjson" | base64decode}}' > pull-secret
Make a backup copy of your pull secret.
$ cp pull-secret pull-secret-backup
-
Open the
pull-secret
file in a text editor. -
Append the
cloud.openshift.com
JSON entry frompull-secret.txt
intoauths
. - Save the file.
Update the secret in your cluster.
oc set data secret/pull-secret -n openshift-config --from-file=.dockerconfigjson=pull-secret
It may take several minutes for the secret to update and your cluster to begin reporting.
Verification
- Navigate to the OpenShift Container Platform Web Console Overview page.
- Insights in the Status tile reports the number of issues found.
4.5. Using Insights to identify issues with your cluster
Insights repeatedly analyzes the data Insights Operator sends. Users of OpenShift Container Platform can display the report in the Insights Advisor service on Red Hat Hybrid Cloud Console.
4.5.1. About Red Hat Insights Advisor for OpenShift Container Platform
You can use Insights Advisor to assess and monitor the health of your OpenShift Container Platform clusters. Whether you are concerned about individual clusters, or with your whole infrastructure, it is important to be aware of your exposure to issues that can affect service availability, fault tolerance, performance, or security.
Insights repeatedly analyzes the data that Insights Operator sends using a database of recommendations, which are sets of conditions that can leave your OpenShift Container Platform clusters at risk. Your data is then uploaded to the Insights Advisor service on Red Hat Hybrid Cloud Console where you can perform the following actions:
- See clusters impacted by a specific recommendation.
- Use robust filtering capabilities to refine your results to those recommendations.
- Learn more about individual recommendations, details about the risks they present, and get resolutions tailored to your individual clusters.
- Share results with other stakeholders.
4.5.2. Understanding Insights Advisor recommendations
Insights Advisor bundles information about various cluster states and component configurations that can negatively affect the service availability, fault tolerance, performance, or security of your clusters. This information set is called a recommendation in Insights Advisor and includes the following information:
- Name: A concise description of the recommendation
- Added: When the recommendation was published to the Insights Advisor archive
- Category: Whether the issue has the potential to negatively affect service availability, fault tolerance, performance, or security
- Total risk: A value derived from the likelihood that the condition will negatively affect your infrastructure, and the impact on operations if that were to happen
- Clusters: A list of clusters on which a recommendation is detected
- Description: A brief synopsis of the issue, including how it affects your clusters
- Link to associated topics: More information from Red Hat about the issue
4.5.3. Displaying potential issues with your cluster
This section describes how to display the Insights report in Insights Advisor on OpenShift Cluster Manager.
Note that Insights repeatedly analyzes your cluster and shows the latest results. These results can change, for example, if you fix an issue or a new issue has been detected.
Prerequisites
- Your cluster is registered on OpenShift Cluster Manager.
- Remote health reporting is enabled, which is the default.
- You are logged in to OpenShift Cluster Manager.
Procedure
Navigate to Advisor
Recommendations on OpenShift Cluster Manager. Depending on the result, Insights Advisor displays one of the following:
- No matching recommendations found, if Insights did not identify any issues.
- A list of issues Insights has detected, grouped by risk (low, moderate, important, and critical).
- No clusters yet, if Insights has not yet analyzed the cluster. The analysis starts shortly after the cluster has been installed, registered, and connected to the internet.
If any issues are displayed, click the > icon in front of the entry for more details.
Depending on the issue, the details can also contain a link to more information from Red Hat about the issue.
4.5.4. Displaying all Insights Advisor recommendations
The Recommendations view, by default, only displays the recommendations that are detected on your clusters. However, you can view all of the recommendations in the advisor archive.
Prerequisites
- Remote health reporting is enabled, which is the default.
- Your cluster is registered on Red Hat Hybrid Cloud Console.
- You are logged in to OpenShift Cluster Manager.
Procedure
-
Navigate to Advisor
Recommendations on OpenShift Cluster Manager. Click the X icons next to the Clusters Impacted and Status filters.
You can now browse through all of the potential recommendations for your cluster.
4.5.5. Disabling Insights Advisor recommendations
You can disable specific recommendations that affect your clusters, so that they no longer appear in your reports. It is possible to disable a recommendation for a single cluster or all of your clusters.
Disabling a recommendation for all of your clusters also applies to any future clusters.
Prerequisites
- Remote health reporting is enabled, which is the default.
- Your cluster is registered on OpenShift Cluster Manager.
- You are logged in to OpenShift Cluster Manager.
Procedure
-
Navigate to Advisor
Recommendations on OpenShift Cluster Manager. - Click the name of the recommendation to disable. You are directed to the single recommendation page.
To disable the recommendation for a single cluster:
- Click the Options menu for that cluster, and then click Disable recommendation for cluster.
- Enter a justification note and click Save.
To disable the recommendation for all of your clusters:
-
Click Actions
Disable recommendation. - Enter a justification note and click Save.
-
Click Actions
4.5.6. Enabling a previously disabled Insights Advisor recommendation
When a recommendation is disabled for all clusters, you will no longer see the recommendation in Insights Advisor. You can change this behavior.
Prerequisites
- Remote health reporting is enabled, which is the default.
- Your cluster is registered on OpenShift Cluster Manager.
- You are logged in to OpenShift Cluster Manager.
Procedure
-
Navigate to Advisor
Recommendations on OpenShift Cluster Manager. -
Filter the recommendations by Status
Disabled. - Locate the recommendation to enable.
- Click the Options menu , and then click Enable recommendation.
4.5.7. Displaying the Insights status in the web console
Insights repeatedly analyzes your cluster and you can display the status of identified potential issues of your cluster in the OpenShift Container Platform web console. This status shows the number of issues in the different categories and, for further details, links to the reports in OpenShift Cluster Manager.
Prerequisites
- Your cluster is registered in OpenShift Cluster Manager.
- Remote health reporting is enabled, which is the default.
- You are logged in to the OpenShift Container Platform web console.
Procedure
-
Navigate to Home
Overview in the OpenShift Container Platform web console. Click Insights on the Status card.
The pop-up window lists potential issues grouped by risk. Click the individual categories or View all recommendations in Insights Advisor to display more details.
4.6. Using Insights Operator
The Insights Operator periodically gathers configuration and component failure status and, by default, reports that data every two hours to Red Hat. This information enables Red Hat to assess configuration and deeper failure data than is reported through Telemetry. Users of OpenShift Container Platform can display the report in the Insights Advisor service on Red Hat Hybrid Cloud Console.
Additional resources
- The Insights Operator is installed and enabled by default. If you need to opt out of remote health reporting, see Opting out of remote health reporting.
- For more information on using Insights Advisor to identify issues with your cluster, see Using Insights to identify issues with your cluster.
4.6.1. Downloading your Insights Operator archive
Insights Operator stores gathered data in an archive located in the openshift-insights
namespace of your cluster. You can download and review the data that is gathered by the Insights Operator.
Prerequisites
-
Access to the cluster as a user with the
cluster-admin
role.
Procedure
Find the name of the running pod for the Insights Operator:
$ oc get pods --namespace=openshift-insights -o custom-columns=:metadata.name --no-headers --field-selector=status.phase=Running
Copy the recent data archives collected by the Insights Operator:
$ oc cp openshift-insights/<insights_operator_pod_name>:/var/lib/insights-operator ./insights-data 1
- 1
- Replace
<insights_operator_pod_name>
with the pod name output from the preceding command.
The recent Insights Operator archives are now available in the insights-data
directory.
4.6.2. Viewing Insights Operator gather durations
You can view the time it takes for the Insights Operator to gather the information contained in the archive. This helps you to understand Insights Operator resource usage and issues with Insights Advisor.
Prerequisites
- A recent copy of your Insights Operator archive.
Procedure
From your archive, open
/insights-operator/gathers.json
.The file contains a list of Insights Operator gather operations:
{ "name": "clusterconfig/authentication", "duration_in_ms": 730, 1 "records_count": 1, "errors": null, "panic": null }
- 1
duration_in_ms
is the amount of time in milliseconds for each gather operation.
- Inspect each gather operation for abnormalities.
4.7. Using remote health reporting in a restricted network
You can manually gather and upload Insights Operator archives to diagnose issues from a restricted network.
To use the Insights Operator in a restricted network, you must:
- Create a copy of your Insights Operator archive.
- Upload the Insights Operator archive to console.redhat.com.
Additionally, you can choose to obfuscate the Insights Operator data before upload.
4.7.1. Running an Insights Operator gather operation
You must run a gather operation to create an Insights Operator archive.
Prerequisites
-
You are logged in to OpenShift Container Platform as
cluster-admin
.
Procedure
Create a file named
gather-job.yaml
using this template:apiVersion: batch/v1 kind: Job metadata: name: insights-operator-job annotations: config.openshift.io/inject-proxy: insights-operator spec: backoffLimit: 6 ttlSecondsAfterFinished: 600 template: spec: restartPolicy: OnFailure serviceAccountName: operator nodeSelector: beta.kubernetes.io/os: linux node-role.kubernetes.io/master: "" tolerations: - effect: NoSchedule key: node-role.kubernetes.io/master operator: Exists - effect: NoExecute key: node.kubernetes.io/unreachable operator: Exists tolerationSeconds: 900 - effect: NoExecute key: node.kubernetes.io/not-ready operator: Exists tolerationSeconds: 900 volumes: - name: snapshots emptyDir: {} - name: service-ca-bundle configMap: name: service-ca-bundle optional: true initContainers: - name: insights-operator image: quay.io/openshift/origin-insights-operator:latest terminationMessagePolicy: FallbackToLogsOnError volumeMounts: - name: snapshots mountPath: /var/lib/insights-operator - name: service-ca-bundle mountPath: /var/run/configmaps/service-ca-bundle readOnly: true ports: - containerPort: 8443 name: https resources: requests: cpu: 10m memory: 70Mi args: - gather - -v=4 - --config=/etc/insights-operator/server.yaml containers: - name: sleepy image: quay.io/openshift/origin-base:latest args: - /bin/sh - -c - sleep 10m volumeMounts: [{name: snapshots, mountPath: /var/lib/insights-operator}]
Copy your
insights-operator
image version:$ oc get -n openshift-insights deployment insights-operator -o yaml
Paste your image version in
gather-job.yaml
:initContainers: - name: insights-operator image: <your_insights_operator_image_version> terminationMessagePolicy: FallbackToLogsOnError volumeMounts:
Create the gather job:
$ oc apply -n openshift-insights -f gather-job.yaml
Find the name of the job pod:
$ oc describe -n openshift-insights job/insights-operator-job
Example output
Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal SuccessfulCreate 7m18s job-controller Created pod: insights-operator-job-<your_job>
where
insights-operator-job-<your_job>
is the name of the pod.Verify that the operation has finished:
$ oc logs -n openshift-insights insights-operator-job-<your_job> insights-operator
Example output
I0407 11:55:38.192084 1 diskrecorder.go:34] Wrote 108 records to disk in 33ms
Save the created archive:
$ oc cp openshift-insights/insights-operator-job-<your_job>:/var/lib/insights-operator ./insights-data
Clean up the job:
$ oc delete -n openshift-insights job insights-operator-job
4.7.2. Uploading an Insights Operator archive
You can manually upload an Insights Operator archive to console.redhat.com to diagnose potential issues.
Prerequisites
-
You are logged in to OpenShift Container Platform as
cluster-admin
. - You have a workstation with unrestricted internet access.
- You have created a copy of the Insights Operator archive.
Procedure
Download the
dockerconfig.json
file:$ oc extract secret/pull-secret -n openshift-config --to=.
Copy your
"cloud.openshift.com"
"auth"
token from thedockerconfig.json
file:{ "auths": { "cloud.openshift.com": { "auth": "<your_token>", "email": "asd@redhat.com" } }
Upload the archive to console.redhat.com:
$ curl -v -H "User-Agent: insights-operator/one10time200gather184a34f6a168926d93c330 cluster/<cluster_id>" -H "Authorization: Bearer <your_token>" -F "upload=@<path_to_archive>; type=application/vnd.redhat.openshift.periodic+tar" https://console.redhat.com/api/ingress/v1/upload
where
<cluster_id>
is your cluster ID,<your_token>
is the token from your pull secret, and<path_to_archive>
is the path to the Insights Operator archive.If the operation is successful, the command returns a
"request_id"
and"account_number"
:Example output
* Connection #0 to host console.redhat.com left intact {"request_id":"393a7cf1093e434ea8dd4ab3eb28884c","upload":{"account_number":"6274079"}}%
Verification steps
- Log in to https://console.redhat.com/openshift.
- Click the Clusters menu in the left pane.
- To display the details of the cluster, click the cluster name.
Open the Insights Advisor tab of the cluster.
If the upload was successful, the tab displays one of the following:
- Your cluster passed all recommendations, if Insights Advisor did not identify any issues.
- A list of issues that Insights Advisor has detected, prioritized by risk (low, moderate, important, and critical).
4.7.3. Enabling Insights Operator data obfuscation
You can enable obfuscation to mask sensitive and identifiable IPv4 addresses and cluster base domains that the Insights Operator sends to console.redhat.com.
Although this feature is available, Red Hat recommends keeping obfuscation disabled for a more effective support experience.
Obfuscation assigns non-identifying values to cluster IPv4 addresses, and uses a translation table that is retained in memory to change IP addresses to their obfuscated versions throughout the Insights Operator archive before uploading the data to console.redhat.com.
For cluster base domains, obfuscation changes the base domain to a hardcoded substring. For example, cluster-api.openshift.example.com
becomes cluster-api.<CLUSTER_BASE_DOMAIN>
.
The following procedure enables obfuscation using the support
secret in the openshift-config
namespace.
Prerequisites
-
You are logged in to the OpenShift Container Platform web console as
cluster-admin
.
Procedure
-
Navigate to Workloads
Secrets. - Select the openshift-config project.
-
Search for the support secret using the Search by name field. If it does not exist, click Create
Key/value secret to create it. - Click the Options menu , and then click Edit Secret.
- Click Add Key/Value.
-
Create a key named
enableGlobalObfuscation
with a value oftrue
, and click Save. -
Navigate to Workloads
Pods -
Select the
openshift-insights
project. -
Find the
insights-operator
pod. -
To restart the
insights-operator
pod, click the Options menu , and then click Delete Pod.
Verification
-
Navigate to Workloads
Secrets. - Select the openshift-insights project.
- Search for the obfuscation-translation-table secret using the Search by name field.
If the obfuscation-translation-table
secret exists, then obfuscation is enabled and working.
Alternatively, you can inspect /insights-operator/gathers.json
in your Insights Operator archive for the value "is_global_obfuscation_enabled": true
.
Additional resources
- For more information on how to download your Insights Operator archive, see Showing data collected by the Insights Operator.
4.8. Importing simple content access entitlements with Insights Operator
Insights Operator periodically imports your simple content access entitlements from OpenShift Cluster Manager and stores them in the etc-pki-entitlement
secret in the openshift-config-managed
namespace. Simple content access is a capability in Red Hat subscription tools which simplifies the behavior of the entitlement tooling. This feature makes it easier to consume the content provided by your Red Hat subscriptions without the complexity of configuring subscription tooling.
Insights Operator imports simple content access entitlements every eight hours, but can be configured or disabled using the support
secret in the openshift-config
namespace.
Simple content access must be enabled in Red Hat Subscription Management for the importing to function.
Additional resources
- See About simple content access in the Red Hat Subscription Central documentation, for more information about simple content access.
- See Using Red Hat subscriptions in builds for more information about using simple content access entitlements in OpenShift Container Platform builds.
4.8.1. Configuring simple content access import interval
You can configure how often the Insights Operator imports the simple content access entitlements using the support
secret in the openshift-config
namespace. The entitlement import normally occurs every eight hours, but you can shorten this interval if you update your simple content access configuration in Red Hat Subscription Management.
This procedure describes how to update the import interval to one hour.
Prerequisites
-
You are logged in to the OpenShift Container Platform web console as
cluster-admin
.
Procedure
-
Navigate to Workloads
Secrets. - Select the openshift-config project.
-
Search for the support secret using the Search by name field. If it does not exist, click Create
Key/value secret to create it. - Click the Options menu , and then click Edit Secret.
- Click Add Key/Value.
Create a key named
scaInterval
with a value of1h
, and click Save.NoteThe interval
1h
can also be entered as60m
for 60 minutes.
4.8.2. Disabling simple content access import
You can disable the importing of simple content access entitlements using the support
secret in the openshift-config
namespace.
Prerequisites
-
You are logged in to the OpenShift Container Platform web console as
cluster-admin
.
Procedure
-
Navigate to Workloads
Secrets. - Select the openshift-config project.
-
Search for the support secret using the Search by name field. If it does not exist, click Create
Key/value secret to create it. - Click the Options menu , and then click Edit Secret.
- Click Add Key/Value.
Create a key named
scaPullDisabled
with a value oftrue
, and click Save.The simple content access entitlement import is now disabled.
NoteTo enable the simple content access import again, edit the
support
secret and delete thescaPullDisabled
key.