Chapter 12. Support
12.1. Support overview
You can request assistance from Red Hat Support, report bugs, collect data about your environment, and monitor the health of your cluster and virtual machines (VMs) with the following tools.
12.1.1. Opening support tickets
If you have encountered an issue that requires immediate assistance from Red Hat Support, you can submit a support case.
To report a bug, you can create a Jira issue directly.
12.1.1.1. Submitting a support case
To request support from Red Hat Support, follow the instructions for submitting a support case.
It is helpful to collect debugging data to include with your support request.
12.1.1.1.1. Collecting data for Red Hat Support
You can gather debugging information by performing the following steps:
- Collecting data about your environment
-
Configure Prometheus and Alertmanager and collect
must-gather
data for Red Hat OpenShift Service on AWS and OpenShift Virtualization.
12.1.1.2. Creating a Jira issue
To report a bug, you can create a Jira issue directly by filling out the form on the Create Issue page.
12.1.2. Web console monitoring
You can monitor the health of your cluster and VMs by using the Red Hat OpenShift Service on AWS web console. The web console displays resource usage, alerts, events, and trends for your cluster and for OpenShift Virtualization components and resources.
Page | Description |
---|---|
Overview page | Cluster details, status, alerts, inventory, and resource usage |
Virtualization | OpenShift Virtualization resources, usage, alerts, and status |
Virtualization | Top consumers of CPU, memory, and storage |
Virtualization | Progress of live migrations |
VirtualMachines | VM resource usage, storage, network, and migration |
VirtualMachines | List of VM events |
VirtualMachines | VM status conditions and volume snapshot status |
12.2. Collecting data for Red Hat Support
When you submit a support case to Red Hat Support, it is helpful to provide debugging information for Red Hat OpenShift Service on AWS and OpenShift Virtualization by using the following tools:
- Prometheus
- Prometheus is a time-series database and a rule evaluation engine for metrics. Prometheus sends alerts to Alertmanager for processing.
- Alertmanager
- The Alertmanager service handles alerts received from Prometheus. The Alertmanager is also responsible for sending the alerts to external notification systems.
For information about the Red Hat OpenShift Service on AWS monitoring stack, see About Red Hat OpenShift Service on AWS monitoring.
12.2.1. Collecting data about your environment
Collecting data about your environment minimizes the time required to analyze and determine the root cause.
Prerequisites
- Set the retention time for Prometheus metrics data to a minimum of seven days.
- Configure the Alertmanager to capture relevant alerts and to send alert notifications to a dedicated mailbox so that they can be viewed and persisted outside the cluster.
- Record the exact number of affected nodes and virtual machines.
Procedure
12.2.2. Collecting data about virtual machines
Collecting data about malfunctioning virtual machines (VMs) minimizes the time required to analyze and determine the root cause.
Prerequisites
- Linux VMs: Install the latest QEMU guest agent.
Windows VMs:
- Record the Windows patch update details.
- Install the latest VirtIO drivers.
- Install the latest QEMU guest agent.
- If Remote Desktop Protocol (RDP) is enabled, connect by using the desktop viewer to determine whether there is a problem with the connection software.
Procedure
- Collect screenshots of VMs that have crashed before you restart them.
- Collect memory dumps from VMs before remediation attempts.
- Record factors that the malfunctioning VMs have in common. For example, the VMs have the same host or network.
12.3. Troubleshooting
OpenShift Virtualization provides tools and logs for troubleshooting virtual machines (VMs) and virtualization components.
You can troubleshoot OpenShift Virtualization components by using the tools provided in the web console or by using the oc
CLI tool.
12.3.1. Events
Red Hat OpenShift Service on AWS events are records of important life-cycle information and are useful for monitoring and troubleshooting virtual machine, namespace, and resource issues.
VM events: Navigate to the Events tab of the VirtualMachine details page in the web console.
- Namespace events
You can view namespace events by running the following command:
$ oc get events -n <namespace>
See the list of events for details about specific events.
- Resource events
You can view resource events by running the following command:
$ oc describe <resource> <resource_name>
12.3.2. Pod logs
You can view logs for OpenShift Virtualization pods by using the web console or the CLI. You can also view aggregated logs by using the LokiStack in the web console.
12.3.2.1. Configuring OpenShift Virtualization pod log verbosity
You can configure the verbosity level of OpenShift Virtualization pod logs by editing the HyperConverged
custom resource (CR).
Procedure
To set log verbosity for specific components, open the
HyperConverged
CR in your default text editor by running the following command:$ oc edit hyperconverged kubevirt-hyperconverged -n openshift-cnv
Set the log level for one or more components by editing the
spec.logVerbosityConfig
stanza. For example:apiVersion: hco.kubevirt.io/v1beta1 kind: HyperConverged metadata: name: kubevirt-hyperconverged spec: logVerbosityConfig: kubevirt: virtAPI: 5 1 virtController: 4 virtHandler: 3 virtLauncher: 2 virtOperator: 6
- 1
- The log verbosity value must be an integer in the range
1–9
, where a higher number indicates a more detailed log. In this example, thevirtAPI
component logs are exposed if their priority level is5
or higher.
- Apply your changes by saving and exiting the editor.
12.3.2.2. Viewing virt-launcher pod logs with the web console
You can view the virt-launcher
pod logs for a virtual machine by using the Red Hat OpenShift Service on AWS web console.
Procedure
-
Navigate to Virtualization
VirtualMachines. - Select a virtual machine to open the VirtualMachine details page.
- On the General tile, click the pod name to open the Pod details page.
- Click the Logs tab to view the logs.
12.3.2.3. Viewing OpenShift Virtualization pod logs with the CLI
You can view logs for the OpenShift Virtualization pods by using the oc
CLI tool.
Procedure
View a list of pods in the OpenShift Virtualization namespace by running the following command:
$ oc get pods -n openshift-cnv
Example 12.1. Example output
NAME READY STATUS RESTARTS AGE disks-images-provider-7gqbc 1/1 Running 0 32m disks-images-provider-vg4kx 1/1 Running 0 32m virt-api-57fcc4497b-7qfmc 1/1 Running 0 31m virt-api-57fcc4497b-tx9nc 1/1 Running 0 31m virt-controller-76c784655f-7fp6m 1/1 Running 0 30m virt-controller-76c784655f-f4pbd 1/1 Running 0 30m virt-handler-2m86x 1/1 Running 0 30m virt-handler-9qs6z 1/1 Running 0 30m virt-operator-7ccfdbf65f-q5snk 1/1 Running 0 32m virt-operator-7ccfdbf65f-vllz8 1/1 Running 0 32m
View the pod log by running the following command:
$ oc logs -n openshift-cnv <pod_name>
NoteIf a pod fails to start, you can use the
--previous
option to view logs from the last attempt.To monitor log output in real time, use the
-f
option.Example 12.2. Example output
{"component":"virt-handler","level":"info","msg":"set verbosity to 2","pos":"virt-handler.go:453","timestamp":"2022-04-17T08:58:37.373695Z"} {"component":"virt-handler","level":"info","msg":"set verbosity to 2","pos":"virt-handler.go:453","timestamp":"2022-04-17T08:58:37.373726Z"} {"component":"virt-handler","level":"info","msg":"setting rate limiter to 5 QPS and 10 Burst","pos":"virt-handler.go:462","timestamp":"2022-04-17T08:58:37.373782Z"} {"component":"virt-handler","level":"info","msg":"CPU features of a minimum baseline CPU model: map[apic:true clflush:true cmov:true cx16:true cx8:true de:true fpu:true fxsr:true lahf_lm:true lm:true mca:true mce:true mmx:true msr:true mtrr:true nx:true pae:true pat:true pge:true pni:true pse:true pse36:true sep:true sse:true sse2:true sse4.1:true ssse3:true syscall:true tsc:true]","pos":"cpu_plugin.go:96","timestamp":"2022-04-17T08:58:37.390221Z"} {"component":"virt-handler","level":"warning","msg":"host model mode is expected to contain only one model","pos":"cpu_plugin.go:103","timestamp":"2022-04-17T08:58:37.390263Z"} {"component":"virt-handler","level":"info","msg":"node-labeller is running","pos":"node_labeller.go:94","timestamp":"2022-04-17T08:58:37.391011Z"}
12.3.3. Guest system logs
Viewing the boot logs of VM guests can help diagnose issues. You can configure access to guests' logs and view them by using either the Red Hat OpenShift Service on AWS web console or the oc
CLI.
This feature is disabled by default. If a VM does not explicitly have this setting enabled or disabled, it inherits the cluster-wide default setting.
If sensitive information such as credentials or other personally identifiable information (PII) is written to the serial console, it is logged with all other visible text. Red Hat recommends using SSH to send sensitive data instead of the serial console.
12.3.3.1. Enabling default access to VM guest system logs with the web console
You can enable default access to VM guest system logs by using the web console.
Procedure
-
From the side menu, click Virtualization
Overview. - Click the Settings tab.
-
Click Cluster
Guest management. - Set Enable guest system log access to on.
12.3.3.2. Enabling default access to VM guest system logs with the CLI
You can enable default access to VM guest system logs by editing the HyperConverged
custom resource (CR).
Procedure
Open the
HyperConverged
CR in your default editor by running the following command:$ oc edit hyperconverged kubevirt-hyperconverged -n openshift-cnv
Update the
disableSerialConsoleLog
value. For example:kind: HyperConverged metadata: name: kubevirt-hyperconverged spec: virtualMachineOptions: disableSerialConsoleLog: true 1 #...
- 1
- Set the value of
disableSerialConsoleLog
tofalse
if you want serial console access to be enabled on VMs by default.
12.3.3.3. Setting guest system log access for a single VM with the web console
You can configure access to VM guest system logs for a single VM by using the web console. This setting takes precedence over the cluster-wide default configuration.
Procedure
-
Click Virtualization
VirtualMachines from the side menu. - Select a virtual machine to open the VirtualMachine details page.
- Click the Configuration tab.
- Set Guest system log access to on or off.
12.3.3.4. Setting guest system log access for a single VM with the CLI
You can configure access to VM guest system logs for a single VM by editing the VirtualMachine
CR. This setting takes precedence over the cluster-wide default configuration.
Procedure
Edit the virtual machine manifest by running the following command:
$ oc edit vm <vm_name>
Update the value of the
logSerialConsole
field. For example:apiVersion: kubevirt.io/v1 kind: VirtualMachine metadata: name: example-vm spec: template: spec: domain: devices: logSerialConsole: true 1 #...
- 1
- To enable access to the guest’s serial console log, set the
logSerialConsole
value totrue
.
Apply the new configuration to the VM by running the following command:
$ oc apply vm <vm_name>
Optional: If you edited a running VM, restart the VM to apply the new configuration. For example:
$ virtctl restart <vm_name> -n <namespace>
12.3.3.5. Viewing guest system logs with the web console
You can view the serial console logs of a virtual machine (VM) guest by using the web console.
Prerequisites
- Guest system log access is enabled.
Procedure
-
Click Virtualization
VirtualMachines from the side menu. - Select a virtual machine to open the VirtualMachine details page.
- Click the Diagnostics tab.
- Click Guest system logs to load the serial console.
12.3.3.6. Viewing guest system logs with the CLI
You can view the serial console logs of a VM guest by running the oc logs
command.
Prerequisites
- Guest system log access is enabled.
Procedure
View the logs by running the following command, substituting your own values for
<namespace>
and<vm_name>
:$ oc logs -n <namespace> -l kubevirt.io/domain=<vm_name> --tail=-1 -c guest-console-log
12.3.4. Log aggregation
You can facilitate troubleshooting by aggregating and filtering logs.
12.3.4.1. Viewing aggregated OpenShift Virtualization logs with the LokiStack
You can view aggregated logs for OpenShift Virtualization pods and containers by using the LokiStack in the web console.
Prerequisites
- You deployed the LokiStack.
Procedure
-
Navigate to Observe
Logs in the web console. -
Select application, for
virt-launcher
pod logs, or infrastructure, for OpenShift Virtualization control plane pods and containers, from the log type list. - Click Show Query to display the query field.
- Enter the LogQL query in the query field and click Run Query to display the filtered logs.
12.3.4.2. OpenShift Virtualization LogQL queries
You can view and filter aggregated logs for OpenShift Virtualization components by running Loki Query Language (LogQL) queries on the Observe
The default log type is infrastructure. The virt-launcher
log type is application.
Optional: You can include or exclude strings or regular expressions by using line filter expressions.
If the query matches a large number of logs, the query might time out.
Component | LogQL query |
---|---|
All |
{log_type=~".+"}|json |kubernetes_labels_app_kubernetes_io_part_of="hyperconverged-cluster" |
|
{log_type=~".+"}|json |kubernetes_labels_app_kubernetes_io_part_of="hyperconverged-cluster" |kubernetes_labels_app_kubernetes_io_component="storage" |
|
{log_type=~".+"}|json |kubernetes_labels_app_kubernetes_io_part_of="hyperconverged-cluster" |kubernetes_labels_app_kubernetes_io_component="deployment" |
|
{log_type=~".+"}|json |kubernetes_labels_app_kubernetes_io_part_of="hyperconverged-cluster" |kubernetes_labels_app_kubernetes_io_component="network" |
|
{log_type=~".+"}|json |kubernetes_labels_app_kubernetes_io_part_of="hyperconverged-cluster" |kubernetes_labels_app_kubernetes_io_component="compute" |
|
{log_type=~".+"}|json |kubernetes_labels_app_kubernetes_io_part_of="hyperconverged-cluster" |kubernetes_labels_app_kubernetes_io_component="schedule" |
Container |
{log_type=~".+",kubernetes_container_name=~"<container>|<container>"} 1
|json|kubernetes_labels_app_kubernetes_io_part_of="hyperconverged-cluster"
|
| You must select application from the log type list before running this query. {log_type=~".+", kubernetes_container_name="compute"}|json
|!= "custom-ga-command" 1
|
You can filter log lines to include or exclude strings or regular expressions by using line filter expressions.
Line filter expression | Description |
---|---|
| Log line contains string |
| Log line does not contain string |
| Log line contains regular expression |
| Log line does not contain regular expression |
Example line filter expression
{log_type=~".+"}|json |kubernetes_labels_app_kubernetes_io_part_of="hyperconverged-cluster" |= "error" != "timeout"
Additional resources for LokiStack and LogQL
- xref :../../observability/logging/log_storage/about-log-storage.adoc#about-log-storage[About log storage]
- LogQL log queries in the Grafana documentation
12.3.5. Common error messages
The following error messages might appear in OpenShift Virtualization logs:
ErrImagePull
orImagePullBackOff
- Indicates an incorrect deployment configuration or problems with the images that are referenced.
12.3.6. Troubleshooting data volumes
You can check the Conditions
and Events
sections of the DataVolume
object to analyze and resolve issues.
12.3.6.1. About data volume conditions and events
You can diagnose data volume issues by examining the output of the Conditions
and Events
sections generated by the command:
$ oc describe dv <DataVolume>
The Conditions
section displays the following Types
:
-
Bound
-
Running
-
Ready
The Events
section provides the following additional information:
-
Type
of event -
Reason
for logging -
Source
of the event -
Message
containing additional diagnostic information.
The output from oc describe
does not always contains Events
.
An event is generated when the Status
, Reason
, or Message
changes. Both conditions and events react to changes in the state of the data volume.
For example, if you misspell the URL during an import operation, the import generates a 404 message. That message change generates an event with a reason. The output in the Conditions
section is updated as well.
12.3.6.2. Analyzing data volume conditions and events
By inspecting the Conditions
and Events
sections generated by the describe
command, you determine the state of the data volume in relation to persistent volume claims (PVCs), and whether or not an operation is actively running or completed. You might also receive messages that offer specific details about the status of the data volume, and how it came to be in its current state.
There are many different combinations of conditions. Each must be evaluated in its unique context.
Examples of various combinations follow.
Bound
- A successfully bound PVC displays in this example.Note that the
Type
isBound
, so theStatus
isTrue
. If the PVC is not bound, theStatus
isFalse
.When the PVC is bound, an event is generated stating that the PVC is bound. In this case, the
Reason
isBound
andStatus
isTrue
. TheMessage
indicates which PVC owns the data volume.Message
, in theEvents
section, provides further details including how long the PVC has been bound (Age
) and by what resource (From
), in this casedatavolume-controller
:Example output
Status: Conditions: Last Heart Beat Time: 2020-07-15T03:58:24Z Last Transition Time: 2020-07-15T03:58:24Z Message: PVC win10-rootdisk Bound Reason: Bound Status: True Type: Bound ... Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal Bound 24s datavolume-controller PVC example-dv Bound
Running
- In this case, note thatType
isRunning
andStatus
isFalse
, indicating that an event has occurred that caused an attempted operation to fail, changing the Status fromTrue
toFalse
.However, note that
Reason
isCompleted
and theMessage
field indicatesImport Complete
.In the
Events
section, theReason
andMessage
contain additional troubleshooting information about the failed operation. In this example, theMessage
displays an inability to connect due to a404
, listed in theEvents
section’s firstWarning
.From this information, you conclude that an import operation was running, creating contention for other operations that are attempting to access the data volume:
Example output
Status: Conditions: Last Heart Beat Time: 2020-07-15T04:31:39Z Last Transition Time: 2020-07-15T04:31:39Z Message: Import Complete Reason: Completed Status: False Type: Running ... Events: Type Reason Age From Message ---- ------ ---- ---- ------- Warning Error 12s (x2 over 14s) datavolume-controller Unable to connect to http data source: expected status code 200, got 404. Status: 404 Not Found
Ready
– IfType
isReady
andStatus
isTrue
, then the data volume is ready to be used, as in the following example. If the data volume is not ready to be used, theStatus
isFalse
:Example output
Status: Conditions: Last Heart Beat Time: 2020-07-15T04:31:39Z Last Transition Time: 2020-07-15T04:31:39Z Status: True Type: Ready